A Comparison Of K-Means And Fuzzy C-Means Clustering Methods For A Sample Of Gulf Cooperation Council Stock Markets

Open access


The main goal of this article is to compare data-mining clustering methods (k-means and fuzzy c-means) based on a sample of banking and energy companies on the Gulf Cooperation Council (GCC) stock markets. We examined these companies for a pattern that reflected the effect of news on the bank sector’s stocks throughout October, November, and December 2012. Correlation coefficients and t-statistics for the good news indicator (GNI) and the bad news indicator (BNI) and financial factors, such as PER, PBV, DY and rate of return, were used as diagnostic variables for the clustering methods.

Alves, A., Camacho, R. & Oliveira, E. (2004). Inductive Logic Programming for Data Mining in Economics. The 2nd International Workshop on Data Mining and Adaptive Modelling Methods for Economics and Management. Pisa: University of Porto.

Anderberg, M.R. (1973). Cluster Analysis for Applications. New York: Academic Press.

Andreassen, P.B. (1987). On the social psychology of the stock market. Aggreagat attributional effects and the regressivness of prediction. Journal of Personality and Socioal Psychology, 53 (3), 490–496.

Bezdek, J.C. (1980). A convergence theorem for the fuzzy ISODATA clustering Algorithms. IEEE Trans. Pattern Anal. Machine Intell, 2, 1–8.

Bezdek, J.C. (1981). Pattern recognition with fuzzy objective function algorithms. New York: Plenum Press.

Bezdek, J.C., Ehrlich, R. & Full, W. (1984). FCM: the fuzzy c-means clustering algorithm. Computers and Geosciences, 10, 191–203.

Błażewicz, J., Kubiak, W., Morzy, T. & Rusinkiewicz, M. (2003). Handbook on Data Management in Information Systems. Springer-Verlag.

Bose, I. & Mahapatra, R.K. (2001). Business data mining – a machine learning perspective. Information & Management, 39, 211–225.

Business (10, 11, 12.2012), www.reuters.com/finance/economy.

Bussiness and Technology (10, 11, 12.2012). From AL ARABIA NEWS: http://english.alarabiya.net/index.

Calinski, R.H. (1974). A dendrite method for cluster analysis. Communications in Statistics, 3, 1–27.

Cao, L., Yu, P.S., Zhang, C. & Zhang, H. (2009). Data Mining for Business Applications. New York: Springer.

Carretta, A., Farina, V., Martelli, D., Fiordelisi, F. & Schwizer, P. (2011). The impact of corporate governance press news on stock market returns. European financial management, 17 (1), 100–119.

Chiang, M.M.-T. & Mirkin, B. (2010). Intelligent Choice of the Number of Clusters in K-Means Clustering: An Experimental Study with Different Cluster Spreads. Journal of Classification, 27, 3–40.

Clustering (2012, June 8). From Computer Science 831: Knowledge Discovery in Databases: www2.cs.uregina.ca/~dbd/cs831/notes/clustering/clustering.html (7.03.2013).

Deza, E. & Deza, M.M. (2009). Encyclopedia of Distances. Berlin, Heidelberg: Springer-Verlag.

Dunham, M.H. (2002). Data Mining: Introductory and Advanced Topics. New York: Prentice Hall.

Elavarasi, S.A., Akilandeswari, J. & Sathiyabhama, B. (2011). A Survey on Partition Clustering Agorithms. International Journal of Enterprise Computing and Business Systems, 1, 1–14.

Elmasri, R. & Navathe, S.B. (2011). Fundamentals of database systems. Boston, MA: Addison-Wesley.

Fairfield, P.M. (1994). P/E, P/B and the Present Value of Future Dividends. Financial Analysts’ Journal, 23–31.

Field, A. (2009). Discovering Statistics Using SPSS. New Delhi: Sage Publications.

Fridson, M.S. (2011). Financial Statement Analysis. A Practitioner’s Guide. New Jersey: John Wiley & Sons.

Gasch, A.P., & Eisen, M.B. (2002). Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering. Genome Biology, 3, 1–22.

Ghosh, J. & Liu, A. (2009). K-Means. In: W. Xindong, V. Kumar, The top ten algoritms in Data Mining (pp. 21–36). Boca Raton, Florida: Taylor & Francis Group.

Gorsevski, P.V., Gessler, P.E. & Jankowski, P. (2003). Integrating a fuzzy k-means classification and a Bayesian approach for spatial prediction of landslide hazard. Journal of Geographical System, 223–251.

Hammoudeh, S. & Choi, K. (2006). Behavior of GCC stock markets and impacts of US oil and financial markets. Research in International Business and Finance, 20, 22–44.

Han, J. & Kamber, M. (2006). Data Mining:Concepts and Techniques. San Francisco: Morgan Kaufmann Publishers.

Hertog, S. (November 2012). Financial markets in GCC countries: recent crises and structural weaknesses. Norwegian Peacebuilding Resource Centre.

Huang, Z. (1997). A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining. Research Issues on Data Mining and Knowledge Discovery. Cite Seer, 1–8.

Huang, Z. & Ng, M.K. (1999). A Fuzzy K-Modes Algorithm for Clustering Categorical Data. IEEE Transactions on Fuzzt Systems, 7 (4), 446–452.

Investmens Policy. (2013). Calgary.

Jain, A.K. & Dubes, R.C. (1988). Algorithms for Clustering Data. Englewood Cliffs, NJ: Prentice Hall.

KAMCO (10, 11, 12.2012). Research Reports, www.kamconline.com (01.2013).

Kudyba, S. (2004). Managing Data Mining, Advice from Experts. USA: IT Solutions Series, Idea Group.

Kumar, P. & Wasan, S.K. (2010). Comparative Analysis of k-mean Based Algorithms. International Journal of Computer Science and Network Security, 10 (4), 314–318.

Kumar, V., Joshi, M.V., Han, E.-H.S., Tan, P.-N. & Steinbach, M. (2003). High performance data mining. High Performance Computing for Computational Science – VECPAR 2002, 111–125.

Larose, D.T. (2005). Discovering Knowledge in Data (An Introduction to Data Mining). Hoboken, NJ: John Wiley & Sons.

Levinson, M. (2006). Guide to Financial Markets (pp. 145–146). London: The Economist (Profile Books).

Li, M.J., Ng, M.K., Cheung, Y.-M, & Huang, J.Z. (2008). Agglomerative Fuzzy K-Means Clustering Algorithm with Selection of Number of Clusters. IEEE Transactions on Knowledge and Data Engineering, 20 (11), 1519–1534.

Lo, A.W., & MacKinlay, A.C. (1988). Stock Market Prices Do not Follow Random Walks: Evidence from a Simple Specification Test. The Review of Financial Studies, 41–66.

Luo, F., Wu, J. & Yan, K. (2010). A Novel Nonlinear Combination Model Based on Support Vector Machine for Stock Market Prediction. 8th World Congress on Intelligent Control and Automation (p. 1). Jinan, China: IEEE.

Madhulatha, T.S. (2012). An Overview On Clustering Methods. IOSR Journal of Engineering, 2 (4), 719–725.

Majewski, S. (2009). The media and the prices creation in Poland. International Journal of Management Cases, 11 (1), 70–77.

Majewski, S., Nermend, K. & Al-augby, S. (2012). Media and Price Creation in Abu Dhabi Security Exchange. Sientific Papers of the Polish Information Processing Society Sientific Council, University of Szczecin, 81–93.

Marghescu, D., Sarlin, P. & Liu, S. (2010). Early-Warning Analysis for Currency Crises in Emerging Markets: A Revisit With Fuzzy Clustering. Intellegent Systems in Accounting, Finance and Management, 17, 143–165.

Mathuriya, N. & Bansal, A. (2012). Comparison of K-means and means and Back propagation Data Mining Algorithms. International Journal of Computer Technology and Electronics Engineering, 151–155.

McBratney, A.B. & De Gruijter, J.J. (1992). A Continuum Approach to Soil Classification by Modified Fuzzy K-means with Extragrades. Journal of Soil Science, 43, 159–175.

Mhmoud, A.S. & Ali, S.O. (2013). Application of Principal Component Method and k-me ans clustering algorithm for Khartoum stock Market. Nature and Science, 108–112.

Mirkin, B.G. (1996). Mathematical classification and clustering. Dordrecht: Kluwer Academic Publishing.

Mitchell, M.L. & Mulherin, J.H. (1994). The impact of public information on the stock market. The Journal of Finance, 49 (3), 923–950.

Mooi, E. & Sarstedt, M. (2011). A Concise Guide to Market Research The Process, Data, and Methods Using IBM SPSS Statistics. Berlin: Springer-Verlag.

Nanda, S.R., Mahanty, B. & Tiwari, M.K. (2010). Clustering Indian stock market data for portfolio management. Expert Systems with Applications 37, 8793–8798.

Nikam, V., Kadam, V.J. & Meshram, B.B. (2011). Image Compression Using Partitioning Around Medoids Clustering Algorithm. International Journal of Computer Science Issues, 8, 6 (1), 399–401.

Ramamurthy, B. & Chandran, K.R. (2011). CBMIR: Shape-BasedImage Retrieval Using Canny Edge Detection and K-Means Clustering Algorithms for Medical Images. International Journal of Engineering Science and Technology, 3, 1870–1877.

Ruspini, E.R. (1969). A new approach to clustering. Inform. Control, 19, 22–32.

Santosh, K.C. & Nattee, C. (2009). A Comperhensive Survey on On-line Handwriting Recgnition Technology and Its Real Application to The Nepalese NaturalL Handwriting. Kathmandu University Journal of Science, Engineering and Technology, 5 (1), 31–55.

Setty, D.V., Rangaswamy, T.M. & Subramanya, K.N. (2010). A Review on Data Mining Applications to the Performance of Stock Marketing. International Journal of Computer Applications, 1 (3), 24–34.

Shiller, R.J. (2001). Irrational Exuberance. New York: Brodway Books, p. 95.

Shrestha, D. (2009). Text Mining with Lucene and Hadoop: Document Clustering With Feature Extraction. Research Degree Thesis. Wakhok University.

Simpson, J. (2008). Financial Integration In The GCC Stock Markets: Evidence From The Early 2000s Development Phase. Journal of Economic Cooperation, 1–28.

Singh, K., Malik, D. & Sharma, N. (2011). Evolving limitations in K-means algorithm in data mining and their removal. International Journal of Computational Engineering & Management, 12, 105–109.

StatSoft (2013). StatSoft Electronic Statistics Textbook. From Introduction to ANOVA/MANOVA: www.thefullwiki.org/Analysis_of_variance.

Sugar, C.A. & James, G M. (2003). Finding the number of clusters in a data set :An information theoretic approach. Journal of the American Statistical Association, 98 (463), 750–763.

Tan, P.-N., Steinbach, M. & Kumar, V. (2006). Introduction to Data Mining. Pearson Addison Wesley.

Thompson, B. (2002). “Statistical,” “Practical,” and “Clinical”: How Many Kinds of Significance Do Counselors Need to Consider? Journal of Counseling & Development, 80, 64–71.

Triantaphyllou, E. (2010). Data Mining and Knowledge Discovery Via Logic-Based Methods. New York: Springer.

Vassilios, C., Adrian, G.B. & Ioannis, P. (1999). Multimodal Decision-Level Fusion for Person Authentication. IEEE Transactions on Systems, Man, and Cybernetics – Part A: Systems and Humans, 674–680.

Vimal, A., Valluri, S.R. & Karlapalem, K. (2008). International Conference on Management of Data COMAD 2008. Mumbai: Computer Society of India.

Wei, Y. (2005, May). Approximation To K-means Clustering. Hamilton, Ontario, Canada: McMaster University.

Witten, I.H. & Eibe, F. (2005). Data Mining Practical Machine Learning Tools and Techniques. San Francisco: Morgan Kaufmann Publishers is an imprint of Elsevier.

Xu, R. & II, D.W. (2005). Survey of Clustering Algorithms. IEEE Transactions on Neura Networks, 16 (3), 645–678.

Zadeh, L.A. (1965). Fuzzy sets. Information and Control, 8 (3), 338–353.

Zaki, M.J. & Jr., W.M. (2013). Data Mining and Analysis:Fundamental Concepts and Algorithms. Draft copy: Cambridge University Press.

Zielonka, P. (2000). Biased Judgement on What Moves Stock Prices. Warsaw: Institute of Philosophy and Sociology Polish Academy of Sciences.

Folia Oeconomica Stetinensia

The Journal of University of Szczecin

Journal Information


All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 73 73 29
PDF Downloads 11 11 3