Particle Swarm Optimization Based Fuzzy Clustering Approach to Identify Optimal Number of Clusters

Open access


Fuzzy clustering is a popular unsupervised learning method that is used in cluster analysis. Fuzzy clustering allows a data point to belong to two or more clusters. Fuzzy c-means is the most well-known method that is applied to cluster analysis, however, the shortcoming is that the number of clusters need to be predefined. This paper proposes a clustering approach based on Particle Swarm Optimization (PSO). This PSO approach determines the optimal number of clusters automatically with the help of a threshold vector. The algorithm first randomly partitions the data set within a preset number of clusters, and then uses a reconstruction criterion to evaluate the performance of the clustering results. The experiments conducted demonstrate that the proposed algorithm automatically finds the optimal number of clusters. Furthermore, to visualize the results principal component analysis projection, conventional Sammon mapping, and fuzzy Sammon mapping were used

[1] H. A. Edelstein, Introduction to data mining and knowledge discovery (3rd ed), Potomac, MD: Two Crows Corp. 1999.

[2] B. Mirkin, Clustering: A Data Recovery Approach, Second Edition (Chapman & Hall/CRC Computer Science & Data Analysis).

[3] R. B. Cattell, The description of personality: Basic traits resolved into clusters, Journal of Abnormal and Social Psychology, 38, 476-506,1943.

[4] S. Theodoridis, and K. Koutroubas, Pattern Recognition, Academic Press, 1999.

[5] T. N. Pappas, An adaptive clustering algorithm for image segmentation, IEEE Trans. Signal process, vol. 40, pp.901-914, 1992

[6] A. Likas, N. Vlassis, and J. Verbeek, The global k-means clustering algorithm (Technical Report), Computer Science Institute, University of Amsterdam, The Netherlands. ISA-UVA-01-02. 2001.

[7] V. P. Guerrero-Bote, et al., Comparison of neural models for document clustering, Int. Journal of Approximate Reasoning, vol. 34, pp.287-305, 2003.

[8] G. L. Carl, A fuzzy clustering and fuzzy merging algorithm, Technical Report, CS-UNR-101, 1999.

[9] L. A. Zadeh, Fuzzy sets, Information and Control, Vol. 8, pp. 338-353, 1965.

[10] R. Babuska, Fuzzy Modelling for Control, Kluwer Academic, USA, 1998.

[11] J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms, ISBN 0-306-40671-3, 1981.

[12] J. C. Dunn, A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters, Journal of Cybernetics 3: 32-57, 1973.

[13] I. Gath and A. B. Geva, Unsupervised optimal fuzzy clustering, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol 11(7), pp 773-781, 1989.

[14] J. C. Bezdek, C. Coray, R. Gunderson and J.Watson, Detection and characterization of cluster substructure i. linear structure: Fuzzy c-lines, SIAM Journal on Applied Mathematics, 40(2), 339-357.

[15] J. Yu, and M. S Yang, A generalized fuzzy clustering regularization model with optimality tests and model complexity analysis, Fuzzy Systems, IEEE Transactions on 15.5 (2007): 904-915.

[16] Y. T. Kao, E. Zahara, I. W. Kao, A hybridized approach to data clustering, Expert Systems with Applications 34 (3), 1754-1762, 2008.

[17] D. N. Cao, J. C. Krzysztof, GAKREM: a novel hybrid clustering algorithm, Information Sciences 178, 4205-4227, 2008.

[18] K. R. Zalik, An efficient k-means clustering algorithm, Pattern Recognition Letters 29, 1385-1391, 2008.

[19] K. Krishna, Murty, Genetic k-means algorithm, IEEE Transactions of System Man Cybernetics Part BCybernetics 29, 433-439, 1999.

[20] U. Mualik, S. Bandyopadhyay, Genetic algorithm-based clustering technique, Pattern Recognition 33, 1455-1465, 2000.

[21] M. Laszlo, S. Mukherjee, A genetic algorithm that exchanges neighboring centers for k-means clustering, Pattern Recognition Letters 28 (16), 2359-2366, 2007.

[22] P. S. Shelokar, V. K. Jayaraman, B. D. Kulkarni, An ant colony approach for clustering, Analytica Chimica Acta 509 (2), 187-195, 2004.

[23] S. A. Ludwig, Clonal Selection based Fuzzy C-Means Algorithm for Clustering, Proceedings of Genetic and Evolutionary Computation Conference (ACM GECCO), Vancouver, BC, Canada, July 2014.

[24] T. A. Runkler, and C. Katz. Fuzzy clustering by particle swarm optimization, Fuzzy Systems, 2006 IEEE International Conference on. IEEE, 2006.

[25] H. C. Liu, J. M. Yih, D. B Wu, S. W. Liu, Fuzzy Cmean clustering algorithms based on Picard iteration and particle swarm optimization, Education Technology and Training, 2008. and 2008 International Workshop on Geoscience and Remote Sensing. ETT and GRS 2008. International Workshop on. Vol. 2. IEEE, 2008.

[26] B. J. Zhao, An ant colony clustering algorithm, Machine Learning and Cybernetics, 2007 International Conference on. Vol. 7. IEEE, 2007.

[27] G. Gan, J.Wu, and Z. Yang, A genetic fuzzy k-Modes algorithm for clustering categorical data, Expert Systems with Applications 36.2 (2009): 1615-1620.

[28] F. Yang, T. Sun, and C. Zhang, An efficient hybrid data clustering method based on K-harmonic means and Particle Swarm Optimization, Expert Systems with Applications 36.6 (2009): 9847-9852.

[29] T. Niknam, B. Amiri, An efficient hybrid approach based on PSO, ACO and k-means for cluster analysis, Applied Soft Computing, 10(1), 183-197.

[30] P. Melin, F. Olivas, O. Castillo, F. Valdez, J. Soria, and M. Valdez, Optimal design of fuzzy classification systems using PSO with dynamic parameter adaptation through fuzzy logic, Expert Systems with Applications 40, no. 8 (2013): 3196-3206.

[31] L. Y. Chuang, C. J. Hsiao, and C. H. Yang, Chaotic particle swarm optimization for data clustering, Expert systems with Applications 38.12 (2011): 14555-14563.

[32] H. Izakian, and A. Abraham, Fuzzy C-means and fuzzy swarm for fuzzy clustering problem, Expert Systems with Applications 38.3 (2011): 1835-1838.

[33] S. Das, A. Abraham, and A. Konar, Automatic kernel clustering with a multi-elitist particle swarm optimization algorithm, Pattern Recognition Letters, 29(5), 688-699, 2008.

[34] M. Chen and S. A. Ludwig, Fuzzy Clustering Using Automatic Particle Swarm Optimization, Proceedings of 2014 IEEE International Conference on Fuzzy Systems, Beijing, China, July 2014.

[35] R. C. Eberhart and J. Kennedy, A New Optimizer using Particle Swarm Theory, In Proc. 6th Symp. Micro Machine and Human Science, Nagoya, Japan 1995, 29-43.

[36] J. C. Bezdek, Cluster validity with fuzzy sets, (1973): 58-73.

[37] R. N. Dave, Validating fuzzy partitions obtained through c-shells clustering, Pattern Recognition Letters, 17(6), 613-623, 1996.

[38] Y. Fukuyama and M. Sugeno, A new method of choosing the number of clusters for the fuzzy c-means method, Proceeding of fifth fuzzy Syst. Sympo., pp.247-250, 1989.

[39] X. L. Xie, and G. Beni, A validity measure for fuzzy clustering. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 13(8), 841-847, 1991.

[40] N. R. Pal, and J. C. Bezdek, On cluster validity for the fuzzy c-means model, Fuzzy Systems, IEEE Transactions on, 3(3), 370-379, 1995.

[41] K. L. Wu, and M. S. Yang, A cluster validity index for fuzzy clustering, Pattern Recognition Letters, 26(9), 1275-1291, 2005.

[42] A. Strehl, Relationship-based clustering and cluster ensembles for high-dimensional data mining, 2002.

[43] W. Pedrycz and J. V. de Oliveira, A development of fuzzy encoding and decoding through fuzzy clustering, IEEE Trans. Instrum. Meas., vol. 57, no. 4, pp. 829?837, Apr. 2008.

[44] A. Frank & A. Asuncion, UCI Machine Learning Repository []. Irvine, CA: University of California, School of Information and Computer Science, 2010.

[45] B. Balasko, J. Abonyi, and B. Feil. Fuzzy clustering and data analysis toolbox, Department of Process Engineering, University of Veszprem, Veszprem, 2005.

[46] J. MacQueen, Some methods for classification and analysis of multivariate observations, Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics, 281-297, University of California Press, Berkeley, Calif., 1967.

[47] L. Kaufman and P.J. Rousseeuw, Clustering by means of Medoids, in Statistical Data Analysis Based on the L1 Norm and Related Methods, edited by Y. Dodge, North- Holland, 405-416, 1987

Journal of Artificial Intelligence and Soft Computing Research

The Journal of Polish Neural Network Society, the University of Social Sciences in Lodz & Czestochowa University of Technology

Journal Information

CiteScore 2017: 5.00

SCImago Journal Rank (SJR) 2017: 0.492
Source Normalized Impact per Paper (SNIP) 2017: 2.813


All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 124 124 42
PDF Downloads 51 51 37