Clustering Large-Scale Data Based On Modified Affinity Propagation Algorithm

Ahmed M. Serdah; Wesam M. Ashour

Open Access

Clustering Large-Scale Data Based On Modified Affinity Propagation Algorithm

Ahmed M. Serdah

and

Wesam M. Ashour

| Jan 13, 2016

Journal of Artificial Intelligence and Soft Computing Research

Volume 6 (2016): Issue 1 (January 2016)

About this article

Cite

Page range: 23 - 33

DOI: https://doi.org/10.1515/jaiscr-2016-0003

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.

Traditional clustering algorithms are no longer suitable for use in data mining applications that make use of large-scale data. There have been many large-scale data clustering algorithms proposed in recent years, but most of them do not achieve clustering with high quality. Despite that Affinity Propagation (AP) is effective and accurate in normal data clustering, but it is not effective for large-scale data. This paper proposes two methods for large-scale data clustering that depend on a modified version of AP algorithm. The proposed methods are set to ensure both low time complexity and good accuracy of the clustering method. Firstly, a data set is divided into several subsets using one of two methods random fragmentation or K-means. Secondly, subsets are clustered into K clusters using K-Affinity Propagation (KAP) algorithm to select local cluster exemplars in each subset. Thirdly, the inverse weighted clustering algorithm is performed on all local cluster exemplars to select well-suited global exemplars of the whole data set. Finally, all the data points are clustered by the similarity between all global exemplars and each data point. Results show that the proposed clustering method can significantly reduce the clustering time and produce better clustering result in a way that is more effective and accurate than AP, KAP, and HAP algorithms.

eISSN:: 2083-2567
Language:: English

Publication timeframe:: 4 times per year
Journal Subjects:: Computer Sciences, Artificial Intelligence, Databases and Data Mining

Journal RSS Feed

Clustering Large-Scale Data Based On Modified Affinity Propagation Algorithm

Published Online: Jan 13, 2016

Page range: 23 - 33

DOI: https://doi.org/10.1515/jaiscr-2016-0003

© 2016 Academy of Management (SWSPiZ), Lodz

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.