Open Access

Improvement of the Fast Clustering Algorithm Improved by K-Means in the Big Data


Cite

Y. Y. Tang, Y. Tao and E. C. M. Lam, (2002), New method for feature extraction based on fractal behavior, Pattern Recognize, 35, 1071–1081, DOI: 10.1016/S0031-3203(01)00095-4.TangY. Y.TaoY.LamE. C. M.2002New method for feature extraction based on fractal behaviorPattern Recognize351071108110.1016/S0031-3203(01)00095-4Open DOISearch in Google Scholar

Y. Y. Tang, L. Yang and J. Liu, (2000), Characterization of dirac-structure edges with wavelet transform, IEEE Transactions on Cybernetics, 30, 93–109, DOI: 10.1109/3477.826950.TangY. Y.YangL.LiuJ.2000Characterization of dirac-structure edges with wavelet transformIEEE Transactions on Cybernetics309310910.1109/3477.826950Open DOISearch in Google Scholar

T. Zhang, B. Fang, Y. Yuan, Y. Y. Yang, Z. Shang and B. Xu, (2010), Generalized discriminate analysis: A matrix exponential approach, IEEE Transactions on Cybernetics, 40, 186–197, DOI: 10.1109/TSMCB.2009.2024759.ZhangT.FangB.YuanY.YangY. Y.ShangZ.XuB.2010Generalized discriminate analysis: A matrix exponential approachIEEE Transactions on Cybernetics4018619710.1109/TSMCB.2009.2024759Open DOISearch in Google Scholar

Y. Y. Tang and X. You, (2003), Skeletonization of ribbon-like shapes based on a new wavelet function, IEEE Transactions on Pattern Analysis and Machine Intelligence, 25, 1118–1133, DOI: 10.1109/TPAMI.2003.1227987.TangY. Y.YouX.2003Skeletonization of ribbon-like shapes based on a new wavelet functionIEEE Transactions on Pattern Analysis and Machine Intelligence251118113310.1109/TPAMI.2003.1227987Open DOISearch in Google Scholar

T. Xie, P. Ren, T. Zhang and Y. Y. Tang, (2018), Distribution preserving learning for unsupervised feature selection, Neurocomputing, 289, 231–240, DOI: 10.1016/j.neucom.2018.02.032.XieT.RenP.ZhangT.TangY. Y.2018Distribution preserving learning for unsupervised feature selectionNeurocomputing28923124010.1016/j.neucom.2018.02.032Open DOISearch in Google Scholar

T. Zhang, B. Fang, Y. Y. Tang, G. He and J. Wen, (2008), Topology preserving non-negative matrix factorization for face recognition, IEEE Transactions on Image Processing, 17, 574–584, DOI: 10.1109/CIS.2007.82.ZhangT.FangB.TangY. Y.HeG.WenJ.2008Topology preserving non-negative matrix factorization for face recognitionIEEE Transactions on Image Processing1757458410.1109/CIS.2007.82Open DOISearch in Google Scholar

T. Zhang, Y. Y. Tang, Z. Shang and X. Liu, (2009), Face recognition under varying illumination using gradientfaces, IEEE Transactions on Image Processing, 18, 2599–2606, DOI: 10.1109/TIP.2009.2028255.ZhangT.TangY. Y.ShangZ.LiuX.2009Face recognition under varying illumination using gradientfacesIEEE Transactions on Image Processing182599260610.1109/TIP.2009.2028255Open DOISearch in Google Scholar

J. Han, M. Kamber and J. Pei, (2011), Data mining: concepts and technique, Morgan Kaufmann Press.HanJ.KamberM.PeiJ.2011Data mining: concepts and techniqueMorgan Kaufmann PressSearch in Google Scholar

G. Sudipto, R. Rajeev and S. Kyuseok, (2001), CURE: An efficient clustering algorithm for large databases, Information Systems, 26, 35–58, DOI: 10.1016/S0306-4379(01)00008-4.SudiptoG.RajeevR.KyuseokS.2001CURE: An efficient clustering algorithm for large databasesInformation Systems26355810.1016/S0306-4379(01)00008-4Open DOISearch in Google Scholar

T. Xie and F. Chen, (2018), Non-convex clustering via proximal alternating linearized minimization method, International Journal of Wavelets, Multisolution and Information Processing, 16, 13–25, DOI: 10.1142/S0219691318400131.XieT.ChenF.2018Non-convex clustering via proximal alternating linearized minimization methodInternational Journal of Wavelets, Multisolution and Information Processing16132510.1142/S0219691318400131Open DOISearch in Google Scholar

P. Hoyer, (2004), Nonnegative matrix factorization with sparseness constraints, Machine Learning Research, 9, 1457–1469.HoyerP.2004Nonnegative matrix factorization with sparseness constraintsMachine Learning Research914571469Search in Google Scholar

D. D. Lee and H. S. Seung, (1999), Learning the parts of objects by nonnegative matrix factorization, Nature, 401, 788–791.LeeD. D.SeungH. S.1999Learning the parts of objects by nonnegative matrix factorizationNature40178879110.1038/4456510548103Search in Google Scholar

B. Ren, P. Laurent, G. B. Zhu and D. Gaspard, (2018), Nonnegative matrix factorization: robust extraction of extended structures, The Astrophysical Journal, 852, 104–121.RenB.LaurentP.ZhuG. B.GaspardD.2018Nonnegative matrix factorization: robust extraction of extended structuresThe Astrophysical Journal85210412110.3847/1538-4357/aaa1f2Search in Google Scholar

Y. X. Wang and Y. J. Zhang, (2013), Nonnegative matrix factorization: A comprehensive review, IEEE Transactions on Knowledge and Data Engineering, 25, 1336–1353, DOI: 10.1109/TKDE.2012.51.WangY. X.ZhangY. J.2013Nonnegative matrix factorization: A comprehensive reviewIEEE Transactions on Knowledge and Data Engineering251336135310.1109/TKDE.2012.51Open DOISearch in Google Scholar

D. Comaniciu and P. Meer, (2002), Mean shift: A robust approach toward feature space analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 603–619, DOI: 10.1109/34.1000236.ComaniciuD.MeerP.2002Mean shift: A robust approach toward feature space analysisIEEE Transactions on Pattern Analysis and Machine Intelligence2460361910.1109/34.1000236Open DOISearch in Google Scholar

Y. Du, B. Sun, R. Lu, C. Zhang and H. Wu, (2019), A method for detecting high-frequency oscillations using semi-supervised k-means and mean shift clustering, Neurocomputing, 350, 102–107, DOI: 10.1016/j.neucom.2019.03.055.DuY.SunB.LuR.ZhangC.WuH.2019A method for detecting high-frequency oscillations using semi-supervised k-means and mean shift clusteringNeurocomputing35010210710.1016/j.neucom.2019.03.055Open DOISearch in Google Scholar

T. Duong, G. Beck, H. Azzag and M. Lebbah, (2016), Nearest neighbor estimators of density derivatives, with application to mean shift clustering, Pattern Recognition Letter, 80, 224–230, DOI: 10.1016/j.patrec.2016.06.021.DuongT.BeckG.AzzagH.LebbahM.2016Nearest neighbor estimators of density derivatives, with application to mean shift clusteringPattern Recognition Letter8022423010.1016/j.patrec.2016.06.021Open DOISearch in Google Scholar

D. Cai and X. Chen, (2015), Large scale spectral clustering via landmark-based sparse representation, IEEE Transactions on Cybernetics, 45, 1669–1680, DOI: 10.1109/TCYB.2014.2358564.CaiD.ChenX.2015Large scale spectral clustering via landmark-based sparse representationIEEE Transactions on Cybernetics451669168010.1109/TCYB.2014.235856425265642Open DOISearch in Google Scholar

U. V. Luxburg, (2007), A tutorial on spectral clustering, Statistics and Computing, 17, 395–416, DOI: 10.1007/s11222-007-9033-z.LuxburgU. V.2007A tutorial on spectral clusteringStatistics and Computing1739541610.1007/s11222-007-9033-zOpen DOISearch in Google Scholar

J. Shi and J. Malik, (2000), Normalized cuts and image segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 888–905, DOI: 10.1109/34.868688.ShiJ.MalikJ.2000Normalized cuts and image segmentationIEEE Transactions on Pattern Analysis and Machine Intelligence2288890510.1109/34.868688Open DOISearch in Google Scholar

M. Brbis and I. Kopriva, (2018), Multi-view low-rank sparse subspace clustering, Pattern Recognition, 73, 247–258, DOI: 10.1016/j.patcog.2017.08.024.BrbisM.KoprivaI.2018Multi-view low-rank sparse subspace clusteringPattern Recognition7324725810.1016/j.patcog.2017.08.024Open DOISearch in Google Scholar

E. Elhamifar and R. Vidal, (2013), Sparse subspace clustering: algorithm, theory and application, IEEE Transactions on Pattern Analysis and Machine Intelligence, 35, 2765–2781, DOI: 10.1109/TPAMI.2013.57.ElhamifarE.VidalR.2013Sparse subspace clustering: algorithm, theory and applicationIEEE Transactions on Pattern Analysis and Machine Intelligence352765278110.1109/TPAMI.2013.5724051734Open DOISearch in Google Scholar

Y. Ma, A. Y. Yang, H. Derksen and R. Fossum, (2008), Estimation of subspace arrangements with applications in modeling and segmenting mixed data, SIAM Review, 50, 413–458, DOI: 10.1137/060655523.MaY.YangA. Y.DerksenH.FossumR.2008Estimation of subspace arrangements with applications in modeling and segmenting mixed dataSIAM Review5041345810.1137/060655523Open DOISearch in Google Scholar

A.K. Jain, (2014), Data clustering: 50 years beyond K-means, Pattern Recognition Letters, 33, 651–666, DOI: 10.1016/j.patrec.2009.09.011.JainA.K.2014Data clustering: 50 years beyond K-meansPattern Recognition Letters3365166610.1016/j.patrec.2009.09.011Open DOISearch in Google Scholar

S. Lloyd, (1982), Least squares quantization in PCM, IEEE Transactions on Information Theory, 28, 129–137, DOI: 10.1109/TIT.1982.1056489.LloydS.1982Least squares quantization in PCMIEEE Transactions on Information Theory2812913710.1109/TIT.1982.1056489Open DOISearch in Google Scholar

H. Park and C. Jun, (2009), A simple and fast algorithm for K-medoids clustering, Expert Systems with Applications, 36, 3336–3341, DOI: 10.1016/j.eswa.2008.01.039.ParkH.JunC.2009A simple and fast algorithm for K-medoids clusteringExpert Systems with Applications363336334110.1016/j.eswa.2008.01.039Open DOISearch in Google Scholar

S. Yu, S. Chu, C. Wang Y. Chan and T. Chang, (2018), Two improved K-means algorithms, Applied Soft Computing, 68, 747–755, DOI: 10.1016/j.asoc.2017.08.032.YuS.ChuS.WangC.ChanY.ChangT.2018Two improved K-means algorithmsApplied Soft Computing6874775510.1016/j.asoc.2017.08.032Open DOISearch in Google Scholar

D. Arthur and S. Vassilvitskii, (2007), K-means++++: the advantages of careful seeding, Society for Industrial and Applied Mathematics, 165, 1027–1035, DOI: 10.1145/1283383.1283494.ArthurD.VassilvitskiiS.2007K-means++++: the advantages of careful seedingSociety for Industrial and Applied Mathematics1651027103510.1145/1283383.1283494Open DOISearch in Google Scholar

O. Bachem, M. Lucic, S. H. Hassani and A. Krause, (2016), Fast and provably good seeding for k-means, IEEE The 30th Conference on Neural Information Processing Systems, 2016, 76–85.BachemO.LucicM.HassaniS. H.KrauseA.2016Fast and provably good seeding for k-meansIEEE The 30th Conference on Neural Information Processing Systems2016768510.1609/aaai.v30i1.10259Search in Google Scholar

M. E. Celebi, H. A. Kingravi and P. A. Vela, (2013), A comparative study of efficient initialization methods for the k-means clustering algorithm, Expert Systems with Applications, 40, 200–210, DOI: 10.1016/j.eswa.2012.07.021.CelebiM. E.KingraviH. A.VelaP. A.2013A comparative study of efficient initialization methods for the k-means clustering algorithmExpert Systems with Applications4020021010.1016/j.eswa.2012.07.021Open DOISearch in Google Scholar

I. S. Dhilon, Y. Guan and B. Kulis, (2004), Kernel k-means, spectral clustering and normalized cuts, ACM International Conference on Knowledge Discovery and Data Mining, 2004, 551–556.DhilonI. S.GuanY.KulisB.2004Kernel k-means, spectral clustering and normalized cutsACM International Conference on Knowledge Discovery and Data Mining200455155610.1145/1014052.1014118Search in Google Scholar

G. Ball and D. Hall, ISODATA, (1965), A novel method of data analysis and pattern classification, Stanford Research Institute Press.BallG.HallD.ISODATA1965A novel method of data analysis and pattern classificationStanford Research Institute PressSearch in Google Scholar

S. A. E. Rahman, (2015), Hyperspectral imaging classification using ISODATA algorithm: big data challenge, IEEE the 5th International Conference on e-Learning, 2015, 271–280.RahmanS. A. E.2015Hyperspectral imaging classification using ISODATA algorithm: big data challengeIEEE the 5th International Conference on e-Learning201527128010.1109/ECONF.2015.39Search in Google Scholar

J. C. Dunn, (1973), A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters, Journal of Cybernetics, 3, 32–57, DOI: 10.1080/01969727308546046.DunnJ. C.1973A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clustersJournal of Cybernetics3325710.1080/01969727308546046Open DOISearch in Google Scholar

I. S. Dhillon and D. S. Modha, (2001), Concept decompositions for large sparse text data using clustering, Machine Learning, 42, 143–175.DhillonI. S.ModhaD. S.2001Concept decompositions for large sparse text data using clusteringMachine Learning4214317510.1023/A:1007612920971Search in Google Scholar

A. Banerjee, (2004), Clustering with Bregman Divergences, SIAM International Conference on Data Mining, 2004, 234–245.BanerjeeA.2004Clustering with Bregman DivergencesSIAM International Conference on Data Mining200423424510.1137/1.9781611972740.22Search in Google Scholar

Y. Linde, A. Buzo and R. Gray, (1980), An algorithm for vector quantizer design, IEEE Transaction Communication, 28, 84–94, DOI: 10.1109/TCOM.1980.1094577.LindeY.BuzoA.GrayR.1980An algorithm for vector quantizer designIEEE Transaction Communication28849410.1109/TCOM.1980.1094577Open DOISearch in Google Scholar

J. Mao and A. K. Jain, (1996), A self-organizing network for hyper ellipsoidal clustering, IEEE Transactions on Neural Networks, 7, 16–29, DOI: 10.1109/ICNN.1994.374705.MaoJ.JainA. K.1996A self-organizing network for hyper ellipsoidal clusteringIEEE Transactions on Neural Networks7162910.1109/ICNN.1994.374705Open DOISearch in Google Scholar

Online, (2019), ORL, http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html.Online2019ORLhttp://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.htmlSearch in Google Scholar

Online, (2019), Yale database, http://cvc.yale edu/projects/yalefaces.html.Online2019Yale databasehttp://cvc.yale edu/projects/yalefaces.htmlSearch in Google Scholar

Online, (2019), COIL-20 database, ftp://zen.cs.columbia.edu/.Online2019COIL-20 databaseftp://zen.cs.columbia.edu/.Search in Google Scholar

Online, (2019), CMD data, http://www.cottonssr.org/.Online2019CMD datahttp://www.cottonssr.org/Search in Google Scholar

Online, (2019), DLBCL data, http://flowrepository.org/id/FR-FCM-ZZYY/.Online2019DLBCL datahttp://flowrepository.org/id/FR-FCM-ZZYY/Search in Google Scholar

Online, (2019), LunG data, http://biogps.org/dataset/tag/lung/.Online2019LunG datahttp://biogps.org/dataset/tag/lung/Search in Google Scholar

Online, (2019), Prostate data, http://statweb.stanford.edu/~tibs/ElemStatLearn/prostate.data/.Online2019Prostate datahttp://statweb.stanford.edu/~tibs/ElemStatLearn/prostate.data/Search in Google Scholar

eISSN:
2444-8656
Language:
English
Publication timeframe:
Volume Open
Journal Subjects:
Life Sciences, other, Mathematics, Applied Mathematics, General Mathematics, Physics