Parallel MCNN (pMCNN) with Application to Prototype Selection on Large and Streaming Data

V. Susheela Devi 1  and Lakhpat Meena 1
  • 1 Department of Computer Science and Automation, Indian Institute of Science, Bangalore, India


The Modified Condensed Nearest Neighbour (MCNN) algorithm for prototype selection is order-independent, unlike the Condensed Nearest Neighbour (CNN) algorithm. Though MCNN gives better performance, the time requirement is much higher than for CNN. To mitigate this, we propose a distributed approach called Parallel MCNN (pMCNN) which cuts down the time drastically while maintaining good performance. We have proposed two incremental algorithms using MCNN to carry out prototype selection on large and streaming data. The results of these algorithms using MCNN and pMCNN have been compared with an existing algorithm for streaming data.

If the inline PDF is not rendering correctly, you can download the PDF file here.

  • [1] Lakhpat Meena and V. Susheela Devi, Prototype Selection on Large and Streaming Data, International Conference on Neural Information Processing (ICONIP 2015), 2015.

  • [2] M. Narasimha Murty and V. Susheela Devi, Pattern Recognition: An Algorithmic Approach, Springer and Universities Press, 2012.

  • [3] T.M. Cover, P.E. Hart, Nearest neighbor pattern classification, IEEE Trans. on Information Theory, IT-13: 21-27, 1967.

  • [4] P.E. Hart, The condensed nearest neighbor rule. IEEE Trans. on Information Theory, IT-14(3): 515-516, 1968.

  • [5] G.W. Gates, The reduced nearest neighbour rule, IEEE Trans. on Information Theory, IT-18 (3): 431-433, 1972

  • [6] V. Susheela Devi, M. Narasimha Murty. An incremental prototype set building technique, Pattern Recognition, 35: 505-513, 2002.

  • [7] F. Angiulli, Fast Condensed Nearest Neighbor Rule, Proc. 22nd International Conf. Machine Learning (ICML ’05), 2005

  • [8] Angiulli, Fabrizio, and Gianluigi Folino, Distributed nearest neighbor-based condensation of very large data sets, Knowledge and Data Engineering, IEEE Transactions on 19.12, 2007, 1593-1606, 2007.

  • [9] B. Karacali and H. Krim, Fast Minimization of Structural Risk by Nearest Neighbor Rule, IEEE Trans. Neural Networks, vol. 14, no. 1, pp. 127-134, 2003.

  • [10] Law, Yan-Nei and Zaniolo, Carlo, An adaptive nearest neighbor classification algorithm for data streams, In Knowledge Discovery in Databases: PKDD 2005, pp. 108120, Springer, 2005.

  • [11] J. Beringer, E. Hüllermeier, Efficient instance-based learning on data streams, Intelligent Data Analysis, 11 (6) 627-650, 2007

  • [12] K. Tabata, Maiko Sato, Mineichi Kudo, Data compression by volume prototypes for streaming data, Pattern Recognition, 43: 3162-3176, 2010

  • [13] Salvador Garcia, Joaquin Derrac, Prototype selection for nearest neighbor classification: Taxonomy and Empirical study, IEEE Trans. on PAMI, 34: 417-435, 2012.

  • [14] Ireneusz Czarnowski, Piotr Jedrzejowicz, Ensemble classifier for mining data streams, 18th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems(KES 2014), Procedia Computer Science, 35: 397-406, 2014.

  • [15] Jacob Bien, Robert Tibshirani, Prototype selection for interpretable classification, Annals of Applied Statistics, Vol. 5, No. 4, 2403-2424, 2011.

  • [16] Shikha V. Gadodiya, Manoj B. Chandak, Prototype selection algorithms for kNN Classifier: A Survey, International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE), Vol. 2, Issue 12, pp. 4829-4832, 2013.

  • [17] Nele Verbiest, Chris Cornelis, Francisco Herrera, FRPS: A fuzzy rough prototype selection method, Vol. 46, Issue 10, 2770-2782, 2013.

  • [18] Juan Li, Yuping Wang, A nearest prototype selection algorithm using multi-objective optimization and partition, 9th International Conference on Computational Intelligence and Security, 264-268, 2013.


Journal + Issues