Multi-label classification using error correcting output codes

Open access

A framework for multi-label classification extended by Error Correcting Output Codes (ECOCs) is introduced and empirically examined in the article. The solution assumes the base multi-label classifiers to be a noisy channel and applies ECOCs in order to recover the classification errors made by individual classifiers. The framework was examined through exhaustive studies over combinations of three distinct classification algorithms and four ECOC methods employed in the multi-label classification problem. The experimental results revealed that (i) the Bode-Chaudhuri-Hocquenghem (BCH) code matched with any multi-label classifier results in better classification quality; (ii) the accuracy of the binary relevance classification method strongly depends on the coding scheme; (iii) the label power-set and the RAkEL classifier consume the same time for computation irrespective of the coding utilized; (iv) in general, they are not suitable for ECOCs because they are not capable to benefit from ECOC correcting abilities; (v) the all-pairs code combined with binary relevance is not suitable for datasets with larger label sets.

If the inline PDF is not rendering correctly, you can download the PDF file here.

  • Boutell M.R. Luo J. Shen X. and Brown C.M. (2004). Learning multi-label scene classification Pattern Recognition 37(9): 1757-1771.

  • Clare A. and King R.D. (2001). Knowledge discovery in multi-label phenotype data in L.D. Raedt and A. Siebes (Eds.) PKDD: 5th European Conference on Machine Learning and Knowledge Discovery Lecture Notes in Computer Science Vol. 2168 Springer Berlin/Heidelberg pp. 42-53.

  • Crammer K. and Singer Y. (2003). A family of additive online algorithms for category ranking Journal of Machine Learning Research 3: 1025-1058.

  • Dietterich T.G. and Bakiri G. (1995). Solving multiclass learning problems via error-correcting output codes Journal of Artificial Intelligence Research 2: 263-286.

  • Diplaris S. Tsoumakas G. Mitkas P. and Vlahavas I. (2005). Protein classification with multiple algorithms in P. Bozanis and E.N. Houstis (Eds.) 10th Panhel-llenic Conference on Informatics (PCI 2005) Lecture Notes in Computer Science Vol. 3746 Springer-Verlag Berlin/Heidelberg pp. 448-456.

  • Duan K. Keerthi S.S. Chu W. Shevade S.K. and Poo A.N. (2003). Multi-Category Classification by Soft-Max Combination of Binary Classifiers Lecture Notes in Computer Science Vol. 2709 Springer Berlin/Heidelberg.

  • Elisseeff A. and Weston J. (2001). A kernel method for multi-labelled classification in T.G. Dietterich S. Becker and Z. Ghahramani (Eds.) Advances in Neural Information Processing Systems 14 MIT Press Cambridge MA pp. 681-687.

  • Ferng C.-S. and Lin H.-T. (2011). Multi-label classification with error-correcting codes Journal of Machine Learning Research 20: 281-295.

  • Ghamrawi N. and McCallum A. (2005). Collective multi-label classification in O. Herzog H.-J. Schek N. Fuhr A. Chowdhury and W. Teiken (Eds.) International Conference on Information and Knowledge Management CIKM ACM New York NY pp. 195-200.

  • Hong J. Min J. Cho U. and Cho S. (2008). Fingerprint classification using one-vs-all support vector machines dynamically ordered with naive Bayes classifiers Pattern Recognition 41(2): 662-671.

  • Hullermeier E. Furnkranz J. Cheng W. and Brinker K. (2008). Label ranking by learning pairwise preferences Artificial Intelligence 172(16-17): 1897-1916.

  • Jankowski N. (2012). Graph-based generation of a meta-learning search space. International Journal of Applied Mathematics and Computer Science 22(3): 647-667 DOI: 10.2478/v10006-012-0049-y

  • Kajdanowicz T. and Kazienko P. (2009a). Hybrid repayment prediction for debt portfolio in N.T. Nguyen R. Kowalczyk and S.-M. Chen (Eds.) Computational Collective Intelligence. Semantic Web Social Networks and Multiagent Systems Lecture Notes in Artificial Intelligence Vol. 5796 Springer Berlin/Heidelberg pp. 850-857.

  • Kajdanowicz T. and Kazienko P. (2009b). Prediction of sequential values for debt recovery in E. Bayro-Corrochano and J.-O. Eklundh (Eds.) Progress in Pattern Recognition Image Analysis Computer Vision and Applications Lecture Notes in Computer Science Vol. 5856 Springer Berlin/Heidelberg pp. 337-344.

  • Kajdanowicz T. Wozniak M. and Kazienko P. (2011). Multiple classifier method for structured output prediction based on error correcting output codes in N. Nguyen C.-G. Kim and A. Janiak (Eds.) Intelligent Information and Database Systems Lecture Notes in Computer Science Vol. 6592 Springer Berlin/Heidelberg pp. 333-342.

  • Kuncheva L.I. (2005). Using diversity measures for generating error-correcting output codes in classifier ensembles Pattern Recognition Letters 26(1): 83-90.

  • Kuriata E. (2008). Creation of unequal error protection codes for two groups of symbols International Journal of Applied Mathematics and Computer Science 18(2): 251-257 DOI: 10.2478/v10006-008-0023-x.

  • Loza Mencia E. and Furnkranz J. (2008). Pairwise learning of multilabel classifications with perceptrons Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN-08) Hong Kong China pp. 2900-2907.

  • Mackay D.J.C. (2003). Information Theory Inference and Learning Algorithms Cambridge University Press Cambridge.

  • Morelos-Zaragoza R. (2006). The Art of Error Correcting Coding Wiley West Sussex.

  • Pestian J. Brew C Matykiewicz P. Hovermale D. Johnson N. Bretonnel Cohen K. and Duch W. (2007). A shared task involving multi-label classification of clinical free text Proceedings of ACL BioNLP Association of Computational Linguistics Stroudsburg PA.

  • Read J. Pfahringer B. Holmes G. and Frank E. (2009). Classifier chains for multi-label classification 13th European Conference on Principles and Practice of Knowledge Discovery in Databases/20th European Conference on Machine Learning Bled Slovenia pp. 254-269.

  • Read J. Pfahringer B. Holmes G. and Frank E. (2011). Classifier chains for multi-label classification Machine Learning 85(3): 333-359.

  • Reed I.S. and Chen X. (1999). Error-Control Coding for Data Networks Kluwer Academic Publishers Norwell MA.

  • Sammut C. and Webb G.I. (2011). Encyclopedia of Machine Learning Springer Berlin/Heidelberg.

  • Schapire R.E. and Singer Y. (2000). Boostexter: A boosting-based system for text categorization Machine Learning 39(2/3): 135-168.

  • Trohidis K. Tsoumakas G. Kalliris G. and Vlahavas I. (2008). Multilabel classification of music into emotions 9th International Conference on Music Information Retrieval (ISMIR 2008) Philadelphia PA USA pp. 325-330.

  • Tsoumakas G. Katakis I. and Vlahavas I. (2011). Random k-labelsets for multilabel classification IEEE Transactions on Knowledge and Data Engineering 23(7): 1079-1089.

  • Tsoumakas G. and Vlahavas I. (2007). Random k-labelsets: An Ensemble Method for Multilabel Classification Lecture Notes in Artificial Intelligence Vol. 4701 Springer Berlin/Heidelberg.

  • Zhang M.-L. and Zhou Z.-H. (2006). Multilabel neural networks with applications to functional genomics and text categorization IEEE Transactions on Knowledge and Data Engineering 18(10): 1338-1351.

  • Zhang M. and Zhou Z. (2007). ML-KNN: A lazy learning approach to multi-label learning Pattern Recognition 40(7): 2038-2048.

  • Zhang Y. and Schneider J. (2011). Multi-label output codes using canonical correlation analysis Journal of Machine Learning Research 15: 873-882.

Search
Journal information
Impact Factor

IMPACT FACTOR 2018: 1.504
5-year IMPACT FACTOR: 1.553

CiteScore 2018: 2.09

SCImago Journal Rank (SJR) 2018: 0.493
Source Normalized Impact per Paper (SNIP) 2018: 1.361

Mathematical Citation Quotient (MCQ) 2018: 0.08

Cited By
Metrics
All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 239 88 0
PDF Downloads 99 43 0