Cite

In this paper, we look closely at the issue of contaminated data sets, where apart from legitimate (proper) patterns we encounter erroneous patterns. In a typical scenario, the classification of a contaminated data set is always negatively influenced by garbage patterns (referred to as foreign patterns). Ideally, we would like to remove them from the data set entirely. The paper is devoted to comparison and analysis of three different models capable to perform classification of proper patterns with rejection of foreign patterns. It should be stressed that the studied models are constructed using proper patterns only, and no knowledge about the characteristics of foreign patterns is needed. The methods are illustrated with a case study of handwritten digits recognition, but the proposed approach itself is formulated in a general manner. Therefore, it can be applied to different problems. We have distinguished three structures: global, local, and embedded, all capable to eliminate foreign patterns while performing classification of proper patterns at the same time. A comparison of the proposed models shows that the embedded structure provides the best results but at the cost of a relatively high model complexity. The local architecture provides satisfying results and at the same time is relatively simple.

eISSN:
2449-6499
Language:
English
Publication timeframe:
4 times per year
Journal Subjects:
Computer Sciences, Databases and Data Mining, Artificial Intelligence