The aim of this work is to create a web-based system that will assist its users in the cancer diagnosis process by means of automatic classification of cytological images obtained during fine needle aspiration biopsy. This paper contains a description of the study on the quality of the various algorithms used for the segmentation and classification of breast cancer malignancy. The object of the study is to classify the degree of malignancy of breast cancer cases from fine needle aspiration biopsy images into one of the two classes of malignancy, high or intermediate. For that purpose we have compared 3 segmentation methods: k-means, fuzzy c-means and watershed, and based on these segmentations we have constructed a 25–element feature vector. The feature vector was introduced as an input to 8 classifiers and their accuracy was checked.
The results show that the highest classification accuracy of 89.02 % was recorded for the multilayer perceptron. Fuzzy c–means proved to be the most accurate segmentation algorithm, but at the same time it is the most computationally intensive among the three studied segmentation methods.
If the inline PDF is not rendering correctly, you can download the PDF file here.
 UCI machine learning repository.
 National Cancer Registry The Maria Skłodowska–Curie memorial Cancer Center Department of Epidemiology and Cancer Prevetion December 2013.
 TNM breast cancer staging December 2014.
 M.N. Ahmed S.M. Yamany N. Mohamed A.A. Farag and T. Moriarty. A modified fuzzy c-means algorithm for bias field estimation and segmentation of mri data. IEEE Transactions on Medical Imaging 21:193–199 2002.
 J.C. Bezdek. Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press New York 1981.
 C.M. Bishop. Pattern Recognition and Machine Learning. Springer 2006.
 H.J.G. Bloom and W.W. Richardson. Histological grading and prognosis in breast cancer. British Journal of Cancer 11:359–377 1957.
 J.C. Dunn. A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters. Journal of Cybernetics 3:32–57 1973.
 A. Ethem. Introduction to Machine Learning. MIT Press Boston 2010.
 J. Ferlay I. Soerjomataram M. Ervik R. Dikshit S. Eser C. Mathers M. Rebelo D.M. Parkin D. Forman and F. Bray. Cancer incidence and mortality worldwide. IARC Cancer Base No. 11 2012.
 P. Filipczuk T. Fevens A. Krzyzak and R. Monczak. Computer-aided breast cancer diagnosis based on the analysis of cytological images of fine needle biopsies. IEEE Transactions on Medical Imaging PP(99):1–1 2013.
 P. Filipczuk M. Kowal and A. Obuchowicz. Fuzzy clustering and adaptive thresholding based segmentation method for breast cancer diagnosis. Computer Recognition Systems 4(5):613–622 2011.
 D.L. Fisher. Data documentation and decision tables. Comm ACM 9(1):26–31 1966.
 Y.M. George H.H. Zayed M.I. Roushdy and B.M. Elbagoury. Remote computer-aided breast cancer detection and diagnosis system based on cytological images. IEEE Systems Journal PP(99):1–16 2013.
 T. Hastie R. Tibshirani and J. Friedman. The elements of statistical learning 2nd. edition. Springer New York 2009.
 S. Haykin. Neural Networks: A Comprehensive Foundation. Prentice Hall 1998.
 R.C. Holte. Very simple classification rules perform well on most commonly used datasets. Machine Learning 11(1):63–90 1993.
 T. Kanungo D. M. Mount N. Netanyahu C. Piatko R. Silverman and A. Y. Wu. An efficient k-means clustering algorithm: Analysis and implementation. In Proc. IEEE Conf. Computer Vision and Pattern Recognition pages 881–892 2002.
 S.B. Kotsiantis. Supervised machine learning: A review of classification techniques. Informatica pages 249–268 2007.
 B. Krawczyk and P. Filipczuk. Cytological image analysis with firefly nuclei detection and hybrid one–class classification decomposition. Engineering Applications of Artificial Intelligence 31:126–135 2014.
 B. Krawczyk Ł. Jeleń A. Krzyżak and T. Fevens. Oversampling methods for classification of imbalanced breast cancer malignancy data. Lecture Notes in Computer Science (LNCS) 7594:483–490 2012.
 B. Krawczyk and G. Schaefer. A hybrid classifier committee for analysing asymmetry features in breast thermograms. Applied Soft Computing 20:112–118 2014.
 Jihene Malek Abderrahim Sebri Souhir Mabrouk Kholdoun Torki and Rached Tourki. Automated breast cancer diagnosis based on gvf-snake segmentation wavelet features extraction and fuzzy classification. Journal of Signal Processing Systems 55(1-3):49–66 2009.
 O.L. Mangasarian R. Setiono and W.H. Wolberg. Pattern Recognition via Linear Programming: Theory and Application to Medical Diagnosis. Large-Scale Num. Opt. Philadelphia: SIAM pages 22–31 1990.
 A. Marcano-Cedeño J. Quintanilla-Domínguez and D. Andina. WBCD breast cancer database classification applying artificial metaplasticity neural network. Expert Systems with Applications 38(8):9573 – 9579 2011.
 T. Mitchell. Machine Learning Generative and Discriminative Classifiers: Naive Bayes and Logistic Regression (Draft Version). McGraw Hill 2005.
 S.I. Niwas P. Palanisamy and K. Sujathan. Wavelet based feature extraction method for breast cancer cytology images. In IEEE Symposium on Industrial Electronics Applications (ISIEA) pages 686–690 Oct 2010.
 J.R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers 1993.
 J.B.T.M Roerdink and A. Meijster. The watershed transform: definitions algorithms and parallelization strategies. Fundamenta Informaticae 41:187–228 2000.
 W.N. Street W.H. Wolberg and O.L. Mangasarian. Nuclear feature extraction for breast tumor diagnosis. In IS&T/SPIE Inter. Symp. on Electronic Imaging: Science and Technology volume 1905 pages 861–870 1993.
 W.H Wolberg and O.L. Mangasarian. Multisurface Method of Pattern Separation for Medical Diagnosis Applied to Breast Cytology. Proceedings of National Academy of Science USA 87:9193–9196 1990.
 Xiangchun Xiong Yangon Kim Yuncheol Baek Dae Wong Rhee and Soo-Hong Kim. Analysis of breast cancer using data mining & statistical techniques. In Proc. 6th Int. Conf. on Software Engineering Artificial Intelligence Networking and Parallel/Distributed Computing and 1st ACIS Int. Worksh. on Self-Assembling Wireless Networks pages 82–87 2005.