Nowadays, unprecedented amounts of heterogeneous data collections are stored, processed and transmitted via the Internet. In data analysis one of the most important problems is to verify whether data observed or/and collected in time are genuine and stationary, i.e. the information sources did not change their characteristics. There is a variety of data types: texts, images, audio or video files or streams, metadata descriptions, thereby ordinary numbers. All of them changes in many ways. If the change happens the next question is what is the essence of this change and when and where the change has occurred. The main focus of this paper is detection of change and classification of its type. Many algorithms have been proposed to detect abnormalities and deviations in the data. In this paper we propose a new approach for abrupt changes detection based on the Parzen kernel estimation of the partial derivatives of the multivariate regression functions in presence of probabilistic noise. The proposed change detection algorithm is applied to oneand two-dimensional patterns to detect the abrupt changes.
[1] A. Berlinet, G. Biau, L. Rouviere, Optimal L1 bandwidth selection for variable kernel density estimates, Statistics and Probability Letters, Elsevier, Vol. 74, No. 2, 2005, pp. 116-128.
[2] S. Bhardwaj, A. Mittal, A survey on various edge detector techniques, Elseiver, Sci-Verse ScienceDirect, Procedia Technology 4, 2nd International Conference on Computer, Communication, Control and Information Technology, 2012, pp. 220-226.
[3] J.F. Canny, A computational approach to edge detection, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 8, No. 6, 1986, pp. 679-698.
[4] G.W. Corder, D.I. Foreman, Nonparametric Statistics: A Step-by-Step Approach. Wiley, New York, 2014.
[5] K. Cpałka, L. Rutkowski, Evolutionary learning of flexible neuro-fuzzy systems, Proc. of the 2008 IEEE Int. Conference on Fuzzy Systems (IEEE World Congress on Computational Intelligence, WCCI 2008), Hong Kong June 1-6, CD, 2008, pp. 969-975.
[6] T. Dasu, S. Krishnan, S. Venkatasubramanian, K. Yi, An information-theoretic approach to detecting changes in multi-dimensional data streams, Proc. Symp. on the Interface of Statistics, Computing Science, and Applications, 2006.
[7] J.J. Davis, Ch.-T. Lin, G. Gillett, R. Kozma, An integrative approach to analyze EEG signals and human brain dynamics in different cognitive states, Journal of Artificial Intelligence and Soft Computing Research, Vol. 7, No. 4, 2017, pp. 287-299.
[8] V.S. Devi, L. Meena, Parallel MCNN (PMCNN) with application to prototype selection on large and streaming data, Journal of Artificial Intelligence and Soft Computing Research, Vol. 7, No. 3, 2017, pp. 155-169.
[9] L. Devroye, G. Lugosi, Combinatorial Methods in Density Estimation. Springer-Verlag, New York, 2001.
[10] P. Duda, M. Jaworski, L. Rutkowski, Convergent time-varying regression models for data streams: tracking concept drift by the recursive Parzen-based generalized regression neural networks, International Journal of Neural Systems, Vol. 28, No. 2, 1750048, 2018.
[11] P. Duda, M. Jaworski, L. Rutkowski, Knowledge discovery in data streams with the orthogonal series-based generalized regression neural networks, Information Sciences, Vol. 460-461, 2018, pp. 497-518.
[12] P. Duda, L. Rutkowski, M. Jaworski, D. Rutkowska, On the Parzen kernel-based probability density function learning procedures over time-varying streaming data with applications to pattern classification, IEEE Transactions on Cybernetics, 2018, pp. 1-14.
[13] R.L. Eubank, Nonparametric Regression and Spline Smoothing. 2nd edition, Marcel Dekker, New York, 1999.
[14] W.J. Faithfull, J.J. Rodríguez, L.I. Kuncheva, Combining univariate approaches for ensemble change detection in multivariate data, Elseiver, Information Fusion, Vol. 45, 2019, pp. 202-214.
[15] T. Gałkowski, L. Rutkowski, Nonparametric recovery of multivariate functions with applications to system identification, Proceedings of the IEEE, Vol. 73, 1985, pp. 942-943.
[16] T. Gałkowski, L. Rutkowski, Nonparametric fitting of multivariable functions, IEEE Transactions on Automatic Control, Vol. AC-31, 1986, pp. 785-787.
[17] T. Gałkowski, On nonparametric fitting of higher order functions derivatives by the kernel method - a simulation study, Proceedings of the 5-th Int. Symp. on Applied Stochastic Models and data Analysis, Granada, Spain, 1991, pp. 230-242.
[18] T. Gasser, H.-G. Müller, Kernel estimation of regression functions, Lecture Notes in Mathematics, Vol. 757. Springer-Verlag, Heidelberg, 1979, pp. 23-68.
[19] T. Gasser, H.-G. Müller, Estimating regression functions and their derivatives by the kernel method, Scandinavian Journal of Statistics, Vol. 11, No. 3, 1984, pp. 171-185.
[20] R. Grycuk, R. Scherer, M. Gabryel, New image descriptor from edge detector and blob extractor. Journal of Applied Mathematics and Computational Mechanics, Vol. 14, No.4, 2015, pp. 31-39.
[21] R. Grycuk, M. Knop, S. Mandal, Video key frame detection based on SURF algorithm. International Conference on Artificial Intelligence and Soft Computing, ICAISC’2015, Springer, Cham, 2015, pp. 566-576.
[22] R. Grycuk, M. Gabryel, M. Scherer, S. Voloshynovskiy, Image descriptor based on edge detection and crawler algorithm. In International Conference on Artificial Intelligence and Soft Computing, ICAISC’2016, Springer, 2016, pp. 647-659.
[23] L. Györfi, M. Kohler, A. Krzyzak, H. Walk, A Distribution-Free Theory of Nonparametric Regression. Springer, 2002.
[24] M. Jaworski, P. Duda, L. Rutkowski, New splitting criteria for decision trees in stationary data streams, IEEE Transactions on Neural Networks and Learning Systems, Vol. 29, No. 6, 2018, pp. 2516-2529.
[25] S. Kullback, R.A. Leibler, On information and sufficiency, The Annals of Mathematical Statistics. Vol. 22, No. 1, 1951, pp. 79-86.
[26] M.W.Y. Lam, One-match-ahead forecasting in two-team sports with stacked Bayesian regressions, Journal of Artificial Intelligence and Soft Computing Research, Vol. 8, No. 3, 2018, pp. 159-171.
[27] K. Łapa, K. Cpałka, A. Przybył, K. Grzanek, Negative space-based population initialization algorithm (NSPIA), Artificial Intelligence and Soft Computing, ICAISC’2018, Lecture Notes in Computer Science, Vol. 10841, Springer, 2018, pp. 449-461.
[28] K. Łapa, K. Cpałka, A. Przybył, Genetic programming algorithm for designing of control systems, Information Technology and Control, vol. 47, no. 5, 2018, pp. 668-683.
[29] D. Marr, E. Hildreth, Theory of edge detection, Proc. R. Soc. London, B-207, 1980), pp. 187-217.
[30] L. Pietruczuk, L. Rutkowski, M. Jaworski, P. Duda, How to adjust an ensemble size in stream data mining?, Information Sciences, Elsevier Science Inc., Vol. 381, No. C, 2017, pp. 46-54.
[31] W.K. Pratt, Digital Image Processing, 4th Edition, John Wiley Inc., New York, 2007.
[32] P. Qiu, Nonparametric estimation of jump surface, The Indian Journal of Statistics, Series A, Vol. 59, No. 2, 1997, pp. 268-294.
[33] P. Qiu, Jump surface estimation, edge detection, and image restoration, Journal of the American Statistical Association, No. 102, 2007, pp. 745-756.
[34] E. Rafajłowicz, R. Schwabe, Halton and Hammersley sequences in multivariate nonparametric regression, Statistics and Probability Letters, Vol. 76, No. 8, 2006, pp. 803-812.
[35] W. Rafajłowicz, Nonparametric estimation of continuously parametrized families of probability density functions – Computational aspects, Preprint of the Department of Engineering Informatics, Wrocław University of Science and Technology, Wrocław, 2020.
[36] C.R. Rivero, J. Pucheta, S. Laboret, V. Sauchelli, D. Patino, Energy associated tuning method for short-term series forecasting by complete and incomplete datasets, Journal of Artificial Intelligence and Soft Computing Research, Vol. 7, No. 1, 2017, pp. 5-16.
[37] L. Romani, M. Rossini, D. Schenone, Edge detection methods based on RBF interpolation, Journal of Computational and Applied Mathematics, Vol. 349, 2019, pp. 532-547.
[38] L. Rutkowski, Application of multiple Fourier-series to identification of multivariable non-stationary systems, International Journal of Systems Science, Vol. 20, No. 10, 1989, pp. 1993-2002.
[39] L. Rutkowski, E. Rafajłowicz, On optimal global rate of convergence of some nonparametric identification procedures, IEEE Transactions on Automatic Control, Vol. 34, No. 10, 1989, pp. 1089-1091.
[40] L. Rutkowski, Identification of MISO nonlinear regressions in the presence of a wide class of disturbances, IEEE Transactions on Information Theory, Vol. 37, No. 1, 1991, pp. 214-216.
[41] L. Rutkowski, L. Pietruczuk, P. Duda, M. Jaworski, Decision trees for mining data streams based on the McDiarmid’s bound, IEEE Transactions on Knowledge and Data Engineering, Vol. 25, No. 6, 2013, pp. 1272-1279.
[42] L. Rutkowski, M. Jaworski, L. Pietruczuk, P. Duda, Decision trees for mining data streams based on the Gaussian approximation, IEEE Transactions on Knowledge and Data Engineering, Vol. 26, No. 1, 2014, pp. 108-119.
[43] L. Rutkowski, M. Jaworski, L. Pietruczuk, P. Duda, The CART decision tree for mining data streams, Information Sciences, Vol. 266, 2014, pp. 1-15.
[44] L. Rutkowski, M. Jaworski, L. Pietruczuk, P. Duda, A new method for data stream mining based on the misclassification error, IEEE Transactions on Neural Networks and Learning Systems, Vol. 26, No. 5, 2015, pp. 1048-1059.
[45] T. Rutkowski, J. Romanowski, P. Woldan, P. Staszewski, R. Nielek, L. Rutkowski, A content-based recommendation system using neuro-fuzzy approach, International Conference on Fuzzy Systems: FUZZ-IEEE, 2018, pp. 1-8.
[46] T. Rutkowski, J. Romanowski, P. Woldan, P. Staszewski, R. Nielek, Towards interpretability of the movie recommender based on a neuro-fuzzy approach, Lectures Notes in Artificial Intelligence, ICAISC’2018, Vol. 10842, Springer, 2018, pp. 752-762.
[47] L. Rutkowski, M. Jaworski, P. Duda, Stream Data Mining: Algorithms and Their Probabilistic Properties, Springer, 2019.
[48] S. Singh, R. Singh, Comparison of various edge detection techniques, in: 2nd International Conference on Computing for Sustainable Global Development, 2015, pp. 393-396.
[49] T. Tezuka, Ch. Claramunt, Kernel analysis for estimating the connectivity of a network with event sequences, Journal of Artificial Intelligence and Soft Computing Research, Vol. 7, No. 1, 2017, pp. 17-31.
[50] Y.G. Yatracos, Rates of convergence of minimum distance estimators and Kolmogorov’s entropy. The Annals of Statistics, Vol. 13, 1985, pp. 768-774.