Nonparametric statistical analysis for multiple comparison of machine learning regression algorithms

Alcalá-Fdez, J., Fernandez, A., Luengo, J., Derrac, J., García, S., Sánchez, L. and Herrera, F. (2011). Keel data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework, Journal of Multiple-Valued Logic and Soft Computing 17(2-3): 255-287.Search in Google Scholar

Alcalá-Fdez, J., Sánchez, L., García, S., del Jesus, M., Ventura, S., Garrell, J., Otero, J., Romero, C., Bacardit, J., Rivas, V., Fernández, J. and Herrera, F. (2009). KEEL: A software tool to assess evolutionary algorithms to data mining problems, Soft Computing 13(3): 307-318.10.1007/s00500-008-0323-ySearch in Google Scholar

Anderson, T. and Darling, D. (1954). A test of goodness-of-fit, Journal of the American Statistical Association 49(268): 765-769.10.1080/01621459.1954.10501232Search in Google Scholar

Anscombe, F. and Glynn, W. (1983). Distribution of the kurtosis statistic b2 for normal samples, Biometrika 70(1): 227-234.10.1093/biomet/70.1.227Search in Google Scholar

Baruque, B., Porras, S. and Corchado, E. (2011). Hybrid classification ensemble using topology-preserving clustering, New Generation Computing 29(3): 329-344.10.1007/s00354-011-0306-xSearch in Google Scholar

Bergmann, G. and Hommel, G. (1988). Improvements of general multiple test procedures for redundant systems of hypotheses, in P. Bauer, G. Hommel and E. Sonnemann (Eds.), Multiple Hypotheses Testing, Springer-Verlag, Berlin, pp. 100-115.10.1007/978-3-642-52307-6_8Search in Google Scholar

Broomhead, D. and Lowe, D. (1998). Multivariable functional interpolation and adaptive networks, Complex Systems 11: 321-355.Search in Google Scholar

Czarnowski, I. and Je˛drzejowicz, P. (2011). Application of agent-based simulated annealing and tabu search procedures to solving the data reduction problem, International Journal of Applied Mathematics and Computer Science 21(1): 57-68, DOI: 10.2478/v10006-011-0004-3.10.2478/v10006-011-0004-3Search in Google Scholar

D’Agostino, R. (1970). Transformation to normality of the null distribution of g1, Biometrika 57(3): 679-681.10.1093/biomet/57.3.679Search in Google Scholar

D’Agostino, R., Belanger, A. and D’Agostino Jr., R. (1990). A suggestion for using powerful and informative tests of normality, The American Statistician 44(4): 316-321.10.1080/00031305.1990.10475751Search in Google Scholar

Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets, Journal of Machine Learning Research 7: 1-30.Search in Google Scholar

Derrac, J., García, S., Molina, D. and Herrera, F. (2011). A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms, Swarm and Evolutionary Computation 1: 3-18.10.1016/j.swevo.2011.02.002Search in Google Scholar

Dunn, O. (1961). Multiple comparisons among means, Journal of the American Statistical Association 56(238): 52-64.10.1080/01621459.1961.10482090Search in Google Scholar

Finner, H. (1993). On a monotonicity problem in step-down multiple test procedures, Journal of the American Statistical Association 88(423): 920-923.10.1080/01621459.1993.10476358Search in Google Scholar

Friedman, M. (1937). The use of ranks to avoid the assumption of normality implicit in the analysis of variance, Journal of the American Statistical Association 32(200): 675-701.10.1080/01621459.1937.10503522Search in Google Scholar

García, S., Fernández, A., Luengo, J. and Herrera, F. (2009). A study of statistical techniques and performance measures for genetics-based machine learning: Accuracy and interpretability, Soft Computing 10(13): 959-977.10.1007/s00500-008-0392-ySearch in Google Scholar

García, S., Fernández, A. and Luengo, J.and Herrera, F. (2010). Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power, Information Sciences 180: 2044-2064.10.1016/j.ins.2009.12.010Search in Google Scholar

García, S. and Herrera, F. (2008). An extension on “Statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons, Journal of Machine Learning Research 9: 2677-2694.Search in Google Scholar

Graczyk, M., Lasota, T., Telec, Z. and Trawin´ski, B. (2010). Nonparametric statistical analysis of machine learning algorithms for regression problems, in R. Setchi, I. Jordanov, R.J. Howlett and L.C. Jain (Eds.), KES 2010, Lecture Notes in Artificial Intelligence, Vol. 6276, Springer, Heidelberg, pp. 111-120.10.1007/978-3-642-15387-7_15Search in Google Scholar

Graczyk, M., Lasota, T. and Trawin´ski, B. (2009). Comparative analysis of premises valuation models using KEEL, RapidMiner, and WEKA, in N.T. Nguyen, R. Kowalczyk and S.-M. Chen (Eds.), ICCCI 2009, Lecture Notes in Artificial Intelligence, Vol. 5796, Springer, Heidelberg, pp. 800-812.10.1007/978-3-642-04441-0_70Search in Google Scholar

Hill, T. and Lewicki, P. (2007). Statistics: Methods and Applications, StatSoft, Tulsa.Search in Google Scholar

Hochberg, Y. (1988). A Sharper Bonferroni procedure for multiple tests of significance, Biometrika 75(4): 800-802.10.1093/biomet/75.4.800Search in Google Scholar

Hodges, J. and Lehmann, E. (1962). Ranks methods for combination of independent experiments in analysis of variance, Annals of Mathematical Statistics 33: 482-497.10.1214/aoms/1177704575Search in Google Scholar

Holland, B. and Copenhaver, M. (1987). An improved sequentially rejective Bonferroni test procedure, Biometrics 43(2): 417-423.10.2307/2531823Search in Google Scholar

Holm, S. (1979). A simple sequentially rejective multiple test procedure, Scandinavian Journal of Statistics 6: 65-70.Search in Google Scholar

Hommel, G. (1988). A stagewise rejective multiple test procedure based on a modified Bonferroni test, Biometrika 75(2): 383-386.10.1093/biomet/75.2.383Search in Google Scholar

Hommel, G.and Bernhard, G. (1994). A rapid algorithm and a computer program for multiple test procedures using procedures using logical structures of hypotheses, Computer Methods and Programs in Biomedicine 43: 213-216.10.1016/0169-2607(94)90072-8Search in Google Scholar

Igel, C. and Hüsken, M. (2003). Empirical evaluation of the improved RPROP learning algorithm, Neurocomputing 50: 105-123.10.1016/S0925-2312(01)00700-7Search in Google Scholar

Iman, R. and Davenport, J. (1980). Approximations of the critical region of the Friedman statistic, Communications in Statistics 18: 571-595.10.1080/03610928008827904Search in Google Scholar

Jackowski, K. and Woz´niak, M. (2010). Method of classifier selection using the genetic approach, Expert Systems 27(2): 114-128.10.1111/j.1468-0394.2010.00513.xSearch in Google Scholar

Jarque, C. and Bera, A. (1987). A test for normality of observations and regression residuals, International Statistical Review 55(2): 163-172.10.2307/1403192Search in Google Scholar

Kajdanowicz, T. and Kazienko, P. (2011). Boosting-based sequential output prediction, New Generation Computing 29(3): 293-307.10.1007/s00354-010-0304-4Search in Google Scholar

Keskin, S. (2006). Comparison of several univariate normality tests regarding type I error rate and power of the test in simulation based small samples, Journal of Applied Science Research 2(5): 296-300.Search in Google Scholar

Król, D., Lasota, T., Trawin´ski, B. and Trawin´ski, K. (2008). Investigation of evolutionary optimization methods of TSK fuzzy model for real estate appraisal, International Journal of Hybrid Intelligent Systems 5(3): 111-128.10.3233/HIS-2008-5302Search in Google Scholar

Krzystanek, M., Lasota, T. and Trawin´ski, B. (2009). Comparative analysis of evolutionary fuzzy models for premises valuation using KEEL, in N.T. Nguyen, R. Kowalczyk and S.-M. Chen (Eds.), ICCCI 2009, Lecture Notes in Artificial Intelligence, Vol. 5796, Springer, Heidelberg, pp. 838-849.10.1007/978-3-642-04441-0_73Search in Google Scholar

Lasota, T., Mazurkiewicz, J., Trawin´ski, B. and Trawin´ski, K. (2010). Comparison of data driven models for the validation of residential premises using KEEL, International Journal of Hybrid Intelligent Systems 7(1): 3-16.10.3233/HIS-2010-0101Search in Google Scholar

Lasota, T., Telec, Z., Trawin´ski, B. and Trawin´ski, K. (2011). Investigation of the ets evolving fuzzy systems applied to real estate appraisal, Journal of Multiple-Valued Logic and Soft Computing 17(2-3): 229-253.Search in Google Scholar

Li, J. (2008). A two-step rejection procedure for testing multiple hypotheses, Journal of Statistical Planning and Inference 138(6): 1521-1527.10.1016/j.jspi.2007.04.032Search in Google Scholar

Lilliefors, H. (1967). On the Kolmogorov-Smirnov test for normality with mean and variance unknown, Journal of the American Statistical Association 62(318): 399-402.10.1080/01621459.1967.10482916Search in Google Scholar

Luengo, J., García, S. and Herrera, F. (2009). A study on the use of statistical tests for experimentation with neural networks: Analysis of parametric test conditions and non-parametric tests, Expert Systems with Applications 36: 7798-7808.10.1016/j.eswa.2008.11.041Search in Google Scholar

Lughofer, E., Trawin´ski, B., Trawin´ski, K., Kempa, O. and Lasota, T. (2011). On employing fuzzy modeling algorithms for the valuation of residential premises, Information Sciences 181: 5123-5142.10.1016/j.ins.2011.07.012Search in Google Scholar

Moller, F. (1990). A scaled conjugate gradient algorithm for fast supervised learning, Neural Networks 6: 525-533.10.1016/S0893-6080(05)80056-5Search in Google Scholar

Motulsky, H. (2010). Intuitive Biostatistics: A Nonmathematical Guide to Statistical Thinking, 2nd Edn., Oxford University Press, New York, NY.Search in Google Scholar

Nemenyi, P.B. (1963). Distribution-free Multiple Comparisons, Ph.D. thesis, Princeton University, Princeton, NJ.Search in Google Scholar

Plackett, R. (1983). Karl Pearson and the chi-squared test, International Statistical Review 51(1): 59-72.10.2307/1402731Search in Google Scholar

Plat, J. (1991). A resource allocating network for function interpolation, Neural Computation 3(2): 213-225.10.1162/neco.1991.3.2.21331167310Search in Google Scholar

Quade, D. (1979). Using weighted rankings in the analysis of complete blocks with additive block effects, Journal of the American Statistical Association 74: 680-683.10.1080/01621459.1979.10481670Search in Google Scholar

Romão, X., Delgado, R. and Costa, A. (2010). An empirical power comparison of univariate goodness-of-fit tests for normality, Journal of Statistical Computation and Simulation 80(5): 545-591.10.1080/00949650902740824Search in Google Scholar

Rom, D. (1990). A sequentially rejective test procedure based on a modified Bonferroni inequality, Biometrika 77(3): 663-665.10.1093/biomet/77.3.663Search in Google Scholar

Royston, P. (1993). A pocket-calculator algorithm for the Shapiro-Francia test for non-normality: An application to medicine, Statistics in Medicine 12(2): 181-184.10.1002/sim.47801202098446812Search in Google Scholar

Salzberg, S. (1997). On comparing classifiers: Pitfalls to avoid and a recommended approach, Data Mining and Knowledge Discovery 1: 317-327.10.1023/A:1009752403260Search in Google Scholar

Shaffer, J. (1986). Modified sequentially rejective multiple test procedures, Journal of the American Statistical Association 81(395): 826-831.10.1080/01621459.1986.10478341Search in Google Scholar

Shapiro, S. and Wilk, M. (1965). An analysis of variance test for normality (complete samples), Biometrika 52(3/4): 591-611.10.1093/biomet/52.3-4.591Search in Google Scholar

Sheskin, D. (2011). Handbook of Parametric and Non-parametric Statistical Procedures, 5th Edn., Chapman & Hall/CRC, Boca Raton, FL.Search in Google Scholar

Smętek, M. and Trawin´ski, B. (2011). Investigation of genetic algorithms with self-adaptive crossover, mutation, and selection, in E. Corchado, M. Kurzyn´ski and M. Woz´niak (Eds.), HAIS 2011, Lecture Notes in Artificial Intelligence, Vol. 6678, Springer, Heidelberg, pp. 116-123.10.1007/978-3-642-21219-2_16Search in Google Scholar

Smotroff, I., Friedman, D. and Connolly, D. (1991). Self organizing modular neural networks, IEEE International Joint Conference on Neural Networks, IJCNN’91, Seattle, WA, USA, pp. 187-192.Search in Google Scholar

Székely, G.J. and Rizzo, M. (2005). A new test for multivariate normality, Journal of Multivariate Analysis 93(1): 58-80.10.1016/j.jmva.2003.12.002Search in Google Scholar

Tanweeer-Ul-Islam (2011). Normality testing-A new direction, International Journal of Business and Social Science 2(3): 115-118.Search in Google Scholar

Thodę H. (2002). Testig for Normality, Marcel Dekker, New York, NY.Search in Google Scholar

Troç, M. and Unold, O. (2010). Self-adaptation of parameters in a learning classifier system ensemble machine, International Journal of Applied Mathematics and Computer Science 20(1): 157-174, DOI: 10.2478/v10006-010-0012-8.10.2478/v10006-010-0012-8Search in Google Scholar

Wilcoxon, F. (1945). Individual comparisons by ranking methods, Biometrics 1: 80-83.10.2307/3001968Search in Google Scholar

Wright, S. (1992). Adjusted p-values for simultaneous inference, Biometrics 48: 1005-1013.10.2307/2532694Search in Google Scholar

Yazici, B. and Yolacan, S. (2007). A comparison of various tests of normality, Journal of Statistical Computation and Simulation 77(2): 175-183.10.1080/10629360600678310Search in Google Scholar

Zaman, M. and Hirose, H. (2011). Classification performance of bagging and boosting type ensemble methods with small training sets, New Generation Computing 29(3): 277-292.10.1007/s00354-011-0303-0Search in Google Scholar

Zar, J. (2009). Biostatistical Analysis, 5th Edn., Prentice Hall, Upper Saddle River, NJ.Search in Google Scholar

eISSN:: 2083-8492
ISSN:: 1641-876X
Language:: English

Publication timeframe:: 4 times per year
Journal Subjects:: Mathematics, Applied Mathematics

Journal RSS Feed

Nonparametric statistical analysis for multiple comparison of machine learning regression algorithms

Published Online: Dec 28, 2012

Page range: 867 - 881

DOI: https://doi.org/10.2478/v10006-012-0064-z

This content is open access.