An algorithm for reducing the dimension and size of a sample for data exploration procedures

Aarts, E., Korst, J. and van Laarhoven, P. (1997). Simulated annealing, in E. Aarts and J. Lenstra (Eds.), Local Searchin Combinatorial Optimization, Wiley, Chichester, pp. 91-120.Search in Google Scholar

Alba, E. (2005). Parallel Metaheuristics: A New Class of Algorithms, Wiley, New York, NY.10.1002/0471739383Search in Google Scholar

Aswani Kumar, C. and Srinivas, S. (2006). Latent semantic indexing using eigenvalue analysis for efficient information retrieval, International Journal of AppliedMathematicsand Computer Science 16(4): 551-558.Search in Google Scholar

Aswani Kumar, C. (2009). Analysis of unsupervised dimensionality techniques, Computer Science and InformationSystems 6(2): 217-227.10.2298/CSIS0902217KSearch in Google Scholar

Azencot, R. (1992). Simulated Annealing: Parallelization Techniques, Wiley, New York, NY.Search in Google Scholar

Bartenhagen, C., Klein, H.-U., Ruckert, C., Jiang, X. and Dugas, M. (2010). Comparative study of unsupervised dimension reduction techniques for the visualization of microarray gene expression data, BMC Bioinformatics 11, paper no. 567.Search in Google Scholar

Bartkuté, V. and Sakalauskas, L. (2009). Statistical inferences for termination of Markov type random search algorithms, Journal of Optimization Theory and Applications141(3): 475-493.10.1007/s10957-008-9502-3Search in Google Scholar

Ben-Ameur, W. (2004). Computing the initial temperature of simulated annealing, Computational Optimization and Applications29(3): 367-383.10.1023/B:COAP.0000044187.23143.bdSearch in Google Scholar

Borg, I. and Groenen, P. (2005). Modern Multidimensional Scaling. Theory and Applications, Springer-Verlag, Berlin.Search in Google Scholar

Camastra, F. (2003). Data dimensionality estimation methods: A survey, Pattern Recognition 36(12): 2945-2954. 10.1016/S0031-3203(03)00176-6Search in Google Scholar

Charytanowicz, M., Niewczas, J., Kulczycki, P., Kowalski, P., Łukasik, S. and ˙Zak, S. (2010). Complete gradient clustering algorithm for features analysis of x-ray images, in E. Pia˛tka and J. Kawa (Eds.), Information Technologiesin Biomedicine, Vol. 2, Springer-Verlag, Berlin, pp. 15-24.10.1007/978-3-642-13105-9_2Search in Google Scholar

Cortez, P., Cerdeira, A., Almeida, F., Matos, T. and Reis, J. (2009). Modeling wine preferences by data mining from physicochemical properties, Decision Support Systems47(4): 547-553.10.1016/j.dss.2009.05.016Search in Google Scholar

Cox, T. and Cox, M. (2000). Multidimensional Scaling, Chapman and Hall, Boca Raton, FL.10.1201/9780367801700Search in Google Scholar

Cunningham, P. (2007). Dimension reduction, Technical report, UCD School of Computer Science and Informatics, Dublin.Search in Google Scholar

Czarnowski, I. and J˛edrzejowicz, P. (2011). Application of agent-based simulated annealing and tabu search procedures to solving the data reduction problem, International Journal of Applied Mathematics and ComputerScience 21(1): 57-68, DOI: 10.2478/v10006-011-0004-3.10.2478/v10006-011-0004-3Search in Google Scholar

David, H. and Nagaraja, H. (2003). Order Statistics,Wiley, New York, NY.10.1002/0471722162Search in Google Scholar

Deng, Z., Chung, F.-L. and Wang, S. (2008). FRSDE: Fast reduced set density estimator using minimal enclosing ball approximation, Pattern Recognition 41(4): 1363-1372.10.1016/j.patcog.2007.09.013Search in Google Scholar

François, D., Wertz, V. and Verleysen, M. (2007). The concentration of fractional distances, IEEE Transactionson Knowledge and Data Engineering 19(7): 873-886.10.1109/TKDE.2007.1037Search in Google Scholar

Geman, S. and Geman, D. (1984). Stochastic relaxation, Gibbs distribution and the Bayesian restoration in images, IEEETransactions on Pattern Analysis and Machine Intelligence6: 721-741.10.1109/TPAMI.1984.4767596Search in Google Scholar

Gendreau, M. and Potvin, J.-Y. (2010). Handbook of Metaheuristics, Springer, New York, NY.10.1007/978-1-4419-1665-5Search in Google Scholar

Han, J. and Kamber, M. (2006). Data Mining: Concepts andTechniques, Morgan Kaufmann, San Francisco, CA.Search in Google Scholar

Ingber, L. (1996). Adaptive simulated annealing (ASA): Lessons learned, Control and Cybernetics 25(1): 33-54.Search in Google Scholar

Inza, I., Larranaga, P., Etxeberria, R. and Sierra, B. (2000). Feature subset selection by Bayesian network-based optimization, Artificial Intelligence 123(1-2): 157-184.10.1016/S0004-3702(00)00052-7Search in Google Scholar

Ishibuchi, H., Nakashima, T. and Murata, T. (2001). Three-objective genetics-based machine learning for linguistic rule extraction, Information Sciences136(1-4): 109-133.10.1016/S0020-0255(01)00144-XSearch in Google Scholar

Kerdprasop, K., Kerdprasop, N. and Sattayatham, P. (2005). Weighted k-means for density-biased clustering, in A. Tjoa and J. Trujillo (Eds.), Data Warehousing and KnowledgeDiscovery, Lecture Notes in Computer Science, Vol. 3589, Springer-Verlag, Berlin pp. 488-497.10.1007/11546849_48Search in Google Scholar

Kulczycki, P. (2005). Kernel Estimators in System Analysis, WNT, Warsaw, (in Polish). Kulczycki, P. (2008). Kernel estimators in industrial applications, in B. Prasad (Ed.), Soft Computing Applicationsin Industry, Springer-Verlag, Berlin, pp. 69-91.10.1007/978-3-540-77465-5_4Search in Google Scholar

Kulczycki, P. and Charytanowicz, M. (2010). A complete gradient clustering algorithm formed with kernel estimators, International Journal of Applied Mathematicsand Computer Science 20(1): 123-134, DOI: 10.2478/v10006-010-0009-3.10.2478/v10006-010-0009-3Search in Google Scholar

Kulczycki, P. and Kowalski, P. (2011). Bayes classification of imprecise information of interval type, Control and Cybernetics40(1): 101-123.Search in Google Scholar

Kulczycki, P. and Łukasik, S. (2014). Reduction of dimension and size of data set by parallel fast simulated annealing, in L.T. Koczy, C.R. Pozna, R. Claudiu and J. Kacprzyk (Eds.), Issues and Challenges of Intelligent Systems and ComputationalIntelligence, Springer-Verlag, Berlin, pp. 273-292.10.1007/978-3-319-03206-1_19Search in Google Scholar

Kuo, Y. (2010). Using simulated annealing to minimize fuel consumption for the time-dependent vehicle routing problem, Computers & Industrial Engineering59(1): 157-165.10.1016/j.cie.2010.03.012Search in Google Scholar

Łukasik, S. and Kulczycki, P. (2011). An algorithm for sample and data dimensionality reduction using fast simulated annealing, in J. Tang, I. King, L. Chen and J. Wang (Eds.), Advanced Data Mining and Applications, Lecture Notes in Computer Science, Vol. 7120, Springer-Verlag, Berlin, pp. 152-161.10.1007/978-3-642-25853-4_12Search in Google Scholar

Łukasik, S. and Kulczycki, P. (2013). Using topology preservation measures for multidimensional intelligent data analysis in the reduced feature space, in L. Rutkowski, M. Korytkowski, R. Scherer, R. Tadeusiewicz, L. Zadeh and J. Zurada (Eds.), Artificial Intelligence and Soft Computing, Lecture Notes in Computer Science, Vol. 7895, Springer-Verlag, Berlin, pp. 184-193.10.1007/978-3-642-38610-7_18Search in Google Scholar

Maaten, van der, L. (2009). Feature Extraction from Visual Data, Ph.D. thesis, Tilburg University, Tilburg. Search in Google Scholar

Mangasarian, O. and Wolberg, W. (1990). Cancer diagnosis via linear programming, SIAM News 23(5): 1-18.Search in Google Scholar

Mitra, P., Murthy, C. and Pal, S. (2002). Density-based multiscale data condensation, IEEE Transactions on PatternAnalysis and Machine Intelligence 24(6): 734-747.10.1109/TPAMI.2002.1008381Search in Google Scholar

Nam, D., Lee, J.-S. and Park, C. (2004). n-dimensional Cauchy neighbor generation for the fast simulated annealing, IEICE Transactions on Information and Systems E87-D(11): 2499-2502.Search in Google Scholar

Oliveira, J. and Pedrycz, W. (Eds.) (2007). Advances in FuzzyClustering and Its Applications, Wiley, Chichester.Search in Google Scholar

Pal, S. and Mitra, P. (2004). Pattern Recognition Algorithms forData Mining, Chapman and Hall, Boca Raton, FL.10.1201/9780203998076Search in Google Scholar

Parvin, H., Alizadeh, H. and Minati, B. (1971). Objective criteria for the evaluation of clustering methods, Journal of theAmerican Statistical Association 66(336): 846-850.10.1080/01621459.1971.10482356Search in Google Scholar

Parvin, H., Alizadeh, H. and Minati, B. (2010). A modification on k-nearest neighbor classifier, Global Journal of ComputerScience and Technology 10(14): 37-41.Search in Google Scholar

Sait, S. and Youssef, H. (2000). Iterative Computer Algorithmswith Applications in Engineering: Solving CombinatorialOptimization Problems, IEEE Computer Society Press, Los Alamitos, CA.Search in Google Scholar

Sammon, J. (1969). A nonlinear mapping for data structure analysis, IEEE Transactions on Computers18(5): 401-409.10.1109/T-C.1969.222678Search in Google Scholar

Saxena, A., Pal, N. and Vora, M. (2010). Evolutionary methods for unsupervised feature selection using Sammon’s stress function, Fuzzy Information and Engineering2(3): 229-247.10.1007/s12543-010-0047-4Search in Google Scholar

Strickert, M., Teichmann, S., Sreenivasulu, N. and Seiffert, U. (2005). DIPPP online self-improving linear map for distance-preserving data analysis, 5th Workshop on Self-Organizing Maps, WSOM’05, Paris, France, pp. 661-668.Search in Google Scholar

Sumi, S.M., Zaman, M.F. and Hirose, H. (2012). A rainfall forecasting method using machine learning models and its application to the Fukuoka city case, InternationalJournal of Applied Mathematics and Computer Science22(4): 841-854, DOI: 10.2478/v10006-012-0062-1. 10.2478/v10006-012-0062-1Search in Google Scholar

Szu, H. and Hartley, R. (1987). Fast simulated annealing, PhysicsLetters A 122(3-4): 157-162.10.1016/0375-9601(87)90796-1Search in Google Scholar

Tian, T., Wilcox, R. and James, G. (2010). Data reduction in classification: A simulated annealing based projection method, Statistical Analysis and Data Mining3(5): 319-331.10.1002/sam.10087Search in Google Scholar

UC Irvine Machine Learning Repository (2013). http://archive.ics.uci.edu/ml/.Search in Google Scholar

Vanstrum, M. and Starks, S. (1981). An algorithm for optimal linear maps, Southeastcon Conference, Huntsville, AL,USA, pp. 106-110.Search in Google Scholar

Wand, M. and Jones, M. (1995). Kernel Smoothing, Chapman and Hall, London.10.1007/978-1-4899-4493-1Search in Google Scholar

Wilson, D. and Martinez, T. (2000). Reduction techniques for instance-based learning algorithms, Machine Learning38(3): 257-286.10.1023/A:1007626913721Search in Google Scholar

Xu, R. and Wunsch, D. (2009). Clustering,Wiley, Hoboken, NJ.10.1002/9780470382776Search in Google Scholar

Zhigljavsky, A. and Žilinskas, A. (2008). Stochastic GlobalOptimization, Springer-Verlag, Berlin. Search in Google Scholar

ISSN:: 1641-876X
Language:: English

Publication timeframe:: 4 times per year
Journal Subjects:: Mathematics, Applied Mathematics

Journal RSS Feed

An algorithm for reducing the dimension and size of a sample for data exploration procedures

Published Online: Mar 25, 2014

Page range: 133 - 149

DOI: https://doi.org/10.2478/amcs-2014-0011

Keywordsdimension reduction, sample size reduction, linear transformation, simulated annealing, data mining.

This content is open access.

Keywords
dimension reduction, sample size reduction, linear transformation, simulated annealing, data mining.