Methodologies of Knowledge Discovery from Data and Data Mining Methods in Mechanical Engineering

Open access


The paper contains a review of methodologies of a process of knowledge discovery from data and methods of data exploration (Data Mining), which are the most frequently used in mechanical engineering. The methodologies contain various scenarios of data exploring, while DM methods are used in their scope. The paper shows premises for use of DM methods in industry, as well as their advantages and disadvantages. Development of methodologies of knowledge discovery from data is also presented, along with a classification of the most widespread Data Mining methods, divided by type of realized tasks. The paper is summarized by presentation of selected Data Mining applications in mechanical engineering.

[1] Hamrol A., Intelligent components for quality control in manufacturing, Proc. of 3rd IFAC Symp. on Intel. Comp. and Instr. for Con., pp. 613–618, 1997.

[2] Hamrol A., Kowalik D., Kujawińska A., Impact of selected work condition factors on quality of manual assembly process, Hum. Fact. and Erg. in Man. and Serv. Ind., 21, 2, 156–163, 2011.

[3] Starzyńska B., Hamrol A., Excellence toolbox: Decision support system for quality tools and techniques selection and application, Tot. Qual. Man. and Bus. Exc., 24, 5–6, 577–595, 2013.

[4] Diering M., Dyczkowski K., Hamrol A., New method for assessment of raters agreement based on fuzzy similarity, Adv. in Intell. Sys. and Comp., 368, 415–425, 2015.

[5] Trojanowska J., Żywicki K., Varela M.L.R. et al., Shortening changover time – an industrial study, Proc. of the 2015 10th Iberian Conf. on Inf. Sys. and Tech., 2015.

[6] Górski F., Buń P., Wichniarek R., Zawadzki P., Hamrol A., Immersive City Bus Configuration System for Marketing and Sales Education, Proc. Comp. Sc., 75, 137–146, 2015.

[7] Pandilov Z., Milecki A., Nowak A., Górski F., Grajewski D., Ciglar D., Mulc T., Klaić M., Virtual Modelling And Simulation Of A CNC Machine Feed Drive System, Trans. of FAMENA, 39, 4, 37–54, 2016.

[8] Grajewski D., Diakun J., Wichniarek R. et al., Improving the skills and knowledge of future designers in the field of ecodesign using virtual reality technologies, Proc. Comp. Sc., 75, 348–358, 2015.

[9] Zawadzki P., Górski F., Hamrol A., Kowalski M., Paszkiewicz R., An Automatic System for 3D Models and Technology Process Design, Trans. of FAMENA, 35, 2, 69–78, 2011.

[10] Hamrol A., Intelligent system for quality control in manufacturing, Proc. of 7th Int. Conf. on Hum. Comp. Int., 21, 321–324, 1997.

[11] Hamrol A., Process diagnostics as a means of improving the efficiency of quality control, Prod. Plan. and Con., 11, 8, 797–805, 2000.

[12] Lee J., Kao H., Yang S., Service innovation and smart analytics for Industry 4.0 and big data environment, Prod. Serv. Sys. and Val. Creat., CIRP Procedia, 16, 3–8, 2014.

[13] Schuh G., Potente T., Varandani R., Schmitz T., Global Footprint Design based on genetic algorithms – An “Industry 4.0” perspective, CRIP Ann. Man. Tech., CIRP Procedia, 63, 1, 433–436, 2014.

[14] Chen F., Deng P., Wan J., Zhang D., Data Mining review for the Internet of Things: Literature Review and Challenges, Int. J. of Dis. Sen. Net., vol. 2015, Art. ID 431014, 14 pages, 2015.

[15] Gorecky D., Schmitt M., Loskyll M., Zuhlke D., Human – Machine – Interaction in the Industry 4.0 ERA, Ind. Inf. (INDIN), in 12th IEEE International Conference, pp. 289–294, 2014.

[16] Zawadzki P., Żywicki K., Smart product design and production control for effective mass customization in the Industry 4.0 concept, Manag. and Prod. Eng. Rev., 7, 3, 105–112, 2016.

[17] Rachel K., Scientists want more computing power, ZDNET Magazine, 2001,

[18] Canepa M., Virtual Data Storage, new uses, ZDNET Magazine, 2002,

[19] Ignaszak Z., Hajkowski J., Popielarski P., Example of New Models Applied in Selected Simulation System with Respect to Database, Arch. of Found. Eng., 13, 1, 45–50, 2013.

[20] Ignaszak Z., Hajkowski J., Popielarski P., Sensivity of Models Applied in Selected Simulation System with Respect to Database Quality for Resolving of Casting Problems, Def. and Diff. Forum., 334–335, 314–321, 2013.

[21] Larose T., Discovering Knowledge in Data: An Introduction to Data Mining, Wiley & Sons, 2005.

[22] Morzy M., Data Mining – Review of available methods and fields of application [in Polish], Retrieved 15.10.2016,

[23] Fronczak E., Michalczewicz M., Application of Data Mining tools to create models and knowledge management [in Polish], Pol. Comp. of Know. Manag., Series: Studies and Materials, Vol. 27, 2010.

[24] Ngai E., Hu Y., Wong Y., Chen Y., Sun X., The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature, Dec. Supp. Sys., 50, 559–569, 2011.

[25] Tadeusiewicz R., Data Mining as a chance for relatively cheap scientific perform of scientific discoveries digging seemingly unexploited empirical data [in Polish], Statsoft Inc. Web Site, 2006, Retrieved 03.11.2016,

[26] Zhang P., Zhu X., Shi Y., Guo L., Wu X., Robust ensemble learning for mining noisy data streams, Dec. Supp. Sys., 50, 469–479, 2011.

[27] Liao S., Chu P., Hsiao P., Data mining techniques and application – A decade review from 2000 to 2011, Exp. Sys. with App., 39, 11303–11311, 2012.

[28] Azevedo A., Santos M., KDD, SEMMA and CRISPDM: a parallel overview, IADIS European Conf. Data Mining, pp. 182–185, 2008.

[29] Witten I.H., Frank E., Hall M.A., Data Mining: Practical machine learning tools and techniques, 3rd edition, Morgan Kaufmann Series in Data Management Systems, Elsevier, 2011.

[30] Han J., Kamber M., Pei J., Data Mining: Concepts and Techniques, 3rd Edition, Morgan Kaufmann Series in Data Management Systems, Elsevier, 2012.

[31] Wu X. and others, Top 10 algorithms in data mining, Know. Inf. Sys., 14, 1–37, 2008.

[32] Berry M.J.A., Linoff G., Data mining techniques: for marketing, sales, and customer support, Wiley & Sons, 1997.

[33] Manikandan G., Sairam N., Sharmili S., Venkatakrishnan S., Achieving Privacy in Data Mining Using Normalization, Ind. J. of Sc. And Tech., 6, 4, 4268–4272, 2013.

[34] Perzyk M., Statistical and Visualization Data Mining Tools for Foundry Production, Arch. of Foun. Eng., 7, 3, 111–116, 2007.

[35] KDnuggets, Computing resources for analytics, data mining, data science work or research Pool, 2015, Retrieved 05.11.2016,

[36] Dean J., Big Data, Data Mining and Machine Learning. Value Creation for Business Leaders and Practitioners, Wiley, 2014.

[37] KDnuggets, CRISP-DM, still the top methodology for analytics, data mining, or data science projects, 2014, Retrieved 25.10.2016,

[38] Piatetsky-Shapiro G., Frawley W., Knowledge Discovery in Databases, MIT Press Cambridge, 1991.

[39] Frawley W., Piatetsky-Shapiro G., Matheus C.J., Knowledge Discovery in Databases: An Overview, Art. Int. Mag., 13, 3, 57–70, 1992.

[40] Piatetsky-Shapiro G., Matheus C.J., Smyth P., Uthurusamy R., KDD-93: Progress and Chellenges in Knowledge Discovery in Databases, Art. In. Mag., 15, 3, 77–82, 1994.

[41] Fayyad U.M., Piatetsky-Shapiro G., Smyth P., From Data Mining to Knowledge Discovery in Databases, Art. Int. Mag., 17, 3, 37–53, 1996.

[42] Faayad U.M., Piatetsky-Shapiro G., Smyth P., Uthurusamy R., Advances in knowledge discovering and data mining, American Association for Artificial Intelligence, 1996.

[43] Marban O., Mariscal G., Segovia J., A Data Mining & Knowledge Discovery Process Model, Dat. Min. and Know. Disc. Proc., INTECH Open Science, 2009.

[44] Berry M.J.A., Linoff G., Mastering data mining, Wiley & Sons, 2000.

[45] Alnoukari M., Sheikh A., Knowledge Discovery Process Models: From Traditional to Agile Modeling, IGI Glob., pp. 72–100, 2012.

[46] Rohanizadeh S.S., Moghadam M.B., A proposed Data Mining Methodology and its Application to Industrial Procedures, J. of Ind. Eng., 4, 37–50, 2009.

[47] Hastie T., Tibshirani R., Friedman J., The Elements of Statistical Learning. Data Mining, Inference, and Prediction, Springer, 2009.

[48] Jain K., Murty M., Flynn P., Data clustering: a review, ACM Comp. Surv. (CSUR), 31, 3, 264–323, 1999.

[49] Kotsiantis S.B., Zaharakis I.D., Pintelas P.E., Machine learning: A review of classification and combining techinques, Art. Int. Rev., 26, 159–190, 2006.

[50] Murtagh F., Contreras P., Algorithms for hierarchical clustering: an overview, Wiley Int. Rev.: Data Mining and Know. Disc., 2, 1, 86–97, 2012.

[51] Popat S., Emmanuel M., Review and comparative study of clustering techniques, Int. J. of Comp. Sc. and Inf. Tech., 5, 1, 805–812, 2014.

[52] Więcek-Janka E., Mierzwiak R., Kijewska J., Competencies’ model in the succession process of family firms with the use of grey clustering analysis, J. of Grey Sys., 28, 2, 121–131, 2016.

[53] Wang K., Applying data mining to manufacturing: the nature and implications, Springer, 2007.

[54] Choudhary A.K., Harding J.A., Tiwari M.K., Data mining in manufacturing: a review based on the kind of knowledge, J. of Int. Man., pp. 500–521, 2009.

[55] Harding J.A., Shahbaz M., Srinivas S., Kusiak A., Data Mining in Manufacturing: A Review, J.of Man. Sc. and Eng., 2006.

[56] Wang K., Tong S., Eynard B. et al., Review on Application of Data Mining in Product Design and Manufacturing, Fourth Int. Conf. on Fuz. Sys. And Know. Dis., 2007.

[57] Djatna T., Muharram, A.I., An application of association rule mining in total productive maintenance strategy: an analysis and modelling in wooden door manufacturing industry, Proc. of Inter. Conf. on Ind. Eng. and Serv. Sc., pp. 336–343, 2015.

[58] Hu Y., Guo Z., Wen J., Research on knowledge mining for agricultural machinery maintenance based on association rules, Proc of Inter. Conf. on Ind. Elect. and App., pp. 901–906, 2015.

[59] Jia Z., Gou Y., Han X., The Fault Diagnosis for Warship’s Power Plant Based on Association Rules, Adv. in Mech. and Cont. Eng. II, 433–435, 960–963, 2013.

[60] Martinez-de-Pison F.J., Sanz A., Martinez-de-Pison E. et al., Mining association rules from time series to explain failures in a hot-dip galvanizing steel line, Comp. and Ind. Eng., 63, 1, 22–36, 2012.

[61] Zhang L., Jiao R., Identifying Mapping Relationships between Functions and Technologies: an Approach based on Association Rule Mining, Proc. of Int. Conf. on Ind. Eng. and Eng. Manag., pp. 1596–1601, 2011.

[62] Yang X., Wu D., Zhou F., Association rule mining for affective product design, Proc. of Int. Conf.on Ind. Eng. and Eng. Manag., pp. 748–752, 2008.

[63] Shahbaz M., Srinivas V., Harding J.A. et al., Product design and manufacturing process improvement using association rules, Proc. of the Int. of Mech. Eng. Part B – J of Eng. Manuf., 220, 2, 243–254, 2006.

[64] Sobh A.S., Salem A.S., Darwish R. et al., Unsupervised clustering of materials properties using hierarchical techniques, Int. J of Coll. Ent., 5, 1–2, 74–88, 2015.

[65] Hayajneh M.T., Fuzzy clustering modelling for surface finish prediction in fine turning process, Mach. Sc. and Tech., 9, 3, 437–451, 2005.

[66] Jing H.L., Li C., Huang M., A Fast Retrieval Method Based on K-means Clustering for Mechanical Product Design, Adv. Manuf. Tech., 156–157, 98–101, 2011.

[67] Zhou X., Peng W., Shi H., Improved K-means algorithm for manufacturing process anomaly detection and recognition, Proc. of 1st Int. Symp. on Dig. Manuf., 1–3, 1036–1041, 2006.

[68] Yiakopoulos C.T., Gryllias K.C., Antoniadis I.A., Rolling element bearing fault detection in industrial environments based on a K-means clustering approach, Exp. Sys. with Appl., 38, 3, 2888–2911, 2011.

[69] Ma H.W., Mao Q.H., Zhang X.H. et al., Defects Classification of Steel Cord Conveyor Belt Based on Rough Set and Multi-Class v-SVM, Adv. Mat. Res., 328–330, 1814–1819, 2011.

[70] Muralidharan V., Sugumaran V., Rough set based rule learning and fuzzy classification of wavelet features for fault diagnosis of monoblock centrifugal pump, Measuement, 46, 9, 3057–3063, 2013.

[71] Muralidharan V., Sugumaran V., A comparative study of Naive Bayes classifier and Bayes net classifier for fault diagnosis of monoblock centrifugal pump using wavelet analysis, App. Soft Comp., 12, 8, 2023–2029, 2012.

[72] Jegadeeshwaran R., Sugumaran V., Brake fault diagnosis using Clonal Selection Classification Algorithm (CSCA) – a statistical learning approach, Eng. Sc. and Tech., 18, 1, 14–23, 2015.

[73] Moosavian A., Ahmadi H., Tabatabaeefar A. et al., Comparison of two classifiers; K-nearest neighbor and artificial neural network, for fault diagnosis on a main engine journal-bearing, Shock and Vib., 20, 63–272, 2013.

[74] Lesany S.A., Koochakzadeh A., Fatemi Ghomi S.M.T., Recognition and classification of single and concurrent unnatural patterns in control charts via neural networks and fitted line of samples, Int. J. of Prod. Res., 52, 6, 1771–1786, 2014.

[75] Brezak D., Majetić D., Udiljak T. et al., Tool wear estimation using an analytic fuzzy classifier and support vector machines, J. of Int. Man., 23, 3, 797–809, 2012.

[76] Yasa R., Etemad-Shahidi A., Classification and Regression Trees Approach for Predicting Current-Induced Scour Depth Under Pipelines, J. Off. Mech Arct. Eng., 136, 1, 2014.

[77] Perzyk M., Kochański A., Kozłowski J., Soroczyński A., Biernacki R., Comparison of data mining tools for significance analysis of process parameters in applications to process fault diagnosis, Inf. Sc., 259, 380–392, 2014.

[78] Lu Z.J., Xiang Q., Wu Y. et al., Application of Support Vector Machine and Genetic Algorithm Optimization for Quality Prediction within Complex Industrial Process, Proc. of IEEE 13th Int. Conf. on Ind. Inf., pp. 98–103, 2015.

[79] Jin R., Shi J., Reconfigured piecewise linear regression tree for multistage manufacturing process control, IIE Trans., 44, 4, 249–261, 2012.

[80] Pashazadeh H., Gheisari Y., Hamedi M., Statistical modeling and optimization of resistance spot welding process parameters using neural networks and multi-objective genetic algorithm, J. of Int. Man., 2014.

[81] Verbert J., Behera A.K., Lauwers B., Duflou J.R., Multivariate Adaptive Regression Splines as a Tool to Improve the Accuracy of Parts Produced by FSPIF, Key. Eng. Mat., 473, 841–846, 2011.

[82] Mareci D., Sutiman D., Chelariu R. et al., Evaluation of the corrosion resistance of new ZrTi alloys by experiment and simulation with an adaptive instance-based regression model, Corros. Sc., 73, 106–122, 2013.

[83] Perzyk M., Soroczyński A., Kozłowski J., Application of rough sets theory in control of foundry processes, Arch. of Metall. and Mat., 55, 3, 889–898, 2010.

[84] Jansen F.E., Kelkar M.G., Exploratory Data Analysis of Production Data, Proc. of Permian Basin Oil and Gas Rec. Conf., 1996.

[85] Abonyi J., Application of Exploratory Data Analysis to Historical Process Data of Polyethylene Production, Hung. J. of Ind. and Chem., 35, 1, 85–93, 2007.

Management and Production Engineering Review

The Journal of Production Engineering Committee of Polish Academy of Sciences and Polish Association for Production Management

Journal Information

CiteScore 2016: 0.48

SCImago Journal Rank (SJR) 2016: 0.126
Source Normalized Impact per Paper (SNIP) 2016: 0.551

Cited By


All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 642 642 43
PDF Downloads 408 408 31