The Use of Principal Component Analysis and Logistic Regression in Prediction of Infertility Treatment Outcome

Open access

Abstract

Principal Component Analysis is one of the data mining methods that can be used to analyze multidimensional datasets. The main objective of this method is a reduction of the number of studied variables with the mainte- nance of as much information as possible, uncovering the structure of the data, its visualization as well as classification of the objects within the space defined by the newly created components. PCA is very often used as a preliminary step in data preparation through the creation of independent components for further analysis. We used the PCA method as a first step in analyzing data from IVF (in vitro fertilization). The next step and main purpose of the analysis was to create models that predict pregnancy. Therefore, 805 different types of IVF cy- cles were analyzed and pregnancy was correctly classified in 61-80% of cases for different analyzed groups in obtained models.

Agarwal, S., Jacobs Jr., D. R., Vaidya, D. Sibley, Ch. T., Jorgensen, N. W., Rotter, J. I., Chen, Y.-D. I., et al. (2012). Metabolic Syndrome Derived from Principal Component Analysis and Incident Cardiovascular Events: The Multi Ethnic Study of Atherosclerosis (MESA) and Health, Aging, and Body Composition (Health ABC). Cardiology Research and Practice, 2012. DOI:10.1155/2012/919425.

Aguilera, A. M., Escabias, M., & Valderrama, M. J. (2006). Using principal compo- nents for estimating logistic regression with high-dimensional multicollinear data. Computational Statistics & Data Analysis, 50(8), 1905-1924.

Akinsola, O. M., Nwagu, B. I., Orunmuyi, M., Iyeghe-Erakpotobor, G. T., Eze, E. D., Abanikannda, O. T. F., Onaadepo, O., Okuda, E. U., & Louis, U. (2014). Prediction of bodyweight from body measurements in rabbits using principal component analysis. Annals of Biological Sciences, 2(1), 1-6.

Belasco, E., Philips, B. U., & Gong, G. (2012). The Health Care Access Index as a Determinant of Delayed Cancer Detection Through Principal Component Analysis. In P. Sanguansat (Ed.), Principal Component Analysis - Multidis- ciplinary Applications (pp. 143-166). InTech. DOI:10.5772/38460.

Biffi, A., Anderson, Ch. D., Nalls, M. A., Rahman, R., Sonni, A., Cortellini, L., Rost, N. S., et al. (2010). Principal-Component Analysis for Assessment of Population Stratification in Mitochondrial Medical Genetics. The American Journal of Human Genetics, 86(6), 904-917.

Brzyski, P., Tobiasz-Adamczyk, B., & Knurowski T. (2012). Trafność i rzetelność skali GARS w populacji osob w starszym wieku w Polsce, Gerontologia Pol- ska, 20(3), 109-117.

Czernyszewicz, E. (2008). Zastosowanie analizy głownych składowych do opisu kon- sumenckiej struktury jakości jabłek. Żywność. Nauka. Technologia. Jakość, 2(57), 119-127.

Daszykowski,M., &Walczak, B. (2008). Analiza czynnikow głownych i inne metody eksploracji danych. In D. Zuba & A. Parczewski (Eds.), Chemometria w ana- lityce. Krakow: IES.

Daszykowski, M., Walczak, B., & Massart, D. L. (2001). Looking for natural pat- terns in data: Part 1. Density-based approach. Chemometrics and Intelligent Laboratory Systems, 56, 83-92.

Duch, W., Korbicz, J., Rutkowski, L., & Tadeusiewicz, R. (2000). Biocybernetyka i Inżynieria Biomedyczna 2000. Tom 6: Sieci neuronowe. Warszawa: Aka- demicka Oficyna Wydawnicza Exit.

Fisher, R., & MacKenzie, W. (1923). Studies in crop variation II. The manurial response of different potato varieties. Journal of Agricultural Science, 13, 311-320.

Furman-Haran, E., Shapiro Feinberg, M., Badikhi, D., Eyal, E., Zehavi, T., & De- gani, H. (2014). Standardization of Radiological Evaluation of Dynamic Con- trast Enhanced MRI: Application in Breast Cancer Diagnosis. Technology in Cancer Research & Treatment, 13(5), 445-454.

Gastinel, L. N. (2012). Principal Component Analysis in the Era of “Omisc” Data. In P. Sanguansat (Ed.), Principal Component Analysis - Multidisciplinary Applications (pp. 21-42). InTech. DOI:10.5772/37099.

Giuliani, A., & Benigni, R. (2000). Principal Component Analysis for Descriptive Epidemiology. In R. W. Brause & E. Hanisch (Eds.). Medical Data Analysis. Lecture Notes in Computer Science, 1933, 308-313.

Hladnik, A. (2013). Image compression and face recognition: two image process- ing applications of principal component analysis. International Circular of Graphic Education and Research, 6, 56-61.

Hoffmann, K., Schulze, M. B., Schienkiewitz, A., Nothlings, U., & Boeing, H. (2004). Application of a New Statistical Method to Derive Dietary Patterns in Nutritional Epidemiology. American Journal of Epidemiology, 159(10), 935-944.

Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24, 417-441.

Kaur, G., Arora, A. S., & Jain, V. K. (2012). Multiple Linear Regression Model based on Principal Component Scores to Study the Relationship between Anthropometric Variables and BP Reactivity to Unsupported Back in Nor- motensive Post-graduate Females. International conference: 1st, Energy and environment technologies and equipment. Advances in Environment, Biotechnology and Biomedicine (pp. 373-377). Greece: WSEAS.

Kolasa-Więcek, A. (2012). Application of PCA in the analysis of parameters related to agricultural greenhouse gases emissions in Europe. Journal of Research and Applications in Agricultural Engineering, 57(1), 77-79.

Konieczna, L., & Lamparczyk, H. (2008). Wpływ płci na farmakokinetykę wybra- nych lekow. Zastosowania metod statystycznych w badaniach naukowych III (pp. 299-310).Krakow, Polska: StatSoft Polska. Retreived from: http://www.statsoft.pl/portals/0/Downloads/Wplywplci.pdf.

Koter, S., & Wesołowska, K. (2003). Zastosowanie metody PCA do opisu wod na- turalnych. II Ogólnopolska Konferencja Naukowo-Techniczna “Aktualne za- gadnienia w uzdatnianiu i dystrybucji wody” (pp. 413-420). Szczyrk, Poland.

Latifoglu, F., Polat, K., Kara, S., & Gunes, S. (2008). Medical diagnosis of atherosclerosis from Carotid Artery Doppler Signals sing principal compo- nent analysis (PCA), k-NN based weighting pre-processing and Artificial Immune Recognition System (AIRS). Journal of Biomedical Informatics, 41(1), 15-23.

Ma, S. (2007). Principal Component Analysis in Linear Regression Survival Model with Microarray Data. Journal of Data Science, 5, 183-198.

Martens, H., & Nas, T. (1991). Multivariate calibration. Chichester: Jon Wiley & Sons.

Martis, R. J., Acharya U. R., & Min, L. Ch. (2013). ECG beat classification us- ing PCA, LDA, ICA and Discrete Wavelet Transform. Biomedical Signal Processing and Control, 8(5), 437-448.

Milewska, A. J., Gorska, U., Jankowska, D., Milewski, R., & Wołczyński, S. (2011). The use of the basket analysis in a research of the process of hospitalization in the gynecological ward. Studies in Logic, Grammar and Rhetoric. Logical, Statistical and Computer Methods in Medicine, 25(38), 83-98.

Milewska, A. J., Jankowska, D., Cwalina, U., Więsak, T., Morgan, A., & Milew- ski, R. (2013). Analyzing outcome of intrauterine insemination treatment by application of Cluster Analysis or Kohonen Neural Networks. Studies in Logic, Grammar and Rhetoric. Logical, Statistical and Computer Methods in Medicine, 35(48), 7-25.

Milewska, A. J., Jankowska, D., Gorska, U., Milewski, R., & Wołczyński, S. (2012). Graphical representation of the relationships between qualitative variables concerning the process of hospitalization in the gynecological ward using correspondence analysis. Studies in Logic, Grammar and Rhetoric. Logical, Statistical and Computer Methods in Medicine, 29(42), 7-25.

Milewski, R., Jamiołkowski, J., Milewska, A. J., Domitrz, J., Szamatowicz, J., & Wołczyński, S. (2009). Prognosis of the IVF ICSI/ET procedure efficiency with the use of artificial neural networks among patients of the Depart- ment of Reproduction and Gynecological Endocrinology. Ginekologia Polska, 80(12), 900-906.

Milewski, R., Malinowski, P., Milewska, A. J., Czerniecki, J., Ziniewicz, P., & Wołczyński, S. (2011). Nearest neighbor concept in the study of IVF ICSI/ET treatment effectiveness. Studies in Logic, Grammar and Rhetoric. Logical, Statistical and Computer Methods in Medicine, 25(38), 49-57.

Milewski, R., Milewska, A. J., Czerniecki, J., Leśniewska, M., & Wołczyński, S. (2013a). Analysis of the demographic profile of patients treated for infertility using assisted reproductive +techniques in 2005-2010. Ginekologia Polska, 84(7), 609-614. 21 Milewski, R., Milewska, A. J., Domitrz, J., & Wołczyński, S. (2008). In vitro fer- tilization ICSI/ET in women over 40. Przegląd Menopauzalny, 7(2), 85-90.

Milewski, R., Milewska, A. J., Więsak, T., Morgan, A., (2013b). Comparison of artificial neural networks and logistic regression analysis in pregnancy pre- diction using in the in vitro fertilization treatment Networks. Studies in Logic, Grammar and Rhetoric. Logical, Statistical and Computer Methods in Medicine, 35(48), 39-48.

Mudrova, A., & Prochazka, A. (2005). Principal Component Analysis in Image Processing. Technical Computing Conference, Prague, Czech Republic.

Næs, T., Isaksson, T., Fearn, T., & Davies, T. (2002). A user-friendly guide to multivariate calibration and classification. Chichester UK: NIR Publications.

Nascimento, E. C. M., & Martins, J. B. L. (2012). Pharmacophoric Profile: De- sign of New Potential Drugs with PCA Analysis. In P. Sanguansat (Ed.), Principal Component Analysis - Multidisciplinary Applications (pp. 59-74). InTech. DOI:10.5772/37426.

Nowicki, J., Żylińska, A., & Kin, A. (2013). Zastosowanie metod statystycznych i graficznych w analizie zdeformowanych tektonicznie trylobitow z rodziny Ellipsocephalidae Matthew, 1887 z kambru Gor Świętokrzyskich. In M. Kę- dzierski & B. Kołodziej (Eds.), XXII Konferencja Naukowa Sekcji Paleonto- logicznej Polskiego Towarzystwa Geologicznego “Aktualizm i antyaktualizm w paleontologii” (pp. 38-39). Tyniec, Poland: Polskie Towarzystwo Geolog- iczne.

Pandey, P. K., Singh, Y., & Tripathi, S. (2011). Image Processing using Princi- ple Component Analysis. International Journal of Computer Applications, 15(4), 37-40.

Panek, D. (2014). Ocena parametrow analizy akustycznej w detekcji patologii mowy. Przegląd Elektrotechniczny, R. 90(5), 126-129. DOI: 10.12915/pe. 2014.05.29.

Patterson, N., Price, A. L., & Reich, D. (2006). Population Structure and Eigen- analysis. PLoS Genetics, 2(12), 2074-2093. DOI:10.1371/journal.pgen.0020 190.

Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. Philosophical Magazine, 2, 559-572.

Petrisor, A. I., Ianos, I., Iurea, D., & Vaidianu, M. N. (2012). Applications of Principal Component Analysis integrated with GIS. Procedia Environmental Sciences, 14, 247-256.

Pushpa Rathi, G. V. P., & Palani, S. (2012). Brain Tumor MRI Image Classification with Feature Selection and Extraction using Linear Discriminant Analysis. International Journal of Information Sciences & Techniques, 2(4), 131-146.

Raskin, R., & Terry, H. (1988). A Principal-Components Analysis of the Narcis- sistic Personality Inventory and Further Evidence of Its Construct Validity. Journal of Personality and Social Psychology, 54(5), 890-902.

Raychaudhuri, S., Stuart, J. M., & Altman, R. (2000). Principal Component Analy- sis to Summarize Microarray Experiments: Application to Sporulation Time Series. Pacyfic Symposium on Biocomputing, 2000, 455-466.

Reverter, F., Vegas, E., & Oller, J. M. (2012). Kernel Methods for Dimensionality Reduction Applied to the “Omics” Data. In P. Sanguansat (Ed.), Principal Component Analysis - Multidisciplinary Applications (pp. 1-20). InTech. DOI:10.5772/37431.

Rymuza, K., & Radzka, E. (2013). Zastosowanie analiz wielowymiarowych do oceny jakości wody pitnej. Nauka. Technologia. Jakość, 6(91), 165-174.

Santo, R. do E. (2012). Principal Component Analysis applied to digital image compression. Einstein (Sao Paulo), 10(2), 135-139.

Scholz, M., Schmidt, S., Loesgen, S., & Bickeb¨oller, H. (1999). Analysis of principal component based quantitative phenotypes for alcoholism. Genetic Epidemi- ology, 17(l), 313-318.

Stuhler, E., &Merhof, D. (2012). Principal Component Analysis Applied to SPECT and PET Data of Dementia Patients - A Review. In P. Sanguansat (Ed.), Principal Component Analysis - Multidisciplinary Applications (pp. 167-186). InTech. DOI:10.5772/38010.

Suchacz, B., & Wesołowski, M. (2010). Relacje pomiędzy zawartością cynku, miedzi, ołowiu i niklu w wodnych ekstraktach z mieszanek ziołowych. Bro- matologia i Chemia Toksykologiczna, 43(4), 485-492.

Szefer, P. (2003). Zastosowanie technik chemometrycznych w analitycznej ocenie probek biologicznych i środowiskowych. In J. Namieśnik, W. Chrzanowski & P. Szpinek (Eds.), Nowe Horyzonty i Wyzwania w Analityce i Monitoringu Środowiskowym (pp. 599-629). Gdańsk, Poland: CEEAM.

Tabachnick, B. G., & Fidell, L. S. (1996). Using Multivariate Statistics. Boston: Pearson.

Ukalska, J., Ukalski, K., Śmiałowski, T., & Mądry, W. (2008). Badanie zmien- ności i wspołzależności cech użytkowych w kolekcji roboczej pszenicy ozi- mej (Triticum aestivum L.) za pomocą metod wielowymiarowych. Część II. Analiza składowych głownych na podstawie macierzy korelacji fenotypowych i genotypowych. Biuletyn Instytutu Hodowli i Aklimatyzacji Roślin, 249, 45-57.

Webb, A. R. (2003). Statistical Pattern Recognition. Wiley.

Varraso, R., Garcia-Aymerich, J., Monier, F., Le Moual, N., De Batlle, J., Mi- randa, G., Pison, Ch., Romieu, I., Kauffmann, F., & Maccario, J. (2012). Assessment of dietary patterns in nutritional epidemiology: principal com- ponent analysis compared with confirmatory factor analysis. The American Journal of Clinical Nutrition, 96(5), 1079-1092.

Zuendorf, G., Kerrouche, N., Herholz, K., & Baron, J. C. (2003). Efficient principal component analysis for multivariate 3D voxel-based mapping of brain func- tional imaging data sets as applied to FDG-PET and normal aging. Human Brain Mapping 18(1), 13-21.

Studies in Logic, Grammar and Rhetoric

The Journal of University of Bialystok

Journal Information


Cite Score 2018: 0.29

SCImago Journal Rank (SJR) 2018: 0.138
Source Normalized Impact per Paper (SNIP) 2018: 0.358

Cited By

Metrics

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 438 343 21
PDF Downloads 233 207 11