Generalized canonical correlation analysis for functional data

Tomasz Górecki 1 , Mirosław Krzyśko 2  and Waldemar Wołyński 1
  • 1 Faculty of Mathematics and Computer Science, Adam Mickiewicz University, Uniwersytetu Poznańskiego 4, 61-614, Poznań, Poland
  • 2 Interfaculty Institute of Mathematics and Statistics, The President Stanisław Wojciechowski State University of Applied Sciences in Kalisz, 62-800, Kalisz, Poland

Summary

There is a growing need to analyze data sets characterized by several sets of variables observed on the same set of individuals. Such complex data structures are known as multiblock (or multiple-set) data sets. Multi-block data sets are encountered in diverse fields including bioinformatics, chemometrics, food analysis, etc. Generalized Canonical Correlation Analysis (GCCA) is a very powerful method to study this kind of relationships between blocks. It can also be viewed as a method for the integration of information from K > 2 distinct sources (Takane and Oshima-Takane 2002). In this paper, GCCA is considered in the context of multivariate functional data. Such data are treated as realizations of multivariate random processes. GCCA is a technique that allows the joint analysis of several sets of data through dimensionality reduction. The central problem of GCCA is to construct a series of components aiming to maximize the association among the multiple variable sets. This method will be presented for multivariate functional data. Finally, a practical example will be discussed.

If the inline PDF is not rendering correctly, you can download the PDF file here.

  • Carroll, J.D. (1968a): Generalization of canonical correlation analysis to three or more sets of variables. Proceedings of the 76th Annual Convention of the American Psychological Association 3:227–228.

  • Carroll, J.D. (1968b): Equations and tables for a generalization of canonical correlation analysis to three or more sets of variables. Unpublished companion paper to Carroll (1968a).

  • Gower, J.C. (1989): Generalized canonical analysis. R. Coppi and S. Bolasco, Editors, Multiway Data Analysis. North Holland, Amsterdam, 221–232.

  • Górecki, T., Krzyśko, M., Wołyński, W. (2017): Correlation analysis for multivariate functional data. Data Science, Studies in Classification, Data Analysis, and Knowledge Organization 243–258.

  • Górecki, T., Krzyśko, M., Waszak, Ł., Wołyński, W. (2018): Selected statistical methods of data analysis for multivariate functional data. Statistical Papers 59(1): 153–182.

  • Hotelling, H. (1936): Relations between two sets of variates. Biometrika 28(3/4): 321–377.

  • Horváth, L., Kokoszka, P. (2012): Inference for Functional Data with Applications. Springer. New York.

  • Hwang, H., Jung, K., Takane, Y., Woodward, T.S. (2012): Functional multiple-set canonical correlation analysis. Psychometrika 77(1): 48–64.

  • Hwang, H., Jung, K., Takane, Y., Woodward, T.S. (2013): A unified approach to multiple-set canonical correlation analysis and principal components analysis. British Journal of Mathematical and Statistical Psychology 66: 308–321.

  • Kettenring, J.R. (1971): Canonical analysis of several sets of variables. Biometrika 58(3): 433–451.

  • Leurgans, S.E., Moyeed, R.A., Silverman, B.W. (1993): Canonical correlation analysis when the data are curves. Journal of the Royal Statistical Society. Series B 55(3): 725—740.

  • Löfstedt, T. Hadj-Selem, F., Guillemot, V., Philippe, C., Raymond, N., Duchesney, E., Frouin, V., Tenenhaus, A. (2018): A general multiblock method for structured variable selection. arXiv:1610.09490v1 [stat.ML]

  • Markos, A., D’enza, A.I. (2016): Incremental generalized canonical correlation analysis. Analysis of Large and Complex Data, Studies in Classification, Data Analysis, and Knowledge Organization: 185–194.

  • Ramsay, J.O., Silverman, B.W. (2005): Functional Data Analysis, 2nd edition. Springer, New York.

  • Ramsay, J.O. Wickham, H., Graves, S., Hooker, G. (2018): fda: Functional Data Analysis. R package version 2.4.8. https://CRAN.R-project.org/package=fda

  • R Core Team (2019): R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/

  • Takane, Y., Hwang, H., Abdi, H. (2008): Regularized multiple-set canonical correlation analysis. Psychometrika 73(4): 753–775.

  • Takane, Y., Oshima-Takane, Y. (2002): Nonlinear generalized canonical correlation analysis by neural network models. Measurement and Multivariate Analysis: 183–190.

  • Tenenhaus, A., Guillemot, V. (2017a): RGCCA: Regularized and Sparse Generalized Canonical Correlation Analysis for Multiblock Data. R package version 2.1.2. https://CRAN.R-project.org/package=RGCCA

  • Tenenhaus, A., Philippe, C., Frouin, V. (2015): Kernel generalized canonical correlation analysis. Computational Statistics & Data Analysis 90(C): 114–131.

  • Tenenhaus, A., Philippe, C., Guillemot, V., Le Cao, K.A., Grill, J., Frouin, V. (2014): Variable selection for generalized canonical correlation analysis. Biostatistics 15(3): 569–83.

  • Tenenhaus, A., Tenenhaus, M. (2011): Regularized generalized canonical correlation analysis. Psychometrika 76(2): 257–284.

  • Tenenhaus, M., Tenenhaus, A., Groenen, P. (2017b): Regularized generalized canonical correlation analysis: A framework for sequential multiblock component methods. Psychometrika 82(3): 737–777.

  • Van de Velden. M. (2011): On generalized canonical correlation analysis. Proc. 58th World Statistical Congress. Dublin, 758–765.

OPEN ACCESS

Journal + Issues

Search