Open Access

Impact of sample size on principal component analysis ordination of an environmental data set: effects on eigenstructure


Cite

In this study, we used bootstrap simulation of a real data set to investigate the impact of sample size (N = 20, 30, 40 and 50) on the eigenvalues and eigenvectors resulting from principal component analysis (PCA). For each sample size, 100 bootstrap samples were drawn from environmental data matrix pertaining to water quality variables (p = 22) of a small data set comprising of 55 samples (stations from where water samples were collected). Because in ecology and environmental sciences the data sets are invariably small owing to high cost of collection and analysis of samples, we restricted our study to relatively small sample sizes. We focused attention on comparison of first 6 eigenvectors and first 10 eigenvalues. Data sets were compared using agglomerative cluster analysis using Ward’s method that does not require any stringent distributional assumptions.

eISSN:
1337-947X
Language:
English
Publication timeframe:
4 times per year
Journal Subjects:
Life Sciences, Ecology, other, Chemistry, Environmental Chemistry, Geosciences, Geography