THESIS
2021
1 online resource (xvii, 156 pages) : illustrations (some color)
Abstract
In this thesis, we study the asymptotic behavior of the extreme eigenvalues and
eigenvectors of the high dimensional spiked sample covariance matrices, in the
supercritical case when a reliable detection of spikes is possible. In particular,
we derive the joint distribution of the extreme eigenvalues and the generalized
components of the associated eigenvectors, i.e., the projections of the eigenvectors
onto arbitrary given direction, assuming that the dimension and sample size
are comparably large. In general, the joint distribution is given in terms of linear
combinations of finitely many Gaussian and Chi-square variables, with parameters
depending on the projection direction and the spikes. Our assumptions
on the spikes are fully general. First, the strengths of spikes are only requi...[
Read more ]
In this thesis, we study the asymptotic behavior of the extreme eigenvalues and
eigenvectors of the high dimensional spiked sample covariance matrices, in the
supercritical case when a reliable detection of spikes is possible. In particular,
we derive the joint distribution of the extreme eigenvalues and the generalized
components of the associated eigenvectors, i.e., the projections of the eigenvectors
onto arbitrary given direction, assuming that the dimension and sample size
are comparably large. In general, the joint distribution is given in terms of linear
combinations of finitely many Gaussian and Chi-square variables, with parameters
depending on the projection direction and the spikes. Our assumptions
on the spikes are fully general. First, the strengths of spikes are only required
to be slightly above the critical threshold and no upper bound on the strengths
is needed. Second, multiple spikes, i.e., spikes with the same strength, are allowed.
Third, no structural assumption is imposed on the spikes. Thanks to
the general setting, in application, we can then apply the results to various high
dimensional statistical hypothesis testing problems involving both the eigenvalues
and eigenvectors. Specifically, we propose accurate and powerful statistics
to conduct hypothesis testing on the principal components. These statistics are
data-dependent and adaptive to the underlying true spikes. Numerical simulations also confirm the accuracy and powerfulness of our proposed statistics and
illustrate significantly better performance compared to the existing methods in
the literature. More importantly, our methods are accurate and powerful even
when either the spikes are small or the dimension is large.
Post a Comment