Independence test for high dimensional data based on regularized canonical correlation coefficients

Yanrong Yang, Guangming Pan

Research output: Contribution to journalArticleResearchpeer-review

Abstract

This paper proposes a new statistic to test independence between two high dimensional random vectors X:p1 x 1 and Y:p2 x 1. The proposed statistic is based on the sum of regularized sample canonical correlation coefficients of X and Y. The asymptotic distribution of the statistic under the null hypothesis is established as a corollary of general central limit theorems (CLT) for the linear statistics of classical and regularized sample canonical correlation coefficients when p1 and p2 are both comparable to the sample size n. As applications of the developed independence test, various types of dependent structures, such as factor models, ARCH models and a general uncorrelated but dependent case, etc., are investigated by simulations. As an empirical application, cross-sectional dependence of daily stock returns of companies between different sections in the New York Stock Exchange (NYSE) is detected by the proposed test.
Original languageEnglish
Pages (from-to)467 - 500
Number of pages34
JournalAnnals of Statistics
Volume43
Issue number2
DOIs
Publication statusPublished - 2015

Cite this

@article{59299ef467804785a3a1ff276aa73417,
title = "Independence test for high dimensional data based on regularized canonical correlation coefficients",
abstract = "This paper proposes a new statistic to test independence between two high dimensional random vectors X:p1 x 1 and Y:p2 x 1. The proposed statistic is based on the sum of regularized sample canonical correlation coefficients of X and Y. The asymptotic distribution of the statistic under the null hypothesis is established as a corollary of general central limit theorems (CLT) for the linear statistics of classical and regularized sample canonical correlation coefficients when p1 and p2 are both comparable to the sample size n. As applications of the developed independence test, various types of dependent structures, such as factor models, ARCH models and a general uncorrelated but dependent case, etc., are investigated by simulations. As an empirical application, cross-sectional dependence of daily stock returns of companies between different sections in the New York Stock Exchange (NYSE) is detected by the proposed test.",
author = "Yanrong Yang and Guangming Pan",
year = "2015",
doi = "10.1214/14-AOS1284",
language = "English",
volume = "43",
pages = "467 -- 500",
journal = "Annals of Statistics",
issn = "0090-5364",
publisher = "Institute of Mathematical Statistics",
number = "2",

}

Independence test for high dimensional data based on regularized canonical correlation coefficients. / Yang, Yanrong; Pan, Guangming.

In: Annals of Statistics, Vol. 43, No. 2, 2015, p. 467 - 500.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Independence test for high dimensional data based on regularized canonical correlation coefficients

AU - Yang, Yanrong

AU - Pan, Guangming

PY - 2015

Y1 - 2015

N2 - This paper proposes a new statistic to test independence between two high dimensional random vectors X:p1 x 1 and Y:p2 x 1. The proposed statistic is based on the sum of regularized sample canonical correlation coefficients of X and Y. The asymptotic distribution of the statistic under the null hypothesis is established as a corollary of general central limit theorems (CLT) for the linear statistics of classical and regularized sample canonical correlation coefficients when p1 and p2 are both comparable to the sample size n. As applications of the developed independence test, various types of dependent structures, such as factor models, ARCH models and a general uncorrelated but dependent case, etc., are investigated by simulations. As an empirical application, cross-sectional dependence of daily stock returns of companies between different sections in the New York Stock Exchange (NYSE) is detected by the proposed test.

AB - This paper proposes a new statistic to test independence between two high dimensional random vectors X:p1 x 1 and Y:p2 x 1. The proposed statistic is based on the sum of regularized sample canonical correlation coefficients of X and Y. The asymptotic distribution of the statistic under the null hypothesis is established as a corollary of general central limit theorems (CLT) for the linear statistics of classical and regularized sample canonical correlation coefficients when p1 and p2 are both comparable to the sample size n. As applications of the developed independence test, various types of dependent structures, such as factor models, ARCH models and a general uncorrelated but dependent case, etc., are investigated by simulations. As an empirical application, cross-sectional dependence of daily stock returns of companies between different sections in the New York Stock Exchange (NYSE) is detected by the proposed test.

U2 - 10.1214/14-AOS1284

DO - 10.1214/14-AOS1284

M3 - Article

VL - 43

SP - 467

EP - 500

JO - Annals of Statistics

JF - Annals of Statistics

SN - 0090-5364

IS - 2

ER -