A multiple testing approach to the regularisation of large sample correlation matrices

Natalia Bailey, M. Hashem Pesaran, L. Vanessa Smith

Research output: Contribution to journalArticleResearchpeer-review

Abstract

This paper proposes a regularisation method for the estimation of large covariance matrices that uses insights from the multiple testing (MT) literature. The approach tests the statistical significance of individual pair-wise correlations and sets to zero those elements that are not statistically significant, taking account of the multiple testing nature of the problem. The effective p-values of the tests are set as a decreasing function of N (the cross section dimension), the rate of which is governed by the nature of dependence of the underlying observations, and the relative expansion rates of N and T (the time dimension). In this respect, the method specifies the appropriate thresholding parameter to be used under Gaussian and non-Gaussian settings. The MT estimator of the sample correlation matrix is shown to be consistent in the spectral and Frobenius norms, and in terms of support recovery, so long as the true covariance matrix is sparse. The performance of the proposed MT estimator is compared to a number of other estimators in the literature using Monte Carlo experiments. It is shown that the MT estimator performs well and tends to outperform the other estimators, particularly when N is larger than T.

Original languageEnglish
Pages (from-to)507-534
Number of pages28
JournalJournal of Econometrics
Volume208
Issue number2
DOIs
Publication statusPublished - Feb 2019

Keywords

  • High-dimensional data
  • Multiple testing
  • Non-Gaussian observations
  • Shrinkage
  • Sparsity
  • Thresholding

Cite this

Bailey, Natalia ; Pesaran, M. Hashem ; Smith, L. Vanessa. / A multiple testing approach to the regularisation of large sample correlation matrices. In: Journal of Econometrics. 2019 ; Vol. 208, No. 2. pp. 507-534.
@article{0254694640ff490cb873cc79386a78b7,
title = "A multiple testing approach to the regularisation of large sample correlation matrices",
abstract = "This paper proposes a regularisation method for the estimation of large covariance matrices that uses insights from the multiple testing (MT) literature. The approach tests the statistical significance of individual pair-wise correlations and sets to zero those elements that are not statistically significant, taking account of the multiple testing nature of the problem. The effective p-values of the tests are set as a decreasing function of N (the cross section dimension), the rate of which is governed by the nature of dependence of the underlying observations, and the relative expansion rates of N and T (the time dimension). In this respect, the method specifies the appropriate thresholding parameter to be used under Gaussian and non-Gaussian settings. The MT estimator of the sample correlation matrix is shown to be consistent in the spectral and Frobenius norms, and in terms of support recovery, so long as the true covariance matrix is sparse. The performance of the proposed MT estimator is compared to a number of other estimators in the literature using Monte Carlo experiments. It is shown that the MT estimator performs well and tends to outperform the other estimators, particularly when N is larger than T.",
keywords = "High-dimensional data, Multiple testing, Non-Gaussian observations, Shrinkage, Sparsity, Thresholding",
author = "Natalia Bailey and Pesaran, {M. Hashem} and Smith, {L. Vanessa}",
year = "2019",
month = "2",
doi = "10.1016/j.jeconom.2018.10.006",
language = "English",
volume = "208",
pages = "507--534",
journal = "Journal of Econometrics",
issn = "0304-4076",
publisher = "Elsevier",
number = "2",

}

A multiple testing approach to the regularisation of large sample correlation matrices. / Bailey, Natalia; Pesaran, M. Hashem; Smith, L. Vanessa.

In: Journal of Econometrics, Vol. 208, No. 2, 02.2019, p. 507-534.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - A multiple testing approach to the regularisation of large sample correlation matrices

AU - Bailey, Natalia

AU - Pesaran, M. Hashem

AU - Smith, L. Vanessa

PY - 2019/2

Y1 - 2019/2

N2 - This paper proposes a regularisation method for the estimation of large covariance matrices that uses insights from the multiple testing (MT) literature. The approach tests the statistical significance of individual pair-wise correlations and sets to zero those elements that are not statistically significant, taking account of the multiple testing nature of the problem. The effective p-values of the tests are set as a decreasing function of N (the cross section dimension), the rate of which is governed by the nature of dependence of the underlying observations, and the relative expansion rates of N and T (the time dimension). In this respect, the method specifies the appropriate thresholding parameter to be used under Gaussian and non-Gaussian settings. The MT estimator of the sample correlation matrix is shown to be consistent in the spectral and Frobenius norms, and in terms of support recovery, so long as the true covariance matrix is sparse. The performance of the proposed MT estimator is compared to a number of other estimators in the literature using Monte Carlo experiments. It is shown that the MT estimator performs well and tends to outperform the other estimators, particularly when N is larger than T.

AB - This paper proposes a regularisation method for the estimation of large covariance matrices that uses insights from the multiple testing (MT) literature. The approach tests the statistical significance of individual pair-wise correlations and sets to zero those elements that are not statistically significant, taking account of the multiple testing nature of the problem. The effective p-values of the tests are set as a decreasing function of N (the cross section dimension), the rate of which is governed by the nature of dependence of the underlying observations, and the relative expansion rates of N and T (the time dimension). In this respect, the method specifies the appropriate thresholding parameter to be used under Gaussian and non-Gaussian settings. The MT estimator of the sample correlation matrix is shown to be consistent in the spectral and Frobenius norms, and in terms of support recovery, so long as the true covariance matrix is sparse. The performance of the proposed MT estimator is compared to a number of other estimators in the literature using Monte Carlo experiments. It is shown that the MT estimator performs well and tends to outperform the other estimators, particularly when N is larger than T.

KW - High-dimensional data

KW - Multiple testing

KW - Non-Gaussian observations

KW - Shrinkage

KW - Sparsity

KW - Thresholding

UR - http://www.scopus.com/inward/record.url?scp=85057963582&partnerID=8YFLogxK

U2 - 10.1016/j.jeconom.2018.10.006

DO - 10.1016/j.jeconom.2018.10.006

M3 - Article

VL - 208

SP - 507

EP - 534

JO - Journal of Econometrics

JF - Journal of Econometrics

SN - 0304-4076

IS - 2

ER -