Skip to main navigation Skip to search Skip to main content

Factor Model-Based Large Covariance Estimation from Streaming Data Using a Knowledge-Based Sketch Matrix

Xiao Dong Tan, Zhaoyang Wang, Hao Qian, Jun Zhou, Peibo Duan, Dian Shen, Meng Wang, Beilun Wang

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

Abstract

Covariance matrix estimation is an important problem in statistics, with wide applications in finance, neuroscience, meteorology, oceanography, and other fields. However, when the data are high-dimensional and constantly generated and updated in a streaming fashion, the covariance matrix estimation faces huge challenges, including the curse of dimensionality and limited memory space. The existing methods either assume sparsity, ignoring any possible common factor among the variables, or obtain poor performance in recovering the covariance matrix directly from sketched data. To address these issues, we propose a novel method - KEEF: <u>K</u>nowledge-based Time and Memory <u>E</u>fficient Covariance <u>E</u>stimator in <u>F</u>actor Model and its extended variation. Our method leverages historical data to train a knowledge-based sketch matrix, which is used to accelerate the factor analysis of streaming data and directly estimates the covariance matrix from the sketched data. We provide theoretical guarantees, showing the advantages of our method in terms of time and space complexity, as well as accuracy. We conduct extensive experiments on synthetic and real-world data, comparing KEEF with several state-of-the-art methods, demonstrating the superior performance of our method.

Original languageEnglish
Title of host publicationProceedings of the 33rd ACM International Conference on Information and Knowledge Management
EditorsAndrea D’Angelo, Angelica Liguori
Place of PublicationNew York NY USA
PublisherAssociation for Computing Machinery (ACM)
Pages2210-2219
Number of pages10
ISBN (Electronic)9798400704369
DOIs
Publication statusPublished - 2024
EventACM International Conference on Information and Knowledge Management 2024 - Boise, United States of America
Duration: 21 Oct 202425 Oct 2024
Conference number: 33rd
https://cikm2024.org/ (Website)
https://dl.acm.org/doi/proceedings/10.1145/3627673 (Proceedings)

Conference

ConferenceACM International Conference on Information and Knowledge Management 2024
Abbreviated titleCIKM 2024
Country/TerritoryUnited States of America
CityBoise
Period21/10/2425/10/24
Internet address

Keywords

  • covariance matrix
  • sketching algorithm
  • streaming data

Cite this