Estimating bayesian networks for high-dimensional data with complex mean structure and random effects

Jessica Eleonore Kasza, Gary Glonek, Patty Solomon

Research output: Contribution to journalArticleResearchpeer-review

1 Citation (Scopus)

Abstract

The estimation of Bayesian networks given high-dimensional data, in particular gene expression data, has been the focus of much recent research. Whilst there are several methods available for the estimation of such networks, these typically assume that the data consist of independent and identically distributed samples. It is often the case, however, that the available data have a more complex mean structure, plus additional components of variance, which must then be accounted for in the estimation of a Bayesian network. In this paper, score metrics that take account of such complexities are proposed for use in conjunction with score-based methods for the estimation of Bayesian networks. We propose first, a fully Bayesian score metric, and second, a metric inspired by the notion of restricted maximum likelihood. We demonstrate the performance of these new metrics for the estimation of Bayesian networks using simulated data with known complex mean structures. We then present the analysis of expression levels of grape-berry genes adjusting for exogenous variables believed to affect the expression levels of the genes. Demonstrable biological effects can be inferred from the estimated conditional independence relationships and correlations amongst the grape-berry genes.
Original languageEnglish
Pages (from-to)169 - 187
Number of pages19
JournalAustralian & New Zealand Journal of Statistics
Volume54
Issue number2
DOIs
Publication statusPublished - 2012
Externally publishedYes

Cite this