Not every credible interval is credible: evaluating robustness in the presence of contamination in Bayesian data analysis

Lauren A. Kennedy, Daniel J. Navarro, Amy Perfors, Nancy Briggs

Research output: Contribution to journalArticleResearchpeer-review

3 Citations (Scopus)

Abstract

As Bayesian methods become more popular among behavioral scientists, they will inevitably be applied in situations that violate the assumptions underpinning typical models used to guide statistical inference. With this in mind, it is important to know something about how robust Bayesian methods are to the violation of those assumptions. In this paper, we focus on the problem of contaminated data (such as data with outliers or conflicts present), with specific application to the problem of estimating a credible interval for the population mean. We evaluate five Bayesian methods for constructing a credible interval, using toy examples to illustrate the qualitative behavior of different approaches in the presence of contaminants, and an extensive simulation study to quantify the robustness of each method. We find that the “default” normal model used in most Bayesian data analyses is not robust, and that approaches based on the Bayesian bootstrap are only robust in limited circumstances. A simple parametric model based on Tukey’s “contaminated normal model” and a model based on the t-distribution were markedly more robust. However, the contaminated normal model had the added benefit of estimating which data points were discounted as outliers and which were not.

Original languageEnglish
Pages (from-to)2219-2234
Number of pages16
JournalBehavior Research Methods
Volume49
Issue number6
DOIs
Publication statusPublished - Dec 2017
Externally publishedYes

Keywords

  • Bayesian data analysis
  • Contaminated data
  • Robust methods

Cite this