Parametric models for biomarkers based on flexible size distributions

Apostolos Davillas, Andrew M. Jones

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Recent advances in social science surveys include collection of biological samples. Although biomarkers offer a large potential for social science and economic research, they impose a number of statistical challenges, often being distributed asymmetrically with heavy tails. Using data from the UK Household Panel Survey, we illustrate the comparative performance of a set of flexible parametric distributions, which allow for a wide range of skewness and kurtosis: the four-parameter generalized beta of the second kind (GB2), the three-parameter generalized gamma, and their three-, two-, or one-parameter nested and limiting cases. Commonly used blood-based biomarkers for inflammation, diabetes, cholesterol, and stress-related hormones are modelled. Although some of the three-parameter distributions nested within the GB2 outperform the latter for most of the biomarkers considered, the GB2 can be used as a guide for choosing among competing parametric distributions for biomarkers. Going “beyond the mean” to estimate tail probabilities, we find that GB2 performs fairly well with some disparities at the very high levels of glycated hemoglobin and fibrinogen. Commonly used linear models are shown to perform worse than almost all the flexible distributions.

Original languageEnglish
Pages (from-to)1617-1624
Number of pages8
JournalHealth Economics
Volume27
Issue number10
DOIs
Publication statusPublished - 1 Oct 2018

Keywords

  • biomarkers
  • generalized beta of second kind
  • heavy tails
  • tail probabilities

Cite this

@article{f60dd18c05674a95b6b973edf382ce7d,
title = "Parametric models for biomarkers based on flexible size distributions",
abstract = "Recent advances in social science surveys include collection of biological samples. Although biomarkers offer a large potential for social science and economic research, they impose a number of statistical challenges, often being distributed asymmetrically with heavy tails. Using data from the UK Household Panel Survey, we illustrate the comparative performance of a set of flexible parametric distributions, which allow for a wide range of skewness and kurtosis: the four-parameter generalized beta of the second kind (GB2), the three-parameter generalized gamma, and their three-, two-, or one-parameter nested and limiting cases. Commonly used blood-based biomarkers for inflammation, diabetes, cholesterol, and stress-related hormones are modelled. Although some of the three-parameter distributions nested within the GB2 outperform the latter for most of the biomarkers considered, the GB2 can be used as a guide for choosing among competing parametric distributions for biomarkers. Going “beyond the mean” to estimate tail probabilities, we find that GB2 performs fairly well with some disparities at the very high levels of glycated hemoglobin and fibrinogen. Commonly used linear models are shown to perform worse than almost all the flexible distributions.",
keywords = "biomarkers, generalized beta of second kind, heavy tails, tail probabilities",
author = "Apostolos Davillas and Jones, {Andrew M.}",
year = "2018",
month = "10",
day = "1",
doi = "10.1002/hec.3787",
language = "English",
volume = "27",
pages = "1617--1624",
journal = "Health Economics",
issn = "1057-9230",
publisher = "John Wiley & Sons",
number = "10",

}

Parametric models for biomarkers based on flexible size distributions. / Davillas, Apostolos; Jones, Andrew M.

In: Health Economics, Vol. 27, No. 10, 01.10.2018, p. 1617-1624.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Parametric models for biomarkers based on flexible size distributions

AU - Davillas, Apostolos

AU - Jones, Andrew M.

PY - 2018/10/1

Y1 - 2018/10/1

N2 - Recent advances in social science surveys include collection of biological samples. Although biomarkers offer a large potential for social science and economic research, they impose a number of statistical challenges, often being distributed asymmetrically with heavy tails. Using data from the UK Household Panel Survey, we illustrate the comparative performance of a set of flexible parametric distributions, which allow for a wide range of skewness and kurtosis: the four-parameter generalized beta of the second kind (GB2), the three-parameter generalized gamma, and their three-, two-, or one-parameter nested and limiting cases. Commonly used blood-based biomarkers for inflammation, diabetes, cholesterol, and stress-related hormones are modelled. Although some of the three-parameter distributions nested within the GB2 outperform the latter for most of the biomarkers considered, the GB2 can be used as a guide for choosing among competing parametric distributions for biomarkers. Going “beyond the mean” to estimate tail probabilities, we find that GB2 performs fairly well with some disparities at the very high levels of glycated hemoglobin and fibrinogen. Commonly used linear models are shown to perform worse than almost all the flexible distributions.

AB - Recent advances in social science surveys include collection of biological samples. Although biomarkers offer a large potential for social science and economic research, they impose a number of statistical challenges, often being distributed asymmetrically with heavy tails. Using data from the UK Household Panel Survey, we illustrate the comparative performance of a set of flexible parametric distributions, which allow for a wide range of skewness and kurtosis: the four-parameter generalized beta of the second kind (GB2), the three-parameter generalized gamma, and their three-, two-, or one-parameter nested and limiting cases. Commonly used blood-based biomarkers for inflammation, diabetes, cholesterol, and stress-related hormones are modelled. Although some of the three-parameter distributions nested within the GB2 outperform the latter for most of the biomarkers considered, the GB2 can be used as a guide for choosing among competing parametric distributions for biomarkers. Going “beyond the mean” to estimate tail probabilities, we find that GB2 performs fairly well with some disparities at the very high levels of glycated hemoglobin and fibrinogen. Commonly used linear models are shown to perform worse than almost all the flexible distributions.

KW - biomarkers

KW - generalized beta of second kind

KW - heavy tails

KW - tail probabilities

UR - http://www.scopus.com/inward/record.url?scp=85052824012&partnerID=8YFLogxK

U2 - 10.1002/hec.3787

DO - 10.1002/hec.3787

M3 - Article

VL - 27

SP - 1617

EP - 1624

JO - Health Economics

JF - Health Economics

SN - 1057-9230

IS - 10

ER -