Misreporting and econometric modelling of zeros in survey data on social bads: An application to cannabis consumption

William Greene, Mark N. Harris, Preety Srivastava, Xueyan Zhao

Research output: Contribution to journalArticleResearchpeer-review

Abstract

When modelling "social bads," such as illegal drug consumption, researchers are often faced with a dependent variable characterised by a large number of zero observations. Building on the recent literature on hurdle and double-hurdle models, we propose a double-inflated modelling framework, where the zero observations are allowed to come from the following: nonparticipants; participant misreporters (who have larger loss functions associated with a truthful response); and infrequent consumers. Due to our empirical application, the model is derived for the case of an ordered discrete-dependent variable. However, it is similarly possible to augment other such zero-inflated models (e.g., zero-inflated count models, and double-hurdle models for continuous variables). The model is then applied to a consumer choice problem of cannabis consumption. We estimate that 17% of the reported zeros in the cannabis survey are from individuals who misreport their participation, 11% from infrequent users, and only 72% from true nonparticipants.

Original languageEnglish
Pages (from-to)372-389
Number of pages18
JournalHealth Economics
Volume27
Issue number2
DOIs
Publication statusPublished - 2018

Keywords

  • Cannabis consumption
  • Discrete data
  • Misclassification
  • Ordered outcomes
  • Zero-inflated responses

Cite this

Greene, William ; Harris, Mark N. ; Srivastava, Preety ; Zhao, Xueyan. / Misreporting and econometric modelling of zeros in survey data on social bads : An application to cannabis consumption. In: Health Economics. 2018 ; Vol. 27, No. 2. pp. 372-389.
@article{a51be160e213441d98f8b2937cf88115,
title = "Misreporting and econometric modelling of zeros in survey data on social bads: An application to cannabis consumption",
abstract = "When modelling {"}social bads,{"} such as illegal drug consumption, researchers are often faced with a dependent variable characterised by a large number of zero observations. Building on the recent literature on hurdle and double-hurdle models, we propose a double-inflated modelling framework, where the zero observations are allowed to come from the following: nonparticipants; participant misreporters (who have larger loss functions associated with a truthful response); and infrequent consumers. Due to our empirical application, the model is derived for the case of an ordered discrete-dependent variable. However, it is similarly possible to augment other such zero-inflated models (e.g., zero-inflated count models, and double-hurdle models for continuous variables). The model is then applied to a consumer choice problem of cannabis consumption. We estimate that 17{\%} of the reported zeros in the cannabis survey are from individuals who misreport their participation, 11{\%} from infrequent users, and only 72{\%} from true nonparticipants.",
keywords = "Cannabis consumption, Discrete data, Misclassification, Ordered outcomes, Zero-inflated responses",
author = "William Greene and Harris, {Mark N.} and Preety Srivastava and Xueyan Zhao",
year = "2018",
doi = "10.1002/hec.3553",
language = "English",
volume = "27",
pages = "372--389",
journal = "Health Economics",
issn = "1057-9230",
publisher = "John Wiley & Sons",
number = "2",

}

Misreporting and econometric modelling of zeros in survey data on social bads : An application to cannabis consumption. / Greene, William; Harris, Mark N.; Srivastava, Preety; Zhao, Xueyan.

In: Health Economics, Vol. 27, No. 2, 2018, p. 372-389.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Misreporting and econometric modelling of zeros in survey data on social bads

T2 - An application to cannabis consumption

AU - Greene, William

AU - Harris, Mark N.

AU - Srivastava, Preety

AU - Zhao, Xueyan

PY - 2018

Y1 - 2018

N2 - When modelling "social bads," such as illegal drug consumption, researchers are often faced with a dependent variable characterised by a large number of zero observations. Building on the recent literature on hurdle and double-hurdle models, we propose a double-inflated modelling framework, where the zero observations are allowed to come from the following: nonparticipants; participant misreporters (who have larger loss functions associated with a truthful response); and infrequent consumers. Due to our empirical application, the model is derived for the case of an ordered discrete-dependent variable. However, it is similarly possible to augment other such zero-inflated models (e.g., zero-inflated count models, and double-hurdle models for continuous variables). The model is then applied to a consumer choice problem of cannabis consumption. We estimate that 17% of the reported zeros in the cannabis survey are from individuals who misreport their participation, 11% from infrequent users, and only 72% from true nonparticipants.

AB - When modelling "social bads," such as illegal drug consumption, researchers are often faced with a dependent variable characterised by a large number of zero observations. Building on the recent literature on hurdle and double-hurdle models, we propose a double-inflated modelling framework, where the zero observations are allowed to come from the following: nonparticipants; participant misreporters (who have larger loss functions associated with a truthful response); and infrequent consumers. Due to our empirical application, the model is derived for the case of an ordered discrete-dependent variable. However, it is similarly possible to augment other such zero-inflated models (e.g., zero-inflated count models, and double-hurdle models for continuous variables). The model is then applied to a consumer choice problem of cannabis consumption. We estimate that 17% of the reported zeros in the cannabis survey are from individuals who misreport their participation, 11% from infrequent users, and only 72% from true nonparticipants.

KW - Cannabis consumption

KW - Discrete data

KW - Misclassification

KW - Ordered outcomes

KW - Zero-inflated responses

UR - http://www.scopus.com/inward/record.url?scp=85026760833&partnerID=8YFLogxK

U2 - 10.1002/hec.3553

DO - 10.1002/hec.3553

M3 - Article

AN - SCOPUS:85026760833

VL - 27

SP - 372

EP - 389

JO - Health Economics

JF - Health Economics

SN - 1057-9230

IS - 2

ER -