Abstract
Objective: To evaluate, across multiple sample sizes, the degree that data-driven methods result in (1) optimal cutoffs different from population optimal cutoff and (2) bias in accuracy estimates. Study design and setting: A total of 1,000 samples of sample size 100, 200, 500 and 1,000 each were randomly drawn to simulate studies of different sample sizes from a database (n = 13,255) synthesized to assess Edinburgh Postnatal Depression Scale (EPDS) screening accuracy. Optimal cutoffs were selected by maximizing Youden's J (sensitivity+specificity–1). Optimal cutoffs and accuracy estimates in simulated samples were compared to population values. Results: Optimal cutoffs in simulated samples ranged from ≥ 5 to ≥ 17 for n = 100, ≥ 6 to ≥ 16 for n = 200, ≥ 6 to ≥ 14 for n = 500, and ≥ 8 to ≥ 13 for n = 1,000. Percentage of simulated samples identifying the population optimal cutoff (≥ 11) was 30% for n = 100, 35% for n = 200, 53% for n = 500, and 71% for n = 1,000. Mean overestimation of sensitivity and underestimation of specificity were 6.5 percentage point (pp) and -1.3 pp for n = 100, 4.2 pp and -1.1 pp for n = 200, 1.8 pp and -1.0 pp for n = 500, and 1.4 pp and -1.0 pp for n = 1,000. Conclusions: Small accuracy studies may identify inaccurate optimal cutoff and overstate accuracy estimates with data-driven methods.
Original language | English |
---|---|
Pages (from-to) | 137-147 |
Number of pages | 11 |
Journal | Journal of Clinical Epidemiology |
Volume | 137 |
DOIs | |
Publication status | Published - Sept 2021 |
Keywords
- Accuracy estimates
- Bias
- Cherry-picking
- Data-driven methods
- Depression
- Optimal cutoff
Access to Document
Other files and links
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver
}
In: Journal of Clinical Epidemiology, Vol. 137, 09.2021, p. 137-147.
Research output: Contribution to journal › Article › Research › peer-review
TY - JOUR
T1 - Data-driven methods distort optimal cutoffs and accuracy estimates of depression screening tools
T2 - a simulation study using individual participant data
AU - Bhandari, Parash Mani
AU - Levis, Brooke
AU - Neupane, Dipika
AU - Patten, Scott B.
AU - Shrier, Ian
AU - Thombs, Brett D.
AU - Benedetti, Andrea
AU - Sun, Ying
AU - He, Chen
AU - Rice, Danielle B.
AU - Krishnan, Ankur
AU - Wu, Yin
AU - Azar, Marleine
AU - Sanchez, Tatiana A.
AU - Chiovitti, Matthew J.
AU - Saadat, Nazanin
AU - Riehm, Kira E.
AU - Imran, Mahrukh
AU - Negeri, Zelalem
AU - Boruff, Jill T.
AU - Cuijpers, Pim
AU - Gilbody, Simon
AU - Ioannidis, John P.A.
AU - Kloda, Lorie A.
AU - Ziegelstein, Roy C.
AU - Comeau, Liane
AU - Mitchell, Nicholas D.
AU - Tonelli, Marcello
AU - Vigod, Simone N.
AU - Aceti, Franca
AU - Alvarado, Rubén
AU - Alvarado-Esquivel, Cosme
AU - Bakare, Muideen O.
AU - Barnes, Jacqueline
AU - Bavle, Amar D.
AU - Beck, Cheryl Tatano
AU - Bindt, Carola
AU - Boyce, Philip M.
AU - Bunevicius, Adomas
AU - Castro e Couto, Tiago
AU - Chaudron, Linda H.
AU - Correa, Humberto
AU - de Figueiredo, Felipe Pinheiro
AU - Eapen, Valsamma
AU - Favez, Nicolas
AU - Felice, Ethel
AU - Fernandes, Michelle
AU - Figueiredo, Barbara
AU - Fisher, Jane R.W.
AU - Garcia-Esteve, Lluïsa
AU - Giardinelli, Lisa
AU - Helle, Nadine
AU - Howard, Louise M.
AU - Khalifa, Dina Sami
AU - Kohlhoff, Jane
AU - Kozinszky, Zoltán
AU - Kusminskas, Laima
AU - Lelli, Lorenzo
AU - Leonardou, Angeliki A.
AU - Maes, Michael
AU - Meuti, Valentina
AU - Radoš, Sandra Nakić
AU - García, Purificación Navarro
AU - Nishi, Daisuke
AU - Luwa E-Andjafono, Daniel Okitundu
AU - Pawlby, Susan J.
AU - Quispel, Chantal
AU - Robertson-Blackmore, Emma
AU - Rochat, Tamsen J.
AU - Rowe, Heather J.
AU - Sharp, Deborah J.
AU - Siu, Bonnie W.M.
AU - Skalkidou, Alkistis
AU - Stein, Alan
AU - Stewart, Robert C.
AU - Su, Kuan Pin
AU - Sundström-Poromaa, Inger
AU - Tadinac, Meri
AU - Tandon, S. Darius
AU - Tendais, Iva
AU - Thiagayson, Pavaani
AU - Töreki, Annamária
AU - Torres-Giménez, Anna
AU - Tran, Thach D.
AU - Trevillion, Kylee
AU - Turner, Katherine
AU - Vega-Dienstmaier, Johann M.
AU - Wynter, Karen
AU - Yonkers, Kimberly A.
AU - the Depression Screening Data (DEPRESSD) EPDS Group
N1 - Funding Information: This study was funded by the Canadian Institutes of Health Research (CIHR, KRS-140994) . Mr. Bhandari was supported by a studentship from the Research Institute of the McGill University Health Centre . Ms. Levis was supported by a CIHR Frederick Banting and Charles Best Canada Graduate Scholarship doctoral award and a Fonds de recherche du Québec - Santé (FRQ-S) Postdoctoral Training Award. Ms. Neupane was supported by G.R. Caverhill Fellowship from the Faculty of Medicine, McGill University . Ms. Rice was supported by a Vanier Canada Graduate Scholarship . Dr. Wu was supported by an Utting Postdoctoral Fellowship from the Jewish General Hospital, Montreal, Quebec, Canada and a FRQ-S Postdoctoral Training Award . Ms. Azar was supported by a FRQ-S Masters Training Award . The primary study by Alvarado et al. was supported by the Ministry of Health of Chile . The primary study by Barnes et al. was supported by a grant from the Health Foundation (1665/608) . The primary study by Beck et al. was supported by the Patrick and Catherine Weldon Donaghue Medical Research Foundation and the University of Connecticut Research Foundation . The primary study by Helle et al. was supported by the Werner Otto Foundation, the Kroschke Foundation, and the Feindt Foundation. Prof. Robertas Bunevicius, MD, PhD (1958-2016) was Principal Investigator of the primary study by Bunevicius et al, but passed away and was unable to participate in this project. The primary study by Couto et al. was supported by the National Counsel of Technological and Scientific Development (CNPq) (Grant no. 444254/2014-5) and the Minas Gerais State Research Foundation (FAPEMIG) (Grant no. APQ-01954-14) . The primary study by Chaudron et al. was supported by a grant from the National Institute of Mental Health (grant K23 MH64476) . The primary study by Figueira et al. was supported by the Brazilian Ministry of Health and by the National Counsel of Technological and Scientific Development (CNPq) (Grant no. 403433/2004-5) . The primary study by de Figueiredo et al. was supported by Fundação de Amparo à Pesquisa do Estado de São Paulo . The primary study by Tissot et al. was supported by the Swiss National Science Foundation (grant 32003B 125493) . The primary study by Fernandes et al. was supported by grants from the Child: Care Health and Development Trust and the Department of Psychiatry, University of Oxford, Oxford, UK , and by the Ashok Ranganathan Bursary from Exeter College, University of Oxford . Dr. Fernandes is supported by a University of Southampton National Institute for Health Research (NIHR) academic clinical fellowship in Paediatrics. The primary study by Tendais et al. was supported under the project POCI/SAU-ESP/56397/2004 by the Operational Program Science and Innovation 2010 (POCI 2010) of the Community Support Board III and by the European Community Fund FEDER . The primary study by Fisher et al. was supported by a grant under the Invest to Grow Scheme from the Australian Government Department of Families, Housing, Community Services and Indigenous Affairs . The primary study by Garcia-Esteve et al. was supported by grant 7/98 from the Ministerio de Trabajo y Asuntos Sociales, Women's Institute, Spain . The primary study by Howard et al. was supported by the NIHR under its Programme Grants for Applied Research Programme (Grant Reference Numbers RP-PG-1210-12002 and RP-DG-1108-10012) and by the South London Clinical Research Network . The views expressed are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care. The primary study by Phillips et al. was supported by a scholarship from the National Health and Medical and Research Council (NHMRC) . The primary study by Roomruangwong et al. was supported by the Ratchadaphiseksomphot Endowment Fund 2013 of Chulalongkorn University (CU-56-457-HR) . The primary study by Nakić Radoš et al. was supported by the Croatian Ministry of Science, Education, and Sports (134-0000000-2421) . The primary study by Navarro et al. was supported by grant 13/00 from the Ministry of Work and Social Affairs, Institute of Women, Spain . The primary study by Usuda et al. was supported by Grant-in-Aid for Young Scientists (A) from the Japan Society for the Promotion of Science (primary investigator: Daisuke Nishi, MD, PhD ), and by an Intramural Research Grant for Neurological and Psychiatric Disorders from the National Center of Neurology and Psychiatry, Japan. The primary study by Pawlby et al. was supported by a Medical Research Council UK Project Grant (number G89292999N) . The primary study by Quispel et al. was supported by Stichting Achmea Gezondheid (grant number z-282) . Dr. Robertson-Blackmore was supported by a Young Investigator Award from the Brain and Behavior Research Foundation and NIMH grant K23MH080290 . The primary study by Rochat et al. was supported by grants from University of Oxford (HQ5035) , the Tuixen Foundation (9940) , and the Wellcome Trust (082384/Z/07/Z and 071571) , and the American Psychological Association. Dr. Rochat receives salary support from a Wellcome Trust Intermediate Fellowship (211374/Z/18/Z) . The primary study by Rowe et al. was supported by the diamond Consortium, beyondblue Victorian Centre of Excellence in Depression and Related Disorders. The primary study by Comasco et al. was supported by funds from the Swedish Research Council (VR: 521-2013-2339, VR:523-2014-2342) , the Swedish Council for Working Life and Social Research (FAS: 2011-0627) , the Marta Lundqvist Foundation (2013, 2014) , and the Swedish Society of Medicine (SLS-331991) . The primary study by Prenoveau et al. was supported by The Wellcome Trust (grant number 071571) . The primary study by Stewart et al. was supported by Professor Francis Creed's Journal of Psychosomatic Research Editorship fund (BA00457) administered through University of Manchester. The primary study by Su et al. was supported by grants from the Department of Health (DOH94F044 and DOH95F022) and the China Medical University and Hospital (CMU94-105, DMR-92-92 and DMR94-46) . The primary study by Tandon et al. was supported by the Thomas Wilson Sanitarium . The primary study by Tran et al. was supported by the Myer Foundation who funded the study under its Beyond Australia scheme . Dr. Tran was supported by an early career fellowship from the Australian National Health and Medical Research Council . The primary study by Vega-Dienstmaier et al. was supported by Tejada Family Foundation, Inc, and Peruvian-American Endowment, Inc. The primary study by Yonkers et al. was supported by a National Institute of Child Health and Human Development grant (5 R01HD045735) . Dr. Thombs was supported by a Tier 1 Canada Research Chair . Dr. Benedetti was supported by FRQ-S Researcher Salary Awards . No other authors reported funding for primary studies or for their work on the present study. Publisher Copyright: © 2021 Elsevier Inc. Copyright: Copyright 2021 Elsevier B.V., All rights reserved.
PY - 2021/9
Y1 - 2021/9
N2 - Objective: To evaluate, across multiple sample sizes, the degree that data-driven methods result in (1) optimal cutoffs different from population optimal cutoff and (2) bias in accuracy estimates. Study design and setting: A total of 1,000 samples of sample size 100, 200, 500 and 1,000 each were randomly drawn to simulate studies of different sample sizes from a database (n = 13,255) synthesized to assess Edinburgh Postnatal Depression Scale (EPDS) screening accuracy. Optimal cutoffs were selected by maximizing Youden's J (sensitivity+specificity–1). Optimal cutoffs and accuracy estimates in simulated samples were compared to population values. Results: Optimal cutoffs in simulated samples ranged from ≥ 5 to ≥ 17 for n = 100, ≥ 6 to ≥ 16 for n = 200, ≥ 6 to ≥ 14 for n = 500, and ≥ 8 to ≥ 13 for n = 1,000. Percentage of simulated samples identifying the population optimal cutoff (≥ 11) was 30% for n = 100, 35% for n = 200, 53% for n = 500, and 71% for n = 1,000. Mean overestimation of sensitivity and underestimation of specificity were 6.5 percentage point (pp) and -1.3 pp for n = 100, 4.2 pp and -1.1 pp for n = 200, 1.8 pp and -1.0 pp for n = 500, and 1.4 pp and -1.0 pp for n = 1,000. Conclusions: Small accuracy studies may identify inaccurate optimal cutoff and overstate accuracy estimates with data-driven methods.
AB - Objective: To evaluate, across multiple sample sizes, the degree that data-driven methods result in (1) optimal cutoffs different from population optimal cutoff and (2) bias in accuracy estimates. Study design and setting: A total of 1,000 samples of sample size 100, 200, 500 and 1,000 each were randomly drawn to simulate studies of different sample sizes from a database (n = 13,255) synthesized to assess Edinburgh Postnatal Depression Scale (EPDS) screening accuracy. Optimal cutoffs were selected by maximizing Youden's J (sensitivity+specificity–1). Optimal cutoffs and accuracy estimates in simulated samples were compared to population values. Results: Optimal cutoffs in simulated samples ranged from ≥ 5 to ≥ 17 for n = 100, ≥ 6 to ≥ 16 for n = 200, ≥ 6 to ≥ 14 for n = 500, and ≥ 8 to ≥ 13 for n = 1,000. Percentage of simulated samples identifying the population optimal cutoff (≥ 11) was 30% for n = 100, 35% for n = 200, 53% for n = 500, and 71% for n = 1,000. Mean overestimation of sensitivity and underestimation of specificity were 6.5 percentage point (pp) and -1.3 pp for n = 100, 4.2 pp and -1.1 pp for n = 200, 1.8 pp and -1.0 pp for n = 500, and 1.4 pp and -1.0 pp for n = 1,000. Conclusions: Small accuracy studies may identify inaccurate optimal cutoff and overstate accuracy estimates with data-driven methods.
KW - Accuracy estimates
KW - Bias
KW - Cherry-picking
KW - Data-driven methods
KW - Depression
KW - Optimal cutoff
UR - http://www.scopus.com/inward/record.url?scp=85105570449&partnerID=8YFLogxK
U2 - 10.1016/j.jclinepi.2021.03.031
DO - 10.1016/j.jclinepi.2021.03.031
M3 - Article
C2 - 33838273
AN - SCOPUS:85105570449
SN - 0895-4356
VL - 137
SP - 137
EP - 147
JO - Journal of Clinical Epidemiology
JF - Journal of Clinical Epidemiology
ER -