TY - JOUR
T1 - Large sample size, wide variant spectrum, and advanced machine-learning technique boost risk prediction for inflammatory bowel disease
AU - Wei, Zhi
AU - Wang, Wei
AU - Bradfield, Jonathan
AU - Li, Jin
AU - Cardinale, Christopher
AU - Frackelton, Edward
AU - Kim, Cecilia
AU - Mentch, Frank
AU - Van Steen, Kristel
AU - Visscher, Peter M.
AU - Baldassano, Robert N.
AU - Hakonarson, Hakon
AU - International IBD Genetics Consortium
AU - D'Amato, Mauro
PY - 2013/6/6
Y1 - 2013/6/6
N2 - We performed risk assessment for Crohn's disease (CD) and ulcerative colitis (UC), the two common forms of inflammatory bowel disease (IBD), by using data from the International IBD Genetics Consortium's Immunochip project. This data set contains ∼17,000 CD cases, ∼13,000 UC cases, and ∼22,000 controls from 15 European countries typed on the Immunochip. This custom chip provides a more comprehensive catalog of the most promising candidate variants by picking up the remaining common variants and certain rare variants that were missed in the first generation of GWAS. Given this unprecedented large sample size and wide variant spectrum, we employed the most recent machine-learning techniques to build optimal predictive models. Our final predictive models achieved areas under the curve (AUCs) of 0.86 and 0.83 for CD and UC, respectively, in an independent evaluation. To our knowledge, this is the best prediction performance ever reported for CD and UC to date.
AB - We performed risk assessment for Crohn's disease (CD) and ulcerative colitis (UC), the two common forms of inflammatory bowel disease (IBD), by using data from the International IBD Genetics Consortium's Immunochip project. This data set contains ∼17,000 CD cases, ∼13,000 UC cases, and ∼22,000 controls from 15 European countries typed on the Immunochip. This custom chip provides a more comprehensive catalog of the most promising candidate variants by picking up the remaining common variants and certain rare variants that were missed in the first generation of GWAS. Given this unprecedented large sample size and wide variant spectrum, we employed the most recent machine-learning techniques to build optimal predictive models. Our final predictive models achieved areas under the curve (AUCs) of 0.86 and 0.83 for CD and UC, respectively, in an independent evaluation. To our knowledge, this is the best prediction performance ever reported for CD and UC to date.
UR - http://www.scopus.com/inward/record.url?scp=84878829383&partnerID=8YFLogxK
U2 - 10.1016/j.ajhg.2013.05.002
DO - 10.1016/j.ajhg.2013.05.002
M3 - Article
C2 - 23731541
AN - SCOPUS:84878829383
SN - 0002-9297
VL - 92
SP - 1008
EP - 1012
JO - American Journal of Human Genetics
JF - American Journal of Human Genetics
IS - 6
ER -