Not all roads lead to the immune system: the genetic basis of multiple sclerosis severity

Vilija G. Jokubaitis, Maria Pia Campagna, Omar Ibrahim, Jim Stankovich, Pavlina Kleinova, Fuencisla Matesanz, Daniel Hui, Sara Eichau, Mark Slee, Jeannette Lechner-Scott, Rodney A. Lea, Trevor John Kilpatrick, Tomas Kalincik, Philip Laurence De Jager, Ashley H Beecham, Jacob L McCauley, Bruce V.M. Taylor, Steve Vucic, Louise Laverick, Karolina VodehnalovaMaria-Isabel García-Sanchéz, Antonio Alcina, Anneke Van Der Walt, Eva Kubala Havrdova, Guillermo Izquierdo, Nikolaos A Patsopoulos, Dana Horáková, Helmut Butzkueven

Research output: Contribution to journalArticleResearchpeer-review

10 Citations (Scopus)


Multiple sclerosis is a leading cause of neurological disability in adults. Heterogeneity in multiple sclerosis clinical presentation has posed a major challenge for identifying genetic variants associated with disease outcomes. To overcome this challenge, we used prospectively ascertained clinical outcomes data from the largest international multiple sclerosis registry, MSBase. We assembled a cohort of deeply phenotyped individuals of European ancestry with relapse-onset multiple sclerosis. We used unbiased genome-wide association study and machine learning approaches to assess the genetic contribution to longitudinally defined multiple sclerosis severity phenotypes in 1813 individuals. Our primary analyses did not identify any genetic variants of moderate to large effect sizes that met genome-wide significance thresholds. The strongest signal was associated with rs7289446 (β = −0.4882, P = 2.73 × 10−7), intronic to SEZ6L on chromosome 22. However, we demonstrate that clinical outcomes in relapse-onset multiple sclerosis are associated with multiple genetic loci of small effect sizes. Using a machine learning approach incorporating over 62 000 variants together with clinical and demographic variables available at multiple sclerosis disease onset, we could predict severity with an area under the receiver operator curve of 0.84 (95% CI 0.79–0.88). Our machine learning algorithm achieved positive predictive value for outcome assignation of 80% and negative predictive value of 88%. This outperformed our machine learning algorithm that contained clinical and demographic variables alone (area under the receiver operator curve 0.54, 95% CI 0.48–0.60). Secondary, sex-stratified analyses identified two genetic loci that met genome-wide significance thresholds. One in females (rs10967273; βfemale = 0.8289, P = 3.52 × 10−8), the other in males (rs698805; βmale = −1.5395, P = 4.35 × 10−8), providing some evidence for sex dimorphism in multiple sclerosis severity. Tissue enrichment and pathway analyses identified an overrepresentation of genes expressed in CNS compartments generally, and specifically in the cerebellum (P = 0.023). These involved mitochondrial function, synaptic plasticity, oligodendroglial biology, cellular senescence, calcium and G-protein receptor signalling pathways. We further identified six variants with strong evidence for regulating clinical outcomes, the strongest signal again intronic to SEZ6L (adjusted hazard ratio 0.72, P = 4.85 × 10−4). Here we report a milestone in our progress towards understanding the clinical heterogeneity of multiple sclerosis outcomes, implicating functionally distinct mechanisms to multiple sclerosis risk. Importantly, we demonstrate that machine learning using common single nucleotide variant clusters, together with clinical variables readily available at diagnosis can improve prognostic capabilities at diagnosis, and with further validation has the potential to translate to meaningful clinical practice change.

Original languageEnglish
Pages (from-to)2316-2331
Number of pages16
Issue number6
Publication statusPublished - Jun 2023


  • disease severity
  • genetics
  • machine learning
  • multiple sclerosis
  • prognostics

Cite this