Analysing the history of autism spectrum disorder using topic models

Adham Beykikhoshk, Dinh Phung, Ognjen Arandjelovic, Svetha Venkatesh

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearch

6 Citations (Scopus)

Abstract

We describe a novel framework for the discovery of underlying topics of a longitudinal collection of scholarly data, and the tracking of their lifetime and popularity over time. Unlike the social media or news data where the underlying topics evolve over time, the topic nuances in science result in new scientific directions to emerge. Therefore, we model the longitudinal literature data with a new approach that uses topics which remain identifiable over the course of time. Current studies either disregard the time dimension or treat it as an exchangeable covariate when they fix the topics over time or do not share the topics over epochs when they model the time naturally. We address these issues by adopting a non-parametric Bayesian approach. We assume the data is partially exchangeable and divide it into consecutive epochs. Then, by fixing the topics in a recurrent Chinese restaurant franchise, we impose a static topical structure on the corpus such that the topics are shared across epochs and the documents within epochs. We demonstrate the effectiveness of the proposed framework on a collection of medical literature related to autism spectrum disorder. We collect a large corpus of publications and carefully examine two important research issues of the domain as case studies. Moreover, we make the results of our experiment and the source code of the model, freely available to the public. This AIDS other researchers to analyse our results or apply the model to their data collections.

Original languageEnglish
Title of host publicationProceedings - 3rd IEEE International Conference on Data Science and Advanced Analytics, DSAA 2016
Subtitle of host publication17 - 19 Oct 2016, Montréal, Canada
EditorsNizar Bouguila
Place of PublicationPiscataway NJ USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages762-771
Number of pages10
ISBN (Electronic)9781509052066
DOIs
Publication statusPublished - 2016
Externally publishedYes
EventInternational Conference on Data Science and Advanced Analytics 2016 - Montreal, Canada
Duration: 17 Oct 201619 Oct 2016
Conference number: 3rd
https://sites.ualberta.ca/~dsaa16/

Conference

ConferenceInternational Conference on Data Science and Advanced Analytics 2016
Abbreviated titleDSAA 2016
CountryCanada
CityMontreal
Period17/10/1619/10/16
Internet address

Keywords

  • Autism spectrum disorder
  • Bayesian nonparametrics
  • Data mining

Cite this

Beykikhoshk, A., Phung, D., Arandjelovic, O., & Venkatesh, S. (2016). Analysing the history of autism spectrum disorder using topic models. In N. Bouguila (Ed.), Proceedings - 3rd IEEE International Conference on Data Science and Advanced Analytics, DSAA 2016: 17 - 19 Oct 2016, Montréal, Canada (pp. 762-771). [7796964] IEEE, Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/DSAA.2016.65