Estimating support scores of autism communities in large-scale web information systems

Nguyen Thin, Nguyen Hung, Svetha Venkatesh, Dinh Phung

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

4 Citations (Scopus)


Individuals with Autism Spectrum Disorder (ASD) have been shown to prefer communication at a socio-spatial distance. So while rarely found in the real world, autism communities are popular in Web-based forums, convenient for people with ASD to seek and share health related information. Reddit is one such avenue for people of common interest to connect, forming communities of specific interest, namely subreddits. This work aims to estimate support scores provided by a popular subreddit interested in ASD – The scores were measured in both the quantities and qualities of the conversations in the forum, including conversational involvement, emotional, and informational support. The support scores of the subreddit Aspergers was compared with that of an average subreddit derived from entire Reddit, represented by two big corpora of approximately 200 million Reddit posts and 1.66 billion Reddit comments. The ASD subreddit was found to be a supportive community, having far higher support scores than did the average subreddit. Apache Spark, an advanced cluster computing framework, is employed to speed up processing of the large corpora. Scalable machine learning techniques implemented in Spark help discriminate the content made in Aspergers versus other subreddits and automatically discover linguistic predictors of ASD within minutes, providing timely reports.

Original languageEnglish
Title of host publicationWeb Information Systems Engineering – WISE 2017
Subtitle of host publication18th International Conference Puschino, Russia, October 7–11, 2017 Proceedings, Part I
EditorsAthman Bouguettaya, Yunjun Gao, Andrey Klimenko, Lu Chen, Xiangliang Zhang, Fedor Dzerzhinskiy, Weijia Jia, Stanislav V. Klimenko, Qing Li
Place of PublicationCham Switzerland
Number of pages9
ISBN (Electronic)9783319687834
ISBN (Print)9783319687827
Publication statusPublished - 2017
Externally publishedYes
EventInternational Conference on Web Information Systems Engineering 2017 - Puschino, Russian Federation
Duration: 7 Oct 201711 Oct 2017
Conference number: 18th (Proceedings)

Publication series

NameLecture Notes in Computer Science
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


ConferenceInternational Conference on Web Information Systems Engineering 2017
Abbreviated titleWISE 2017
Country/TerritoryRussian Federation
Internet address


  • Apache Spark
  • Autism communities
  • Big data
  • Large-scale distributed computing
  • Support scores

Cite this