Identify topic relations in scientific literature using topic modeling

Hongshu Chen, Ximeng Wang, Shirui Pan, Fei Xiong

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Over the past five years, topic models have been applied to bibliometrics research as an efficient tool for discovering latent and potentially useful content. The combination of topic modeling algorithms and bibliometrics has generated new challenges of interpreting and understanding the outcome of topic modeling. Motivated by these new challenges, this paper proposes a systematic methodology for topic analysis in scientific literature corpora to face the concerns of conducting post topic modeling analysis. By linking the corpus metadata with the discovered topics, we feature them with a number of topic-based analytic indices to explore their significance, developing trend, and received attention. A topic relation identification approach is then presented to quantitatively model the relations among the topics. To demonstrate the feasibility and effectiveness of our methodology, we present two case studies, using big data and dye-sensitized solar cell publications derived from searches in World of Science. Possible application of the methodology in telling good stories of a target corpus is also explored to facilitate further research management and opportunity discovery.

Original languageEnglish
Number of pages13
JournalIEEE Transactions on Engineering Management
DOIs
Publication statusAccepted/In press - 9 Apr 2019

Keywords

  • Analytical models
  • Bibliometrics
  • Market research
  • Metadata
  • tech mining
  • Text mining
  • text mining
  • Tools
  • topic analysis

Cite this

@article{e154aa5a49e641b0b68e2b5eeb8b8832,
title = "Identify topic relations in scientific literature using topic modeling",
abstract = "Over the past five years, topic models have been applied to bibliometrics research as an efficient tool for discovering latent and potentially useful content. The combination of topic modeling algorithms and bibliometrics has generated new challenges of interpreting and understanding the outcome of topic modeling. Motivated by these new challenges, this paper proposes a systematic methodology for topic analysis in scientific literature corpora to face the concerns of conducting post topic modeling analysis. By linking the corpus metadata with the discovered topics, we feature them with a number of topic-based analytic indices to explore their significance, developing trend, and received attention. A topic relation identification approach is then presented to quantitatively model the relations among the topics. To demonstrate the feasibility and effectiveness of our methodology, we present two case studies, using big data and dye-sensitized solar cell publications derived from searches in World of Science. Possible application of the methodology in telling good stories of a target corpus is also explored to facilitate further research management and opportunity discovery.",
keywords = "Analytical models, Bibliometrics, Market research, Metadata, tech mining, Text mining, text mining, Tools, topic analysis",
author = "Hongshu Chen and Ximeng Wang and Shirui Pan and Fei Xiong",
year = "2019",
month = "4",
day = "9",
doi = "10.1109/TEM.2019.2903115",
language = "English",
journal = "IEEE Transactions on Engineering Management",
issn = "0018-9391",
publisher = "IEEE, Institute of Electrical and Electronics Engineers",

}

Identify topic relations in scientific literature using topic modeling. / Chen, Hongshu; Wang, Ximeng; Pan, Shirui; Xiong, Fei.

In: IEEE Transactions on Engineering Management, 09.04.2019.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Identify topic relations in scientific literature using topic modeling

AU - Chen, Hongshu

AU - Wang, Ximeng

AU - Pan, Shirui

AU - Xiong, Fei

PY - 2019/4/9

Y1 - 2019/4/9

N2 - Over the past five years, topic models have been applied to bibliometrics research as an efficient tool for discovering latent and potentially useful content. The combination of topic modeling algorithms and bibliometrics has generated new challenges of interpreting and understanding the outcome of topic modeling. Motivated by these new challenges, this paper proposes a systematic methodology for topic analysis in scientific literature corpora to face the concerns of conducting post topic modeling analysis. By linking the corpus metadata with the discovered topics, we feature them with a number of topic-based analytic indices to explore their significance, developing trend, and received attention. A topic relation identification approach is then presented to quantitatively model the relations among the topics. To demonstrate the feasibility and effectiveness of our methodology, we present two case studies, using big data and dye-sensitized solar cell publications derived from searches in World of Science. Possible application of the methodology in telling good stories of a target corpus is also explored to facilitate further research management and opportunity discovery.

AB - Over the past five years, topic models have been applied to bibliometrics research as an efficient tool for discovering latent and potentially useful content. The combination of topic modeling algorithms and bibliometrics has generated new challenges of interpreting and understanding the outcome of topic modeling. Motivated by these new challenges, this paper proposes a systematic methodology for topic analysis in scientific literature corpora to face the concerns of conducting post topic modeling analysis. By linking the corpus metadata with the discovered topics, we feature them with a number of topic-based analytic indices to explore their significance, developing trend, and received attention. A topic relation identification approach is then presented to quantitatively model the relations among the topics. To demonstrate the feasibility and effectiveness of our methodology, we present two case studies, using big data and dye-sensitized solar cell publications derived from searches in World of Science. Possible application of the methodology in telling good stories of a target corpus is also explored to facilitate further research management and opportunity discovery.

KW - Analytical models

KW - Bibliometrics

KW - Market research

KW - Metadata

KW - tech mining

KW - Text mining

KW - text mining

KW - Tools

KW - topic analysis

UR - http://www.scopus.com/inward/record.url?scp=85064383368&partnerID=8YFLogxK

U2 - 10.1109/TEM.2019.2903115

DO - 10.1109/TEM.2019.2903115

M3 - Article

JO - IEEE Transactions on Engineering Management

JF - IEEE Transactions on Engineering Management

SN - 0018-9391

ER -