The global popularity of microblogs has led to an increasing accumulation of large volumes of text data on microblogging platforms such as Twitter. These corpora are untapped resources to understand social expressions on diverse subjects. Microblog analysis aims to unlock the value of such expressions by discovering insights and events of significance hidden among swathes of text. Besides velocity; diversity of content, brevity, absence of structure and time-sensitivity are key challenges in microblog analysis. In this paper, we propose an unsupervised incremental machine learning and event detection technique to address these challenges. The proposed technique separates a microblog discussion into topics to address the key problem of diversity. It maintains a record of the evolution of each topic over time. Brevity, time-sensitivity and unstructured nature are addressed by these individual topic pathways which contribute to generate a temporal, topic-driven structure of a microblog discussion. The proposed event detection method continuously monitors these topic pathways using multiple domain-independent event indicators for events of significance. The autonomous nature of topic separation, topic pathway generation, new topic identification and event detection, appropriates the proposed technique for extensive applications in microblog analysis. We demonstrate these capabilities on tweets containing #microsoft and tweets containing #obama.
|Number of pages||18|
|Journal||Association for Information Science and Technology|
|Publication status||Published - 6 Jul 2017|