Abstract
Recently, the usage of short messaging service (SMS) or text messages have been changed gradually to product or service promotion, and even fraud. The mobile phone users in Indonesia also experience the same condition. A simple approach to address this issue is creating black list of phone numbers or certain keywords and phrases. However, this approach is inefficient because the spammer might change the phone number or change the content of the text message. Meanwhile, another approach is utilizing text classification such as Naive Bayes, k-Nearest Neighbor (kNN), and Support Vector Machine (SVM) to recognize pattern of the text messages. This research proposes Twitter-LDA algorithm to identify spam text messages in Bahasa Indonesia. There are total 985 text messages divided to 774 text messages for training dataset and 211 text messages for testing dataset. These datasets consist of 860 spam and 125 ham text messages. All the text messages should be pre-processed before the training and testing process are applied. This research conducts five experiments which yield the average of f-score is 94.26% and accuracy is 96.49%. According to this result, the Twitter-LDA algorithm has demonstrated a good performance in identifying spam text messages in Bahasa Indonesia.
Original language | English |
---|---|
Title of host publication | 2018 IEEE International Conference on Communication, Networks and Satellite (Comnetsat) - Proceedings |
Editors | Romi Fadillah Rahmat |
Place of Publication | Piscataway NJ USA |
Publisher | IEEE, Institute of Electrical and Electronics Engineers |
Pages | 1-6 |
Number of pages | 6 |
ISBN (Electronic) | 9781538667170, 9781538667163 |
ISBN (Print) | 9781538667187 |
DOIs | |
Publication status | Published - 2018 |
Event | IEEE International Conference on Communications, Networks, and Satellite (COMNETSAT) 2018 - Medan, Indonesia Duration: 15 Nov 2018 → 17 Nov 2018 Conference number: 7th https://ieeexplore.ieee.org/xpl/conhome/8681388/proceeding (Proceedings) |
Conference
Conference | IEEE International Conference on Communications, Networks, and Satellite (COMNETSAT) 2018 |
---|---|
Abbreviated title | COMNETSAT 2018 |
Country/Territory | Indonesia |
City | Medan |
Period | 15/11/18 → 17/11/18 |
Internet address |
Keywords
- short messaging service
- spam
- spam filtering
- spam text messages
- twitter-lda