MML inference of finite state automata for probabilistic spam detection

Vidya Saikrishna, David Leonard Dowe, Siddheswar Ray

    Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

    3 Citations (Scopus)

    Abstract

    MML (Minimum Message Length) has emerged as a powerful tool in inductive inference of discrete, continuous and hybrid structures. The Probabilistic Finite State Automaton (PFSA) is one such discrete structure that needs to be inferred for classes of problems in the field of Computer Science including artificial intelligence, pattern recognition and data mining. MML has also served as a viable tool in many classes of problems in the field of Machine Learning including both supervised and unsupervised learning. The classification problem is the most common among them. This research is a two-fold solution to a problem where one part focusses on the best inferred PFSA using MML and the second part focusses on the classification problem of Spam Detection. Using the best PFSA inferred in part 1, the Spam Detection theory has been tested using MML on a publicly available Enron Spam dataset. The filter was evaluated on various performance parameters like precision and recall. The evaluation was also done taking into consideration the cost of misclassification in terms of weighted accuracy rate and weighted error rate. The results of our empirical evaluation indicate the classification accuracy to be around 93 , which outperforms well-known established spam filters.
    Original languageEnglish
    Title of host publication2015 The Eighth International Conference on Advances in Pattern Recognition (ICAPR 2015)
    EditorsPartha Pratim Mohanta, Swagatam Das, P N Suganthan
    Place of PublicationPiscataway NJ USA
    PublisherIEEE, Institute of Electrical and Electronics Engineers
    Pages1 - 6
    Number of pages6
    ISBN (Print)9781479974580
    DOIs
    Publication statusPublished - 2015
    EventInternational Conference on Advances in Pattern Recognition 2015 - Electronics and Communication Sciences Unit, Indian Statistical Institute, Kolkata, India
    Duration: 4 Jan 20157 Jan 2015
    Conference number: 8th
    https://www.isical.ac.in/~icapr15/

    Conference

    ConferenceInternational Conference on Advances in Pattern Recognition 2015
    Abbreviated titleICAPR 2015
    CountryIndia
    CityKolkata
    Period4/01/157/01/15
    OtherConference Pub title (from RM) = 2015 The Eighth International Conference on Advances in Pattern Recognition (ICAPR 2015)
    ISBN: 978-1-4799-7458-0
    Internet address

    Cite this

    Saikrishna, V., Dowe, D. L., & Ray, S. (2015). MML inference of finite state automata for probabilistic spam detection. In P. P. Mohanta, S. Das, & P. N. Suganthan (Eds.), 2015 The Eighth International Conference on Advances in Pattern Recognition (ICAPR 2015) (pp. 1 - 6). Piscataway NJ USA: IEEE, Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/ICAPR.2015.7050655