Distributed classification for image spam detection

Research output: Contribution to journalArticle

Abstract

Spam appears in various forms and the current trend in spamming is moving towards multimedia spam objects. Image spam is a new type of spam attacks which attempts to bypass the spam filters that mostly text-based. Spamming attacks the users in many ways and these are usually countered by having a server to filter the spammers. This paper provides a fully-distributed pattern recognition system within P2P networks using the distributed associative memory tree (DASMET) algorithm to detect spam which is cost-efficient and not prone to a single point of failure, unlike the server-based systems. This algorithm is scalable for large and frequently updated data sets, and specifically designed for data sets that consist of similar occurring patterns.We have evaluated our system against centralised state-of-the-art algorithms (NN, k-NN, naive Bayes, BPNN and RBFN) and distributed P2P-based algorithms (Ivote-DPV, ensemble k-NN, ensemble naive Bayes, and P2P-GN). The experimental results show that our method is highly accurate with a 98 to 99% accuracy rate, and incurs a small number of messages—in the best-case, it requires only two messages per recall test. In summary, our experimental results show that the DAS-MET performs best with a relatively small amount of resources for the spam detection compared to other distributed methods.

LanguageEnglish
Pages13249-13278
Number of pages30
JournalMultimedia Tools and Applications
Volume77
Issue number11
DOIs
StatePublished - Jun 2018

Keywords

  • Distributed classification
  • Distributed data mining
  • Distributed pattern recognition
  • Image spam
  • P2P classification
  • P2P data mining
  • Spam detection

Cite this

@article{8a1ca8795b464cdc84bccd5a4d4645f5,
title = "Distributed classification for image spam detection",
abstract = "Spam appears in various forms and the current trend in spamming is moving towards multimedia spam objects. Image spam is a new type of spam attacks which attempts to bypass the spam filters that mostly text-based. Spamming attacks the users in many ways and these are usually countered by having a server to filter the spammers. This paper provides a fully-distributed pattern recognition system within P2P networks using the distributed associative memory tree (DASMET) algorithm to detect spam which is cost-efficient and not prone to a single point of failure, unlike the server-based systems. This algorithm is scalable for large and frequently updated data sets, and specifically designed for data sets that consist of similar occurring patterns.We have evaluated our system against centralised state-of-the-art algorithms (NN, k-NN, naive Bayes, BPNN and RBFN) and distributed P2P-based algorithms (Ivote-DPV, ensemble k-NN, ensemble naive Bayes, and P2P-GN). The experimental results show that our method is highly accurate with a 98 to 99{\%} accuracy rate, and incurs a small number of messages—in the best-case, it requires only two messages per recall test. In summary, our experimental results show that the DAS-MET performs best with a relatively small amount of resources for the spam detection compared to other distributed methods.",
keywords = "Distributed classification, Distributed data mining, Distributed pattern recognition, Image spam, P2P classification, P2P data mining, Spam detection",
author = "Amiza Amir and Bala Srinivasan and Khan, {Asad I.}",
year = "2018",
month = "6",
doi = "10.1007/s11042-017-4944-y",
language = "English",
volume = "77",
pages = "13249--13278",
journal = "Multimedia Tools and Applications",
issn = "1380-7501",
publisher = "Springer-Verlag London Ltd.",
number = "11",

}

Distributed classification for image spam detection. / Amir, Amiza; Srinivasan, Bala; Khan, Asad I.

In: Multimedia Tools and Applications, Vol. 77, No. 11, 06.2018, p. 13249-13278.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Distributed classification for image spam detection

AU - Amir,Amiza

AU - Srinivasan,Bala

AU - Khan,Asad I.

PY - 2018/6

Y1 - 2018/6

N2 - Spam appears in various forms and the current trend in spamming is moving towards multimedia spam objects. Image spam is a new type of spam attacks which attempts to bypass the spam filters that mostly text-based. Spamming attacks the users in many ways and these are usually countered by having a server to filter the spammers. This paper provides a fully-distributed pattern recognition system within P2P networks using the distributed associative memory tree (DASMET) algorithm to detect spam which is cost-efficient and not prone to a single point of failure, unlike the server-based systems. This algorithm is scalable for large and frequently updated data sets, and specifically designed for data sets that consist of similar occurring patterns.We have evaluated our system against centralised state-of-the-art algorithms (NN, k-NN, naive Bayes, BPNN and RBFN) and distributed P2P-based algorithms (Ivote-DPV, ensemble k-NN, ensemble naive Bayes, and P2P-GN). The experimental results show that our method is highly accurate with a 98 to 99% accuracy rate, and incurs a small number of messages—in the best-case, it requires only two messages per recall test. In summary, our experimental results show that the DAS-MET performs best with a relatively small amount of resources for the spam detection compared to other distributed methods.

AB - Spam appears in various forms and the current trend in spamming is moving towards multimedia spam objects. Image spam is a new type of spam attacks which attempts to bypass the spam filters that mostly text-based. Spamming attacks the users in many ways and these are usually countered by having a server to filter the spammers. This paper provides a fully-distributed pattern recognition system within P2P networks using the distributed associative memory tree (DASMET) algorithm to detect spam which is cost-efficient and not prone to a single point of failure, unlike the server-based systems. This algorithm is scalable for large and frequently updated data sets, and specifically designed for data sets that consist of similar occurring patterns.We have evaluated our system against centralised state-of-the-art algorithms (NN, k-NN, naive Bayes, BPNN and RBFN) and distributed P2P-based algorithms (Ivote-DPV, ensemble k-NN, ensemble naive Bayes, and P2P-GN). The experimental results show that our method is highly accurate with a 98 to 99% accuracy rate, and incurs a small number of messages—in the best-case, it requires only two messages per recall test. In summary, our experimental results show that the DAS-MET performs best with a relatively small amount of resources for the spam detection compared to other distributed methods.

KW - Distributed classification

KW - Distributed data mining

KW - Distributed pattern recognition

KW - Image spam

KW - P2P classification

KW - P2P data mining

KW - Spam detection

UR - http://www.scopus.com/inward/record.url?scp=85021752441&partnerID=8YFLogxK

U2 - 10.1007/s11042-017-4944-y

DO - 10.1007/s11042-017-4944-y

M3 - Article

VL - 77

SP - 13249

EP - 13278

JO - Multimedia Tools and Applications

T2 - Multimedia Tools and Applications

JF - Multimedia Tools and Applications

SN - 1380-7501

IS - 11

ER -