Skip to main navigation Skip to search Skip to main content

Fast privacy-preserving text classification based on secure multiparty computation

Amanda Resende, Davis Railsback, Rafael Dowsley, Anderson C.A. Nascimento, Diego F. Aranha

Research output: Contribution to journalArticleResearchpeer-review

Abstract

We propose a privacy-preserving Naive Bayes

classifier and apply it to the problem of private text classification.

In this setting, a party (Alice) holds a text message, while another

party (Bob) holds a classifier. At the end of the protocol, Alice will

only learn the result of the classifier applied to her text input and

Bob learns nothing. Our solution is based on Secure Multiparty

Computation (SMC). Our Rust implementation provides a fast

and secure solution for the classification of unstructured text.

Applying our solution to the case of spam detection (the solution

is generic, and can be used in any other scenario in which the

Naive Bayes classifier can be employed), we can classify an SMS

as spam or ham in less than 340ms in the case where the

dictionary size of Bob’s model includes all words (n = 5200)

and Alice’s SMS has at most m = 160 unigrams. In the case

with n = 369 and m = 8 (the average of a spam SMS in the

database), our solution takes only 21 ms.

Original languageEnglish
Pages (from-to)428-442
Number of pages15
JournalIEEE Transactions on Information Forensics and Security
Volume17
DOIs
Publication statusPublished - 20 Jan 2022

Keywords

  • Additives
  • Classification algorithms
  • Computational modeling
  • Cryptography
  • Data models
  • Mobile handsets
  • Naive Bayes
  • Privacy-Preserving Classification
  • Protocols
  • Secure Multiparty Computation
  • Spam

Cite this