Abstract
We propose a privacy-preserving Naive Bayes
classifier and apply it to the problem of private text classification.
In this setting, a party (Alice) holds a text message, while another
party (Bob) holds a classifier. At the end of the protocol, Alice will
only learn the result of the classifier applied to her text input and
Bob learns nothing. Our solution is based on Secure Multiparty
Computation (SMC). Our Rust implementation provides a fast
and secure solution for the classification of unstructured text.
Applying our solution to the case of spam detection (the solution
is generic, and can be used in any other scenario in which the
Naive Bayes classifier can be employed), we can classify an SMS
as spam or ham in less than 340ms in the case where the
dictionary size of Bob’s model includes all words (n = 5200)
and Alice’s SMS has at most m = 160 unigrams. In the case
with n = 369 and m = 8 (the average of a spam SMS in the
database), our solution takes only 21 ms.
| Original language | English |
|---|---|
| Pages (from-to) | 428-442 |
| Number of pages | 15 |
| Journal | IEEE Transactions on Information Forensics and Security |
| Volume | 17 |
| DOIs | |
| Publication status | Published - 20 Jan 2022 |
Keywords
- Additives
- Classification algorithms
- Computational modeling
- Cryptography
- Data models
- Mobile handsets
- Naive Bayes
- Privacy-Preserving Classification
- Protocols
- Secure Multiparty Computation
- Spam
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver