DAG

a general model for privacy-preserving data mining

Sin Gee Teo, Jianneng Cao, Cheng Siong Lee

Research output: Contribution to journalArticleResearchpeer-review

Abstract

SMC allows parties to jointly compute a function over their inputs, while keeping every input confidential. It has been extensively applied in tasks with privacy requirements, such as PPDM, to learn task output and at the same time protect input data privacy. However, existing SMC-based solutions are ad-hoc — they are proposed for specific applications, and thus cannot be applied to other applications directly. To address this issue, we propose a privacy model DAG that consists of a set of fundamental secure operators (e.g., +, -, ×, /, and power). Our model is general — its operators, if pipelined together, can implement various functions, even complicated ones like Naïve Bayes classifier. It is also extendable — new secure operators can be defined to expand the functions the model supports. For case study, we have applied our DAG model to two data mining tasks: kernel regression and Naïve Bayes. Experimental results show that DAG generates outputs that are almost the same as those by non-private setting, where multiple parties simply disclose their data. The experimental results also show that our DAG model runs in acceptable time.

Original languageEnglish
Number of pages14
JournalIEEE Transactions on Knowledge and Data Engineering
DOIs
Publication statusAccepted/In press - 2019

Keywords

  • Computational modeling
  • Cryptography
  • Data mining
  • Data models
  • Protocols
  • Task analysis

Cite this

@article{239c625e37494920b99e62ffc5855051,
title = "DAG: a general model for privacy-preserving data mining",
abstract = "SMC allows parties to jointly compute a function over their inputs, while keeping every input confidential. It has been extensively applied in tasks with privacy requirements, such as PPDM, to learn task output and at the same time protect input data privacy. However, existing SMC-based solutions are ad-hoc — they are proposed for specific applications, and thus cannot be applied to other applications directly. To address this issue, we propose a privacy model DAG that consists of a set of fundamental secure operators (e.g., +, -, ×, /, and power). Our model is general — its operators, if pipelined together, can implement various functions, even complicated ones like Na{\"i}ve Bayes classifier. It is also extendable — new secure operators can be defined to expand the functions the model supports. For case study, we have applied our DAG model to two data mining tasks: kernel regression and Na{\"i}ve Bayes. Experimental results show that DAG generates outputs that are almost the same as those by non-private setting, where multiple parties simply disclose their data. The experimental results also show that our DAG model runs in acceptable time.",
keywords = "Computational modeling, Cryptography, Data mining, Data models, Protocols, Task analysis",
author = "Teo, {Sin Gee} and Jianneng Cao and Lee, {Cheng Siong}",
year = "2019",
doi = "10.1109/TKDE.2018.2880743",
language = "English",
journal = "IEEE Transactions on Knowledge and Data Engineering",
issn = "1041-4347",
publisher = "IEEE, Institute of Electrical and Electronics Engineers",

}

DAG : a general model for privacy-preserving data mining. / Teo, Sin Gee; Cao, Jianneng; Lee, Cheng Siong.

In: IEEE Transactions on Knowledge and Data Engineering, 2019.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - DAG

T2 - a general model for privacy-preserving data mining

AU - Teo, Sin Gee

AU - Cao, Jianneng

AU - Lee, Cheng Siong

PY - 2019

Y1 - 2019

N2 - SMC allows parties to jointly compute a function over their inputs, while keeping every input confidential. It has been extensively applied in tasks with privacy requirements, such as PPDM, to learn task output and at the same time protect input data privacy. However, existing SMC-based solutions are ad-hoc — they are proposed for specific applications, and thus cannot be applied to other applications directly. To address this issue, we propose a privacy model DAG that consists of a set of fundamental secure operators (e.g., +, -, ×, /, and power). Our model is general — its operators, if pipelined together, can implement various functions, even complicated ones like Naïve Bayes classifier. It is also extendable — new secure operators can be defined to expand the functions the model supports. For case study, we have applied our DAG model to two data mining tasks: kernel regression and Naïve Bayes. Experimental results show that DAG generates outputs that are almost the same as those by non-private setting, where multiple parties simply disclose their data. The experimental results also show that our DAG model runs in acceptable time.

AB - SMC allows parties to jointly compute a function over their inputs, while keeping every input confidential. It has been extensively applied in tasks with privacy requirements, such as PPDM, to learn task output and at the same time protect input data privacy. However, existing SMC-based solutions are ad-hoc — they are proposed for specific applications, and thus cannot be applied to other applications directly. To address this issue, we propose a privacy model DAG that consists of a set of fundamental secure operators (e.g., +, -, ×, /, and power). Our model is general — its operators, if pipelined together, can implement various functions, even complicated ones like Naïve Bayes classifier. It is also extendable — new secure operators can be defined to expand the functions the model supports. For case study, we have applied our DAG model to two data mining tasks: kernel regression and Naïve Bayes. Experimental results show that DAG generates outputs that are almost the same as those by non-private setting, where multiple parties simply disclose their data. The experimental results also show that our DAG model runs in acceptable time.

KW - Computational modeling

KW - Cryptography

KW - Data mining

KW - Data models

KW - Protocols

KW - Task analysis

UR - http://www.scopus.com/inward/record.url?scp=85056328662&partnerID=8YFLogxK

U2 - 10.1109/TKDE.2018.2880743

DO - 10.1109/TKDE.2018.2880743

M3 - Article

JO - IEEE Transactions on Knowledge and Data Engineering

JF - IEEE Transactions on Knowledge and Data Engineering

SN - 1041-4347

ER -