DAG: a general model for privacy-preserving data mining

Sin G. Teo, Jianneng Cao, Vincent C.S. Lee

Research output: Contribution to journalArticleResearchpeer-review

4 Citations (Scopus)


Secure multi-party computation (SMC) allows parties to jointly compute a function over their inputs, while keeping every input confidential. It has been extensively applied in tasks with privacy requirements, such as privacy-preserving data mining (PPDM),to learn task output and at the same time protect input data privacy. However, existing SMC-based solutions are ad-hoc – they are proposed for specific applications, and thus cannot be applied to other applications directly. To address this issue, we propose a privacy model DAG (Directed Acyclic Graph) that consists of a set of fundamental secure operators (e.g., +, -, , /, and power). Our model is general – its operators, if pipelined together, can implement various functions, even complicated ones like Naıve Bayes classifier. It is also extendable – new secure operators can be defined to expand the functions that the model supports. For case study,we have applied our DAG model to two data mining tasks: kernel regression and Naıve Bayes. Experimental results show that DAG generates outputs that are almost the same as those by non-private setting, where multiple parties simply disclose their data. The experimental results also show that our DAG model runs in acceptable time, e.g., in kernel regression, when training data size is 683,093, one prediction in non-private setting takes 5.93 sec, and that by our DAG model takes 12.38 sec.

Original languageEnglish
Pages (from-to)40-53
Number of pages14
JournalIEEE Transactions on Knowledge and Data Engineering
Issue number1
Publication statusPublished - 1 Jan 2020


  • Computational modeling
  • Cryptography
  • Data mining
  • Data models
  • Protocols
  • Task analysis

Cite this