DAG: a general model for privacy-preserving data mining

Sin G. Teo, Jianneng Cao, Vincent C.S. Lee

    Research output: Contribution to journalArticleResearchpeer-review

    1 Citation (Scopus)

    Abstract

    Secure multi-party computation (SMC) allows parties to jointly compute a function over their inputs, while keeping every input confidential. It has been extensively applied in tasks with privacy requirements, such as privacy-preserving data mining (PPDM),to learn task output and at the same time protect input data privacy. However, existing SMC-based solutions are ad-hoc – they are proposed for specific applications, and thus cannot be applied to other applications directly. To address this issue, we propose a privacy model DAG (Directed Acyclic Graph) that consists of a set of fundamental secure operators (e.g., +, -, , /, and power). Our model is general – its operators, if pipelined together, can implement various functions, even complicated ones like Naıve Bayes classifier. It is also extendable – new secure operators can be defined to expand the functions that the model supports. For case study,we have applied our DAG model to two data mining tasks: kernel regression and Naıve Bayes. Experimental results show that DAG generates outputs that are almost the same as those by non-private setting, where multiple parties simply disclose their data. The experimental results also show that our DAG model runs in acceptable time, e.g., in kernel regression, when training data size is 683,093, one prediction in non-private setting takes 5.93 sec, and that by our DAG model takes 12.38 sec.

    Original languageEnglish
    Pages (from-to)40-53
    Number of pages14
    JournalIEEE Transactions on Knowledge and Data Engineering
    Volume32
    Issue number1
    DOIs
    Publication statusPublished - 1 Jan 2020

    Keywords

    • Computational modeling
    • Cryptography
    • Data mining
    • Data models
    • Protocols
    • Task analysis

    Cite this