Privacy-preserving scoring of tree ensembles: a novel framework for AI in healthcare

Kyle Fritchman, Keerthanaa Saminathan, Rafael Dowsley, Tyler Hughes, Martine De Cock, Anderson Nascimento, Ankur Teredesai

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

6 Citations (Scopus)

Abstract

Machine Learning (ML) techniques now impact a wide variety of domains. Highly regulated industries such as healthcare and finance have stringent compliance and data governance policies around data sharing. Advances in secure multiparty computation (SMC) for privacy-preserving machine learning (PPML) can help transform these regulated industries by allowing ML computations over encrypted data with personally identifiable information (PII). Yet very little of SMC-based PPML has been put into practice so far. In this paper we present the very first framework for privacy-preserving classification of tree ensembles with application in healthcare. We first describe the underlying cryptographic protocols that enable a healthcare organization to send encrypted data securely to a ML scoring service and obtain encrypted class labels without the scoring service actually seeing that input in the clear. We then describe the deployment challenges we solved to integrate these protocols in a cloud based scalable risk-prediction platform with multiple ML models for healthcare AI. Included are system internals, and evaluations of our deployment for supporting physicians to drive better clinical outcomes in an accurate, scalable, and provably secure manner. To the best of our knowledge, this is the first such applied framework with SMC-based privacy-preserving machine learning for healthcare.

Original languageEnglish
Title of host publicationProceedings - 2018 IEEE International Conference on Big Data
EditorsNaoki Abe, Huan Liu, Calton Pu, Xiaohua Hu, Nesreen Ahmed, Mu Qiao, Yang Song, Donald Kossmann, Bing Liu, Kisung Lee, Jiliang Tang, Jingrui He, Jeffrey Saltz
Place of PublicationPiscataway NJ USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages2413-2422
Number of pages10
ISBN (Electronic)9781538650356, 9781538650349
ISBN (Print)9781538650363
DOIs
Publication statusPublished - 2019
Externally publishedYes
EventIEEE International Conference on Big Data (Big Data) 2018 - Seattle, United States of America
Duration: 10 Dec 201813 Dec 2018
https://ieeexplore.ieee.org/xpl/conhome/8610059/proceeding (Proceedings)
https://cci.drexel.edu/bigdata/bigdata2018/ (Website)

Conference

ConferenceIEEE International Conference on Big Data (Big Data) 2018
Abbreviated titleIEEE BigData 2018
CountryUnited States of America
CitySeattle
Period10/12/1813/12/18
Internet address

Keywords

  • boosted decision trees
  • encryption
  • healthcare
  • privacy-preserving machine learning
  • random forest
  • secure multiparty computation

Cite this