How resilient is privacy-preserving machine learning towards data-driven policy? Jakarta COVID-19 patient study case

Bahrul Ilmi Nasution, Yudhistira Nugraha, Irfan Dwiki Bhaswara, Muhamad Erza Aminanto

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

Abstract

With the rise of personal data law in various countries, data privacy has recently become an essential issue. One of the well-known techniques used in overcoming privacy issues during analysis is differential privacy. However, many studies have shown that differential privacy decreased the machine learning model performance. It becomes problematic for any organization like the government to draw a policy from accurate insights from citizen statistics while maintaining citizen privacy. This study reviews differential privacy in machine learning algorithms and evaluates its performance on real COVID-19 patient data, using Jakarta, Indonesia as a case study. Besides that, we also validate our study with two additional datasets, the public Adult dataset from University of California, Irvine, and an Indonesia socioeconomic dataset. We find that using differential privacy tends to reduce accuracy and may lead to model failure in imbalanced data, particularly in more complex models such as random forests. The finding emphasizes differential privacy usage in government is practical for the trustworthy government but with distinct challenges. We discuss limitations and recommendations for any organization that works with personal data to leverage differential privacy in the future.

Original languageEnglish
Title of host publicationProceedings of the 2023 Workshop on Recent Advances in Resilient and Trustworthy ML Systems in Autonomous Networks
EditorsGregory Blanc, Takeshi Takahashi, Zonghua Zhang
Place of PublicationNew York NY USA
PublisherAssociation for Computing Machinery (ACM)
Pages5-10
Number of pages6
ISBN (Electronic)9798400702655
DOIs
Publication statusPublished - 2023
EventWorkshop on Recent Advances in Resilient and Trustworthy ML Systems in Autonomous Networks 2023: co-located with ACM CCS 2023 - Copenhagen, Denmark
Duration: 30 Nov 202330 Nov 2023
Conference number: 1st
https://dl.acm.org/doi/proceedings/10.1145/3605772 (Proceedings)

Conference

ConferenceWorkshop on Recent Advances in Resilient and Trustworthy ML Systems in Autonomous Networks 2023
Abbreviated titleARTMAN 2023
Country/TerritoryDenmark
CityCopenhagen
Period30/11/2330/11/23
Internet address

Keywords

  • covid-19
  • data-driven policy
  • machine learning
  • privacy-preserving

Cite this