AN ADDITIVE INSTANCE-WISE APPROACH TO MULTICLASS MODEL INTERPRETATION

Vy Vo, Van Nguyen, Trung Le, Quan Hung Tran, Gholamreza Haffari, Seyit Camtepe, Dinh Phung

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

1 Citation (Scopus)

Abstract

Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system. A large number of interpreting methods focus on identifying explanatory input features, which generally fall into two main categories: attribution and selection. A popular attribution-based approach is to exploit local neighborhoods for learning instance-specific explainers in an additive manner. The process is thus inefficient and susceptible to poorly-conditioned samples. Meanwhile, many selection-based methods directly optimize local feature distributions in an instance-wise training framework, thereby being capable of leveraging global information from other inputs. However, they can only interpret single-class predictions and many suffer from inconsistency across different settings, due to a strict reliance on a pre-defined number of features selected. This work exploits the strengths of both methods and proposes a framework for learning local explanations simultaneously for multiple target classes. Our model explainer significantly outperforms additive and instance-wise counterparts on faithfulness with more compact and comprehensible explanations. We also demonstrate the capacity to select stable and important features through extensive experiments on various data sets and black-box model architectures.

Original languageEnglish
Title of host publicationThe Eleventh International Conference on Learning Representations
EditorsMaximilian Nickel, Mengdi Wang, Nancy F Chen, Vukosi Marivate
Place of PublicationPortland OR USA
PublisherOpenReview
Number of pages32
Publication statusPublished - 2023
EventInternational Conference on Learning Representations 2023 - Kigali, Rwanda
Duration: 1 May 20235 May 2023
Conference number: 11th
https://iclr.cc/Conferences/2023 (Website)
https://openreview.net/group?id=ICLR.cc (Proceedings)

Conference

ConferenceInternational Conference on Learning Representations 2023
Abbreviated titleICLR 2023
Country/TerritoryRwanda
CityKigali
Period1/05/235/05/23
Internet address

Cite this