TY - JOUR
T1 - Interactive graphics for visually diagnosing forest classifiers in R
AU - da Silva, Natalia
AU - Cook, Dianne
AU - Lee, Eun-Kyung
N1 - Publisher Copyright:
© 2023, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.
PY - 2023
Y1 - 2023
N2 - This article describes structuring data and constructing plots to explore forest classification models interactively. A forest classifier is an example of an ensemble since it is produced by bagging multiple trees. The process of bagging and combining results from multiple trees produces numerous diagnostics which, with interactive graphics, can provide a lot of insight into class structure in high dimensions. Various aspects of models are explored in this article, to assess model complexity, individual model contributions, variable importance and dimension reduction, and uncertainty in prediction associated with individual observations. The ideas are applied to the random forest algorithm and projection pursuit forest but could be more broadly applied to other bagged ensembles helping in the interpretability deficit of these methods. Interactive graphics are built in R using the ggplot2, plotly, and shiny packages.
AB - This article describes structuring data and constructing plots to explore forest classification models interactively. A forest classifier is an example of an ensemble since it is produced by bagging multiple trees. The process of bagging and combining results from multiple trees produces numerous diagnostics which, with interactive graphics, can provide a lot of insight into class structure in high dimensions. Various aspects of models are explored in this article, to assess model complexity, individual model contributions, variable importance and dimension reduction, and uncertainty in prediction associated with individual observations. The ideas are applied to the random forest algorithm and projection pursuit forest but could be more broadly applied to other bagged ensembles helping in the interpretability deficit of these methods. Interactive graphics are built in R using the ggplot2, plotly, and shiny packages.
KW - Ensemble model
KW - Interactive visualization
KW - Interpretable machine learning
KW - Statistical visualization
UR - http://www.scopus.com/inward/record.url?scp=85146153972&partnerID=8YFLogxK
U2 - 10.1007/s00180-023-01323-x
DO - 10.1007/s00180-023-01323-x
M3 - Article
AN - SCOPUS:85146153972
SN - 0943-4062
JO - Computational Statistics
JF - Computational Statistics
ER -