Predicting the whole distribution with methods for depth data analysis demonstrated on a colorectal cancer treatment study

D. Vicendese, L. Te Marvelde, P. D. McNair, K. Whitfield, D. R. English, S. Ben Taieb, R. J. Hyndman, R. Thomas

Research output: Chapter in Book/Report/Conference proceedingConference PaperOtherpeer-review

Abstract

We demonstrate the utility of predicting the whole distribution of an outcome rather than a marginal change. We overcome inconsistent data modelling techniques in a real world problem. A model based on additive quantile regression and boosting was used to predict the whole distribution of length of hospital stay (LOS) following colorectal cancer surgery. The model also assessed the association of hospital and patient characteristics over the whole distribution of LOS. The model recovered the empirical LOS distribution. A counterfactual simulation quantified change in LOS over the whole distribution if an important associated predictor were to be varied. The model showed that important hospital and patient characteristics were differentially associated across the distribution of LOS. Model insights were much richer than just focusing on a marginal change. This method is novel for public health and epidemiological studies and could be applied in other fields of research.

Original languageEnglish
Title of host publicationStatistics and Data Science
Subtitle of host publicationResearch School on Statistics and Data Science, RSSDS 2019 Proceedings
EditorsHien Nguyen
Place of PublicationSingapore Singapore
PublisherSpringer
Pages162-182
Number of pages21
Edition1st
ISBN (Electronic)9789811519604
ISBN (Print)9789811519598
DOIs
Publication statusPublished - 2019
EventResearch School on Statistics and Data Science, RSSDS 2019 - La Trobe University, Melbourne, Australia
Duration: 24 Jul 201926 Jul 2019
Conference number: 3rd
https://sites.google.com/view/rssds2019/home

Publication series

NameCommunications in Computer and Information Science
Volume1150
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

ConferenceResearch School on Statistics and Data Science, RSSDS 2019
Abbreviated titleRSSDS 2019
CountryAustralia
CityMelbourne
Period24/07/1926/07/19
Internet address

Keywords

  • Additive quantile regression
  • Boosting
  • Density forecast
  • Machine learning

Cite this