Predicting the whole distribution with methods for depth data analysis demonstrated on a colorectal cancer treatment study

D. Vicendese, L. Te Marvelde, P. D. McNair, K. Whitfield, D. R. English, S. Ben Taieb, R. J. Hyndman, R. Thomas

Research output: Chapter in Book/Report/Conference proceedingConference PaperOtherpeer-review


We demonstrate the utility of predicting the whole distribution of an outcome rather than a marginal change. We overcome inconsistent data modelling techniques in a real world problem. A model based on additive quantile regression and boosting was used to predict the whole distribution of length of hospital stay (LOS) following colorectal cancer surgery. The model also assessed the association of hospital and patient characteristics over the whole distribution of LOS. The model recovered the empirical LOS distribution. A counterfactual simulation quantified change in LOS over the whole distribution if an important associated predictor were to be varied. The model showed that important hospital and patient characteristics were differentially associated across the distribution of LOS. Model insights were much richer than just focusing on a marginal change. This method is novel for public health and epidemiological studies and could be applied in other fields of research.

Original languageEnglish
Title of host publicationStatistics and Data Science
Subtitle of host publicationResearch School on Statistics and Data Science, RSSDS 2019 Proceedings
EditorsHien Nguyen
Place of PublicationSingapore Singapore
Number of pages21
ISBN (Electronic)9789811519604
ISBN (Print)9789811519598
Publication statusPublished - 2019
EventResearch School on Statistics and Data Science, RSSDS 2019 - La Trobe University, Melbourne, Australia
Duration: 24 Jul 201926 Jul 2019
Conference number: 3rd

Publication series

NameCommunications in Computer and Information Science
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937


ConferenceResearch School on Statistics and Data Science, RSSDS 2019
Abbreviated titleRSSDS 2019
Internet address


  • Additive quantile regression
  • Boosting
  • Density forecast
  • Machine learning

Cite this