There is a growing appetite for large complex databases that integrate a range of personal, socio-demographic, health, genetic and financial information on individuals. It has been argued that ‘Big Data’ will provide the necessary catalyst to advance both biomedical research and health economics and outcomes research. However, it is important that we do not succumb to being data rich but information poor. This paper discusses the benefits and challenges of building Big Data, analysing Big Data and making appropriate inferences in order to advance cancer care, using Cancer 2015 (a prospective, longitudinal, genomic cohort study in Victoria, Australia) as a case study. Cancer 2015 has been linked to State and Commonwealth reimbursement databases that have known limitations. This partly reflects the funding arrangements in Australia, a country with both public and private provision, including public funding of private healthcare, and partly the legislative frameworks that govern data linkage. Additionally, linkage is not without time delays and, as such, achieving a contemporaneous database is challenging. Despite these limitations, there is clear value in using linked data and creating Big Data. This paper describes the linked Cancer 2015 dataset, discusses estimation issues given the nature of the data and presents panel regression results that allow us to make possible inferences regarding which patient, disease, genomic and treatment characteristics explain variation in health expenditure.