A common problem with self-report quality-of-life questionnaires is missing data. Despite enormous care and effort to prevent it, some level of missing data is common and unavoidable. Missing data can have a detrimental impact on the data analysis. In this paper, a novel approach to imputing missing data in quality-of-life questionnaires is proposed, based on matrix and tensor decompositions. In order to illustrate and assess those methods, two datasets are considered: The first dataset contains the responses of 100 patients to a systemic lupus erythematosus-specific quality-of-life questionnaire; the other contains the responses of 43 patients to a rhino-conjunctivitis quality-of-life questionnaire. The two datasets contain almost no missing data, and for testing purposes, data entries are removed at random to have missing completely at random data. Several proportions of missing values are considered, and for each, the imputation error is assessed through k-fold cross validation. We also evaluate different imputation methods for missing at random and missing not at randomdata. The numerical results demonstrate that the proposed tensor factorization-based methods outperform standard methods in terms of root mean square error with at least 4% improvement, while the bias and variance are similar.
- Health information management
- medical information systems
- missing data imputation
- quality-of-life questionnaires
- tensor decomposition