Background. Cost utility analysis permits the comparison of disparate health services by measuring outcomes in comparable units, namely, quality-adjusted life-years, which equal life-years times the utility of the health state. However, comparability is compromised when different utility instruments predict different utilities for the same health state. The present paper measures the extent of, and reason for, differences between the utilities predicted by the EQ-5D-5L, SF-6D, HUI 3, 15D, QWB, and AQoL-8D. Methods. Data were obtained from patients in seven disease areas and members of the healthy public in six countries. Differences between public and patient utilities were estimated using each of the instruments. To explain discrepancies between the estimates, the measurement scales and content of the instruments were compared. The sensitivity of instruments to independently measured health dimensions was measured in pairwise comparisons of all combinations of the instruments. Results. The difference between public and patient utilities varied with the choice of instrument by more than 50 for every disease group and in four of the seven groups by more than 100 . Discrepancies were associated with differences in both the instrument content and their measurement scales. Pairwise comparisons of instruments found that variation in the sensitivity to physical and psychosocial dimensions of health closely reflected the items in the instrument s descriptive systems. Discussion. Results indicate that instruments measure related but different constructs. They imply that commonly used instruments systematically discriminate against some classes of services, most notably mental health services. Differences in the instrument scales imply the need for transformations between the instruments to increase the comparability of measurement.