Background: Continuous quality improvement (CQI) methods are widely used in healthcare; however, the effectiveness of the methods is variable, and evidence about the extent to which contextual and other factors modify effects is limited. Investigating the relationship between these factors and CQI outcomes poses challenges for those evaluating CQI, among the most complex of which relate to the measurement of modifying factors. We aimed to provide guidance to support the selection of measurement instruments by systematically collating, categorising, and reviewing quantitative self-report instruments.Methods: Data sources: We searched MEDLINE, PsycINFO, and Health and Psychosocial Instruments, reference lists of systematic reviews, and citations and references of the main report of instruments. Study selection: The scope of the review was determined by a conceptual framework developed to capture factors relevant to evaluating CQI in primary care (the InQuIRe framework). Papers reporting development or use of an instrument measuring a construct encompassed by the framework were included. Data extracted included instrument purpose; theoretical basis, constructs measured and definitions; development methods and assessment of measurement properties. Analysis and synthesis: We used qualitative analysis of instrument content and our initial framework to develop a taxonomy for summarising and comparing instruments. Instrument content was categorised using the taxonomy, illustrating coverage of the InQuIRe framework. Methods of development and evidence of measurement properties were reviewed for instruments with potential for use in primary care.Results: We identified 186 potentially relevant instruments, 152 of which were analysed to develop the taxonomy. Eighty-four instruments measured constructs relevant to primary care, with content measuring CQI implementation and use (19 instruments), organizational context (51 instruments), and individual factors (21 instruments). Forty-one instruments were included for full review. Development methods were often pragmatic, rather than systematic and theory-based, and evidence supporting measurement properties was limited.Conclusions: Many instruments are available for evaluating CQI, but most require further use and testing to establish their measurement properties. Further development and use of these measures in evaluations should increase the contribution made by individual studies to our understanding of CQI and enhance our ability to synthesise evidence for informing policy and practice.