“Critical appraisal is the process of carefully and systematically examining research to judge its trustworthiness, and its value and relevance in a particular context,” (Burls, 2009).
-- Amanda Burls, Director of Postgraduate Programmes in Evidence-Based Health Care, University of Oxford
Critical appraisal, or risk of bias assessment, is an integral part of the systematic review methodology.
Bias can be introduced at any point in the research process, from study design to publication, and as such, there are many different forms of bias. (For descriptions and examples of forms of bias, see the University of Oxford’s Biases Archive). Bias or systemic error can lead to inaccurate or incomplete conclusions. Therefore it is imperative to assess possible sources of bias in the research included in your review.
Hundreds of critical appraisal tools (CATs) have been developed to help you do so. Rather than providing a comprehensive list, this page provides a short list of CATs recommended by expert review groups and health technology assessment organizations. The list is organized by study design.
Critical appraisal includes the assessment of several related features: risk of bias, quality of reporting, precision, and external validity.
Critical appraisal has always been a defining feature of the systematic review methodology. However, early critical appraisal tools were structured as 'scales' which rolled many of the previously-mentioned features into one combined score. More recently, consensus has emerged within the health sciences that, in the case of systematic reviews of interventions, critical appraisal should focus on risk of bias alone. Cochrane's risk of bias tools, RoB-2 and ROBINS-I, were developed for this purpose.
Due to the evolution of critical appraisal within the systematic review methodology, you may hear folks use the terms "critical appraisal" and "risk of bias" interchangeably. It is useful to recall the differences between these terms and other related terms.
Critical appraisal (also called: critical assessment or quality assessment) includes the assessment of several related features: risk of bias, quality of reporting, precision, and external validity.
Risk of bias is equivalent to internal validity.
Internal validity can be defined as "the extent to which the observed results represent the truth in the population we are studying and, thus, are not due to methodological errors," (Patino & Ferreira, 2018).
Quality of reporting refers to how accurately and thoroughly the study's methodology was reported.
Precision refers to random error. "Precision depends on the number of participants and (for dichotomous outcomes) the number of events in a study, and is reflected in the confidence interval around the intervention effect estimate from each study," (Cochrane Handbook).
External validity refers to generalizability; "the extent to which the results of a study can be generalized to other populations and settings," (Cochrane Handbook).
This section lists articles which have reviewed or inventoried CATs. These articles can serve more comprehensive catalogs of previously developed CATs.
Buccheri RK, Sharifi C. Critical Appraisal Tools and Reporting Guidelines for Evidence-Based Practice. Worldviews Evid Based Nurs. 2017;14(6):463-472. doi:10.1111/wvn.12258
Munthe-Kaas HM, Glenton C, Booth A, Noyes J, Lewin S. Systematic mapping of existing tools to appraise methodological strengths and limitations of qualitative research: first stage in the development of the CAMELOT tool. BMC Med Res Methodol. 2019;19(1):113. doi:10.1186/s12874-019-0728-6