GRADE guidelines: 18. How ROBINS-I and other tools to assess risk of bias in nonrandomized studies should be used to rate the certainty of a body of evidence

doi:10.1016/j.jclinepi.2018.01.012

Journal of Clinical Epidemiology

Volume 111, July 2019, Pages 105-114

https://doi.org/10.1016/j.jclinepi.2018.01.012 Get rights and content

Abstract

Objective

To provide guidance on how systematic review authors, guideline developers, and health technology assessment practitioners should approach the use of the risk of bias in nonrandomized studies of interventions (ROBINS-I) tool as a part of GRADE's certainty rating process.

Study Design and Setting

The study design and setting comprised iterative discussions, testing in systematic reviews, and presentation at GRADE working group meetings with feedback from the GRADE working group.

Results

We describe where to start the initial assessment of a body of evidence with the use of ROBINS-I and where one would anticipate the final rating would end up. The GRADE accounted for issues that mitigate concerns about confounding and selection bias by introducing the upgrading domains: large effects, dose-effect relations, and when plausible residual confounders or other biases increase certainty. They will need to be considered in an assessment of a body of evidence when using ROBINS-I.

Conclusions

The use of ROBINS-I in GRADE assessments may allow for a better comparison of evidence from randomized controlled trials (RCTs) and nonrandomized studies (NRSs) because they are placed on a common metric for risk of bias. Challenges remain, including appropriate presentation of evidence from RCTs and NRSs for decision-making and how to optimally integrate RCTs and NRSs in an evidence assessment.

Section snippets

GRADE's approach to rate the certainty of the evidence from observational studies

The GRADE working group has developed a widely accepted approach to rate the certainty of a body of evidence (also known as quality of evidence or confidence in evidence) in the contexts of systematic reviews, developing health-care recommendations, and supporting decisions. GRADE's approach to rating the certainty of the evidence is based on a four-level system: high, moderate, low, and very low (Table 1). This is the 18th in the ongoing series of articles describing the GRADE approach in the

Rating risk of bias in individual observational studies

Consider now the assessment of risk of bias in individual observational studies, which in the GRADE approach might lead to further rating down quality from low to very low. Investigators have developed many assessment tools for rating risk of bias in observational studies. Most of the instruments address a specific type of observational or nonrandomized design (e.g., cohort or case control) [18] and seek to determine how well, relative to a perfect observational study of that particular design,

ROBINS-I and GRADE

The arrival of ROBINS-I presents a number of opportunities for the GRADE approach. First, it offers an alternative terminology: establishing NRS rather than observational studies. Although not different in intended meaning in the GRADE approach, substituting NRS for observational studies will lead to a more transparent separation of studies based on their design. For instance, some have struggled with the classification of certain types of studies, such as nonrandomized before-after studies, as

Concerns about GRADE's approach to start an NRS at low certainty

Despite GRADE's broad acceptance in the evidence synthesis community, GRADE's initial certainty rating of outcome data from NRS as low has led to challenges for some GRADE users. First, users of GRADE may inappropriately double count the risk of confounding and selection bias, initially by starting a body of evidence from NRS as low certainty of the evidence followed by again rating down for unknown confounders (although rating down additionally for failure to accurately measure known

Certainty of evidence for a body of evidence from NRS when using ROBINS-I for assessing risk of bias in individual NRSs

Here, we provide general guidance for the use of GRADE in the context of ROBINS-I. ROBINS-I compares an assessment of an individual NRS against a target RCT. The initial description of the underlying study design, such as cohort, case-control, case series, or cross-sectional study, is not considered as a risk of bias feature in ROBINS-I. Thus, when using ROBINS-I for assessing risk of bias in NRS, given that assessment of selection bias and confounding is an integral part of the ROBINS-I tool,

What makes us confident in results of NRS and does GRADE already account for this?

At the end of the previous section, we have noted how, within current GRADE thinking, a body of evidence from NRSs may emerge from the rating exercise as moderate or high-quality evidence. We will now discuss on these issues.

Advantages

Among other features, ROBINS-I allows review authors to assess how failure to use randomization in individual studies has impacted on risk of bias. For example, ROBINS-I allows categorization of the magnitude of bias from lack of randomization through the selection and confounding bias domains, application of this assessment across risk of bias domains, and evaluation of how this differs across individual studies that address different health-care questions. Furthermore, ROBINS-I will

Unresolved issues

GRADE recognizes that there are a number of unresolved issues related to the arrival of ROBINS-I. The GRADE working group is aiming to address those in the near future. The unresolved issues are as follows:

1.
If systematic review authors use ROBINS-I, should the results from NRSs and RCTs be considered together, including potentially in a meta-analysis (Fig. 4)? If RCTs and NRSs are indeed considered together, when should they be combined? Should NRSs be used to provide more precise estimates in

Summary and next steps

Risk of bias can be best mitigated by a well-conducted RCT that balances known and unknown confounders, using the Cochrane RoB 2.0 tool or similar assessment tools for RCTs to assess risk of bias. For situations in which NRSs are used instead or in addition to RCTs, the arrival of ROBINS-I poses a number of opportunities and challenges to summarizing RoB in GRADE and raises a need for clarification about how ROBINS-I and GRADE are used together. Given the inherent limitations of studies that do

Acknowledgments

H.J.S., C.C., E.A.A., R.A.M., K.T., R.L.M., J.J.M., J.P.T.H., and G.G. are the members of the GRADE working group who contributed to writing this article. The authors would like to acknowledge the GRADE working group for input on the work.

Article history: Slides presented at GRADE meetings in Barcelona (2014), Amsterdam (2015), Philadelphia (2016), and Seoul (2016); Approved at GRADE meeting May 2017.

Authors' contributions: H.J.S. conceived and designed the article and wrote the first draft of

References (25)

G. Guyatt et al.
GRADE guidelines: 1. Introduction-GRADE evidence profiles and summary of findings tables
J Clin Epidemiol
(2011)
G.H. Guyatt et al.
GRADE guidelines: a new series of articles in the Journal of Clinical Epidemiology
J Clin Epidemiol
(2011)
G. Guyatt et al.
GRADE guidelines: 11. Making an overall rating of confidence in effect estimates for a single outcome and for all outcomes
J Clin Epidemiol
(2013)
H.J. Schunemann et al.
GRADE Guidelines: 16. GRADE evidence to decision frameworks for tests in clinical practice and public health
J Clin Epidemiol
(2016)
K.A. Thayer et al.
Using GRADE to respond to health questions with different levels of urgency
Environ Int
(2016)
G.H. Guyatt et al.
GRADE guidelines 17: assessing the risk of bias associated with missing participant outcome data in a body of evidence
J Clin Epidemiol
(2017)
R.L. Morgan et al.
GRADE: assessing the quality of evidence in environmental and occupational health
Environ Int
(2016)
G.H. Guyatt et al.
GRADE guidelines: 9. Rating up the quality of evidence
J Clin Epidemiol
(2011)
M. Hultcrantz et al.
The GRADE Working Group clarifies the construct of certainty of evidence
J Clin Epidemiol
(2017)
M.A. Puhan et al.
A GRADE Working Group approach for rating the quality of treatment effect estimates from network meta-analysis
BMJ
(2014)

H.J. Schunemann et al.

Letters, numbers, symbols and words: how to communicate grades of evidence and recommendations

CMAJ

(2003)

H.J. Schunemann et al.

GRADE: assessing the quality of evidence for diagnostic recommendations

ACP J Club

(2008)

Cited by (432)

Vitamins C, E, and β-Carotene and Risk of Type 2 Diabetes: A Systematic Review and Meta-Analysis
2024, Advances in Nutrition
A systematic review and meta-analysis was conducted to assess the relationship between the common dietary antioxidants vitamin C, vitamin E, and β-carotene and type 2 diabetes (T2D) and related traits. MEDLINE, Embase, and the Cochrane Library were searched for relevant publications up until May 2023. Studies were eligible if they had a cohort, case–control, or randomized controlled trial (RCT) design and examined dietary intake, supplementation, or circulating levels of these antioxidants as exposure, and insulin resistance, β-cell function, or T2D incidence as outcomes. Summary relative risks (RR) or mean differences (MD) with 95% confidence intervals (CI) were estimated using random-effects models. The certainty of the evidence was assessed with the Grading of Recommendations, Assessment, Development and Evaluations framework. Among 6190 screened records, 25 prospective observational studies and 15 RCTs were eligible. Inverse associations were found between dietary and circulating antioxidants and T2D (observational studies). The lowest risk was seen at intakes of 70 mg/d of vitamin C (RR: 0.76; CI: 0.61, 0.95), 12 mg/d of vitamin E (RR: 0.72; CI: 0.61, 0.86), and 4 mg/d of β-carotene (RR: 0.78; CI: 0.65, 0.94). Supplementation with vitamin E (RR: 1.01; CI: 0.93, 1.10) or β-carotene (RR: 0.98; CI: 0.90, 1.07) did not have a protective effect on T2D (RCTs), and data on vitamin C supplementation was limited. Regarding insulin resistance, higher dietary vitamin C (RR: 0.85; CI: 0.74, 0.98) and vitamin E supplementation (MD: –0.35; CI: –0.65, –0.06) were associated with a reduced risk. The certainty of evidence was high for the associations between T2D and dietary vitamin E and β-carotene, and low to moderate for other associations. In conclusion, moderate intakes of vitamins C, E, and β-carotene may lower risk of T2D by reducing insulin resistance. Lack of protection with supplementation in RCTs suggests that adequate rather than high intakes may play a role in T2D prevention. This systematic review and meta-analysis was registered in PROSPERO with registration number CRD42022343482.
A tool to assess risk of bias in non-randomized follow-up studies of exposure effects (ROBINS-E)
2024, Environment International
Observational epidemiologic studies provide critical data for the evaluation of the potential effects of environmental, occupational and behavioural exposures on human health. Systematic reviews of these studies play a key role in informing policy and practice. Systematic reviews should incorporate assessments of the risk of bias in results of the included studies.
To develop a new tool, Risk Of Bias In Non-randomized Studies - of Exposures (ROBINS-E) to assess risk of bias in estimates from cohort studies of the causal effect of an exposure on an outcome.
ROBINS-E was developed by a large group of researchers from diverse research and public health disciplines through a series of working groups, in-person meetings and pilot testing phases. The tool aims to assess the risk of bias in a specific result (exposure effect estimate) from an individual observational study that examines the effect of an exposure on an outcome. A series of preliminary considerations informs the core ROBINS-E assessment, including details of the result being assessed and the causal effect being estimated. The assessment addresses bias within seven domains, through a series of ‘signalling questions’. Domain-level judgements about risk of bias are derived from the answers to these questions, then combined to produce an overall risk of bias judgement for the result, together with judgements about the direction of bias.
ROBINS-E provides a standardized framework for examining potential biases in results from cohort studies. Future work will produce variants of the tool for other epidemiologic study designs (e.g. case-control studies). We believe that ROBINS-E represents an important development in the integration of exposure assessment, evidence synthesis and causal inference.
Natural Experiment Outcomes Studies in Rotor Wing Air Medical Transport: Systematic Review and Meta-Analysis of Before-and-After and Helicopter-Unavailable Publications From 1970 to 2022
2024, Air Medical Journal
Helicopter emergency medical services (HEMS) is widely used for prehospital and interfacility transport, but there is a paucity of HEMS outcomes data from studies using randomized controlled trial designs. In the absence of robust randomized controlled trial evidence, judgments regarding HEMS potential benefit must be informed by observational data. Within the study design set of observational analyses, the natural experiment (NE) is notable for its high potential methodologic quality; NE designs are occasionally denoted “quasi-experimental.” The aim of this study is to examine all NE outcomes studies in the HEMS literature and to discern what lessons can be learned from these potentially high-quality observational data.
HEMS NE studies were identified during the development of a new HEMS Outcomes Assessment Research Database (HOARD). HOARD was constructed using a broad-ranging search of published and gray literature resources (eg, PubMed, Embase, and Google Scholar) that used variations of the terms “helicopter EMS,” “air ambulance,” and “air medical transport.” Among the 221 studies ultimately included in HOARD, 16 NE publications describing 13 sets of observational data comprising myriad diagnostic groups were identified. Of these 16 HEMS NEs, 4 HEMS NE studies assessing trauma outcomes were used in a meta-analysis. A meta-analysis was also performed of 4 HEMS NE studies.
Although the disparity of studies (in terms of both case mix and end points) precluded the generation of a pooled effect estimate of an adjusted mortality benefit of HEMs versus ground emergency medical services, HEMS was found to be associated with outcomes improvement in 8 of the 13 cohorts.
The weight of the NE evidence supports a conclusion of some form of HEMS-mediated outcomes improvement in a variety of patient types. Meta-analysis of 4 HEMS NE studies assessing trauma outcomes generated a model with acceptable heterogeneity (I² = 43%, Q test: P = .16), which significantly (P < .01) favored HEMS use with a pooled HEMS survival odd ratio estimate of 1.66 (95% confidence interval, 1.23-2.22).
The impact of health insurance on maternal and reproductive health service utilization and financial protection in low- and lower middle-income countries: a systematic review of the evidence
2024, BMC Health Services Research
Systems for rating bodies of evidence used in systematic reviews of air pollution exposure and reproductive and children’s health: a methodological survey
2024, Environmental Health: A Global Access Science Source
Synthesis methods used to combine observational studies and randomised trials in published meta-analyses
2024, Systematic Reviews

View all citing articles on Scopus

: Conflict of interest: H.J.S. has no direct financial conflict of interest and other authors have not declared financial conflicts of interest. Part of the work has been presented at scientific conferences and at GRADE working group meetings. This article has been officially endorsed by the GRADE working group.

View full text

Original ArticleGRADE guidelines: 18. How ROBINS-I and other tools to assess risk of bias in nonrandomized studies should be used to rate the certainty of a body of evidence

Abstract

Objective

Study Design and Setting

Results

Conclusions

Section snippets

GRADE's approach to rate the certainty of the evidence from observational studies

Rating risk of bias in individual observational studies

ROBINS-I and GRADE

Concerns about GRADE's approach to start an NRS at low certainty

Certainty of evidence for a body of evidence from NRS when using ROBINS-I for assessing risk of bias in individual NRSs

What makes us confident in results of NRS and does GRADE already account for this?

Advantages

Unresolved issues

Summary and next steps

Acknowledgments

J Clin Epidemiol

J Clin Epidemiol

J Clin Epidemiol

J Clin Epidemiol

Environ Int

J Clin Epidemiol

Environ Int

J Clin Epidemiol

J Clin Epidemiol

A GRADE Working Group approach for rating the quality of treatment effect estimates from network meta-analysis

BMJ

Letters, numbers, symbols and words: how to communicate grades of evidence and recommendations

CMAJ

GRADE: assessing the quality of evidence for diagnostic recommendations

ACP J Club

Original Article
GRADE guidelines: 18. How ROBINS-I and other tools to assess risk of bias in nonrandomized studies should be used to rate the certainty of a body of evidence