Article Text

Assessments of risk of bias in systematic reviews of observational nutritional epidemiologic studies are often not appropriate or comprehensive: a methodological study
  1. Dena Zeraatkar1,2,
  2. Alana Kohut3,
  3. Arrti Bhasin2,
  4. Rita E Morassut4,
  5. Isabella Churchill2,
  6. Arnav Gupta5,
  7. Daeria Lawson2,
  8. Anna Miroshnychenko2,
  9. Emily Sirotich2,
  10. Komal Aryal2,
  11. Maria Azab3,
  12. Joseph Beyene2 and
  13. Russell J de Souza2,6
  1. 1Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
  2. 2Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
  3. 3McMaster University, Hamilton, Ontario, Canada
  4. 4Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada
  5. 5Department of Medicine, University of Ottawa, Ottawa, Ontario, Canada
  6. 6Population Health Research Institute, Hamilton Health Sciences Corporation, Hamilton, Ontario, Canada
  1. Correspondence to Dr Russell J de Souza, Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON L8S 4L8, Canada; desouzrj{at}mcmaster.ca

Abstract

Background An essential component of systematic reviews is the assessment of risk of bias. To date, there has been no investigation of how reviews of non-randomised studies of nutritional exposures (called ‘nutritional epidemiologic studies’) assess risk of bias.

Objective To describe methods for the assessment of risk of bias in reviews of nutritional epidemiologic studies.

Methods We searched MEDLINE, EMBASE and the Cochrane Database of Systematic Reviews (Jan 2018–Aug 2019) and sampled 150 systematic reviews of nutritional epidemiologic studies.

Results Most reviews (n=131/150; 87.3%) attempted to assess risk of bias. Commonly used tools neglected to address all important sources of bias, such as selective reporting (n=25/28; 89.3%), and frequently included constructs unrelated to risk of bias, such as reporting (n=14/28; 50.0%). Most reviews (n=66/101; 65.3%) did not incorporate risk of bias in the synthesis. While more than half of reviews considered biases due to confounding and misclassification of the exposure in their interpretation of findings, other biases, such as selective reporting, were rarely considered (n=1/150; 0.7%).

Conclusion Reviews of nutritional epidemiologic studies have important limitations in their assessment of risk of bias.

  • nutritional treatment
  • nutrition assessment
  • dietary patterns

Data availability statement

Data are available in a public, open access repository (https://osf.io/WYQHE/). Data available from https://osf.io/wyqhe/.

http://creativecommons.org/licenses/by-nc/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

What this study adds

  • An essential component of systematic reviews is the assessment of risk of bias.

  • To date, there has been no empirical assessment of how systematic reviews of nutritional epidemiologic studies assess risk of bias.

  • We show that reviews of nutritional epidemiologic studies have important limitations in their assessment of risk of bias and produce recommendations for review authors.

Background

Due to the challenges of conducting randomised controlled trials (RCTs) of dietary interventions, most of the evidence in nutrition comes from non-randomised, observational studies of nutritional exposures, hereon referred to as ‘nutritional epidemiologic studies’.1–4 Clinicians, guideline developers, policymakers and researchers use systematic reviews of these studies to advise patients on optimal dietary habits, formulate recommendations and policies, and plan future research.2 5 6

Bias may arise in nutritional epidemiologic studies, and other non-randomised studies, due to confounding, inappropriate criteria for selection of participants, error in the measurement of the exposure or outcome, departures from the intended exposure, missing outcome data and selective reporting.7–10 The assessment of the validity of studies included in a systematic review and the extent to which that they might overestimate or underestimate the true effects—called risk of bias—is a critical component of the systematic review process.11 12 The assessment of risk of bias informs the evaluation of the certainty of evidence and the interpretation of review findings and failure to appropriately consider risk of bias using appropriate criteria may lead to erroneous conclusions.13–18 Prevailing guidance dictates for systematic reviews to present a rigorous and comprehensive assessment of the risk of bias of primary studies and to incorporate risk of bias assessments in the synthesis and interpretation of findings.11

While methods for the assessment of risk of bias of RCTs have been well established, criteria for the assessment of risk of bias in non-randomised studies are less clear.17–21 Further, there are unique and complex challenges to assessing the risk of bias of nutritional epidemiologic studies, such as making judgments about the validity and reliability of dietary measures.

The objective of this study was to describe and evaluate methods for the assessment of risk of bias in systematic reviews of nutritional epidemiologic studies and to propose guidance addressing major limitations. This study capitalises on the methods and data of our previously published meta-epidemiological study of systematic reviews of nutritional epidemiologic studies.6

Methods

We registered the protocol for this study at the Open Science Framework (https://osf.io/wr6uy).

Search strategy

With the help of an experienced research librarian, we searched MEDLINE and EMBASE from January 2018 to August 2019 and the Cochrane Database of Systematic Reviews from January 2018 up to February 2019 for systematic reviews of nutritional epidemiologic studies (online supplemental material 1).6

Supplemental material

Study selection

We included systematic reviews if they investigated the association between one or more nutritional exposures and health outcomes and reported on one or more epidemiologic studies.6 We defined systematic reviews as studies that explicitly described a search strategy (including at minimum databases searched) and eligibility criteria (including at minimum the exposure(s) and health outcome(s) of interest)1; epidemiologic studies as non-randomised, non-experimental studies (eg, cohort studies) that include a minimum of 500 participants2; nutritional exposures as macronutrients, micronutrients, bioactive compounds, foods, beverages or dietary patterns; and health outcomes as measures of morbidity, mortality and quality of life.6 We did not restrict eligibility based on the language of publication.6 We excluded scoping and narrative reviews, reviews of acute postprandial studies, and reviews of supplements and chemicals involuntarily consumed through the diet.6

Reviewers performed screening independently and in duplicate following calibration exercises. We resolve disagreements by discussion or by third-party adjudication. We estimated that 150 reviews will allow estimation of the prevalence of even uncommon review characteristics (ie, prevalence ∼5% of studies) with acceptable precision (ie, ±3.5%).6 18 Our sample of 150 eligible reviews was selected using a computer-generated random number sequence.

Data collection

Following calibration exercises, reviewers, working independently and in duplicate, extracted the following information from each review using a standardised and pilot-tested data collection form: research question; eligibility criteria; methods and criteria used for the assessment of risk of bias; presentation and reporting of risk of bias; details related to how assessments of risk of bias were incorporated in the analysis and the interpretation of findings. Items of the data collection form were drawn from authoritative sources that had published guidance on optimal practices for assessing risk of bias in systematic reviews, data collection forms of previous studies, and literature on methodological issues relevant to the assessment of risk of bias in non-randomised studies and nutritional epidemiologic studies.12 22–27

We collected information on any tools or criteria that included one or more items or domains that addressed the internal validity of studies or the likelihood of bias or were interpreted by review authors as indicators of bias or internal validity. In order to evaluate both appropriate and inappropriate methods by which reviews assessed risk of bias, we collected information on tools and criteria regardless of whether they were originally designed to address risk of bias or whether they were valid indicators of risk of bias. For example, some reviews applied and interpreted reporting checklists. In such cases, we still collected information on the reporting checklist if it was interpreted by the review authors as an indicator of internal validity or risk of bias. We did this because we were also interested in estimating the proportion of reviews that assess risk of bias using inappropriate methods. For reviews that also included RCTs or other experimental designs in addition to nutritional epidemiologic studies, we only collected data on the tools that were used to assess the risk of bias of nutritional epidemiologic studies.

We reviewed tools and ad hoc criteria and categorised their items and domains according to the type of biases that they addressed. We used the domains of the Cochrane ROBINS-I tool as a framework for categorisation and created additional categories as necessary.7 We classified risk of bias criteria as ad hoc when a study developed a set of criteria de novo to assess risk of bias. We classified risk of bias tools as scales if each item was assigned a numerical score and the tool yielded an overall summary score, as checklists if judgements for each item were presented individually and not aggregated with other items, and domain based if judgements were presented across domains with at least one domain composed of more than one item.28

Data synthesis and analysis

To synthesise the data, we present frequencies and percentages for dichotomous outcomes and median and IQRs for continuous outcomes.

Results

Online supplemental material 2 presents details of the selection of systematic reviews. We retrieved a total of 4267 unique records and screened a random sample of 2273 titles and abstracts and 184 full-text articles to identify a sample of 150 eligible reviews.

General characteristics of systematic reviews

Table 1 presents general characteristics of systematic reviews. Reviews were most frequently published in general nutrition journals by authors from Europe or Asia. Only a small minority of reviews were conducted to inform a particular guideline or policy decision or to fulfil the needs of a specific evidence user. A very small minority of reviews were funded by marketing/advocacy organisations or food companies and most were funded by either government agencies or institutions. Reviews most frequently reported on cancer morbidity and mortality, foods or beverages, and included a median of 15 studies and 200 000 participants. Three quarters of reviews conducted meta-analysis. Nearly all reviews included cohort studies and more half included case-control studies. More than three quarters of reviews attempted to assess the risk of bias of included studies.

Table 1

General characteristics of systematic reviews

Risk of bias methods and reporting

Table 2 presents details on the methods by which risk of bias was assessed and reported in reviews. The most commonly used tool was the Newcastle-Ottawa scale. Among reviews that used modified versions of published tools, nearly all used modifications of the Newcastle-Ottawa scale, which was modified either to include alternative response options or to be applicable to cross-sectional studies (the original Newcastle-Ottawa scale is designed only for cohort and case-control studies).29 Three reviews used modified versions of the Critical Appraisal Skills Programme (CASP) cohort study checklist,30 the NIH Quality Assessment Tool for Observational Cohort and Cross-Sectional Studies,31 and the American Diabetes Association Quality Criteria Checklist32; the former two were modified to include only a subset of the original items,33 34 and the latter included a subset of the original items as well as several additional items.35 Nearly all risk of bias tools were scales and only a minority were checklists or domain based. Only half of all reviews reported assessing risk of bias in duplicate.

Table 2

Risk of bias methods and reporting

Nearly a quarter of reviews only reported the range of risk of bias ratings across studies (eg, “The [Newcastle-Ottawa scale] scores ranged from 6 to 9.”36) rather than risk of bias ratings for each study. More than a third of reviews failed to report judgements for each risk of bias item or domain. Among reviews that reported on more than one outcome across which risk of bias could conceivably differ, risk of bias was seldom assessed separately for each outcome.

Characteristics of risk of bias tools and ad hoc criteria

Table 3 presents characteristics of the tools and ad hoc criteria that were used to assess risk of bias. One review reported using the ‘cross-sectional study quality assessment criteria’ but did not provide a reference to the tool or report any other details on the tool and so it was excluded from our analysis.37 The majority of tools addressed biases due to confounding, classification of the exposure and measurement of the outcome. Biases due to selection of the participants and missing data were addressed by approximately half of the tools and biases due to departures from the intended exposure and selection of the reported results were rarely addressed. Nearly all tools included one or more constructs unrelated to risk of bias, such as reporting, generalisability (external validity), or precision.

Table 3

Characteristics of risk of bias tools and ad hoc criteria

Incorporation of risk of bias in the synthesis of results

Table 4 presents details on how reviews incorporated assessments of risk of bias in the synthesis of results. Two reviews excluded studies at high risk of bias from meta-analysis.13 14 Less than half of reviews explored potential differences in results between studies at higher versus lower risk of bias. Among those that did, nearly all reviews conducted either subgroup analyses or meta-regressions based on the overall rating of study risk of bias. Reviews rarely detected any statistically significant differences between the results of studies at lower versus higher risk of bias. None of the reviews weighted studies in meta-analyses based on risk of bias, implemented credibility ceilings, or attempted to adjust the results of studies for bias.

Table 4

Incorporation of risk of bias in the synthesis of results

Incorporation of risk of bias in the interpretation of review findings

Table 5 presents details related to how risk of bias assessments informed the interpretation of findings. Less than one-fifth of reviews described the overall risk of bias of the body of evidence. While reviews frequently considered biases due to confounding and misclassification of the exposure as potential limitations, biases due to the selection of the participants in the study, departures from the intended exposure, missing outcome data, measurement of the outcome and selective reporting were rarely considered. Reviews rarely described or hypothesised the direction in which results may have been biased. Among the few that did, most hypothesised that results for studies at high risk of bias are likely to have been biased towards the null. Sixteen reviews evaluated the certainty of evidence using a formal system, all of which included considerations related to risk of bias, among which five downgraded the certainty of evidence due to risk of bias.

Table 5

Incorporation of risk of bias in the interpretation of review findings

Discussion

Main findings

Our investigation provides a comprehensive summary of how systematic reviews of nutritional epidemiologic studies assess and report risk of bias and how risk of bias assessments inform the synthesis of results and the interpretation of review findings.

We found that while most reviews attempted to assess risk of bias, the tools and criteria which were used often had serious limitations. For example, commonly used tools frequently neglect to address biases related to departures from the intended exposure and selective reporting and often conflate risk of bias with other study characteristics, such as reporting quality, generalisability and precision. Tools that conflate other study characteristics with risk of bias—often referred to as study quality tools—are poorly suited for the assessment of risk of bias.27 Some reviews even used reporting checklists, like the STROBE checklist, but interpreted these measures as indicators of internal validity or risk of bias. Furthermore, tools often only partially addressed certain biases. The Newcastle-Ottawa scale, for example, includes items related to the selection of participants, but it does not address all potential issues that may arise due to the suboptimal selection of participants (eg, immortal time bias, inception bias).38 39

We also found that existing tools did not provide sufficient guidance to facilitate application, particularly for nutritional epidemiologic studies. While many tools, for example, addressed bias due to classification of the exposure, none provided sufficient guidance for reviewers to make judgements regarding whether tools for measuring dietary exposures are sufficiently valid and reliable, which highlights the need for additional nutrition-specific guidance for applying risk of bias tools.

We identified serious limitations related to how reviews assessed risk of bias. Despite the possibility for risk of bias to vary across outcomes, reviews seldom assessed risk of bias for each outcome individually.7 11 Further, most reviews assigned a numerical rating of risk of bias to each study—a practice that is discouraged because it requires arbitrary assumptions about the relative weights of risk of bias items and domains.40–42

We often found the assessment of risk of bias in reviews to be of questionable validity. For example, reviews rated a median of only 0% (IQR 0% to 25.9%) of studies at high risk of bias. This finding is consistent with previous evidence suggesting that common risk of bias tools, such as the Newcastle-Ottawa Scale, poorly discriminate between studies at lower versus higher risk of bias43 but is striking since risk of bias issues are ubiquitous in nutritional epidemiology.44–46 Commonly used dietary measures in nutritional epidemiologic studies, for example, have very serious limitations.46–49 Furthermore, nutritional epidemiologic studies are usually at risk of selective reporting bias due to the virtual absence of standard practices for the registration of protocols and statistical analysis plans.44 50–52 Our findings also suggest that review authors may disregard biases that they consider to be inherent to the design of nutritional epidemiologic studies.

We identified important deficiencies related to the reporting of risk of bias. Among reviews that assessed risk of bias, for example, nearly half did not report risk of bias judgements for each item or domain of the risk of bias tool. Further, reviews rarely described the criteria that were used to judge each risk of bias item or domain. For example, while almost all tools included an item or domain addressing risk of bias related to the measurement of the exposure, criteria for classifying measures as sufficiently valid and reliable and at low risk of bias were seldom described. Such deficiencies in reporting prevent evidence users from understanding the nature and extent of biases in studies.

We found most reviews did not sufficiently address risk of bias in their synthesis of results or interpretation of findings. Only half of reviews, for example, incorporated risk of bias assessments in statistical analyses, which is important to detect potential differences in results between studies at higher versus lower risk of bias.16 While review authors often discussed the possibility of confounding and misclassification of the exposure, other important biases, such as biases due to missing data and selective reporting, were rarely discussed. Finally, review authors often neglected to make a judgement regarding the overall risk of bias of the body of the evidence, which is a critical step in evaluating the overall certainty of evidence.14

We hypothesise that reviews of RCTs addressing nutrition interventions also have limitations related to the assessment and interpretation of risk of bias. We restricted the scope of this research to reviews of nutritional epidemiologic studies because methods for the assessment of risk of bias of RCTs are better established than for non-randomised studies and there are unique challenges to assessing the risk of bias of nutritional epidemiologic studies, such as assessing the validity and reliability of dietary measures.

Relation to previous work

To our knowledge, our study is the first to evaluate methods for the assessment of risk of bias in systematic reviews of nutritional epidemiologic studies. Previous studies that have addressed the assessment of risk of bias in general biomedical reviews have also found that reviews use a range of different tools to assess the risk of bias of non-randomised studies,53–56 existing risk of bias tools do not address all important sources of bias in non-randomised studies and often include constructs that are unrelated to risk of bias,54 56 and that reviews often fail to incorporate assessments of risk of bias in the synthesis of results and interpretation of findings.57–59 Our findings add to the body of evidence that suggests advancements in methods for the assessment of risk of bias—both in nutritional epidemiology and in other fields comprised primarily of non-randomised studies—are urgently needed.

Implications and recommendations

Evidence users should be aware that risk of bias assessments in reviews of nutritional epidemiologic studies often have important limitations due to which findings from such reviews may be misleading.13–18 We have compiled a list of recommendations for review authors that describe optimal methods for the assessment, reporting and interpretation of risk of bias in reviews of nutritional epidemiologic studies (box 1). We acknowledge, however, that there is great uncertainty in optimal tools and methods for the assessment of risk of bias in nutritional epidemiology. Our recommendations provide guidance on accepted best practice in the interim until further advancements.

Box 1

Recommendations for authors of systematic reviews addressing the assessment of risk of bias of nutritional epidemiologic studies

1. Assess the risk of bias of included studies using an appropriate tool or set of criteria.

Review authors should assess the risk of bias of all included studies. Our investigation shows that there is currently no consensus among review authors on the optimal tool for the assessment of risk of bias of nutritional epidemiologic studies and that many commonly used tools have important limitations. Review authors should select a tool that addresses all potential sources of bias in non-randomised studies, including biases due to confounding, inappropriate criteria for selection of participants, error in the measurement of the exposure and outcome, departures from the intended exposure, missing outcome data and selective reporting, and that does not include constructs unrelated to risk of bias, such as reporting, precision or generalisability.11 12 Review authors should avoid using quality tools that combine risk of bias with other constructs since such tools are poor indicators of risk of bias.

The selected tool should assign studies a qualitative category, and not a quantitative score, representing the degree of risk of bias (eg, ‘low risk’, ‘moderate risk’, ‘serious risk’ and ‘critical risk’) (42). The overall study risk of bias should reflect the highest rated risk of bias item or domain (ie, a single limitation in a crucial aspect of the study should be considered sufficient to put the study at high risk of bias).

We direct review authors to the Cochrane-endorsed ROBINS-I tool, which addresses all established sources of bias in non-randomised studies, does not include unrelated constructs and which is accompanied by an additional guidance document for its implementation.7 A similar tool, called the ROBINS-E, for the assessment of risk of bias of non-randomised studies of exposures, modelled after the ROBINS-I, is currently under development.83 A preliminary version of the ROBINS-E tool is available for piloting (https://www.bristol.ac.uk/population-health-sciences/centres/cresyda/barr/riskofbias/robins-e/), which shares much of the same structure and guiding questions of the ROBINS-I. Despite concerns about their complexity and low inter-rater reliability, the ROBINS-E and ROBINS-I tools appear to be the most rigorous and comprehensive tools available for the assessment of risk of bias of non-randomised studies.84 85

Researchers in nutrition and environmental sciences are often concerned that evidence from non-randomised studies may be discounted in favour of randomised trials despite feasibility concerns with conducting rigorous trials in these fields.86 ROBINS-I, however, may be the least likely among available risk of bias tools for non-randomised studies to discount evidence from non-randomised studies since it accommodates situations in which non-randomised studies may provide high or moderate certainty evidence, similar to RCTs.87

ROBINS-I’s consideration of the magnitude of bias as ranging from low risk to critical risk may be considered another advantage to other tools that simply classify studies at low or high risk of bias. We note, however, that judgements related to the magnitude of bias or importance of bias are complex, most often not justified by empirical evidence, and difficult to make for users.40–42

2. Assess risk of bias in comparison to a ‘target’ RCT.

Bias can arise due to the actions of study investigators (eg, failure to follow up all study participants) or may be unavoidable due to constraints on how studies addressing a particular question can be designed.11 Our findings suggest that the latter category of bias may often be neglected by review authors. The assessment of risk of bias relative to a target RCT—a hypothetical RCT that may or may not be feasible, which addresses the question of interest without any features putting it at risk of bias—provides a benchmark against which risk of bias can be assessed and can ensure that biases that are inherent to the design of studies addressing particular questions are also accounted for.7 39 88 This approach is also incorporated in the ROBINS-I tool.7

Some nutritional and environmental science researchers have expressed concern with this approach since a trial of nutritional or environmental exposures may not be feasible.86 89 We emphasise that the ‘target’ RCT need not be feasible or practical. This is important because ‘target’ RCTs will not be limited by typical limitations of dietary trials such as poor adherence and attrition due to the need for long follow-up.

3. Report all criteria that were used to judge each risk of bias item or domain.

Review authors will need to make judgements regarding which study design features sufficiently protect against bias and which design features may lead to bias. For example, review authors will need to identify factors that may act as confounders for the question being addressed and will need to determine which methods for the measurement of the exposure and outcome of interest are valid and reliable. Ideally, review authors should develop criteria to make these judgements a priori to avoid risk of bias assessments from being influenced by the results of studies.

Authors of nutritional epidemiology reviews may find making judgements related to biases due to confounding and the classification of the exposure to be particularly challenging. When deciding on the list of potential confounders that should be controlled for a study to be considered at low risk of confounding bias, review authors should consider the evidence on prognostic factors of the outcome of interest and correlates of the exposure. The list of confounders should not be generated solely on the basis of confounders considered in primary studies (at least, not without some form of independent confirmation).90 We refer the reader to other sources that describe optimal methods for the selection of confounders.90–93

In making judgements regarding the risk of bias associated with the classification of the exposure, review authors can typically rate well-established biomarkers of nutritional exposures at lowest risk of bias (eg, 24-hour urinary sodium excretion for sodium intake).94 Dietary records and diaries can usually be considered more valid than recall-based methods, although all self-reported methods suffer from serious limitations.47 48 95 96 The validity of food frequency questionnaires and other recall-based methods also depends on the results of validation studies.48 A questionnaire may be sufficiently valid for some exposures and may not have been validated or may not be valid for other exposures and so review authors should look for results of validation studies specific to the exposure being investigated. Review authors should also consider food composition databases that are used to derive nutrient intake levels from food intake data. For these databases to be considered valid, they should represent the nutrient composition of the foods at the time they were consumed and account for variations.97 For example, there are wide variations in the nutrient content of different brands of fruit juices and breakfast cereals and food composition databases that do not include brand-specific information may not yield accurate nutrient intake data from these foods.

4. Conduct risk of bias assessments in duplicate.

To reduce the risk for errors, review authors should conduct risk of bias assessments in duplicate.98 99 Review authors should resolve discrepancies by discussion or, when discussion is insufficient, adjudication by a third party with expertise in methodology and nutritional epidemiology.

5. Conduct risk of bias assessments for each result used in the synthesis.

A single non-randomised study may report several numerical results representing the effect of a single nutritional exposure on a health outcome. For example, a study may report the association between multiple eligible measures of the exposure and outcome, at multiple eligible timepoints, or using several analytical specifications, which may all vary in their risk of bias. Hence, review authors should perform risk of bias assessments separately for each numerical result that is extracted and used in the synthesis.11

6. Report risk of bias judgements for all items or domains of the risk of bias tool.

For all studies, review authors should report risk of bias judgements for all items or domains of the risk of bias tool. One way in which this information can be presented is through traffic light plots that use a colour-coded system to represent risk of bias ratings across items or domains.11 Traffic light plots may also be presented adjacent to forest plots to allow evidence users to simultaneously visualise the results from studies, their relative contributions in the meta-analysis and their risk of bias. This approach allows evidence users to identify the risk of bias of the most influential studies and to identify variations in the results of studies based on risk of bias ratings.

7. Incorporate risk of bias in the synthesis of results.

Authors should address risk of bias in their synthesis of results using one or more of the following methods: (1) restricting the eligibility of studies for inclusion in the synthesis to only those that are at low risk of bias and conducting a sensitivity analysis including all studies (typically when there is sufficient evidence available from studies at low risk of bias); (2) performing subgroup analyses or meta-regressions to explore differences in results of studies at lower or higher risk of bias; or (3) reporting results based on all available studies, along with a description of the risk of bias and how and to what degree bias may have influenced the results—the latter of which is the least informative of the three strategies but is the only plausible approach when there is no variability in risk of bias across studies.11 Review authors should use either the first or third method when the synthesis is narrative rather than quantitative.

Review authors may also attempt to adjust the results of studies in an attempt to remove bias or in attempt to incorporate the uncertainty of results from non-randomised studies as part of the study precision.100 The operationalisation of these approaches, however, is difficult because it requires review authors to make assumptions about the direction and/or magnitude of biases, either based on expert opinion, which is often arbitrary, or based on empirical evidence, which is limited.

8. Incorporate risk of bias in the interpretation of review findings.

Review authors should make a judgement regarding the overall risk of bias across the body of evidence to inform the evaluation of the overall certainty of evidence. For example, the application of the GRADE approach, the most widely endorsed system for evaluating the certainty of evidence, requires review authors to make a judgement regarding whether the certainty of evidence should be downgraded for risk of bias.15 In making this judgement, review authors should consider the relative contribution of studies at higher versus lower risk of bias (ie, larger studies or studies with a greater number of events have a more significant contribution than smaller studies or studies with few events) and whether there are appreciable differences in the results of studies at higher versus lower risk of bias.101 When studies at high risk of bias contribute substantially and when there is insufficient evidence from studies at low risk of bias, review authors should express less certainty in the effect estimate. Alternatively, when studies at high risk of bias have only a small contribution or when studies at high and low risk of bias report consistent results, review authors may not need to be concerned about risk of bias. When there are appreciable differences in the results of studies at higher versus lower risk of bias, estimates from studies at lower risk of bias may be considered more credible and review conclusions may be primarily based on these lower risk of bias studies. Review authors should justify their decision whether to rate down the certainty of evidence for risk of bias and justify their concerns, if any, related to risk of bias. Review authors must also be cognisant of personal biases, including perceptions of ‘landmark’ studies by established names in the field as automatically highly credible.

9. Involve researchers with substantive knowledge of the topic.

Researchers with substantive knowledge of the topic can suggest important criteria for the assessment of each risk of bias domain (eg, the validity and reliability of dietary measures) and should be consulted in the assessment of risk of bias.

Strengths and limitations

This study summarises current methods for the assessment of risk of bias in reviews of nutritional epidemiologic studies and presents recommendations for review authors to improve risk of bias assessments in future reviews. Other strengths of this study include the duplicate assessment of review eligibility and data collection, which reduces the risk for errors.

This study also has limitations. While we identified many deficiencies and errors in the assessment of risk of bias in reviews, it is unclear the extent to which such issues may have impacted the conclusions drawn by reviewers and the implementation of evidence. Empirical evidence suggests that the failure to appropriately consider risk of bias can reduce the interpretability of review findings and even lead to misleading conclusions.13–18 It is possible, however, that evidence users use the most rigorous systematic reviews—which apply rigorous methods for the assessment of the risk of bias—in which case the impact of the issues described in our study may be negligible.

Our analysis is also limited by the possibility that review authors did not address certain biases in the assessment of risk of bias or in the interpretation of review findings because they were deemed to not be a concern in the included studies. However, failure to use comprehensive risk of bias tools combined with failure to adequately address risk of bias in the interpretation of findings leaves evidence users unable to gauge the overall risk of bias of the evidence.

Conclusion

Systematic reviews of nutritional epidemiologic studies often have important limitations related to their assessment of risk of bias. Review authors can improve risk of bias assessments in future reviews by using tools that address all potential sources of bias in non-randomised studies without including unrelated constructs, by transparently reporting all risk of bias judgements, and by incorporating risk of bias assessments in the synthesis of results and the interpretation of findings. Additional guidance for applying existing risk of bias tools to non-randomised studies, particularly nutritional epidemiologic studies, is needed.

Data availability statement

Data are available in a public, open access repository (https://osf.io/WYQHE/). Data available from https://osf.io/wyqhe/.

Ethics statements

Patient consent for publication

References

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Footnotes

  • Twitter @dena.zera

  • Contributors DZ, JB and RJdS designed the study. DZ, AK, AB, REM, IC, AG, DL, AM, ES, KA and MA collected data. DZ and AK analysed data. DZ, AK, JB and RJdS interpreted the data. DZ produced the first draft of the article. DZ, JB and RJdS provided critical revision of the article for important intellectual content. All authors approved the final version of the article. DZ and RJdS are the guarantors.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests RJ de Souza has served as an external resource person to the World Health Organization’s Nutrition Guidelines Advisory Group on trans fats, saturated fats, and polyunsaturated fats. The WHO paid for his travel and accommodation to attend meetings from 2012-2017 to present and discuss this work. He has also done contract research for the Canadian Institutes of Health Research’s Institute of Nutrition, Metabolism, and Diabetes, Health Canada, and the World Health Organization for which he received remuneration. He has received speaker’s fees from the University of Toronto, and McMaster Children’s Hospital. He has held grants from the Canadian Institutes of Health Research, Canadian Foundation for Dietetic Research, Population Health Research Institute, and Hamilton Health Sciences Corporation as a principal investigator, and is a co-investigator on several funded team grants from the Canadian Institutes of Health Research. He serves as a member of the Nutrition Science Advisory Committee to Health Canada (Government of Canada), a co-opted member of the Scientific Advisory Committee on Nutrition (SACN) Subgroup on the Framework for the Evaluation of Evidence (Public Health England), and as an independent director of the Helderleigh Foundation (Canada).

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.