Article Text
Abstract
The purpose of this article, part 1 of 2 on randomised controlled trials (RCTs), is to provide readers (eg, clinicians, patients, health service and policy decision-makers) of the nutrition literature structured guidance on interpreting RCTs. Evaluation of a given RCT involves several considerations, including the potential for risk of bias, the assessment of estimates of effect and their corresponding precision, and the applicability of the evidence to one’s patient. Risk of bias refers to flaws in the design or conduct of a study that may lead to a deviation from measuring the underlying true effect of an intervention. Bias is assessed on a continuum from very low to very high (ie, definitely low to definitely high) risk of yielding estimates that do not represent true treatment-related effects and when appraising a study, judgement involves some degree of subjectivity. Specifically, when evaluating the risk of bias, one must first consider whether patient baseline characteristics (eg, age, smoking) are balanced between groups at randomisation, referred to as prognostic balance, and whether this balance is maintained throughout the study. While randomisation in sufficiently large trials may ensure prognostic balance between study arms at baseline; concealment of randomisation and blinding of participants, healthcare providers, data collectors, outcome adjudicators and data analysts to treatment allocation are needed to maintain prognostic balance between study arms after a trial begins. The status of each participant with respect to outcomes of interest must be known at the conclusion of a trial; when this is not the case, missing (lost) participant outcome data increases the likelihood that prognostic balance was not maintained at study completion. In addition, analysis of participants in the groups to which they were initially randomised (ie, intention-to-treat analysis) offers a reliable method to maintain prognostic balance. Finally, trials terminated early risk overestimating the treatment effect, especially when sample size is limited or stopping boundaries are not defined a priori.
- Dietary patterns
- Nutritional treatment
- Medical education
- Evidence based practice
- Critical appraisal
Data availability statement
Data sharing not applicable as no datasets generated and/or analysed for this study.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
Clinical scenario
You are a registered dietitian, working with a family doctor following a 62-year-old Hispanic man with a history of hypertension and dyslipidaemia. Your patient also has a family history of cardiovascular-related mortality and is taking two medications (thiazide diuretic and a statin). Recently, he lost a sibling to a fatal myocardial infarction. Given his medical history and the unexpected death of his relative, he is concerned about his own risk of a myocardial infarction and related cardiovascular outcomes. Your patient follows a Western-style diet that he considers reasonably healthy because of weekly fish and avocado intake, but he is interested in making dietary changes. He asks for your opinion on the Mediterranean-style diet, a regimen his friend recently adopted with the intention of improving his heart health. You recall having read a recent article about the various study design methods used to assess Mediterranean-style dietary interventions and you let your patient know that you will review the research conducted on this topic before his follow-up appointment with you in 2 weeks.
Step 1: finding the evidence
As a practitioner aiming to be competent in evidence-based nutrition practice,1 as a first step, you formulate the relevant question for this individual: in a patient with cardiovascular risk factors consuming a Western-style diet, does switching to Mediterranean-style diet (henceforth referred to as Mediterranean diet) reduce the risk of cardiovascular events (including myocardial infarction)? You then consider whether any recent systematic reviews or clinical practice guidelines have addressed your question (the focus of subsequent Nutrition Users’ Guides2 3). To rapidly identify a synopsis or best available evidence, you are aware of three relevant databases: UpToDate, Evidence Analysis Library and Practice-based Evidence in Nutrition (PEN) (online evidence-based clinical practice resources to inform clinical decisions). You choose PEN and type in ‘Mediterranean diet’, which reveals an article titled: ‘Diet composition: Mediterranean diet summary of recommendations and evidence’. Scrolling through the section, you identify a summary of evidence regarding the Mediterranean diet for primary prevention. As you read through, you notice the focus on the largest, landmark randomised controlled trial (RCT), assessing the Mediterranean diet for major cardiovascular outcomes: PREDIMED4 (Prevención con Dieta Mediterránea) and download the freely available article.
The PREDIMED trial included 7447 participants aged 55–80 years with either type 2 diabetes mellitus or three or more cardiovascular risk factors, but no history of cardiovascular disease. Participants were randomly allocated to a multicomponent behavioural intervention aimed at encouraging participants to consume a Mediterranean diet either enriched in extravirgin olive oil (Med+EVOO) or mixed nuts (Med+nuts), facilitated by the provision of foods to participants, or to a control group that received dietary advice aimed at achieving a low-fat dietary pattern. The median follow-up was 4.8 years. A priori, the authors specified a primary composite outcome (myocardial infarction, stroke and cardiovascular death) and secondary outcomes of each component of the composite as well as all-cause mortality.4
Step 2: using an RCT of a nutritional intervention to guide dietary choices
Having decided an article is relevant, you proceed to evaluate the related risk of bias, consider the estimates of effect and their precision and the applicability of the evidence. This article draws from the JAMA Users’ Guides series5 and outlines a structured approach for evaluating the risk of bias in RCTs of nutritional interventions (box 1). Bias is defined as a systematic deviation from the underlying truth because of a feature of the design or conduct of a research study (eg, overestimation of a treatment effect because of a failure to randomise).5
Questions to assess the potential for risk of bias
Did intervention and control groups start with the same prognosis?
1a. Was randomisation (allocation) concealed?
1b. At baseline were participants in the study groups similar with respect to known prognostic factors?
Was prognostic balance maintained as the study progressed?
2a. To what extent was the study blinded?
Were the groups balanced prognostically at study completion?
3a. Was follow-up complete?
3b. Were participants analysed in the groups to which they were randomised (intention-to-treat analysis)?
3c. Was the trial stopped early?
Adopted from JAMA Users’ Guides to the Medical Literature (2015).
1. Did intervention and control groups start with the same prognosis?
1a. Was randomisation concealed?
When those responsible for enrolling participants are aware of the arm to which the next participant will be randomised and the schedule (randomisation list) is vulnerable to manipulation, randomisation is deemed unconcealed.6–8
Consider a member of the research staff who is eager to maximise recruitment in a trial. They become aware of the randomisation schedule that dictates the next participant enrolled, participant A, will receive a usual (control) diet and the one following, participant B, will receive the Mediterranean diet. The next participant approached is willing to participate, but only if they receive Mediterranean diet. The eager team member considers the participant as B and not A, thinking they are helping the trial by making a small change. Unfortunately, such decisions compromise the randomisation schedule and consequently might introduce imbalance in the prognostic factors between study arms.
You may ask, what are prognostic factors and why it is important to achieve prognostic balance between groups by randomisation? A case in point is the example of prophylactic use of antioxidants such as vitamin C. Scientists believe that a diet rich in vitamin C decreases the risk of total and cardiovascular mortality through its antioxidant effects. Their hypothesis appeared to be supported by observational studies reporting that people who ingested larger quantities of fruits and vegetables high in vitamin C, as well as vitamin C supplement users experienced lower mortality rates compared with those who ingested smaller quantities.9 However, the largest RCT failed to demonstrate that vitamin C supplementation reduced mortality events,10 further confirmed by a Cochrane meta-analysis of antioxidant compounds and mortality of 78 RCTs involving 296 707 participants.11 People who consumed larger quantities of antioxidants did have lower cardiovascular risk, but this observation was unrelated to their antioxidant intake in RCTs. In the end—something else—perhaps overall dietary patterns, exercise, comorbidities or their genetic background—other than antioxidant supplementation was responsible for the risk difference initially detected in observational studies.11–13 We call these determinants of outcomes ‘prognostic factors’.
With a large sample size and an appropriate allocation concealment process, randomisation makes it less likely that prognostic factors are imbalanced between trial arms. Note, by p<0.05 standards, we would still expect that 5% of all baseline variables (prognostic factors) would be imbalanced, even in large trials.14 Unfortunately, if investigators fail to use optimal methods to ensure concealment (eg, using non-opaque or unsealed envelopes), prognostic factors may be imbalanced between the trial arms. The best strategy to ensure concealment is central (remote) randomisation by a third unbiased party with no further involvement with the trial (nowadays by a computerised system). In other words, individuals recruiting participants have no decision-making control over the arm to which the participant is assigned. Sealed opaque envelopes offer a less secure, but acceptable, alternative.
Randomisation (allocation) concealment and PREDIMED trial
The PREDIMED trial was retracted and republished in 2018 with reanalysed data after a report was published that acknowledged that randomisation appeared to have been subverted for 1588 of 7447 participants.15 The 2018 reanalysis reported deviation from the original randomisation plan including (1) assignment according to household rather than individual across multiple sites (n=425), (2) assignment according to clinic at one site (n=617) and (3) improper use of a randomisation table at one site (n=546).4 In the latter instance, it is possible that allocation was not concealed and investigators were aware of the next assignment based on the randomisation list. Although ‘closed envelopes’ were used to conceal randomisation in the pilot phase, authors reported that envelopes were not used after the pilot trial. Allocation concealment was probably high risk of bias.
1b. At baseline, were participants in the study groups similar with respect to known prognostic factors?
Randomisation may fail to ensure prognostic balance when sample sizes are small. Imagine a small RCT, testing a Mediterranean diet with only eight participants: four women and four men. One would not be surprised if, by chance, all women end up being allocated to the Mediterranean diet and all men were allocated to the control (usual diet) arm. In this case, trial results would be biased showing that women do better than men or vice versa (men do better than women) if sex is a powerful prognostic factor, for a particular outcome with the Mediterranean diet. Were the trial to enrol 2000 participants, one would not expect that randomisation would allocate all 1000 women to one arm and all 1000 men to the other, thus ruling out confounding by biological sex.
Typically, articles that report the findings from RCTs include a table (often, box 1) describing the baseline characteristics of the participants randomised to the intervention group(s) and the control group(s). This allows readers to assess, among other things, the extent to which randomisation facilitated balance of known prognostic factors by comparing the baseline characteristics of the two groups. For most clinical questions evaluated in RCTs, well-known prognostic factors include smoking and socioeconomic status. In well-designed and conducted nutrition trials, among others, known prognostic factors should also include both baseline dietary intakes (while noting the limitations of current methods for determining diet intake) and when possible or relevant to the intervention, indicators of baseline nutrient status if potentially valid biomarkers exist (eg, red blood cell omega-3 fatty acids status, or 25-hydroxyvitamin D status in an omega-3 or vitamin D intervention study). Prognostic factors should mostly be balanced by randomisation; however, in small studies, prognostic imbalance can bias effect estimates.
Several strategies can be applied to explore an imbalance in prognostic factors. For example, investigators can analyse adjusted for prognostic strata (eg, comparing older participants in intervention and control groups to one another, comparing younger participants in the two groups to one another and pooling the two results), an approach known as adjusted or stratified analysis.16 Investigators can also evaluate whether prognostic factors influence observed treatment effects using independent subgroup analyses (eg, subgroups based on age to evaluate whether effects differ between older and younger patients). When considering subgroup analyses, investigators should keep in mind that examining numerous prognostic factors via subgroup analysis may result in spurious and misleading evidence of effect modification, and that criteria for assessing the credibility of subgroup effects exist (ie, likelihood of claims being true and not spurious),17 discussed in part II on RCTs.18
It also should be noted that adjusted or subgroup analyses can only address known and measured prognostic factors, whereas proper randomisation helps ensure balance of all prognostic factors, both known and unknown.
Baseline prognostic factors
In the PREDIMED RCT, after authors discovered that randomisation was subverted in 1588 of 7447 participants randomised (see above), investigators conducted a propensity score analysis (a type of adjusted analysis) that considered 30 baseline participant characteristics as potential prognostic factors. Adjusted and unadjusted HRs were calculated and revealed very similar results, providing reassurance that problems with lack of concealment and potential manipulation of randomisation had not likely biased the results.4 16 19 Baseline prognostic factors were probably at low risk of bias.
2. Was prognostic balance maintained as the study progressed?
2a. To what extent was the study blinded?
While successful randomisation will ensure prognostic balance at baseline, it does not guarantee that the groups will remain balanced through to study conclusion. Blinding represents the optimal strategy to maintain prognostic balance after the trial begins, helping ensure fair comparisons between prognostically balanced study groups, groups that are not influenced by participant beliefs, adherence or differential treatment of study participants (eg, additional cointerventions) or study data based on knowledge of the intervention.
Table 1 describes five groups of individuals who should, ideally, remain blind to treatment allocation. Keeping study participants unaware of the group to which they have been allocated is important. If individuals receiving an intervention believed to be effective, irrespective of whether the intervention has a biological effect or not, participants may fare better; or this belief may make them feel better than those with no such belief. Participants may also be more inclined to adhere to an intervention and less inclined to discontinue participation when they believe their assigned intervention is effective. Similarly, blinding those caring for participants, as well as those collecting, evaluating and analysing data, reduces bias (table 1).
Investigators conducting trials that focus on modifying specific foods or dietary patterns (eg, Mediterranean diet) in real-world clinical settings can seldom, if ever, blind participants or clinicians to group assignment. In dietary interventions, the inability to blind participants or clinicians to the group assignment opens the potential that participants modify their actions or behaviours in a way unique to their group assignment, beyond the prescribed dietary changes. This introduces bias and compromises the ability to ascribe observed effects to the intended intervention (postrandomisation confounding). Study investigators may be able to blind data collectors if, for instance, participants’ medical records include no information regarding participants’ diets. Further, they can blind research team members who decide whether a participant has had an outcome of interest such as myocardial infarction (ie, outcome adjudicators) and organise data analysis by labelling groups as A and B while concealing what diets A and B represent.
In the PREDIMED trial, all endpoints were examined by an external adjudication committee whose members were unaware of the group assignments. Overall, the trial was definitely at high risk of bias for unblinding of most study groups, particularly the patients and healthcare providers.
3. Were the groups prognostically balanced at the study’s completion?
3a. Was follow-up complete?
At the conclusion of an RCT, investigators should be aware of the status of each participant with respect to the outcomes of interest. The greater the number of participants for whom such information is unavailable (we call such participants ‘lost to follow-up’ or ‘missing participant outcome data’), the greater the likelihood of biased results.20
Nutrition studies often face challenges with missing outcome data because of the long durations of follow-up that are needed to observe effects on important health outcomes, durations that can be burdensome to participants. For example, RCTs of dietary interventions for weight loss often report loss to follow-up levels that exceed 20%21 while intensive dietary interventions for preventing adenoma and carcinoma of the colon have documented levels greater than 50%.22 High levels of loss to follow-up, particularly when missing data is not a random subset of all observations, may systematically differ from available data and seriously undermine the credibility of study results as it becomes less likely that prognostic factors are evenly distributed.23
The larger the number of participants lost to follow-up in relation to the number of outcome events, the greater the risk of bias. For instance, a 5% loss to follow-up might raise limited concern in an RCT in which an outcome occurs 20% of the time. However, a 5% loss to follow-up will be much more concerning if the event rate is only 5%. In such cases, relatively small differences in the fate of those with complete follow-up and those lost to follow-up may seriously bias results. For example, an RCT randomising 652 children to either probiotic agents (326 participants) or placebo (327 participants) to prevent antibiotic-associated diarrhoea showed a 70% relative risk reduction when considering only the participants followed the study conclusion (RR [relative risk] 0.30 (95% CI 0.17 to 0.54)).24 The trial, however, suffered from substantial missing outcome data: 82 participants (25.1%) in the probiotic arm and 105 (32.1%) in the placebo arm were lost to follow-up, a number substantially greater than the number of participants that experienced the primary outcome (14 in the probiotic group and 42 in the placebo group). In a worst-case scenario (if all those lost in the intervention group experienced diarrhoea and none of those lost in the control group had diarrhoea), the results would show a twofold increase in diarrhoea, rather than a decrease with probiotics (RR 2.29 (95% CI 1.65 to 3.18)). Sensitivity analyses with more plausible assumptions regarding the likelihood of participants who were lost to follow-up to fare poorly relative to those retained (eg, those lost had twice the risk of the event vs those followed) in the study may reduce the uncertainty related to missing data.20 23 If the complete case and ‘plausible’ sensitivity analysis show similar estimates of effect, readers can be reassured of the validity of the results.
You might also consider the number of participants lost to follow-up between groups, where differences may introduce important bias. While drug interventions may be more likely to be associated with the risk of adverse events and result in more losses to follow-up in the experimental group (ie, participants discontinue because of side effects), differential loss to follow-up between groups in nutrition trials may more likely be a result of the intervention being more intensive than ‘usual diet’ or other controls. Participants lost to follow-up are typically more likely to suffer the target outcome of interest25 26 so clinicians using RCTs to guide practice should be aware of the potential of bias related to loss to follow-up.
3b. Were participants analysed in the groups to which they were randomised?
Investigators will undermine randomisation if they omit participants from the analysis who did not receive their assigned intervention (a ‘per-protocol’ analysis) or, worse yet, attribute events that occur in non-adherent intervention group participants to the comparison group (an ‘as-treated’ analysis). Per-protocol and as-treated analysis can undermine the prognostic balance provided by randomisation, thus potentially seriously biasing the results. Analysing participants in the groups to which they were randomised, known as ‘intention-to-treat’ analysis, offers a reliable method to maintain prognostic balance.20 27 28 It is important to note that intention-to-treat analyses do not help address potential confounding that can occur if lost to follow-up has resulted in an unequal distribution of prognostic factors, an issue that is more likely when substantive loss to follow-up has occurred.20 23 29 30
Our focus thus far has been on dichotomous outcomes (ie, yes/no variables such as stroke or myocardial infarction). Optimal analytical methods for dealing with missing data differ for continuous outcomes (eg, body weight, blood pressure). Simulation studies have shown that sophisticated statistical strategies (eg, multiple imputation, mixed models) do a better job of addressing missing data than either complete-case analysis or less sophisticated strategies such as assuming that study participants’ last known observation represents their status had they completed the study (last observation carried forward).30 31
Loss to follow-up and prognostic balance (intention-to-treat)
The PREDIMED trial primarily measured dichotomous outcomes, had loss to follow-up rates of 4.9% for the treatment group and 11.3% for the control group and applied intention-to-treat analyses. Participants were analysed in the groups to which they were randomised, with plausible sensitivity analyses to account for loss to follow-up. Results of the PREDIMED primary analyses were similar to those of the adjusted analysis, including adjustment for propensity scores, baseline participant characteristics, Framingham risk scores and when omitting participants suspected to have been assigned without individual randomisation. PREDIMED performed well in terms of maintaining prognostic balance related to both loss to follow-up at study completion as well as subverted randomisation, with plausible sensitivity analyses confirming the results from the primary analysis.4 For completeness of follow-up, the trial is probably at low risk of bias, while for prognostic balance related to intention-to-treat, the trial was definitely at low risk of bias.
3c. Was the trial stopped too early?
Trials that are stopped too early (ie, before enrolling the planned sample size) are at risk of substantially overestimating treatment effects. Particularly problematic are RCTs with a small number of participants and events that are stopped early when investigators observe a large benefit, or where a stopping boundary is not clearly predefined.32 33
Trials stopped early for benefit
The PREDIMED trial was stopped after a median follow-up of 4.8 years on the basis of a prespecified interim analysis, where a predefined stopping boundary for benefit was marginally crossed. Interestingly, the reanalysis, which excluded non-properly randomised participants, did not meet the p value boundary required for early stopping for each arm. This draws into question whether the effects reported for the Mediterranean diet arms in the study were inflated and whether benefit would still be present if full follow-up for the study was completed and reported, particularly based on those properly randomised.4 PREDIMED is definitely at high risk of bias for being stopped early.
Using this guide to interpret risk of bias for a trial
Returning to our opening clinical scenario, did the experimental and control groups begin and end the study with a similar prognosis? PREDIMED randomised 7447 participants and 7% (4.9% in treatment and 11.3% in control) were lost to follow-up (523 participants).4 The investigators followed the intention-to-treat principle, including all participants they had followed up in the arm to which they were randomised and reported the Mediterranean diet supplemented with nuts or extravirgin olive oil reduced the incidence of major cardiovascular events. The 2013 report19 demonstrated small between-group differences in some baseline characteristics, including fewer participants with type 2 diabetes and of female gender in the Mediterranean diet group. Given findings of diversion from true randomisation,15 the 2018 reanalysis of PREDIMED4 data provided an adjusted analysis for the baseline differences demonstrating very similar, statistically significant, results to the original 2013 analysis for the primary endpoint. Adjusted analyses for secondary endpoints were not presented fully in the 2018 reanalysis. Intention-to-treat and per-protocol analyses were presented and multiple robust methods of dealing with missing outcome data were used. It is possible that effect sizes were inflated due to early stopping based on interim analyses; of note, the reanalysis excluding non-properly randomised participants did not satisfy the p value boundary required for early stopping for each intervention arm.
The final risk of bias assessment represents a continuum from very low to very high (ie, definitely low to definitely high) risk of yielding biased estimates of effect. Inevitably, some degree of subjectivity in judgement must be accepted when evaluating studies within this spectrum. With respect to the six appraisal questions in box 1, your judgements are split (three judgements suggesting a low risk of bias and three judgements suggesting a high risk of bias).
Now that you have explored the potential for risk of bias related to your chosen RCT, in part 2 of our structured guide to interpreting RCTs,15 we will explore the magnitude of effects and the applicability of the study results to our clinical scenario.
Data availability statement
Data sharing not applicable as no datasets generated and/or analysed for this study.
Ethics statements
Patient consent for publication
Acknowledgments
We would like to thank Robin W.M. Vernooij for commenting on an early draft of the manuscript.
References
Footnotes
X @valli_claudia, @methodsnerd
Contributors BCJ and GHG conceived the paper, AA, GHG and BCJ drafted the paper, all authors reviewed and provided critical feedback and provided examples from nutrition literature that illustrate risk of bias issues. All authors reviewed and approved the final manuscript.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests BCJ has received a start-up grant from Texas A&M AgriLife Research to fund investigator-initiated research related to saturated and polyunsaturated fats. The grant was from Texas A&M AgriLife institutional funds from interest and investment earnings, not a sponsoring organisation, industry or company. BCJ also holds National Institute of Diabetes, Digestive and Kidney Diseases (NIDDK) R25 funds to support training in evidence-based nutrition practice and policy. Other authors claim no disclosures.
Provenance and peer review Not commissioned; externally peer reviewed.