Appendix D: Methodology checklist: cohort studies

Download (PDF)

The guidelines manual: appendices B–I

Checklist

Study identification

Include author, title, reference, year of publication

Guideline topic:

Review question no:

Checklist completed by:

Circle or highlight one option for each question:

A. Selection bias (systematic differences between the comparison groups)

The method of allocation to treatment groups was unrelated to potential confounding factors (that is, the reason for participant allocation to treatment groups is not expected to affect the outcome[s] under study)

Yes

Unclear

N/A

Attempts were made within the design or analysis to balance the comparison groups for potential confounders

Yes

Unclear

N/A

The groups were comparable at baseline, including all major confounding and prognostic factors

Yes

Unclear

N/A

Based on your answers to the above, in your opinion was selection bias present? If so, what is the likely direction of its effect?

Low risk of bias

Unclear/unknown risk

High risk of bias

Likely direction of effect:

B. Performance bias (systematic differences between groups in the care provided, apart from the intervention under investigation)

The comparison groups received the same care apart from the intervention(s) studied

Yes

Unclear

N/A

Participants receiving care were kept 'blind' to treatment allocation

Yes

Unclear

N/A

Individuals administering care were kept 'blind' to treatment allocation

Yes

Unclear

N/A

Based on your answers to the above, in your opinion was performance bias present? If so, what is the likely direction of its effect?

Low risk of bias

Unclear/unknown risk

High risk of bias

Likely direction of effect:

C. Attrition bias (systematic differences between the comparison groups with respect to loss of participants)

All groups were followed up for an equal length of time (or analysis was adjusted to allow for differences in length of follow-up)

Yes

Unclear

N/A

a. How many participants did not complete treatment in each group?

b. The groups were comparable for treatment completion (that is, there were no important or systematic differences between groups in terms of those who did not complete treatment)

Yes

Unclear

N/A

a. For how many participants in each group were no outcome data available?

b. The groups were comparable with respect to the availability of outcome data (that is, there were no important or systematic differences between groups in terms of those for whom outcome data were not available)

Yes

Unclear

N/A

Based on your answers to the above, in your opinion was attrition bias present? If so, what is the likely direction of its effect?

Low risk of bias

Unclear/unknown risk

High risk of bias

Likely direction of effect:

D. Detection bias (bias in how outcomes are ascertained, diagnosed or verified)

The study had an appropriate length of follow-up

Yes

Unclear

N/A

The study used a precise definition of outcome

Yes

Unclear

N/A

A valid and reliable method was used to determine the outcome

Yes

Unclear

N/A

Investigators were kept 'blind' to participants' exposure to the intervention

Yes

Unclear

N/A

Investigators were kept 'blind' to other important confounding and prognostic factors

Yes

Unclear

N/A

Based on your answers to the above, in your opinion was detection bias present? If so, what is the likely direction of its effect?

Low risk of bias

Unclear/unknown risk

High risk of bias

Likely direction of effect:

Notes on use of Methodology checklist: cohort studies

Cohort studies are designed to answer questions about the relative effects of interventions, such as drugs, psychological therapies, operations or placebos. Such studies can include comparisons of 'test and treat strategies' involving a diagnostic test and subsequent management. This checklist does not cover comparisons of diagnostic test accuracy or questions about prognosis.

Some of the items on this checklist may need to be filled in individually for different outcomes reported by the study. It is therefore important that the systematic reviewer has a clear idea of what the important outcomes are before appraising a study. You are likely to need input from the Guideline Development Group in defining the important outcomes.

Checklist items are worded so that a 'yes' response always indicates that the study has been designed/conducted in such a way as to minimise the risk of bias for that item. An 'unclear' response to a question may arise when the item is not reported or is not reported clearly. 'N/A' should be used when a cohort study cannot give an answer of 'yes' no matter how well it has been done.

This checklist is designed to assess the internal validity of the study; that is, whether the study provides an unbiased estimate of what it claims to show. Internal validity implies that the differences observed between groups of participants allocated to different interventions may (apart from the possibility of random error) be attributed to the intervention under investigation. Biases are characteristics that are likely to make estimates of effect differ systematically from the truth.

Recording the presence and direction of bias

This checklist contains four sections (A–D), each of which addresses a potential source of bias relating to internal validity. At the end of each section you are asked to give your opinion on whether bias is present, and to estimate the likely direction of this bias – whether you think it will have increased or decreased the effect size reported by the study. It will not always be possible to determine the direction of bias, but thinking this through can help greatly in interpreting results.

A: Selection bias

Selection bias can be introduced into a study when there are systematic differences between the participants in the different treatment groups. As a result, the differences in the outcome observed may be explained by pre-existing differences between the groups rather than because of the treatment itself. For example, if the people in one group are in poorer health, then they are more likely to have a bad outcome than those in the other group, regardless of the effect of the treatment. The treatment groups should be similar at the start of the study – the only difference between the groups should be in terms of the intervention received.

The main difference between randomised trials and non-randomised studies is the potential susceptibility of the latter to selection bias. Randomisation should ensure that, apart from the intervention received, the treatment groups differ only because of random variation. However, care needs to be taken in the design and analysis of non-randomised studies to take account of potential confounding factors. There are two main ways of accounting for potential confounding factors in non-randomised studies. Firstly, participants can be allocated to treatment groups to ensure that the groups are equal with respect to the known confounders. Secondly, statistical techniques can be used within the analysis to take into account known differences between groups. Neither of these approaches is able to address unknown or unmeasurable confounding factors, and it is important to remember that measurement of known confounders is subject to error. It can rarely, if ever, be assumed that all important factors relevant to prognosis and responsiveness to treatment are known. Hence, considerable judgement is needed to assess the internal validity of non-randomised studies; clinical input may be needed to identify potential confounding factors that should be taken into consideration.

A1. The method of allocation to treatment groups was unrelated to potential confounding factors

In non-randomised studies, there will usually be a reason why participants are allocated to the treatment groups (often as a result of clinician and/or patient choice). If this reason is linked to the outcome under study, this can result in confounding by indication (where the decision to treat is influenced by some factor that is related in turn to the treatment outcome). For example, if the participants who are the most ill are selected for the treatment, then the treatment group may experience worse outcomes because of this difference between the groups at baseline. It will not always be possible to determine from the report of a study which factors influenced the allocation of participants to treatment groups.

A2. Attempts were made within the design or analysis to balance the comparison groups for potential confounders

This represents an attempt when designing the study to ensure that the groups are similar in terms of known confounding or prognostic factors, in order to optimise comparability between the treatment groups. For example, in a matched design, the controls are deliberately chosen to be equivalent to the treatment group for any potential confounding variables, such as age and sex.

An alternative approach is to use statistical techniques to adjust for known confounding factors in the analysis.

A3. The groups were comparable at baseline, including all major confounding and prognostic factors

Studies may report the distributions of potential prognostic and confounding factors in the comparison groups, or important differences in these factors may be noted.

Formal tests comparing the groups are problematic – failure to detect a difference does not mean that a difference does not exist, and multiple comparisons of factors may falsely detect some differences that are not real.

Clinical input may be needed to determine whether all likely confounders have been considered. Confounding factors may differ according to outcome, so you will need to consider potential confounding factors for each of the outcomes that are of interest to your review.

B: Performance bias

Performance bias refers to systematic differences in the care provided to the participants in the comparison groups, other than the intervention under investigation.

This may consist of additional treatment, advice or counselling, rather than a physical intervention, or even simply a belief about the effects of an intervention. If performance bias is present, it can be difficult to attribute any observed effect to the experimental treatment rather than to the other factors.

Performance bias can be more difficult to determine in non-randomised studies than in randomised studies, because the latter are likely to have been better planned and executed according to strict treatment protocols that specify standardised interventions and care. It may be particularly difficult to determine performance bias for retrospective studies, where there is usually no control over standardisation.

B1. The comparison groups received the same care apart from the intervention(s) studied

There should be no differences between the treatment groups apart from the intervention received. If some participants received additional treatment (known as 'co-intervention'), this treatment is a potential confounding factor that may compromise the results.

Blinding

Blinding (also known as masking) refers to the process of withholding information about treatment allocation or exposure status from those involved in the study who could potentially be influenced by this information. This can include participants, investigators, those administering care and those involved in data collection and analysis. If people are aware of the treatment allocation or exposure status ('unblinded'), this can bias the results of studies, either intentionally or unintentionally, through the use of other effective co-interventions, decisions about withdrawal, differential reporting of symptoms or influencing concordance with treatment. Blinding of those assessing outcomes is covered in section D on detection bias.

Blinding of participants and carers is not always possible, particularly in studies of non-drug interventions, and so performance bias may be a particular issue in these studies. It is important to think about the likely size and direction of bias caused by failure to blind.

The terms 'single blind', 'double blind' and even 'triple blind' are sometimes used in studies. Unfortunately, they are not always used consistently. Commonly, when a study is described as 'single blind', only the participants are blind to their group allocation. When both participants and investigators are blind to group allocation the study is often described as 'double blind'. It is preferable to record exactly who was blinded, if reported, to avoid misunderstanding.

B2. Participants receiving care were kept 'blind' to treatment allocation

The knowledge of assignment to a particular treatment group may affect outcomes such as a study participant's reporting of symptoms, self-use of other known interventions or even dropping out of the study.

B3. Individuals administering care were kept 'blind' to treatment allocation

If individuals who are administering the intervention and/or other care to the participant are aware of treatment allocation, they may treat participants receiving one treatment differently from those receiving the comparison treatment; for example, by offering additional co-interventions.

C: Attrition bias

Attrition refers to the loss of participants during the course of a study. Attrition bias occurs when there are systematic differences between the comparison groups with respect to participants lost, or differences between the participants lost to the study and those who remain. Attrition can occur at any point after participants have been allocated to their treatment groups. As such, it includes participants who are excluded after allocation (and may indicate a violation of eligibility criteria), those who do not complete treatment (whether or not they continue measurement) and those who do not complete outcome measurement (regardless of whether or not treatment was completed). Consideration should be given to why participants dropped out, as well as how many. Participants who dropped out of a study may differ in some significant way from those who remained as part of the study throughout. Drop-out rates and reasons for dropping out should be similar across all treatment groups. The proportion of participants excluded after allocation should be stated in the study report and the possibility of attrition bias considered within the analysis; however, these are not always reported.

C1. All groups were followed up for an equal length of time (or analysis was adjusted to allow for differences in length of follow-up)

If the comparison groups are followed up for different lengths of time, then more events are likely to occur in the group followed up for longer, distorting the comparison. This may be overcome by adjusting the denominator to take the time into account; for example by using person-years.

C2a. How many participants did not complete treatment in each group?

A very high number of participants dropping out of a study should give concern. The drop-out rate may be expected to be higher in studies conducted over a longer period of time. The drop-out rate includes people who did not even start treatment; that is, they were excluded from the study after allocation to treatment groups.

C2b. The groups were comparable for treatment completion (that is, there were no important or systematic differences between groups in terms of those who did not complete treatment)

If there are systematic differences between groups in terms of those who did not complete treatment, consider both why participants dropped out and whether any systematic differences in those who dropped out may be related to the outcome under study, such as potential confounders. Systematic differences between groups in terms of those who dropped out may also result in treatment groups that are no longer comparable with respect to potential confounding factors.

C3a. For how many participants in each group were no outcome data available?

A very high number of participants for whom no outcome data were available should give concern.

C3b. The groups were comparable with respect to the availability of outcome data (that is, there were no important or systematic differences between groups in terms of those for whom outcome data were not available)

If there are systematic differences between groups in terms of those for whom no outcome data were available, consider both why the outcome data were not available and whether there are any systematic differences between participants for whom outcome data were and were not available.

D: Detection bias (this section should be completed individually for each important relevant outcome)

The way outcomes are assessed needs to be standardised for the comparison groups; failure to 'blind' people who are assessing the outcomes can also lead to bias, particularly with subjective outcomes. Most studies report results for more than one outcome, and it is possible that detection bias may be present for some, but not all, outcomes. It is therefore recommended that this section is completed individually for each important outcome that is relevant to the guideline review question under study. To avoid biasing your review, you should identify the relevant outcomes before considering the results of the study. Clinical input may be required to identify the most important outcomes for a review.

D1. The study had an appropriate length of follow-up

The follow-up of participants after treatment should be of an adequate length to identify the outcome of interest. This is particularly important when different outcomes of interest occur early and late after an intervention. For example, after surgical interventions there is usually early harm because of side effects, with benefits apparent later on. A study that is too short will give an unbalanced assessment of the intervention.

For events occurring later, a short study will give an imprecise estimate of the effect, which may or may not also be biased. For example, a late-occurring side effect will not be detected in the treatment arm if the study is too short.

D2. The study used a precise definition of outcome

D3. A valid and reliable method was used to determine the outcome

The outcome under study should be well defined and it should be clear how the investigators determined whether participants experienced, or did not experience, the outcome. The same methods for defining and measuring outcomes should be used for all participants in the study. Often there may be more than one way of measuring an outcome (for example, physical or laboratory tests, questionnaire, reporting of symptoms). The method of measurement should be valid (that is, it measures what it claims to measure) and reliable (that is, it measures something consistently).

D4. Investigators were kept 'blind' to participants' exposure to the intervention

D5. Investigators were kept 'blind' to other important confounding and prognostic factors

In this context the 'investigators' are the individuals who are involved in making the decision about whether a participant has experienced the outcome under study. This can include those responsible for taking physical measurements and recording symptoms, even if they are not ultimately responsible for determining the outcome. Investigators can introduce bias through differences in measurement and recording of outcomes, and making biased assessments of a participant's outcome based on the collected data. The degree to which lack of blinding can introduce bias will vary depending on the method of measuring an outcome, but will be greater for more subjective outcomes, such as reporting of pain.

Physical separation of the assessment from the participant (for example, sending samples off to a laboratory) can often be considered as blind if it can be assumed that the laboratory staff are unaware of the treatment assignment.

This page was last updated: 30 November 2012