3 The company's submission

The Appraisal Committee (section 7) considered evidence submitted by Takeda and a review of this submission by the Evidence Review Group (ERG; section 8).

Clinical effectiveness

3.1 The company presented evidence from GEMINI I, a study in adults with moderately to severely active ulcerative colitis whose disease had an inadequate response or lost response to immunosuppressants, corticosteroids or TNF‑alpha inhibitors, or who were intolerant to them. It was carried out in 34 countries at 211 centres; 63 centres in the USA and 2 centres in the UK. The study consisted of separate induction and maintenance trials:

  • Induction trial (double‑blind cohort): the induction trial included 374 people randomised (3:2) to have double‑blind vedolizumab (300 mg) or placebo, intravenously at weeks 0 and 2, at the same time as conventional therapy. People were assessed for clinical response (the primary outcome) at 6 weeks. Clinical response was measured using the Mayo score, which included assessment of stool frequency, rectal bleeding, an endoscopic assessment and a global assessment by a clinician. Clinical response was defined as a reduction in the Mayo score of at least 3 points and a decrease of at least 30% from baseline, with an accompanying decrease in the rectal bleeding subscore of at least 1 point or an overall rectal bleeding subscore of 1 point or less. Secondary outcomes included clinical remission (Mayo score of up to 2 points and no individual subscore greater than 1 point) and mucosal healing (defined as an endoscopic subscore of 1 point or less).

  • Induction (open‑label cohort): an additional 521 people had open‑label vedolizumab (300 mg) at weeks 0 and 2. People were assessed for clinical response (as defined above) at 6 weeks.

  • Maintenance trial: people who had taken vedolizumab and had a clinical response at week 6, from either induction cohort, could progress to the maintenance trial. There were 373 people randomised (1:1:1) to have vedolizumab every 8 weeks (n=122), every 4 weeks (n=125), or placebo every 4 weeks (n=126), for up to 52 weeks. The primary outcome for the maintenance trial was clinical remission at week 52 (remission defined as above). Secondary outcome measures included durable clinical response (response at weeks 6 and 52), durable clinical remission (remission at weeks 6 and 52), mucosal healing at week 52 and glucocorticoid‑free remission at week 52 in patients having glucocorticoids at baseline.

    Additionally, data collection continued for people who did not have a clinical response at 6 weeks in the induction study, and from the induction open‑label cohort. These people continued on their assigned study drug (vedolizumab or placebo) and were followed up until week 52.

3.2 GEMINI I included people who had moderate to severely active ulcerative colitis at baseline (Mayo score of 6 to 12). People in GEMINI I either had disease that had an inadequate response to, or could not tolerate, at least 1 of the following: an immunosuppressant (oral azathioprine or mercaptopurine), a TNF‑alpha inhibitor (infliximab), or a corticosteroid (prednisone) over the previous 5 years. During the trial, people in both treatment arms could take mesalazine, up to 30 mg prednisone (or equivalent daily) and immunosuppressants. People taking corticosteroids had a reduced dose after week 6. Across all study groups, mean age was 40.3 years, mean disease duration was 6.9 years, mean baseline Mayo score was 8.6, mean use of TNF‑alpha inhibitors before study enrolment was 48.2%, and in 41% of people treatment with a TNF‑alpha inhibitor had failed.

3.3 A number of people discontinued treatment before the end of the induction trial: 7 (3%) of those who had vedolizumab and 14 (9%) of those who had placebo. The main reason for discontinuation was lack of efficacy. During the maintenance phase 45 (37%) people having vedolizumab every 8 weeks, 41 (33%) people having vedolizumab every 4 weeks and 78 (62%) people having placebo discontinued prematurely, mostly due to lack of efficacy or disease‑related adverse events.

3.4 The company presented results for the intention‑to‑treat population, and for subgroups based on previous treatment with TNF‑alpha inhibitors (see section 3.5). In the intention‑to‑treat population, 106 (47.1%) people in the vedolizumab arm and 38 (25.5%) people in the placebo arm had a response at week 6 (percentage difference 21.7, 95% confidence interval [CI] 11.6 to 31.7, p<0.001). At week 6, 38 (16.9%) people in the vedolizumab arm and 8 (5.4%) in the placebo arm were in remission (percentage difference 11.5, 95% CI 4.7 to 18.3, p=0.001). During the maintenance phase of GEMINI I a similar proportion of people were in remission at week 52 in the 8‑weekly vedolizumab arm and 4‑weekly vedolizumab arm (51 [41.8%] people and 56 [44.8%] people respectively). Statistically significantly fewer people (20 [15.9%]) in the placebo arm were in remission at week 52 (p<0.001) compared with the vedolizumab arms. In total 69 (56.6%) people in the 8‑weekly vedolizumab arm, 65 (52.0%) people in the 4‑weekly vedolizumab arm and 30 (23.8%) people in the placebo arm had a durable clinical response (a clinical response at both week 6 and 52). The p value for the percentage difference between each dosing regimen and placebo was <0.001. Twenty‑five people (20.5%) in the 8‑weekly vedolizumab arm, 30 (24.0%) people in the 4‑weekly vedolizumab arm and 11 (8.7%) people in the placebo arm had durable clinical remission (remission at both week 6 and 52). The p value for the percentage difference between 8‑weekly vedolizumab and placebo was 0.008 and between 4‑weekly vedolizumab and placebo was 0.001.

3.5 The company presented the results for the 60% of people in the maintenance trial who had not had a TNF‑alpha inhibitor before and for the 32% of people in whom a TNF‑alpha inhibitor had failed. In the population who had not had a TNF‑alpha inhibitor before, 46% of people having 8‑weekly vedolizumab and 19% of people having placebo had clinical remission (percentage difference 26.8, 95% CI 12.4 to 41.2). In the population in whom treatment with a TNF‑alpha inhibitor had failed, 37% of people having 8‑weekly vedolizumab and 5.3% of people having placebo had remission (percentage difference 31.9, 95% CI 10.3 to 51.4).

3.6 The company presented exploratory analyses to assess for delayed response among people whose disease had not responded to treatment at week 6 and who remained in the study having vedolizumab or placebo every 4 weeks. Clinical response was assessed by the partial Mayo score (that is, the Mayo score without the sigmoidoscopy subscore). Response was defined as a reduction of at least 2 points and a decrease of at least 25% from baseline, with an accompanying decrease in the rectal bleeding subscore of at least 1 point or an absolute rectal bleeding subscore of up to 1 point. Response was achieved at week 10 and week 14 by greater proportions of people who had vedolizumab (32% [102/322] and 39% [126/322] respectively) than placebo (15% [12/82] and 21% [126/322] respectively). The recommendation in the summary of product characteristics for vedolizumab, that continued therapy for people with ulcerative colitis should be carefully reconsidered if no evidence of therapeutic benefit is observed by week 10, is based on these analyses.

3.7 Health‑related quality of life was measured in GEMINI I at week 6 in the induction trial and at weeks 30 and 52 in the maintenance trial using a variety of measures (Inflammatory Bowel Disease Questionnaire [IBDQ] total score, the EQ‑5D and the EQ‑5D visual analogue scale [VAS] scores and SF‑36). Improvements in quality of life from baseline were greater with vedolizumab than placebo at all time points, across all instruments, in the intention‑to‑treat population.

3.8 The company carried out a network meta‑analysis to estimate the relative treatment effect and safety of vedolizumab compared with the biological therapies, infliximab, adalimumab and golimumab. The studies used in the meta‑analysis included:

  • ULTRA 1, ULTRA 2 and Suzuki et al., which compared adalimumab with placebo

  • ACT 1 and ACT 2, which compared infliximab with placebo

  • PURSUIT‑SC/M, which compared golimumab with placebo, and

  • GEMINI I, which compared vedolizumab with placebo.

    The company noted that there were differences between the studies in duration, previous treatment with TNF‑alpha inhibitors and randomisation after the induction phase. The duration of the studies varied between 6 and 8 weeks for the induction phase and between 52 and 54 weeks for the maintenance phase of treatment. The only studies that included people who had previously had TNF‑alpha inhibitors were GEMINI I and ULTRA 2, and the inclusion criteria differed between these studies. GEMINI I included people in whom treatment with infliximab had failed, whereas ULTRA 2 included people whose disease had lost response to, or who could not tolerate another TNF‑alpha inhibitor, before starting adalimumab. The company commented that people in whom prior treatment with a TNF‑alpha inhibitor had failed may be less likely to have a successful response to subsequent treatment than people whose disease had lost response to, or who could not tolerate, a TNF‑alpha inhibitor. Another difference between the trials was how people were randomised after the induction phase. In GEMINI I and PURSUIT‑M, people were re‑randomised if their disease responded to treatment during the induction phase, before entering the maintenance phase of the trial. In all the other trials people were randomised at baseline (before induction treatment) only. They continued to be followed during the maintenance phase in their assigned study arm regardless of whether their disease responded to treatment in the induction phase.

3.9 The induction phase and maintenance phase data were synthesised separately by the company. The company presented data from a fixed‑effect model for a population who had not previously had a TNF‑alpha inhibitor, a population who had taken a TNF‑alpha inhibitor that had failed, and the whole population (using data from the intention‑to‑treat population in GEMINI I for vedolizumab). The company stated that golimumab and infliximab were not included in the meta‑analysis for the population in whom a TNF‑alpha inhibitor had failed because no data were available for the efficacy of these comparators in this population. The company stated that its primary analyses were the subgroup analyses. This was because the patient populations differed between the studies and the proportion of people who had and had not had previous treatment with TNF‑alpha inhibitors may affect the results.

3.10 The company presented the odds ratios, estimated from the mixed treatment comparison, for vedolizumab compared with placebo, and 2 dosing regimens for adalimumab, golimumab and infliximab, for the population who had not had a TNF‑alpha inhibitor. The odds of a clinical response and clinical remission during induction treatment were higher with vedolizumab than adalimumab used at its marketing authorisation dose (160 mg at week 0, 80 mg at week 2, 40 mg every other week thereafter; odds ratio [OR] for clinical response 1.48; 95% credible interval [CrI] 0.90 to 2.50; OR for clinical remission 2.09; 95% CrI 0.88 to 5.7). The odds of a clinical response and clinical remission during induction were higher with vedolizumab than golimumab used at its marketing authorisation dose (200 mg week 0, 100 mg week 2, 50 or 100 mg every 4 weeks thereafter; OR for response 1.04, 95% CrI 0.58 to 1.80; OR for remission 1.05, 95% CrI 0.39 to 3.1), but the credible intervals surrounding the odds ratios crossed 1. Compared with infliximab used at its marketing authorisation dose (5 mg/kg at weeks 0, 2, 6 and every 8 weeks thereafter) the odds of clinical response and clinical remission during induction treatment were lower with vedolizumab than infliximab (OR for response during induction treatment 0.64, 95% CrI 0.36 to 1.2; OR for remission 0.72, 95% CrI 0.29 to 1.9). During the maintenance phase of treatment, vedolizumab taken 8‑weekly had higher odds of clinical remission than adalimumab, golimumab and infliximab (OR 2.14, 95% CrI 0.81 to 5.82; OR 2.1, 95% CrI 0.9 to 5.32; and OR 2.93, 95% CrI 1.03 to 8.46 respectively). The credible intervals surrounding the odds ratios crossed 1.

3.11 The company presented adverse event data from:

  • GEMINI I

  • 2 further placebo‑controlled clinical trials of vedolizumab in people with Crohn's disease (GEMINI II and III) and

  • interim safety data from a single‑arm extension study evaluating the long‑term safety of vedolizumab, in people with ulcerative colitis or Crohn's disease, beyond 12 months of treatment.

    The safety population in GEMINI I was defined as people who had at least 1 dose of the study drug. Drug‑related adverse events across the trials of vedolizumab were similar between people with ulcerative colitis and people with Crohn's disease, with the most common being headache (6%), nasopharyngitis (4%), nausea (4%), arthralgia (4%), upper respiratory infection (3%) and fatigue (3%). The most common serious adverse events in people with ulcerative colitis were worsening of ulcerative colitis and abdominal pain. No cases of progressive multifocal leukoencephalopathy were reported across all trials of vedolizumab.

Evidence Review Group comments

3.12 The ERG commented on the baseline characteristics of GEMINI I, and stated that there were no relevant differences between the treatment arms during the induction or maintenance phases. However, there were differences in entry criteria (in the USA failure of an immunomodulator or TNF‑alpha inhibitor was a requirement, whereas elsewhere corticosteroid failure was sufficient for entry), and the protocol for concomitant immunosuppressant use during the study (immunosuppressant use was discontinued at week 6 in the USA but continued elsewhere). The ERG commented that it was unclear how these differences may affect the results of the trial.

3.13 The ERG considered the proportion of people who discontinued the trial. It noted that discontinuation during the induction phase was 6% and during the maintenance phase it was 44%. The ERG noted that the company had presented an intention‑to‑treat analysis and assumed that all people who discontinued treatment had not met the primary end point. The ERG stated that, in general, the validity of the study may be threatened if the proportion of people who discontinued is over 20%, and considered that the disproportionate discontinuation rates seen in the maintenance phase were a serious threat to the validity of GEMINI I.

3.14 The ERG commented that the long‑term efficacy and safety of vedolizumab and the optimum duration of therapy remained unclear. This is because, in GEMINI I, people only had vedolizumab for up to 52 weeks, and the extension study to GEMINI I is ongoing. The ERG commented that there are no data on strategies for withdrawal of vedolizumab in people having it to maintain response or remission.

3.15 The ERG noted that the company had presented data for the subgroups of people in the maintenance trial who had and had not had previous treatment with a TNF‑alpha inhibitor. However, the company had not presented the results for these subgroups for the induction trial. The ERG obtained from the clinical study report for GEMINI I the results for the 55% of people in the induction trial population who had not previously had treatment with a TNF‑alpha inhibitor and the 39% of people in whom treatment with a TNF‑alpha inhibitor had failed. In the population who had not had a TNF‑alpha inhibitor before, 69 (53.1%) people had a clinical response with vedolizumab and 20 (26.3%) people had a clinical response with placebo. In the population in whom treatment with a TNF‑alpha inhibitor had failed, 32 (39%) people had a clinical response with vedolizumab and 13 (20.6%) had a clinical response with placebo. The ERG commented that the results of all subgroup analyses should be interpreted with caution because the numbers of people in each subgroup were small and the study was not powered for these assessments. This included comparing the 4‑weekly and 8‑weekly doses of vedolizumab, and the subgroup analyses relating to prior use of TNF‑alpha inhibitors. The ERG commented that the additional post hoc delayed response analysis should also be interpreted with caution. This was because dosing frequency was increased if a clinical response was not seen by week 6 and the people who continued were not a random sample from the original induction study cohorts.

3.16 The ERG stated that the results from the network meta‑analyses were based on a fixed‑effect model rather than a random‑effects model (a fixed‑effect model assumes that the average result from each trial should be the same; a random‑effects model assumes that the average result from each trial may differ, but the average of the trial results would be the true result). The ERG highlighted that there were considerable differences between the trials included in the network meta‑analysis and that a random‑effects model would explicitly model these differences and capture the uncertainty in the true treatment effect, whereas a fixed‑effect model would underestimate the uncertainty.

3.17 The ERG noted that the trials in the network meta‑analysis had different follow‑up times, and different study designs. The ERG agreed with the company that the difference in study duration during the maintenance phase would not have a large effect on the results. The ERG considered the difference in the study designs. It noted that GEMINI I and PURSUIT‑M only included people whose disease had responded to induction treatment in the maintenance phase of the trials, and that they were re‑randomised at the start of the maintenance phase. It noted that, to allow comparison with adalimumab and infliximab, the company had accounted for this by adjusting the results of the other trials (ULTRA 2, Suzuki et al. and ACT 1), to assume that people whose disease responded at the end of the induction phase were the same as those whose disease responded at the end of the maintenance phase. The ERG stated that the people whose disease had not responded to treatment at the end of the induction phase may have a response during the maintenance phase. Therefore, using the proportion of people whose disease responded at the end of the maintenance phase may be an overestimate. The ERG considered that the effect of this was likely to be different between treatment arms. Therefore, the impact on relative treatment effect was unclear. The ERG stated that it was not clear whether the results in GEMINI I or PURSUIT‑M over‑ or underestimated the treatment effect of vedolizumab relative to the comparators in the maintenance phase.

3.18 The ERG noted that the company had presented separate network meta‑analyses for people who had and had not taken treatment with TNF‑alpha inhibitors before, without providing a full rationale for this approach. The ERG stated that the disadvantage of doing separate analyses by subgroup is that the possibility of an interaction between treatment and subgroup cannot be explored, and that this should be explored using meta‑regression. The company stated, in response to clarification questions, that performing a meta‑regression was not appropriate because there were an insufficient number of trials included in the networks. The ERG stated that without a meta‑regression analysis the company should present the predictive distribution of mean treatment effect, which incorporates extra uncertainty due to potential differences between studies.

Cost effectiveness

3.19 The company developed a new model of the induction and the maintenance phases of treatment with vedolizumab and its comparators. A decision tree structure was used to model the induction phase of treatment. The induction phase was assumed to be 6 weeks. The criterion for response was a drop in Mayo score of 3 or more. People whose disease responded remained on their assigned treatment in the maintenance phase. People whose disease did not respond, or who discontinued a biological treatment (vedolizumab, adalimumab, infliximab or golimumab) because of an adverse event were assumed to have conventional therapy in the maintenance phase. The maintenance phase of the model had a Markov structure, similar to that in NICE's technology appraisal guidance on infliximab for subacute manifestations of ulcerative colitis, and a published cost–utility analysis of infliximab compared with conventional therapy. People entered the maintenance phase in one of 3 disease severity health states (defined according to Mayo scores: 'remission' [Mayo score of 0 to 2]; 'mild' [Mayo score of 3 to 5]; and 'moderate to severe' [Mayo score of 6 to 12]), or the 'surgery' health state; depending on response at the end of the induction phase. In addition, the model included health states for, 'post‑surgical remission', 'post‑surgical complications', 'people who had discontinued treatment' and 'death'. The model considered the costs and health benefits from the perspective of the NHS and these were discounted by 3.5% per year over a time horizon of 10 years. The cycle length for the maintenance phase was 8 weeks, which the company stated was likely to be sufficient time for the Mayo scores to be relatively stable.

3.20 The company's analysis was presented for 3 populations:

  • The whole population, including people who had anti‑TNF inhibitor therapy and those who had not.

  • People who had not had TNF‑alpha inhibitor therapy.

  • People in whom TNF‑alpha inhibitor treatment had failed (that is, the disease had not responded to, or had stopped responding to, a TNF‑alpha inhibitor, or the person could not tolerate a TNF‑alpha inhibitor).

    For all 3 analyses, the comparators included conventional therapies (a combination of aminosalicylates, immunomodulators and corticosteroids) and surgery. TNF‑alpha inhibitors (infliximab, adalimumab and golimumab) were only included as comparators for the subgroup of people who had not had TNF‑alpha inhibitors before. Efficacy data from the intention‑to‑treat population in GEMINI I were used to model the costs and benefits of vedolizumab in the whole population. For the population who had not had TNF‑alpha inhibitors before, data from the company's network meta‑analysis were used. Efficacy data from the population in whom TNF‑alpha inhibitors had failed in GEMINI I were used to model costs and benefits of vedolizumab in this population.

3.21 In the model it was assumed that response to induction treatment would be assessed at 6 weeks based on when it was assessed in GEMINI I. The company noted that the trials for infliximab and adalimumab measured response at week 8, but for the purposes of the modelling it was assumed that response at week 6 would be equivalent to that seen at week 8. The number of doses people had during the induction phase was also assumed to be the same as in the clinical trials on which the efficacy estimates were based. This meant that people having vedolizumab or golimumab had 2 doses (at weeks 0 and 2), people having adalimumab had 4 doses (at weeks 0, 2, 4 and 6) and people having infliximab had 3 doses (at weeks 0, 2 and 6) during the induction period. The company tested a scenario in which response was assessed at week 10 after 3 doses of vedolizumab. The company stated that this may reflect clinical practice where the decision to continue with treatment is made later (see section 3.6). It was assumed that people having vedolizumab or TNF‑alpha inhibitors were treated with conventional therapy at the same time, but at a lower dosage with half the costs than if conventional therapy was their only treatment.

3.22 To obtain the probability of moving between health states, or remaining in the remission, mild or moderate to severe health states during the maintenance phase, the company used a calibration approach. The company used data from GEMINI I on the proportion of people in remission, or with moderate to severe ulcerative colitis at the end of the induction treatment (6 weeks), and the proportion of people whose disease responded, or were in remission, at the end of the maintenance period (52 weeks) to estimate the probability of moving between the health states during the first year of maintenance treatment. These transition probabilities were assumed to remain constant over time and were applied to each subsequent year in the model. To calculate the estimates and calibrate the model the company applied the following constraints:

  • No more than 99.5% of people would remain in remission in each weekly cycle.

  • No more than 20% of people with mild disease would enter remission.

  • More people would remain in the mild health state than enter the moderate to severe health state; more people would remain in the moderate to severe health state than move to the mild health state.

  • People would not move directly from remission to the moderate to severe health state and vice versa.

  • The sum of the transition probabilities would equal 1.

3.23 In the model, people could progress to have surgery if their disease did not respond to induction treatment, or if they had moderate to severe ulcerative colitis during the maintenance phase. Once in the surgery and post‑surgery health states, treatment was discontinued for the rest of the person's lifetime. It was assumed that 40% of people having surgery would have a proctocolectomy with ileostomy (to create a surgical opening of the digestive tract [stoma] in the abdomen to bypass the rectum) and 60% would have subtotal proctocolectomy with pouch formation with or without loop ileostomy. After surgery, some people had complications, needed additional surgeries or remained in post‑surgical remission. The company obtained the transition probabilities from surgery and the post‑surgery health states from a review of published literature.

3.24 In the model it was assumed some people would discontinue treatment with vedolizumab, adalimumab, infliximab or golimumab. Treatment was discontinued if people had not had a response by the end of the induction phase or if there were adverse events at any time. The data for discontinuation and for adverse event rates were obtained from the relevant clinical trials for each treatment. For people who continued treatment, the treatment with biological therapy (vedolizumab, infliximab, adalimumab or golimumab) was assumed be at most 1 year, after which people switched to conventional therapy. People who had conventional therapy were assumed to only discontinue treatment if they needed surgery. The Markov model did not include the option of discontinuing treatment temporarily, because of a lack of data on treatment breaks for all comparators. The company stated that the clinical trials results would capture the effect of any temporary discontinuation.

3.25 During the maintenance phase of the model people could die while in any health state at any time. The probability of dying was estimated using age‑ and sex‑specific all‑cause mortality from the UK (Office for National Statistics, 2011). This was adjusted for disease severity, surgery, and post‑surgery remission and complications, to incorporate an increased risk of mortality associated with moderate to severe disease and surgery.

3.26 To estimate utility values for the health states in the model the company did a post‑hoc analysis of EQ‑5D data from the maintenance phase of GEMINI I. It used the combined data from people who had vedolizumab or placebo, and from all time points at which data were collected. The scores were grouped according to whether they were in remission (Mayo score 0–2), had mild disease (Mayo score 3–5) or had moderate to severe disease (Mayo score 6–12). Surgery outcomes were not assessed in GEMINI I, and utility values associated with the surgery and post‑surgery health states were taken from a study by Punekar and Hawkins. This study reported EQ‑5D data collected from UK patients with UK tariffs applied to the EQ‑5D scores. The utility values used in the company's base case for each health state were 0.86 'remission'; 0.80 'mild'; 0.68 'moderate to severe'; 0.42 'surgery'; 0.60 'post‑surgery remission'; and 0.42 'post‑surgery complications'. The company also assigned utility decrements for certain adverse events. The rates of adverse events were obtained from the clinical trials.

3.27 The model used the NHS list price for adalimumab; golimumab and infliximab and the discounted patient access scheme price of vedolizumab. The company estimated a weighted average cost of conventional therapy including a combination of aminosalicylates, corticosteroids and immunosuppressants (azathioprine, mercaptopurine and methotrexate). The proportion of each drug used was based on clinical expert opinion. The cost of conventional therapies was based on costs and dosing regimens in the 'British national formulary' (BNF, December 2013) and was £204.80 for 8 weeks of treatment. The company assumed that the costs of conventional therapy would be halved if taken with vedolizumab, adalimumab, infliximab or golimumab rather than if conventional therapies were the only treatment taken by a person.

3.28 Resource costs in the model included the costs of consultant visits, blood tests, and elective and emergency endoscopy, which were based on NHS reference costs 2011–12. The cost of surgery was assumed to be £13,577.27. The frequency of resource use in each health state was based on a study by Tsai and others. An additional cost of £308 for intravenous infusion was applied to vedolizumab and infliximab at each administration visit (payment by results tariff 2012–13).

3.29 The company presented deterministic base‑case results for the 3 populations it modelled (see section 3.20). The company presented deterministic pairwise comparisons of the incremental cost effectiveness ratio (ICER) for vedolizumab with each comparator separately. It did not present a fully incremental analysis, nor did it present probabilistic ICERs.

  • For the whole population, vedolizumab dominated surgery (it was less costly and more effective). The ICER for vedolizumab compared with conventional therapy was £33,297 per quality‑adjusted life year (QALY) gained.

  • In the population who had not had TNF‑alpha inhibitors before, vedolizumab dominated infliximab, golimumab and surgery. Vedolizumab was associated with an ICER of £6634 per QALY gained when compared with adalimumab, and £4862 per QALY gained when compared with conventional therapy.

  • In the population in whom TNF‑alpha inhibitors had failed, vedolizumab dominated surgery and was associated with an ICER of £64,999 per QALY gained when compared with conventional therapy.

3.30 The company presented 5 scenario analyses that included:

  • altering the model time horizon (lifetime and 1 year, rather than 10 years)

  • using alternative sources of utility values (in which the utility associated with moderately to severely active disease was lower [0.3–0.4] than its base‑case estimate [0.68])

  • excluding the excess mortality risk for ulcerative colitis

  • using 10‑week response data rather than 6‑week response data and

  • extending the maximum duration of biological treatment from 1 year to 3 years.

    The model was sensitive to the time horizon, with longer time horizons reducing the ICER in all populations. Using the alternative utility values also reduced the ICER for vedolizumab compared with conventional therapies or the other biological treatments. Increasing the maximum time a person could have biological treatment increased the ICER for vedolizumab in the whole population and in the population who had not had TNF‑alpha inhibitors before. The company noted that, in the base case, all people who had a biological treatment were assumed in the model to switch to conventional therapy after 1 year. Therefore, the long‑term effectiveness of vedolizumab was determined by the effect of vedolizumab treatment over 1 year on the distribution of people across the health states at the end of that year.

3.31 The ERG noted that a 10‑year time horizon was used for the company's base case, but it was not clear whether all relevant health gains and costs would be captured within that time. The ERG stated that running the model over a lifetime time horizon was preferable. It noted that the clinical trial data only assessed outcomes up to 54 weeks and extrapolating data to a lifetime horizon would be subject to considerable uncertainty.

3.32 The ERG commented on the company's use of a calibration approach to estimate transition probabilities in the maintenance phase. It noted that patient level data for people with remission, mild, or moderate to severe disease during the maintenance phase would be available from GEMINI I. However, data may not be available to the company for the adalimumab, golimumab and infliximab trials included in the network meta‑analysis. It commented that the assumptions and constraints used in the calibration calculations, including using a different starting matrix for biological therapies and conventional therapies, were arbitrary. It commented that using a calibration process to fit 7 unknown parameters to 2 known data points meant that over fitting may have occurred. The ERG commented that there would be many possible combinations of transition probabilities that could fit the 1‑year data points for response and remission. It also noted that the calibration process did not account for people whose disease responded but whose symptoms remained moderate to severe.

3.33 The ERG commented on the plausibility of the assumptions about the transition probabilities between the surgery and post‑surgery health states. It noted that the company had converted 6‑month estimates for repeat surgery and complications following surgery, to an 8‑weekly probability (assuming a constant rate), and then had applied these probabilities for the full 10‑year time horizon. However, the ERG stated that the probability of repeat surgery and complications would be expected to be greater in the first 12 months after surgery, rather than remaining constant indefinitely. It also noted that the company's estimate of entering remission after having a post‑surgery complication was based on an estimate for 1 type of complication only (pouch leaks) and it was unclear how the probability related to annual risk. Overall, the ERG considered that the company's assumptions would overestimate the probability of having surgical procedures and the time spent in the post‑surgical complications state, which would result in increased costs and reduced health gains associated with surgery.

3.34 The ERG commented that the marketing authorisations for vedolizumab, infliximab, golimumab or adalimumab do not stipulate if or when people whose disease responds to therapy should stop treatment. It noted that the company assumed that people who were responding to these biological treatments would have them for 1 year and then switch to conventional therapy. The ERG stated that it was unclear whether in clinical practice biological therapy would be stopped when a patient is gaining clinical benefit from it. The ERG commented that it was also assumed in the model that people would continue to have biological maintenance therapy for up to 1 year, even if response to treatment was lost after the induction period. It stated that this 'continuation rule' was unlikely to be clinically realistic.

3.35 The ERG considered that it was appropriate to use EQ‑5D data from GEMINI I to determine the utility associated with the disease severity health states in the model. However, the ERG noted that this approach did not differentiate between the treatment that people were having in the trial, and people who did or did not have a response to treatment.

3.36 The ERG noted that in the company's model it was assumed that the utility value for post‑surgical remission was lower (0.60) than the utility value for moderate to severe ulcerative colitis (0.68), reflecting worse quality of life. The ERG considered that the utility value for post‑surgical remission was not plausible because it does not represent any benefit from surgery. The ERG was unable to verify that utility values for surgery, post‑surgery remission, and post‑surgery complications from Punekar and Hawkins were for people with ulcerative colitis. The ERG commented that the Punekar and Hawkins paper, which was cited as the source of utility values, was a study of the epidemiology and costs of Crohn's disease. The ERG identified a different health utility study of people with ulcerative colitis, reporting utility values for remission, response, moderate to severe ulcerative colitis and post‑surgery (Woehl et al.). It noted that the values for people who had surgery in Woehl et al. were much higher than those reported in Punekar and Hawkins. In addition, the values for the pre‑surgery states were slightly different. The ERG considered that the company's assumptions about surgery and post‑surgery health state utility values would underestimate the health gains for people having surgery and favoured drug therapies over surgery.

3.37 The ERG commented on the probability of having an adverse event in the model. The ERG noted that the estimates of adverse events with conventional therapy were derived from a pooled analysis of the placebo arms of trials of vedolizumab, adalimumab, infliximab and golimumab. It noted that in these trials people in the placebo arm had a placebo transfusion or injection, which would not normally be given as part of conventional therapy. The ERG stated that it was not clear whether skin reactions with conventional therapy may be infusion site rashes as a result of placebo delivery rather than as a reaction to the conventional therapy itself.

3.38 The ERG noted that the costs in the model for endoscopy, consultant visits, blood tests, and hospitalisations were based on 2006–7 NHS reference costs (cited in Tsai et al., and uplifted to current prices) rather than 2011–12 NHS reference costs, as stated by the company. The ERG commented that the actual 2012–13 NHS reference costs were much lower (with the exception of consultant visit costs). The ERG stated that this resulted in the model overestimating costs in the post‑surgical complication health state.

3.39 The ERG commented on the costs included in the post‑surgery health states. It stated that it was not clear whether costs associated with stoma care, which would include nurse visits and consumables, were included. The ERG noted that the costs of stoma care would be approximately £466 per year based on a study by Buchanan.

3.40 The ERG noted that, in the company's model, the costs associated with conventional therapies in people who were also having biological treatments were half of those incurred by people having only conventional therapies, and that this assumption was not justified. The ERG also stated that the company's model included the cost of topical rather than oral prednisolone. It noted that replacing the cost of topical prednisolone with that for oral prednisolone reduced the overall cost of conventional therapy but noted that this did not have a large impact on the ICER for vedolizumab.

3.41 The ERG carried out the following exploratory analyses:

  • Scenario 1: correction of an error in the model in which baseline values for infliximab, rather than conventional therapy, were used in the maintenance model, for people who had not had TNF‑alpha inhibitors and who were having conventional therapy.

  • Scenario 2: utility values from a study by Woehl et al. were used in the model for each health state ('remission' 0.87; 'mild' 0.76; 'moderate to severe' 0.41; 'surgery' 0.41; 'post‑surgery remission' 0.71; and 'post‑surgery complications' 0.54).

  • Scenario 3: utility values from a study by Swinburn et al. were used in the model ('remission' 0.91; 'mild' 0.8; 'moderate to severe' 0.55; 'surgery' 0.55; 'post‑surgery remission' 0.59; and 'post‑surgery complications' 0.42).

  • Scenario 4: different assumptions were applied to estimate the transition probabilities between the surgery and post‑surgery health states. It was assumed that:

    • people would not have repeat surgery (because the cost estimates for surgery already included the cost of repeat surgery)

    • people leaving the surgery health state were assumed to remain in the post‑surgery complications state or remission state for the remainder of the modelled time horizon

    • the probability of having late complications was based on the probability of chronic pouchitis reported in Arai et al.

  • Scenario 5: people can continue to have biological therapies beyond 1 year if their disease responds or they are in remission on those therapies.

  • Scenario 6: costs of conventional therapies are the same if they are taken at the same time as a biological therapy or if conventional therapy is the only treatment a person has.

  • Scenario 7: using NHS 2012–13 reference costs for health state resource cost estimates rather than the estimates reported in Tsai and others.

  • Scenario 8: costs of stoma care were included in the post‑surgery health states for the 40% of people whose surgical procedure was assumed to have been an ileostomy. Over a 6‑month period people were assumed to have 1.5 nurse visits at a cost of £136.88 and need consumables costing £178.09.

  • In scenarios 2 and 3 it was assumed that the utility associated with surgery was the same as having moderate to severe ulcerative colitis. It was also assumed that people with post‑surgery complications would have a utility decrement of 0.17 relative to people in post‑surgery remission, to account for the complications (the 0.17 utility decrement was based on Arseneau et al.).

  • In all scenarios, except for scenario 1, the ERG also assumed a lifetime time horizon rather than a 10‑year time horizon. The corrections in scenario 1 were also applied in all scenarios.

3.42 The ERG presented fully incremental results for the company's base case and the ERG's scenarios. The effect of these scenarios was as follows:

  • In all scenarios, except scenario 2, vedolizumab was the most effective option (it had the greatest modelled QALYs).

  • In scenario 2, in which utility values from Woehl et al. were used, surgery became the most effective option, and vedolizumab was less effective and less costly than surgery in all 3 modelled populations.

  • In the whole population, scenarios 3, 6, 7 and 8 resulted in an ICER for vedolizumab compared with the next most effective treatment option (conventional therapy) that was lower than the company's base case. Scenarios 4 and 5 resulted in an ICER for vedolizumab compared with conventional therapy that was greater than the company's base‑case ICER.

  • In the population who had not had a TNF‑alpha inhibitor before, when scenarios 3, 6 and 8 were applied vedolizumab dominated all treatment options. Scenario 7 resulted in the ICER for vedolizumab compared with the next most effective treatment option, adalimumab, reducing from £6634 (in the company base case) to £759 per QALY gained. Scenario 4 had a different impact on the ICER depending on the comparison. When vedolizumab was compared with conventional therapy or the TNF‑alpha inhibitors, scenario 4 resulted in vedolizumab dominating or extendedly dominating these treatment options. When vedolizumab was compared with surgery, scenario 4 resulted in an ICER of £20,449 per QALY gained rather than vedolizumab dominating surgery (as in the company's base case). Scenario 5 resulted in the ICER for vedolizumab compared with adalimumab increasing from £6634 per QALY gained in the company's base case to £3,807,239 per QALY gained. However, the modelled QALY difference between these 2 treatments in the ERG scenario was minimal.

  • In the population in whom a TNF‑alpha inhibitor failed, ERG scenarios 3, 5, 6, 7 and 8 resulted in ICERs for vedolizumab compared with conventional therapies that were lower than the company's base case. Scenario 4 resulted in the ICER for vedolizumab compared with conventional therapy increasing from £64,999 per QALY gained in the company's base case to £73,931 per QALY gained.

3.43 The ERG combined all of its scenarios, except scenario 3 (utility values from Swinburn et al.), in its exploratory base case. The results are presented for a lifetime time horizon. In all 3 populations all options are dominated by surgery (surgery is more effective and less costly). The ERG noted that surgery may not be an acceptable treatment option for all people. The ERG stated that if surgery is not an acceptable option:

  • In the whole population, the ICER for vedolizumab compared with conventional therapy was £53,084 per QALY gained.

  • In the population who have not had prior treatment with TNF‑alpha inhibitors, vedolizumab is dominated by adalimumab.

  • In the population in whom treatment with a prior TNF‑alpha inhibitor has failed, the ICER for vedolizumab compared with conventional therapy is £48,205 per QALY gained.

3.44 Following consultation on the appraisal consultation document, the company submitted revised cost‑effectiveness estimates for the subgroup of people in whom a TNF‑alpha inhibitor had failed. These estimates were based on a revised patient access scheme, and incorporated all of the ERG suggested revisions to their model (see section 3.41) with the exception of the following amendments: scenario 5 (people can continue to have biological therapies beyond 1 year if their disease responds); the transition matrix for rates of surgery was not amended; and the ERG's cost estimate for stoma care was not utilised. The resulting ICERs were £37,086 per QALY gained for vedolizumab compared with conventional therapy using the company's base‑case utility estimates, £27,515 per QALY gained using the Swinburn et al. utility estimates and £30,878 per QALY gained using the Woehl et al. utility estimates.

3.45 Full details of all the evidence are available.