3 The company's submission

The Appraisal Committee (section 7) considered evidence submitted by Janssen and a review of this submission by the Evidence Review Group (ERG; section 8).

Clinical effectiveness

3.1 The company's systematic literature review identified 1 randomised controlled trial (RCT) investigating the clinical efficacy and safety of bortezomib in combination with rituximab, cyclophosphamide, doxorubicin and prednisone (VR‑CAP) in adult patients with previously untreated mantle cell lymphoma. LYM‑3002 trial was a randomised, open-label, multicentre study that compared VR‑CAP against rituximab with cyclophosphamide, doxorubicin, vincristine and prednisone (R‑CHOP). The study involved 128 sites worldwide, and people were randomised in a 1:1 ratio based on the International Prognostic Index and the stage of disease at diagnosis.

3.2 In total, 487 people were randomised; 243 to bortezomib and 244 to R‑CHOP. The median age was 69 years. People were given 6 to 8 cycles (18 to 24 weeks) of treatment depending upon the response documented at the cycle‑6 assessment. Approximately 80% of people in both groups completed treatment. The total study duration from randomisation of the first patient until the last progression‑free survival event needed for the final analysis was expected to be approximately 42 months (24 months for enrolment and 18 months for follow-up). Average treatment duration was 17.6 weeks in the bortezomib treatment group and 16.1 weeks in the R‑CHOP group. Treatment discontinuation was comparable between the 2 groups (18% and 19% respectively). The majority of people had at least 6 cycles of treatment: 84% of people randomised to VR‑CAP, and 83% of people randomised to R‑CHOP.

3.3 The trial included 80 patients (16.4%; 38/243 in VR‑CAP arm, 42/244 in R‑CHOP arm) who were suitable for haematopoietic stem cell transplantation from a medical perspective but access was prevented due to availability or socio-economic reasons. The inclusion of these patients arose due to a protocol amendment part way through the LYM-3002 trial where patients who were ineligible or not considered for haematopoietic stem cell transplantation were enrolled. However concerns over the heterogeneity and interpretability of the study results resulted in a further amendment, realigning to the original eligibility criteria, and only patients who were not eligible for haematopoietic stem cell transplantation as assessed by the treating physician, were subsequently enrolled.

3.4 The primary outcome of the study was progression‑free survival in the intention–to–treat (ITT) population, based on independent review committee assessment of progression. Median progression‑free survival was 751 days (24.7 months) in people randomised to VR‑CAP compared with 437 days (14.4 months) in people randomised to R‑CHOP (hazard ratio [HR]=0.63, p<0.001).

3.5 The company presented results for a number of secondary clinical endpoints:

  • Based on independent review committee assessment of progression in the ITT population, median time to progression was 929 days (30.5 months) in people randomised to VR‑CAP compared with 490 days (16.1 months) in people randomised to R‑CHOP (HR=0.58; p<0.001).

  • In the ITT population, median time to next anti-lymphoma treatment was 1353 days (44.5 months) for people randomised to the VR‑CAP group compared with 756 days (24.8 months) for those randomised to the R‑CHOP group (HR=0.50; p<0.001).

  • Median treatment-free interval in the safety analysis set was 1236 days (40.6 months) for people randomised to VR‑CAP compared with 624 days (20.5 months) for those randomised to R‑CHOP (HR=0.50; p<0.001).

  • Based on independent review committee assessment, complete response rates (complete response plus complete response unconfirmed) were 53.3% in the VR‑CAP group compared with 41.7% in the R‑CHOP group (odds ratio [OR]=1.688; p=0.007), and the median duration of complete response was 42.1 months compared with 18.0 months for people treated with VR‑CAP. The median time to initial response based on independent review committee assessment was 42 days (1.4 months) in people randomised to VR‑CAP compared with 50 days (1.6 months) in people randomised to R‑CHOP (HR=1.54; p<0.001).

3.6 At the time of the company's submission, overall survival data were not mature in the LYM‑3002 trial. In an interim analysis based on a median duration of 40 months' follow-up (in which 158 deaths had been observed: 71 in the VR‑CAP group [29%] and 87 in the R‑CHOP group [36%]), the estimated hazard ratio for death is 0.80 (95% confidence interval [CI] 0.59 to 1.10, in favour of VR‑CAP).

3.7 Three different patient-reported outcome tools were used to assess health-related quality of life in the LYM‑3002 trial: the European Organisation for Research and Treatment of Cancer Quality of Life Questionnaire (EORTC QLQ‑C30); the Brief Fatigue Inventory ; and the EuroQol Dimension Questionnaire (EQ-5D). The company reported that utility values, translated from the EQ-5D taken at day 1 of every treatment cycle and at the end-of-treatment visit, were not statistically significantly different between treatment groups at baseline and throughout the LYM‑3002 treatment. The company highlighted that as the study design did not include patient-reported outcome collection after the end of treatment, it was not possible to assess the dimension of clinical benefit derived by people from the prolonged progression‑free survival and prolonged disease control provided by VR‑CAP. However, the company stated that such improvement in long-term prognosis would be likely to positively affect patient health-related quality of life in practice.

Subgroup analysis

3.8 In subgroup analyses based on region, the North America subgroup was combined with the European Union subgroup post hoc as the former had very few people, most of whom had a progression‑free survival event (5 people with a progression‑free survival event out of 8 enrolled into the R‑CHOP group, and 4 people out of 6 enrolled into the VR‑CAP group) that resulted in a very large CI (0.44 to 41.96) for the estimated HR (which was greater than 1).

3.9 In the pre-specified North American and Western European subgroup,: median progression‑free survival for the VR‑CAP group was 19.4 months compared with 14.4 months for the R‑CHOP group (HR=0.77, 95% CI 0.43 to 1.38).

ERG comments on the clinical effectiveness data

3.10 The ERG highlighted that no patients from the UK were included in the LYM‑3002 trial with approximately 30% of the people recruited in the European Union and North America. The other two thirds were from the 'rest of the world', in particular Russia and China. Given the different prevalence of mantle cell lymphoma depending on the geographic region and potential differences in clinical standards (for example, concomitant care), the ERG stated that this brings into question the generalisability of the trial to clinical practice in the UK.

3.11 The ERG noted that the inclusion criteria in the LYM‑3002 trial were narrower than those defined in the NICE scope. The ERG noted that the population in the final scope (people with previously untreated mantle cell lymphoma, who are not going to have a stem cell transplant) might include people who would not have been eligible for inclusion in the LYM‑3002 trial.

Indirect comparison

3.12 The company highlighted that the induction therapy regimens listed in the final appraisal scope (rituximab with fludarabine and cyclophosphamide [R‑FC] and rituximab with bendamustine [R‑bendamustine]) are not considered to be relevant comparators for VR‑CAP as these are generally reserved for patients who cannot tolerate R‑CHOP and, therefore, VR‑CAP. However, the company did indirect comparison analyses to alternative rituximab-chemotherapy induction regimens where possible. The company emphasised the limitations of the indirect comparison and considered that these analyses are not robust because of important differences between LYM‑3002 and the comparator studies. There were also methodological limitations in the comparator studies.

ERG comments

3.13 The ERG agreed with the company that the indirect analysis should be treated with caution because of the lack of similarity between the 3 included trials. The wide confidence intervals reported above could partly be explained by this heterogeneity. The ERG noted that the 3 trials included in the indirect analyses are linked to a high risk of bias.

Adverse effects of treatment

3.14 The company reported that both VR‑CAP and R‑CHOP induction regimens were generally well tolerated, with discontinuation rates of 8.8% and 7.0% respectively because of adverse events. Adverse-event-related deaths were 7.0% in both treatment groups. Almost all people in both treatment groups experienced a treatment-emergent adverse event, although VR‑CAP was associated with a slightly higher rate of grade 3 or higher adverse events and serious adverse events. In both treatment groups, the most commonly reported grade 3 or higher adverse events were haematological (blood and lymphatic system) disorders. For adverse events of clinical interest, peripheral neuropathy was the most commonly reported and it was similar in the 2 treatment groups (30% for VR‑CAP and 29% for R‑CHOP).

ERG comments

3.15 The ERG agreed with the company's view that both chemotherapy induction regimens were generally well tolerated, with low rates of discontinuation because of adverse events and low rates of treatment‑related deaths in both groups. However, the ERG highlighted that more serious adverse events were observed for VR‑CAP (37.5%) compared to R‑CHOP (29.8%) and the serious adverse events were usually of higher severity in VR‑CAP. While more study drug‑related discontinuations were reported for VR‑CAP (7.9%) compared to R‑CHOP (5.8%), there were more reported deaths related to R‑CHOP (3.0%) compared to VR CAP (2.0%). The ERG highlighted that this was similar to the outcomes for the Western Europe subgroup. More comprehensive results will be available with the final analysis in 2017.

Cost effectiveness

3.16 A de novo cost‑effectiveness model was developed by the company to assess the cost effectiveness of VR‑CAP in England and Wales. The model included 5 states: progression‑free survival from first‑line treatment; progressed from first‑line treatment; progression‑free survival from second‑line treatment; progressed from second‑line treatment; and death.

3.17 The company's base-case model time horizon was 20 years. The company considered this to be essentially a lifetime time horizon for patients, given that the mean age assumed in the model was 69 years. Both costs and health outcomes were discounted at an annual rate of 3.5%. The company stated that costs were based on 2013/14 figures (NHS Reference Costs and Personal Social Services Research Unit) as these were the most recent cost data available at the time the model was developed.

3.18 The key clinical data used within the economic model were taken from the LYM‑3002 trial. The ITT population of the LYM‑3002 trial was used to assess the effectiveness and safety of VR‑CAP compared with R‑CHOP in the de novo cost-effectiveness model. Based on advice from UK haematologists, the company considered that people included in the LYM‑3002 trial were similar to those expected to be seen in UK clinical practice. However, baseline demographics from only the Western European and North American subgroup were used in the model because the company considered that subgroup to be more similar to people in UK clinical practice in terms of age and weight.

3.19 Economic comparison was conducted primarily with R‑CHOP because the company were of the opinion that R‑CHOP induction therapy is the established standard of care for patients with previously untreated mantle cell lymphoma (for whom haematopoietic stem cell transplantation is unsuitable). The company stated that no maintenance treatment with rituximab was assumed in the model base case because it was not identified as a comparator in the decision problem. However, the company highlighted that as R‑maintenance is used in clinical practice in people with a response to induction, the potential impact of induction therapy with VR‑CAP compared with R‑CHOP followed by R‑maintenance was investigated in exploratory analyses.

ERG comments

3.20 The ERG commented that the company's model followed a logical structure with respect to the nature of the disease. The ERG agreed that the discount rate and perspective are in line with the NICE reference case. The ERG noted that considering the average age of 69 years in the LYM‑3002 trial and that the median survival is less than 5 years, a time horizon of 20 years is considered adequate and similar to a lifetime perspective. The ERG identified 2 possible concerns: the exclusion of the half‑cycle correction and the exclusion of any additional treatment lines after second‑line treatment. The company highlighted that it had implemented the half‑cycle correction in its revised analysis in response to clarification, however the ERG disagreed with how it was done. Therefore the ERG has made its own correction for the new ERG base case (see sections 3.42–3.46). The ERG commented that the exclusion of any additional treatment lines seemed reasonable considering the lack of evidence of treatment efficacy and the minority of patients having a third treatment line.

Model details

3.21 Instead of using the primary outcome of the LYM‑3002 trial, progression‑free survival assessed by an independent review committee, the company chose to use an alternative assessment (by an independent review committee member) in the base case of the model because it was felt this reflected clinical practice while retaining the blinded assessment.

3.22 The company fitted the following parametric models to estimate progression‑free survival in the 2 treatment groups: exponential; Weibull; lognormal; log‑logistic; gamma; and Gompertz. The company used the log‑logistic model in the base case based on the goodness of fit of the progression‑free survival curves (that is, using the Akaike information criterion and the Bayesian information criterion, and visual fit and long‑term fit).

Survival was modelled with parametric models fitted using the LYM 3002 patient level data for people having VR CAP and R CHOP. However, overall survival data from the LYM 3002 trial are still immature; median overall survival for VR CAP has not been reached. Because of the a wide range of potential outcomes when attempting to fit survival curves directly to the overall survival data, the company modelled survival using progression as a surrogate marker for overall survival.

3.23 For the base case, parametric curves were fitted for 3 categories of patients: all patients who progressed from VR‑CAP or R‑CHOP during the trial, all patients who did not progress from VR‑CAP, and all patients who did not progress from R‑CHOP. This method assumed that patients who progressed had the same survival regardless of what treatment they had in first line (that is, post‑progression survival was the same, regardless of the first‑line therapy that had been had).

3.24 The company added non‑disease‑specific mortality, based on age and sex, to the model to better capture long‑term survival (using UK life tables from the Office for National Statistics). It was assumed that all deaths in the pre‑progression survival curves (before adjustment for background mortality) in the trial were deaths from mantle cell lymphoma.

3.25 The mean duration of second‑line treatment and progression‑free survival from second‑line treatment were derived from the LYM‑3002 trial. In the company base case, model treatment duration (90 days) and progression‑free survival (231 days) were assumed to be the same for both groups, using data from both LYM‑3002 trial groups combined.

3.26 The company highlighted that there were limited data available for the other comparators included in the scope. The company stated that the indirect comparison to R‑bendamustine was too unreliable given the heterogeneity described previously, particularly for progression‑free survival, to be used to assess comparative efficacy within the cost‑effectiveness model. Instead, the company assumed equal efficacy (progression‑free survival and overall survival) to R‑CHOP, which was based on clinician feedback. Similarly, the limitations of the R‑FC indirect comparison meant that an assumption of equal efficacy with R‑CHOP was also made for R‑FC.

3.27 The company used EQ‑5D data from the LYM‑3002 trial for health‑related quality‑of‑life estimates during and on progression from first‑line treatment. Utility decrements for adverse events were included in addition to the health‑state utilities while patients were on treatment, based upon LYM‑3002 trial data. No long‑term utility values were available from the LYM‑3002 trial so instead, the company assumed equal utility while progression free during first- and second‑line treatments (based on UK clinician feedback and previous non‑Hodgkin's lymphoma modelling), utility associated with post‑progression from second‑line treatment was taken from the most relevant source related to aggressive non‑Hodgkin's lymphoma, which the company stated was the most similar condition to mantle cell lymphoma in terms of expected effect on health status.

Costs

3.28 The company's model assumed that patients had only whole vials and that there was no vial sharing. On dosing regimens, cycle lengths for both VR‑CAP and R‑CHOP were 21 days with a maximum number of 6 cycles or 8 cycles if first response happens in cycle 6. The company also presented drug acquisition and administration costs associated with VR‑CAP, R‑CHOP, other comparators and second‑line treatments.

3.29 In the model, the number of patients having treatment per cycle was informed by the LYM‑3002 trial and reduced with each cycle going from 100% in cycle 1 down to 13.3% for VR CAP and 17.4% for R‑CHOP by cycle 8.

3.30 In addition to the cost of hospital visits to treat adverse events, drug acquisition costs associated with concomitant medications were also included in the model (those used in the trial but unavailable in the UK were excluded). Costs for red blood cell and platelet transfusions were included in the company's model.

3.31 Adverse event costs were based on NHS Reference Costs 2013/14. Weekly costs attributable to adverse events produced cycle costs of £26.41 for VR‑CAP and £28.81 for R‑CHOP.

Company's base-case results and sensitivity analysis

3.32 In the company's base-case deterministic analysis, VR‑CAP was estimated to generate 0.75 incremental life years, 0.80 incremental quality-adjusted life years (QALYs) and an incremental cost of £16,213 compared with R‑CHOP, leading to an incremental cost-effectiveness ratio (ICER) of £20,362 per QALY gained. In the probabilistic analyses the ICERs for VR‑CAP ranged between £13,725 (compared with R‑bendamustine) and £20,264 (compared with R‑CHOP) per QALY gained.

3.33 Cost‑effectiveness acceptability curves (generated by the ERG from the company model) showed that R‑CHOP has the highest probability of being cost‑effective (51.3%) followed by VR‑CAP (48.7%). The probabilities of being cost‑effective for R‑FC and R‑bendamustine were 0.0%. VR‑CAP has the highest probability (86.5%) of being cost‑effective at a maximum acceptable ICER of £30,000, followed by R‑CHOP (13.5%), R‑FC (0.0%) and R‑bendamustine (0.0%).

3.34 The ICERs were most sensitive to the survival functions used to model progression‑free survival and overall survival the utility value for patients progressed from second‑line treatment, intravenous administration costs and the duration of second‑line treatment.

3.35 The company performed a large number of scenario analyses for the comparison between VR‑CAP and R‑CHOP. The most influential scenario analyses were those incorporating different parametric distributions for progression‑free survival; using Weibull, gamma and Gompertz distributions increased the ICER from £20,362 to £25,849, £27,697 and £30,452 respectively. Changing the utility value for patients progressed from second‑line treatment to 0.693 (equal to patients progressing from first‑line treatment) increased the ICER to £26,241 per QALY gained. Changing all health state utility values to correspond with those from Doorduijn et al. 2005 (that is, 0.61 for progression free in the first and second line and 0.45 for progressed patients in the first and second line) did increase the ICER to £28,746 per QALY gained. The company stated that cost‑effectiveness results were generally robust under the sensitivity and scenario analyses conducted, with no scenarios bringing the ICER of VR‑CAP compared with R‑CHOP above £30,000 per QALY gained.

ERG comments

3.36 The ERG did not agree with the company using the ITT population of the LYM‑3002 trial to assess the effectiveness of VR‑CAP compared with R‑CHOP. The ERG preferred the use of data from the European Union subgroup.

3.37 The ERG noted that the log‑logistic distribution was selected for both treatment groups, for progression‑free survival, based upon clinical expert opinion. However, the exponential distribution showed the best statistical fit for the VR‑CAP group (based on Akaike information criterion and Bayesian information criterion). The ERG also questioned the different survival curves based on progression status and the assumption that survival for patients without progression differed between treatment groups.

3.38 The ERG agreed with the company submission that immature data may bias the extrapolation of survival data, however this was not explained further by the company. The ERG suggested that if data are too immature to model overall survival for all patients. It is questionable whether sufficient data are available to separately estimate long‑term survival for patients with and without progression. This distinction would reduce the total number of patients at risk, and may increase the uncertainty about the long‑term survival. The company justified the use of different survival for patients with and without progression by referencing 1 study in mantle cell lymphoma and 1 study in non‑Hodgkin's lymphoma in which better progression‑free survival is associated with better overall survival.Another concern raised by the ERG on the modelling of survival was the assumption that survival for patients without progression differs between treatment groups. The ERG suggested that as a result of using immature data, it is not feasible to identify any differences in overall survival between treatment groups.

3.39 The ERG did not agree with the company using a utility value of 0.45 progression from second‑line treatment because the study from which it was sourced (Doorduijn et al. 2005) was based on a small number of observations (n=26). The ERG estimated utility for progression from second‑line treatment by subtracting the average disutility (from 2 different groups of people) reported in Doorduijn et al. (2005) from the baseline utility in the LYM‑3002 trial for progression‑free survival from first‑line treatment. Therefore, the ERG used a utility of 0.624, instead of the company's value of 0.45.

3.40 The ERG did not agree with the dose reduction applied to the drug costs for VR‑CAP and R‑CHOP because it is questionable whether the dose reduction observed in the LYM‑3002 trial is representative for UK clinical practice. Concomitant medication costs and costs for pegfilgrastim were amended for R‑CHOP.

3.41 The company did not provide a subgroup analysis for the European Union or European Union/North American region subgroup. As the treatment effectiveness appears lower for the European Union subgroup, the relative treatment effect for progression‑free survival was conservatively adjusted to reflect the European Union subgroup in the ERG base case.

ERG exploratory analyses

3.42 In light of a number of issues highlighted in the ERG report, the ERG made a number of amendments. The ERG corrected a number of errors and changed a number of assumptions in the company's model as follows:

  • 1. Corrected the unit prices that were different in the reference price list.

  • 2. Corrected an error in the calculation of adverse events.

  • 3. Corrected calculation of costs of concomitant medication.

  • 4. Inclusion of half‑cycle correction.

  • 5. Age, weight and unit prices were made fixed instead of being stochastic (that is, instead of having distributions applied to them).

  • 6. Proportion of patients having treatment during a cycle and proportion of patients having concomitant medication were made stochastic to reflect second order uncertainty.

  • 7. Adjusted progression‑free survival according to the HR of the European Union population.

  • 8. Start second‑line treatment at time of progression.

  • 9. Utility for progression from second‑line treatment is calculated by subtracting the disutility as found in Doorduijn et al. (2005) from the baseline utility in the LYM‑3002 trial for progression‑free survival from first line treatment. Therefore, the ERG used a utility of 0.624, instead of the company's value of 0.45.

  • 10. Excluded end‑of‑life costs.

  • 11. Used per‑protocol dosage instead of observed dosage reductions because it is unknown whether the dosage reduction is applicable to UK patients.

  • 12. The primary assessment of progression is used instead of the alternative assessment.

  • 13. Indirect treatment comparison is used for the effectiveness of R‑FC and R‑bendamustine instead of assuming equal effectiveness as R‑CHOP.

  • 14. Overall survival is not differentiated between patients with and without progression, but between treatments instead.

  • 15. Excluded all‑cause mortality as this is already incorporated in the overall survival estimate.

  • 16. The exponential distribution is used for the extrapolation of progression‑free survival in the VR‑CAP group and the log‑logistic distribution is used for the extrapolation of progression‑free survival in the R‑CHOP group.

3.43 The ERG stated that the ICERs compared to R‑FC and R‑bendamustine were minimally influenced by the ERG changes and so the results presented focused on the comparison with R‑CHOP. Including all of the ERG's amendments at the same time increased the company's base case ICER of VR‑CAP compared with R‑CHOP by £14,000 to £34,039 compared to the company base case. The large difference between the company base case and the ERG's ICER was caused mainly by changing the distribution for progression‑free survival in the VR‑CAP group to the exponential distribution, while keeping the distribution for R‑CHOP progression‑free survival a log‑logistic distribution.

3.44 The ERG performed probabilistic sensitivity analyses for all comparators to capture the uncertainty in the estimation of input parameters in their additional analyses. The probability that VR‑CAP is cost effective at a threshold of £20,000 and £30,000 is smaller in the ERG analyses compared to the company's base case (11% versus 49% and 39% versus 89% for a threshold of £20,000 and £30,000 respectively). Similar to the company's base case, the probability that R‑FC or R‑bendamustine are cost effective at the usual NICE thresholds are negligible.

3.45 The ERG did some additional exploratory analyses that looked at the effect of removing some assumptions from its preferred cumulative ICER estimate of £34,039 per QALY gained. The ERG combined all their preferred assumptions together but removed the following:

  • Progression‑free survival adjustment for the European Union subgroup.

  • Distinguish survival for patients with and without progression.

  • Use the same progression‑free survival distribution (log‑logistic) for all treatment groups.

    As survival for patients with and without progression is distinguished in this additional analysis by the ERG, all cause mortality to survival pre progression was included (in other words, analysis 13 from section 3.42 was also removed). The result of removing these 4 assumptions from the ERG base case gave an ICER for VR CAP compared with R CHOP of £31,576 per QALY gained.

3.46 The ERG explored the effect of reverting back to the company's original utility value (0.45) and the exclusion of assumption 7 and assumptions 12–14 as in section 3.45. The result of this analysis (that is, the ERG base case excluding assumptions 7, 9 and 12–14) gave an ICER of £26,647 per QALY gained for VR‑CAP compared with R‑CHOP.

3.47 Full details of all the evidence are in the committee papers.