3 Committee discussion

The evaluation committee considered evidence submitted by Bayer, a review of this submission by the external assessment group (EAG), and responses from stakeholders. See the committee papers for full details of the evidence.

The condition

Details of condition

3.1 Colorectal cancer is a malignant tumour arising from the lining of the large intestine (colon and rectum). Metastatic colorectal cancer (mCRC) refers to cancer that has spread beyond the large intestine and nearby lymph nodes to other parts of the body, such as the lungs and liver.

Unmet need and impact on quality of life

3.2 The patient expert explained that mCRC can have a significant impact on quality of life, especially for those diagnosed at later stages, when survival rates are poor (fewer than 10% of those diagnosed at stage 4 survive beyond 5 years). They noted that there are limited effective treatment options available, so treatments which give small improvements in quality of life and extensions to length of life are important. Both the patient and clinical experts highlighted that as with some current treatments, regorafenib is administered orally (by mouth), meaning people may be able to take the medicine at home rather than in a hospital. The committee agreed that people with mCRC who have had previous treatments, have an unmet clinical need, and would welcome new treatment options.

Clinical management

Treatment options

3.3 The aim of treatment for mCRC is to prolong survival and improve quality of life. The treatment options for mCRC include:

Comparators

3.4 The company's proposed decision problem was narrower than regorafenib's marketing authorisation, proposing that the committee should consider regorafenib specifically as an alternative treatment option to trifluridine–tipiracil. The committee was aware that the marketing authorisation for regorafenib is very similar to that of trifluridine–tipiracil, that is, that it should be used after 'available therapies'. Clinical experts noted that trifluridine–tipiracil and regorafenib would be used at the same position in the treatment pathway. There would be shared decision making between patients and clinicians, based on previous response to therapies, tolerance and patient choice. The committee concluded that regorafenib will be used in the treatment pathway as an alternative treatment option to trifluridine–tipiracil. Clinical experts noted that the number of people who are well enough to have treatment reduces after each line of therapy. But they suggested that around 30% of people would be well enough to have active treatment after trifluridine–tipiracil. Currently, the only option for these people is best supportive care or treatment through a clinical trial. The clinical experts explained that trifluridine–tipiracil and regorafenib could be used in sequence because they have different mechanisms of action, and that this had been done in several observational studies. The committee agreed that, in principle, it was appropriate to consider that regorafenib could be used in sequence. When used after trifluridine–tipiracil, regorafenib would be used instead of best supportive care. So, the committee concluded that best supportive care was also an appropriate comparator.

Clinical effectiveness

Data sources and results

3.5 The company submitted clinical evidence from 2 randomised, double-blind, phase 3 clinical trials (CORRECT and CONCUR). These compared regorafenib with placebo in adults with metastatic colorectal cancer whose cancer had progressed within 3 months on approved standard treatment. Standard treatment included: fluoropyrimidine, oxaliplatin, irinotecan, bevacizumab, cetuximab, and panitumumab. The primary completion years for CORRECT and CONCUR were 2011 and 2013, respectively. Overall survival was the primary outcome for both trials. Clinical experts explained that the trials included different populations and these differences could confound the results. CORRECT was a global study that included people from 15 countries. These people had heavier pre-treatment with targeted biological therapies and were more likely to have a KRAS mutation and an Eastern Cooperative Oncology Group (ECOG) score of 0. CONCUR only included people in Asia (China, Hong Kong, South Korea, Taiwan and Vietnam), and included more people who had metastatic disease for a shorter time than CORRECT. In CORRECT, the median overall survival was 6.4 months for people having regorafenib and 5 months for people having placebo (hazard ratio 0.77, 95% confidence interval 0.64 to 0.94). For CONCUR, 8.8 months was the median overall survival for people in the regorafenib arm. In the placebo arm this was 6.3 months (hazard ratio 0.55, 95% confidence interval 0.40 to 0.77). The company pooled both datasets to get an overall survival hazard ratio of 0.68 (95% confidence interval 0.59 to 0.79), which it used for its clinical-effectiveness analyses. The committee acknowledged that there were differences in the trial populations and agreed to consider both trials and their pooled results in its decision making.

Generalisability of the regorafenib clinical trials

3.6 The committee explored whether the differences in populations between regorafenib trials were likely to have an impact on the efficacy results. That is, if any of the differences could be effect modifiers, and if so, what their impact may be on the efficacy results. The clinical experts noted that it is difficult to fully differentiate between people with an ECOG status of 0 and 1, so the impact of ECOG status on the clinical trial results may be unclear. One of the clinical experts explained that in the UK there is no evidence of differences in treatment effect in people with mCRC by ethnicity, so they would not expect ethnicity to be an effect modifier for regorafenib. But using anti‑VEGF treatment may influence the clinical effectiveness of regorafenib. Bevacizumab (an anti‑VEGF) is not recommended by NICE for treating mCRC, but was used by everyone in CORRECT and by about 60% of people in CONCUR, before entering the clinical trials. Clinical experts explained that because of their similar mechanism of action, regorafenib may be less effective in people who have had previous treatment with bevacizumab. The committee understood that the differences in baseline characteristics are complex, which increases uncertainty in the results across both trials. The committee considered the subgroup analyses presented by the company and whether any of these better represented people in UK clinical practice, but were aware that the trials were not adequately powered to detect differences in these populations. The committee acknowledged the differences in the baseline characteristics of people in the 2 regorafenib trials, and noted the uncertainty associated with these. But in the absence of further data, it concluded that the pooled results were likely to be generalisable and reflective of NHS clinical practice.

Indirect treatment comparison

3.7 There are no head-to-head randomised controlled trials comparing regorafenib with trifluridine–tipiracil. So, the company did an indirect treatment comparison to estimate the relative efficacy of the 2 treatments. It included the regorafenib trials (CORRECT and CONCUR), as well as 3 randomised controlled trials comparing trifluridine–tipiracil with best supportive care: RECOURSE, TERRA, and Yoshino (2012). The company reported similar efficacy for regorafenib and trifluridine–tipiracil using a fixed effect network meta-analysis model (overall survival hazard ratio 0.99, 95% confidence interval 0.84 to 1.17). It also reported similar results for an anchored matching-adjusted indirect comparison (MAIC) in which potential efficacy modifiers (sex, age, and previous use of biological treatment) were weighted based on the baseline characteristics of people in the relevant regorafenib and trifluridine–tipiracil trials. The EAG raised concerns about the differences in the clinical trials included in the indirect treatment comparison. It noted that there was a statistically significant difference in the progression-free survival estimates in studies that include only people in Asia (CONCUR, TERRA and Yoshino [2012]). It recalled that, as with the regorafenib trials (see section 3.6), many (45% to 99%) of the people in the trifluridine–tipiracil trials had had previous anti‑VEGF treatment. The EAG also highlighted the uncertainty around the number of previous treatments, noting that fewer people in the regorafenib trials than in the trifluridine–tipiracil trials had more than 3 previous lines of treatment for mCRC. The committee acknowledged the concerns raised by the EAG and noted that in the absence of adequate subgroup power to detect the impact of the differences in the clinical trials, the uncertainties remain. The committee recalled its conclusion about the generalisability of the regorafenib trials (see section 3.6). It understood that all the trials had similar designs, and that in all the trials the disease characteristics at baseline varied. But because of the different mechanisms of action, the clinical experts explained that effect modifiers may differ between treatments. They noted that there was no biological reason for the effect of trifluridine–tipiracil to differ in people who did or did not have biological treatments. But they advised that evidence from studies in Japan suggests there may be increased efficacy for people with an East Asian family background having fluorouracil (a chemotherapy), because of pharmacokinetic differences in this population. The clinical experts agreed that this may impact the efficacy of treatments and, because of the similar mechanism of action, explained that there may also be additional benefit when using trifluridine–tipiracil in this population. The committee concluded that the indirect treatment comparison was associated with uncertainty because of the heterogeneity between trial populations. It concluded that regorafenib is likely to provide similar benefits in terms of progression-free and overall survival compared to trifluridine–tipiracil.

Observational evidence

3.8 Because there were no head-to-head randomised controlled trials, the company and the EAG considered observational studies that directly compared regorafenib with trifluridine–tipiracil. Four of the 5 studies considered relevant supported the company's position of equal efficacy between regorafenib and trifluridine–tipiracil. But the study with the best balance in baseline characteristics (Nakashima 2020) did not. The study reported higher overall survival for people who had treatment with trifluridine–tipiracil compared with regorafenib (10.2 months compared with 6.4 months). The company noted that observational studies have a high risk of bias. The clinical expert also highlighted that the full details of the observational studies were not reported and the impact of this on the clinical effectiveness is uncertain. The committee agreed that the observational study had a high risk of bias. The committee was aware that the Nakashima (2020) study compared 4 groups. These were people who had: regorafenib only, trifluridine–tipiracil only, regorafenib then trifluridine–tipiracil, and trifluridine–tipiracil then regorafenib. The EAG reported results for people who only received regorafenib or trifluridine–tipiracil. The committee noted that this was likely to represent people with the poorest outcomes. In addition, clinical-effectiveness estimates for people who had both treatments would be prone to immortal time bias, because only people who lived long enough had both treatments. The committee concluded that given the difference in baseline characteristics inherent within the regorafenib and trifluridine–tipiracil trials, the observational study likely compounds the risk of bias. It preferred clinical-effectiveness estimates using the indirect treatment comparison.

Economic model

Company's modelling approach

3.9 The company submitted a 3‑state (progression-free, progressed disease and death) partitioned survival model to estimate the cost effectiveness of regorafenib. The model took the perspective of the NHS and Personal Social Services. It had a time horizon of 10 years, a cycle length of 1 week, and discounted costs and quality adjusted life-years (QALYs) at a rate of 3.5% per year. The committee concluded that the model structure was appropriate for decision making.

Modelling assumptions

3.10 For modelling overall survival, the company fit fully parametric survival models to pooled CORRECT and CONCUR data. For time-on-treatment and progression-free survival estimates for regorafenib and best supportive care, it used Kaplan–Meier data then applied parametric models from the point at which Kaplan–Meier data was unavailable. The company noted the maturity of its data and explained that this approach reflected clinical practice because assessment for disease progression occurred every 8 weeks in the clinical trial, and in clinical practice. The EAG highlighted that the stepped nature of Kaplan–Meier curves could result in overfitting, so it preferred fully parametric models for the base case, which aligns with NICE's Decision Support Unit technical support document 14. The committee noted that both the Kaplan–Meier data and the fully parametric model appeared reasonable for extrapolating short-term survival. But it also noted that for modelling overall survival, the company and EAG had only fit parametric curves to the trial data, whereas safety data provided by the company in response to clarification request showed extended survival data for up to 5 years. The committee preferred to use the long-term data. It concluded that generalised gamma was the best visual fit to the company's long-term overall survival data for regorafenib and best supportive care, and this should be used for the cost-effectiveness estimates.

Utility values

Source of utility values

3.11 Utility values used in the company's model were obtained from pooling EQ‑5D‑3L results collected in CORRECT and CONCUR. In general, there was no difference in quality-of-life results between regorafenib and best supportive care. But the EAG had some concerns about the plausibility of pooled end-of-treatment results being used to derive the post-progression health state. The clinical trials considered for trifluridine–tipiracil did not report quality-of-life results. So, the company assumed pre-progression (0.72) and post-progression (0.59) utility values to be equal for trifluridine–tipiracil and regorafenib. The committee noted the large difference in pre- and post-progression utility. Clinical experts explained that management of disease progression on best supportive care was difficult and could have a severe effect on quality of life. The committee was aware that the utility values from CORRECT were applied for the NICE technology appraisal guidance on trifluridine–tipiracil and concluded that it was appropriate to use pooled estimates from the clinical trials.

Adverse events

3.12 The company explained that regorafenib has an alternate adverse event profile to trifluridine–tipiracil. Myelosuppression, which can cause dose delays, dose reductions or stopping of treatment, is less common with regorafenib, but diarrhoea and fatigue is more common. A network meta-analysis showed both treatments had a similar likelihood of stopping treatment because of adverse events (odds ratio 1.10, 95% confidence interval 0.53 to 2.24) and similar grade 3 and 4 (severe and life-threatening) adverse events (odds ratio 0.90, 95% confidence interval 0.55 to 1.47). But regorafenib showed a higher likelihood of all treatment-emergent adverse events (odds ratio 1.94, 95% confidence interval 1.20 to 3.17), that is, including grade 1 and 2 (mild and moderate) adverse events. Only grade 3 and 4 adverse events, seen in over 2% of people in the trial, were captured in the company's economic model. Adverse event results from the network meta-analysis were not captured. The company noted that this was in line with previous modelling experience. The clinical experts explained grade 2 adverse events, by definition, have an impact on activities of daily living. But less severe adverse events can typically be managed in the short term. They explained that adverse events may not happen independently, and some may not be symptomatic for patients. This means that although a single grade 3 event may be counted as severe, multiple grade 1 and 2 events may have a greater impact on their quality of life. The company presented a scenario analysis which included grade 1 and 2 adverse events, applying a fixed cost of £5 per adverse event and a disutility of 0.01, but this had negligible impact on the cost-effectiveness estimates. The committee concluded that grade 1 and 2 adverse events should be included in the economic model.

Costs

Relative dose intensity

3.13 The company modelled relative dose intensity (RDI) differently for regorafenib than for trifluridine–tipiracil. RDI for regorafenib was based on the mean dose used in CORRECT and CONCUR. For trifluridine–tipiracil, data from NICE's technology appraisal on trifluridine–tipiracil was used to model cycle delay and dose reduction separately. The company noted that this approach is similar to how adverse events would be managed in clinical practice, that is, regorafenib would tend to be managed by dose reduction whereas trifluridine–tipiracil would be managed by dose delay. The clinical experts noted that both dose delay and dose reduction are used to manage adverse events in clinical practice. One clinical expert explained that in their NHS trust, granulocyte colony-stimulating factor (GCSF) can be used pre-emptively to manage adverse events. This can prevent dose delay or dose reduction for those on trifluridine–tipiracil. But the NHS England Cancer Drugs Fund lead explained that not all patients have equal access to GCSF. The committee agreed and noted that there would be cost implications associated with GCSF, if it were used. The EAG highlighted that the mean dose from CORRECT and CONCUR includes both dose delay and dose reduction. It also noted that real-world evidence (Nakashima 2020) directly comparing regorafenib and trifluridine–tipiracil suggests a similar dose reduction for both treatments (54% and 48%, respectively). Based on this, the EAG's preference was to apply equal RDI for both treatments. The committee concluded that both dose delay and dose reduction should be used for estimating the RDI, and preferred the EAG's approach of applying equal RDI to trifluridine–tipiracil and regorafenib.

Post-progression treatment costs

3.14 Post-progression treatment costs were not included in the company's base case. The company stated that advice from clinical experts suggests fewer than 10% of people having regorafenib or trifluridine–tipiracil would have post-progression treatment. But 26% of people who had regorafenib in CORRECT, and 31% in CONCUR, had post-progression treatment. The committee recalled that around 30% of people would be offered post-progression treatment after regorafenib or trifluridine–tipiracil has failed (see section 3.4). The company did a scenario analysis in which the post-progression treatment cost reported in NICE's technology appraisal on trifluridine–tipiracil was inflated to 2021 values (£1,633.18) and applied as a single cost to people having either regorafenib or trifluridine–tipiracil. The committee heard that the scenario analysis showed post-progression treatment was not a key driver of the cost-effectiveness estimates. But the committee concluded that subsequent treatment should be included in the economic model. It recalled that both trifluridine–tipiracil and regorafenib could be used in sequence (that is, trifluridine–tipiracil after regorafenib has been used, and vice versa). But it heard from the clinical experts that no difference in efficacy would be expected if either trifluridine–tipiracil or regorafenib were used at fourth line. This is because of the different mechanisms of action of the 2 technologies, and that those well enough to benefit from subsequent therapy are similar to the populations in the indirect treatment comparison. Clinical experts noted that maintained efficacy at later lines is also supported by observational data. The committee concluded that subsequent treatments should be included in the cost-effectiveness modelling, to reflect the clinical trials and clinical expert opinion. In the absence of additional evidence, it concluded that there is likely to be no difference in cost effectiveness if trifluridine–tipiracil is used as a further active treatment (that is, at fourth line) instead of best supportive care. This would likely reflect the evidence already considered in the NICE technology appraisal guidance on trifluridine–tipiracil.

Severity

General population QALYs

3.15 NICE's health technology evaluations manual notes that when considering overall benefits, the committee can consider decision-making modifiers. In its submission, the company provided evidence that mCRC that has been previously treated (including with fluoropyrimidine-based chemotherapy, anti‑VEGF, and anti‑EGFR treatments) or which is not suitable for treatment with the listed treatment options, is a severe condition. The severity modifier allows the committee to give more weight to health benefits in the most severe conditions. The company provided absolute and proportional QALY shortfall estimates in line with NICE's health technology evaluations manual. The company calculated the QALYs of people without the condition over their remaining lifetime. It used general population life expectancy estimates based on 2017 to 2019 national life tables, and utility estimates from Health Survey for England 2017 and 2018 data. These were matched to the same age and sex distribution as those with the condition, based on the baseline characteristics of people in the randomised controlled trials. The company and EAG agreed that the population in the trial had a mean age of 60, and 56% were women. The committee agreed that the methods used by the company and the EAG to estimate the remaining lifetime QALYs for the general population and for people living with the condition were appropriate.

Estimating QALY shortfall

3.16 QALY shortfall is calculated by estimating the difference between the number of QALYs generated for an individual in the general population and an individual who has metastatic colorectal cancer that has progressed after first-line chemotherapy or biological treatments, and who is well enough for further active treatment. The committee recalled that both trifluridine–tipiracil and best supportive care are potential comparators for regorafenib (see section 3.4), so the QALY shortfall when using both treatments would need to be considered. The company used its economic model to estimate the expected remaining lifetime QALYs for both comparators. This meant pooled CONCUR and CORRECT data was used. To estimate survival and progression through the model for people who had trifluridine–tipiracil, a hazard ratio from the indirect treatment comparison was applied to the survival curves of regorafenib. The committee noted that the QALY shortfall estimates for trifluridine–tipiracil and best supportive care were similar and felt this similarity might lack face validity (that is, the results are unexpected). The trials underpinning the model had their primary completion between 2011 and 2013 (see section 3.5). The clinical experts advised that there had been advancements in the management and treatment of advanced colorectal cancer in the decade since these trials took place. Clinical experts also explained that the COVID‑19 pandemic had impacted outcomes because of delays in diagnosis. They noted that there was likely to be real-world evidence available for people who have active cancer treatments in the NHS from the national cancer registry or Systemic Anti-Cancer Therapy (SACT) dataset, which would better reflect clinical practice. The committee agreed that using real-world evidence as a reference group would have more accurately reflected the QALY estimates. The committee would have preferred using a real-world dataset to derive the absolute event estimates for trifluridine–tipiracil. It would also have preferred using the hazard ratio from the network meta-analysis to estimate the relative effect for survival for those who had best supportive care.

QALY weighting

3.17 The committee considered 2 measures of QALY shortfall: absolute QALY shortfall and proportional QALY shortfall. Absolute QALY shortfall is the total health lost because of the condition. That is, the difference between the expected future health of people living without the condition and the future health which is lost by people living with a condition over their remaining lifetimes. Proportional QALY shortfall represents the fraction of health lost because of the condition. That is, the proportion of future health which is lost by people living with the condition compared with the expected future health of people living without the condition. Both the company's and EAG's estimates for proportional shortfall were above 0.95 for trifluridine–tipiracil and best supportive care, so the company considered that the 1.7 QALY weight should apply. The exact QALYs, and the absolute and proportional QALY shortfalls are confidential so cannot be reported here. The committee considered that there was uncertainty around the data used to estimate the QALYs for people living with the condition who had trifluridine–tipiracil. It would have preferred to see estimates using real-world data. The committee noted that the modifier for disease severity was not convincingly met for this population, and without additional data the committee would not be able to apply the 1.7 weighting. The committee then considered the estimates for people who would have best supportive care. Although it would have preferred to see QALY shortfall based on real-world data, it understood that people who had best supportive care in clinical practice would likely be less well, and have a worse prognosis, suggesting the estimates had greater face validity. The committee noted that all sensitivity analyses included estimates for a proportional shortfall above 0.95. The committee concluded that the modifier for disease severity for those who would have best supportive care was met and a weighting of 1.7 should be applied to the health benefits for regorafenib compared with best supportive care.

Cost-effectiveness estimates

Company and EAG cost-effectiveness estimates

3.18 The company's probabilistic incremental cost-effectiveness ratio (ICER) for regorafenib compared with trifluridine–tipiracil (including the commercial discount for regorafenib) is within what NICE normally considers an acceptable use of NHS resources, regardless of what severity weighting is applied. The committee recalled that a 1.7 weighting should be applied to health gains in the comparison with best supportive care. When the high severity weighting was applied, the company's cost-effectiveness estimates of regorafenib compared with best supportive care were also within the range NICE considers a cost-effective use of NHS resources. But the company's base case did not reflect the committee's preferred assumptions, which were:

  • fully parametric survival models for overall survival estimates for regorafenib and best supportive care (generalised gamma)

  • fully parametric survival models for progression-free survival estimates for regorafenib and best supportive care (log-logistic)

  • a fully parametric model for regorafenib time-on-treatment estimates (log-logistic)

  • an equal RDI for regorafenib and trifluridine–tipiracil

  • inclusion of grade 1 and 2 adverse events

  • including subsequent treatments.

    The committee concluded that applying all its preferred assumptions impacted the cost-effectiveness estimates for regorafenib compared with best supportive care but had negligible impact on the estimates for regorafenib compared with trifluridine–tipiracil.

Other factors

Equality issues

3.19 The committee heard that no equalities concerns were raised by the stakeholders and did not consider any equality issues to have an impact on its decision making about treatment of mCRC with regorafenib.

Conclusion

Recommendation

3.20 The committee concluded that regorafenib was likely to be cost effective compared with both trifluridine–tipiracil and best supportive care. But there remain some uncertainties in the cost-effectiveness estimates of regorafenib compared with best supportive care. The committee considered input from clinical experts, which suggested that treatment sequencing (that is, which treatment to use first, either regorafenib or trifluridine–tipiracil) would be based on individual preference and potential for response. It noted that clinicians would also consider toxicities from previous lines of treatment as well as previous treatment response and patient choice. So, the committee concluded that having regorafenib as an additional treatment option is appropriate, because this allows clinicians to make the most appropriate decision for each patient. So, regorafenib is recommended for previously treated metastatic colorectal cancer.