4 Economic evaluation

4.1 Introduction

4.1.1 This section details methods for assembling and synthesising evidence on the technology in an economic evaluation. This is needed to estimate the technology's relative clinical effectiveness and value for money compared with established practice in the NHS. NICE promotes high-quality analysis and encourages consistency in analytical approaches, but also acknowledges the need to report studies in other ways to reflect particular circumstances.

4.2 The reference case: framework

The concept of the reference case

4.2.1 NICE makes decisions across different technologies and disease areas. So, it is crucial that analyses done to inform the economic evaluation are consistent. NICE has defined a reference case that specifies the methods that are appropriate for the committee's purpose. Economic evaluations considered by NICE should include an analysis of results using these reference-case methods. This does not prevent additional analyses being presented when 1 or more aspects of methods differ from the reference case. However, these must be justified and clearly distinguished from the reference case.

4.2.2 Although the reference case specifies the methods preferred by NICE, it does not prevent the committee's consideration of non-reference-case analyses if appropriate. The key elements of analysis using the reference case are summarised in table 4.1.

Table 4.1 Summary of the reference case

Element of health technology assessment	Reference case	Section providing details
Defining the decision problem	The scope developed by NICE	4.1.4 to 4.1.6
Comparator(s)	As listed in the scope developed by NICE	2.2.4 to 2.2.6, 4.1.6, 4.1.14
Perspective on outcomes	All health effects, whether for patients or, when relevant, carers	4.1.7, 4.1.8
Perspective on costs	NHS and personal social services (PSS)	4.1.9 and 4.1.10
Types of economic evaluation	Cost-utility analysis with fully incremental analysis Cost-comparison analysis	4.1.11 to 4.1.14 4.1.18 to 4.1.22
Time horizon	Long enough to reflect all important differences in costs or outcomes between the technologies being compared	4.1.15 to 4.1.17
Synthesis of evidence on health effects	Based on systematic review	4.2
Measuring and valuing health effects*	Health effects should be expressed in quality-adjusted life years (QALYs).The EQ‑5D is the preferred measure of health-related quality of life in adults	4.3.1
Source of data for measurement of health-related quality of life*	Reported directly by patients or carers, or both	4.3.3
Source of preference data for valuation of changes in health-related quality of life*	Representative sample of the UK population	4.3.4
Equity considerations*	An additional QALY has the same weight regardless of the other characteristics of the individuals receiving the health benefit, except in specific circumstances	4.4.1
Evidence on resource use and costs	Costs should relate to NHS and PSS resources and should be valued using the prices relevant to the NHS and PSS	4.5.1
Discounting	The same annual rate for both costs and health effects (currently 3.5%)	4.6.1

*Elements of health technology assessment relevant to cost-utility analysis and not cost-comparison analysis.

4.2.3 Clearly specify and justify reasons for not applying reference-case methods and quantify the likely implications. The committee will discuss the weight it attaches to the results of such a non-reference-case analysis.

Defining the decision problem

4.2.4 The economic evaluation should start with a clear statement of the decision problem that defines the technologies being compared and the relevant patient groups. The decision problem should be consistent with the scope for the evaluation; any differences must be justified.

4.2.5 The main technologies of interest, their expected place in the care pathway, the comparator(s) and the relevant patient groups will be defined in the scope developed by NICE (see section 2).

4.2.6 Consider the scope (see section 2), and the evidence available for the technology under evaluation and its comparator(s) to allow a robust economic evaluation.

Perspective

4.2.7 For the reference case, the perspective on outcomes should be all relevant health effects, whether for patients or, when relevant, other people (mainly carers). The perspective adopted on costs should be that of the NHS and PSS.

4.2.8 Some features of healthcare delivery (often referred to as process characteristics) may indirectly affect health. For example, the way a technology is used might affect effectiveness, or a diagnostic technology may improve the speed of correct diagnosis. The value of these benefits should be quantified if possible, and the nature of these characteristics should be clearly explained. These characteristics may include convenience and the level of information available for patients.

4.2.9 NICE does not set the budget for the NHS. The objective of NICE's evaluations is to offer guidance that represents an efficient use of available NHS and PSS resources. For these reasons, the reference-case perspective on costs is that of the NHS and PSS. Productivity costs should not be included.

4.2.10 Some technologies may have substantial benefits to other government bodies (for example, treatments to reduce drug misuse may also reduce crime). These issues should be identified during the scoping stage of an evaluation. Evaluations that consider benefits to the government outside of the NHS and PSS will be agreed with the Department of Health and Social Care and other relevant government bodies as appropriate. They will be detailed in the remit from the Department of Health and Social Care and the final scope. For these non-reference-case analyses, the benefits and costs (or cost savings) should be presented in a disaggregated format and separately from the reference-case analysis.

Type of economic evaluation

4.2.11 Two forms of economic evaluation are available for guidance-producing programmes in the Centre for Health Technology Evaluation.

4.2.12 A cost-utility analysis is used when a full analysis of costs and health benefits is needed. It is used to establish the level of health benefit and costs of the technology(s) compared with relevant comparator(s).

4.2.13 A cost-comparison analysis is for technologies that are likely to provide similar or greater health benefits at similar or lower cost than the relevant comparator(s). For technologies evaluated using cost-comparison analysis in the technology appraisal programme, relevant comparators are those recommended in published NICE guidance for the same population.

Cost-utility analysis

4.2.14 Cost-effectiveness (specifically cost-utility) analysis is used to determine if differences in expected costs between technologies can be justified in terms of changes in expected health effects. Health effects should be expressed in terms of quality-adjusted life years (QALYs).

4.2.15 Using cost-effectiveness analysis is justified by NICE's focus on maximising health gains from a fixed NHS and PSS budget. QALYs are the most appropriate generic measure of health benefit that reflects both mortality and health-related quality-of-life effects. If the assumptions that underlie the QALY (for example, constant proportional trade-off and additive independence between health states) are inappropriate in a particular case, then evidence of this should be produced. Analyses using alternative measures may be presented as an additional non-reference-case analysis.

4.2.16 Follow standard decision rules when combining costs and QALYs. When appropriate, these should reflect when dominance or extended dominance exists, presented thorough incremental cost-utility analysis. Incremental cost-effectiveness ratios (ICERs) reported must be the ratio of expected additional total cost to expected additional QALYs compared with alternative technologies. As well as ICERs, expected net health benefits should be presented; this may be particularly informative when applying decision-making modifiers, if there are several technologies or comparators, when the differences in costs or QALYs between technologies is small, or when technologies provide less health benefit at lower costs. Net health benefits should be presented using values placed on a QALY gain of £20,000 and £30,000 (see section 4.10.8). Net monetary benefits can also be shown alongside ICERs and net health benefits.

4.2.17 In exceptional circumstances, if the technologies form part of a class of treatments, and evidence is available to support their clinical equivalence, estimates of QALYs gained for the class as a whole can be shown.

Cost-comparison analysis

4.2.18 Cost-comparison analysis comprises an analysis of the costs and resource use associated with the technology compared with that of the comparator(s). This type of analysis is usually used when developing medical technologies guidance or a cost-comparison technology appraisal.

4.2.19 The costs associated with differing health outcomes and resource consequences from the technology and the comparator(s) should be captured in the cost-comparison analysis (for example, managing adverse events or impacts on the care pathway), when relevant.

4.2.20 Cost-comparison analyses in a technology appraisal should be used for technologies likely to provide similar health benefits at similar or lower cost than comparator(s) that are recommended in published NICE guidance for the same population. For these analyses, the effects of the intervention and comparator(s) on health outcomes are captured in the clinical-effectiveness evidence and are not included in the cost-comparison analysis. Substantial differences between technologies in costs directly relating to health outcomes (such as adverse events) indicate that the technology and comparator(s) may not provide similar overall health benefits, so any such cost differences must be clearly justified. Whenever possible and appropriate, cost data and data sources should be consistent with any corresponding data and sources that were considered appropriate in the published NICE guidance for the comparator(s) for the same population.

4.2.21 Some technologies may have only a healthcare system benefit. For example, a test which rules out disease more quickly but has similar diagnostic performance to the existing and slower test. If there is evidence that existing approaches are similar, the evaluation may concentrate on the health and social care system outcomes.

Time horizon

4.2.22 The time horizon for estimating clinical effectiveness and value for money should be long enough to reflect all important differences in costs or outcomes between the technologies being compared.

4.2.23 Many technologies have effects on costs and outcomes over a patient's lifetime. In these circumstances, a lifetime time horizon is usually appropriate. A lifetime time horizon is needed when alternative technologies lead to differences in survival or benefits that last for the remainder of a person's life.

4.2.24 For a lifetime time horizon, it is often necessary to extrapolate data beyond the duration of the clinical trials, observational studies or other available evidence and to consider the associated uncertainty. When the effect of technologies is estimated beyond the results of the clinical studies, analyses that compare several alternative scenarios reflecting different assumptions about future effects using different statistical models are desirable (see section 4.7). These should include assuming the technology does not provide further benefit beyond the technologies' use, as well as more optimistic assumptions. Analyses that limit the time horizon to periods shorter than the expected effect of the technology do not usually provide the best estimates of benefits and costs.

4.2.25 A time horizon shorter than a patient's lifetime could be justified if there is no differential mortality effect between technologies and the differences in costs and clinical outcomes relate to a relatively short period.

4.3 Measuring and valuing health effects in cost-utility analyses

4.3.1 Express health effects in QALYs for cost-effectiveness analyses. For the reference case, report the measurement of changes in health-related quality of life directly from patients. The utility of these changes should be based on public preferences using a choice-based method.

4.3.2 A QALY combines both quality of life and life expectancy into a single index. In calculating QALYs, each of the health states experienced within the time horizon of the model is given a utility reflecting the health-related quality of life associated with that health state. The time spent in each health state is multiplied by the utility. Deriving the utility for a particular health state usually comprises 2 elements: measuring health-related quality of life in people who are in the relevant health state and valuing it according to preferences for that health state relative to other states (usually perfect health and death).

4.3.3 Health-related quality of life, or changes in health-related quality of life, should be measured directly by patients. When it is not possible to get measurements of health-related quality of life directly from patients, these should come from the person who acts as their carer rather than healthcare professionals.

4.3.4 The valuation of health-related quality of life measured by patients (or their carers) should be based on a valuation of public preferences from a representative sample of the UK population using a choice-based method. This valuation leads to the calculation of utility values.

4.3.5 Different methods used to measure health-related quality of life produce different utility values. Therefore, results from different methods or instruments cannot always be compared.

4.3.6 Given the need for consistency across evaluations, the EQ‑5D measurement method is preferred to measure health-related quality of life in adults. Preference values from the EQ‑5D should be applied to measurements of health-related quality of life to generate health-related utility values.

4.3.7 In some circumstances adjustments to utility values may be needed, for example for age or comorbidities. If baseline utility values are extrapolated over long time horizons, they should be adjusted to reflect decreases in health-related quality of life seen in the general population and to make sure that they do not exceed general population values at a given age. Adjustment should be based on a recent and robust source of population health-related quality of life. If this is not considered appropriate for a particular model, the supporting rationale should be provided. A multiplicative approach is generally preferred. Clearly document the methods used for adjusting utility values.

4.3.8 If not available in the relevant clinical trials, EQ‑5D data can be sourced from the literature. When taken from the literature, the methods for identifying the data should be systematic and transparent. Clearly explain the justification for choosing a particular data set. When more than 1 plausible set of EQ‑5D data is available, sensitivity analyses should be done to show the effect of the alternative utility values.

4.3.9 When EQ‑5D data is not available, this data can be estimated by mapping other health-related quality-of-life measures or health-related benefits seen in the relevant clinical trials to EQ‑5D. This is considered to be a departure from the reference case. The mapping function chosen should be based on data sets containing both health-related quality-of-life measures and its statistical properties. It should be fully described, its choice justified, and it should be adequately shown how well the function fits the data. Present sensitivity analyses to explore variation in using mapping algorithms on the outputs.

4.3.10 In some circumstances the EQ‑5D may not be the most appropriate measure. To make a case that the EQ‑5D is inappropriate, provide qualitative empirical evidence on the lack of content validity for the EQ‑5D, showing that key dimensions of health are missing. This should be supported by evidence that shows that EQ‑5D performs poorly on tests of construct validity (that is, it does not perform as would be expected) and responsiveness in a particular patient population. This evidence should be derived from a synthesis of peer-reviewed literature. In these circumstances alternative health-related quality-of-life measures may be used. These must be accompanied by a carefully detailed account of the methods used to generate the data, their validity, and how these methods affect the utility values.

4.3.11 In circumstances when evidence generation is difficult (for example, for rare diseases), when there is insufficient data to assess whether the EQ‑5D adequately reflects changes in quality of life, evidence other than psychometric measures may be presented and considered to establish whether the EQ‑5D is appropriate.

4.3.12 A hierarchy of preferred health-related quality-of-life methods is presented in figure 4.1. Use figure 4.1 for guidance when the EQ‑5D is not available or not appropriate.

Figure 4.1 Hierarchy of preferred health-related quality-of-life methods

4.3.13 For evaluations in which the population includes children and young people (that is, people aged under 18) consider alternative measures of health-related quality of life for children.

4.3.14 NICE does not recommend specific measures of health-related quality of life in children and young people. A generic measure that has been shown to have good psychometric performance in the relevant age ranges should be used. Not all paediatric health-related quality-of-life instruments have a UK value set, and there are methodological challenges when developing value sets for children and young people. Nonetheless, generic measures give valuable descriptive information about the effect of the condition and technology on children and young people's health-related quality of life. If data from a paediatric health-related quality-of-life instrument are used to generate utility values, explain how this was done. If there is evidence that generic measures are unsuitable for the condition or technology, refer to the hierarchy of preferred sources for health-related quality of life. A report by the Decision Support Unit summarises the psychometric performance of several preference-based measures.

4.3.15 Report if measures of health-related quality of life were completed by adults with the condition, children and young people themselves, or on their behalf (for example, by parents, carers or clinicians). Report the age of the children and young people. If multiple data sources are available, report what data was used in the economic model and the rationale behind this choice.

4.3.16 The EQ‑5D‑5L is a new version of the EQ‑5D, with 5 response levels. NICE does not recommend using the EQ‑5D‑5L value set for England published by Devlin et al. (2018). Companies, academic groups and others preparing evidence submissions for NICE should use the 3L value set for reference-case analyses. If data was gathered using the EQ‑5D‑5L descriptive system, utility values in reference-case analyses should be calculated by mapping the 5L descriptive system data onto the 3L value set. If analyses use data gathered using both EQ‑5D‑3L and EQ‑5D‑5L descriptive systems, the 3L value set should be used to derive all utility values, with 5L mapped onto 3L when needed. The mapping function developed by the Decision Support Unit (Hernández Alava et al. 2017), using the 'EEPRU dataset' (Hernández Alava et al. 2020), should be used for reference-case analyses. We support sponsors of prospective clinical studies continuing to use the 5L version of the EQ‑5D descriptive system to collect data on quality of life.

4.3.17 Evaluations should consider all health effects for patients, and, when relevant, carers. When presenting health effects for carers, evidence should be provided to show that the condition is associated with a substantial effect on carer's health-related quality of life and how the technology affects carers.

4.3.18 For diagnostics evaluations, linked-evidence modelling is usually needed to measure and value health effects, because 'end-to-end' controlled trials with follow up through the care pathway are uncommon (see section 4.6.14).

4.3.19 The analysis should include all relevant patient outcomes that change in the care pathway because of the diagnostic test or sequence of tests. The nature, severity, time and frequency of occurrence, and the duration of the outcome may all be important in determining the effect on quality of life and should be considered as part of the modelling process.

4.4 Evidence on resource use and costs

NHS and PSS costs

4.4.1 For the reference case, costs should relate to resources that are under the control of the NHS and PSS. Value these resources using the prices relevant to the NHS and PSS. Present evidence to show that resource use and cost data have been identified systematically.

4.4.2 Estimates of resource use should include the comparative costs or saving of the technologies and changes in infrastructure, use and maintenance. If appropriate, staff training costs should be included.

4.4.3 Estimates of resource use may also include the comparative value of healthcare service use outcomes (such as length of hospital stay, number of hospitalisations, outpatient or primary care consultations) associated with the technology or its comparators.

4.4.4 Reference-case analyses should be based on prices that reflect as closely as possible the prices that are paid in the NHS for all evaluations. Analyses should be based on price reductions when it is known that some form of price reduction is available across the NHS. Sources of prices may include: patient access schemes, commercial access agreements, NHS Supply Chain prices, the Drugs and Pharmaceutical electronic Market Information Tool (eMIT), the drugs tariff or through negotiated contracts such as Commercial Medicines Unit (CMU). When judgement on the appropriate price is needed, the committee should consider the limitations around the price source in its deliberations. This should consider transparency to the NHS and the period for which the prices are guaranteed. Any uncertainty should be acknowledged and explored. If the acquisition price paid for a resource varies substantially (for example, the diagnostic technology or consumables may be sold at reduced prices to NHS institutions) the reference-case analysis should be based on costs that reflect as closely as possible the prices that are paid in the NHS. Any uncertainty in price may be incorporated into the modelling and should follow a consistent approach as for other uncertain or variable parameters.

4.4.5 When contracts are awarded by the CMU, the prices for a medicine can differ between regions. This means that, although a discounted price is available across the NHS, there is no single price that is universally available across the NHS. When CMU prices are considered most appropriate for an evaluation, the committee should be aware that prices may not be consistently available across the NHS. The committee should consider analyses based on both the lowest and the highest available CMU prices in its decision making. For pragmatism, sensitivity and scenario analyses for other parameters should use the midpoint (the value between the highest and lowest CMU prices).

4.4.6 When eMIT or confidential CMU prices are used by the committee, it will be aware that those prices are not guaranteed for the duration of the guidance.

4.4.7 For medicines that are mainly prescribed in primary care, base prices on the drugs tariff.

4.4.8 When there is no form of price reduction available across the NHS, or a price agreed by a national institution for the technology(s) (as may be the case for some devices and diagnostic technologies), analyses may use the list price or the price that is generally available to the NHS as submitted by the company (if it is reported transparently).

4.4.9 Healthcare resource groups (HRGs) are a valuable source of information for estimating resource use. HRGs are standard groupings of clinically similar treatments that use common levels of healthcare resources. The national average unit cost of an HRG is reported as part of the annual mandatory collection of reference costs from all NHS organisations in England. Using these costs can reduce the need for local micro-costing (costing of each individual component of care related to a technology). Carefully consider all relevant HRGs. For example, the cost of hospital admission for a serious condition may not account for time spent in critical care, which is captured and costed as a separate HRG. It may also be necessary to consider other costs that are unbundled and not included in the core HRG.

4.4.10 Data based on HRGs may not be appropriate in all circumstances. For example, when the new technology and the comparator both fall under the same HRG, or when the mean cost does not reflect resource use in relation to the new technology under evaluation. In such cases, other sources of evidence, such as micro-costing studies, may be more appropriate. In all cases, include all relevant costs such as the costs of the test, follow up, treatment, monitoring, staffing, facilities, training and any other modifications needed. When cost data is taken from literature, the methods used to identify sources of costs and resource use should be defined (preferably through systematic review). When multiple or alternative sources are available, the choice for the base case should be justified, the discrepancies between the sources should be explained and sensitivity analyses explored when appropriate implications for results of using alternative data sources.

4.4.11 Include costs related to the condition of interest and incurred in additional years of life gained because of technology in the reference-case analysis. Exclude costs that are unrelated to the condition or technology of interest. For diagnostic technologies, if the prognostic information generated increases the cost or allows cost savings in unrelated conditions, include these changes in a non-reference-case analysis but explain and justify them.

4.4.12 In cases when current costs are not available, costs from previous years should be adjusted to present value using inflation indices appropriate to the cost perspective, such as the NHS cost inflation index and the PSS pay and prices index, available from the PSS Research Unit report on unit costs of health and social care or the Office for National Statistics consumer price index.

4.4.13 Whenever possible, costs relevant to the UK healthcare system should be used. However, in cases when only costs from other countries are available these should be converted to Pounds Sterling using an exchange rate from an appropriate and current source (such as HM Revenue and Customs or Organisation for Economic Co-operation and Development).

4.4.14 The reference case should include the full additional costs associated with introducing a technology.

4.4.15 The committee should consider the specific circumstances and context of the evaluation. It should consider alongside the reference-case analysis a non-reference-case analysis in which a particular cost is apportioned or adjusted when:

there is an established plan to change practice or service delivery in the NHS
there is a formal arrangement with relevant stakeholders that the full costs should not be attributed to the new technology
the technology has multiple uses beyond the indication under evaluation
introducing the new technology will lead to identifiable benefits that are not captured in health technology evaluations.

4.4.16 In cases where a technology increases survival in people for whom the NHS is currently providing care that is expensive or would not be considered cost effective at NICE's normal levels, the committee may consider alongside the reference-case analysis a non-reference-case analysis with the background care costs removed. The committee will consider in its decision making both the reference-case and non-reference-case analyses, taking into account the nature of the specific circumstances of the evaluation including the population, care pathway and technology, as well as:

the extent to which the cost effectiveness of the technology is driven by factors outside its direct costs and benefits
if the NHS is already providing care that would not be considered cost effective at NICE's normal levels
if the high-cost care is separate from direct, intrinsic consequences of the technology (such as a side effect or administration cost)
the extent to which commercial solutions would address the issue.

4.4.17 When developing technology appraisal guidance, if a technology is administered in combination with another technology, the company may propose commercial solutions.

4.4.18 When a group of related technologies is being evaluated as part of a 'class', an analysis using the individual unit costs specific to each technology should normally be presented in the reference case. Exceptionally, if there is a very wide range of technologies and costs to be considered, then present analyses using the weighted mean cost and the highest and lowest cost estimates.

4.4.19 Exclude value added tax (VAT) from all economic evaluations but include it in the calculation of the budgetary impact when the resources in question are liable for this tax.

4.4.20 For technologies with multiple uses that are already being used in the NHS, for example diagnostic tests that could identify multiple markers, and when not all of its uses are being evaluated, the average cost should initially be identified based on the expected use or throughput of the device for only the uses being evaluated. In some cases, if a technology is already recommended for another purpose and enough spare capacity exists to allow the use for the condition in the current evaluation, an analysis using marginal costs may be supplied in addition to the analysis based on average costs.

4.4.21 Additional sensitivity analyses may be done using average costs computed through assigning some of the fixed costs to other uses of the technology, if there is evidence that the other uses also provide good value for money.

Non-NHS and non-PSS costs

4.4.22 Some technologies may have a substantial effect on the costs (or cost savings) to government bodies other than the NHS. Exceptionally, these costs may be included if specifically agreed with the Department of Health and Social Care. When non-reference-case analyses include these broader costs, explicit methods of valuation are needed. In all cases, these costs should be reported separately from NHS and PSS costs, and not included in the reference-case analysis.

4.4.23 Costs paid by patients may be included when they are reimbursed by the NHS or PSS. When the rate of reimbursement varies between patients or geographical regions, such costs should be averaged across all patients. When there are costs paid by patients that are not reimbursed by the NHS and PSS, these may be presented separately. Productivity costs should be excluded from the reference case. They can be presented separately, as additional information for the committee, if such costs may be a critical component of the value of the technology.

4.4.24 When care by family members, friends or a partner might otherwise have been provided by the NHS or PSS, it may be appropriate to consider the cost of the time of providing this care, even when adopting an NHS or PSS perspective. All analyses including the time spent by family members providing care should be shown separately. A range of valuation methods exists to cost this type of care. Methods chosen should be clearly described and sensitivity analyses using other methods should be presented. PSS savings should also be included.

4.5 Discounting

4.5.1 Cost-effectiveness results should reflect the present value of the stream of costs and benefits accruing over the time horizon of the analysis. For the reference case, costs and health effects should be discounted at the same rate of 3.5% per year.

4.5.2 Alternative analyses using rates of 1.5% for both costs and health effects may be presented alongside the reference-case analysis, in specific circumstances.

Non-reference-case discounting

4.5.3 The committee may consider analyses using a non-reference-case discount rate of 1.5% per year for both costs and health effects, if, in the committee's considerations, all of the following criteria are met:

The technology is for people who would otherwise die or have a very severely impaired life.
It is likely to restore them to full or near-full health.
The benefits are likely to be sustained over a very long period.

4.5.4 When considering analyses using a 1.5% discount rate, the committee must take account of plausible long-term health benefits in its discussions. The committee will need to be confident that there is a highly plausible case for the maintenance of benefits over time when using a 1.5% discount rate.

4.5.5 Further, the committee will need to be satisfied that any irrecoverable costs associated with the technology (including, for example, its acquisition costs and any associated service design or delivery costs) have been appropriately captured in the economic model or mitigated through commercial arrangements.

4.6 Modelling methods

4.6.1 The models used to generate estimates of clinical and cost effectiveness and cost comparison should follow accepted guidelines. Provide full documentation and justification of structural assumptions and data inputs. When there are alternative plausible assumptions and inputs, do sensitivity analyses of their effects on model outputs.

4.6.2 Modelling provides an important framework for synthesising available evidence and generating estimates of clinical and cost effectiveness, and cost comparison, in a format relevant to the committee's decision-making process. Models are needed for most evaluations.

4.6.3 Providing an all-encompassing definition of what constitutes a high-quality model is not possible. In general, estimates of technology performance should be based on the results of the systematic review and modelling when appropriate. Structural assumptions should be fully justified, and data inputs should be clearly documented and justified in the context of a valid review of the alternatives. The conceptual model development process used to inform the choice of model structure should be transparent and justified. This should include details of expert involvement in this process (for example, number of experts, details of their involvement, how they were chosen). It is not enough to state that the chosen model structure has previously been used in published model reports or accepted in submissions to NICE. The chosen type of model (for example, Markov cohort model, individual patient simulation) and model structure should be justified for each new decision problem.

4.6.4 Detail the methods of quality assurance used in the development of the model and provide the methods and results of model validation. Also, present the results from the analysis in a disaggregated format and include a table of key results. For cost-utility analyses, this should include estimates of life years gained, mortality rates (at separate time points if appropriate) and the frequency of selected outputs predicted by the model.

4.6.5 For cost-utility analyses, clinical end points that reflect how a patient feels, functions, or how long a patient lives are considered more informative than surrogate outcomes. When using 'final' clinical end points is not possible and data on other outcomes are used to infer the effect of the technology on mortality and health-related quality of life, evidence supporting the outcome relationship must be provided together with an explanation of how the relationship is quantified for use in modelling.

4.6.6 Three levels of evidence for surrogate relationships can be considered in decision making (Ciani et al. 2017):

Level 3: biological plausibility of relation between surrogate end point and final outcomes.
Level 2: consistent association between surrogate end point and final outcomes. This would usually be derived from epidemiological or observational studies.
Level 1: the technology's effect on the surrogate end point corresponds to commensurate effect on the final outcome as shown in randomised controlled trials (RCTs).

4.6.7 For a surrogate end point to be considered validated, there needs to be good evidence that the relative effect of a technology on the surrogate end point is predictive of its relative effect on the final outcome. This evidence preferably comes from a meta-analysis of level 1 evidence (that is, RCTs) that reported both the surrogate and the final outcomes, using the recommended meta-analytic methods outlined in technical support document 20 (bivariate meta-analytic methods). Show biological plausibility for all surrogate end points, but committees will reach decisions about the acceptability of the evidence according to the decision context. For example, for certain technologies indicated for rare conditions, and some diagnostic technologies and medical devices, the level of evidence might not be as high.

4.6.8 The validation of a surrogate outcome is specific to the population and technology type under consideration.

4.6.9 Thoroughly justify extrapolating a surrogate to final relationship to a different population or technology of a different class or with a different mechanism of action.

4.6.10 Extrapolation should be done using the recommended meta-analytic methods that allow borrowing of information from similar enough classes of technologies, populations, and settings, as outlined in technical support document 20. Existing relevant meta-analytical models may be used. However, when historical models are based on data collected in a different setting, then development of a new model using appropriate meta-analytic techniques is recommended. This may include network meta-analysis or hierarchical methods reflecting differences in mechanism of action between classes of technologies or for first-in-class scenarios.

4.6.11 In cost-utility analyses, the usefulness of the surrogate end point for estimating QALYs will be greatest when there is strong evidence that it predicts health-related quality of life or survival. In all cases, the uncertainty associated with the relationship between the surrogate end points and the final outcomes should be quantified and presented. It should also be included through probabilistic sensitivity analysis and can be further explored in scenario analysis.

4.6.12 Diagnostics evaluations may include intermediate outcomes. Diagnostic test accuracy statistics are intermediate measures, and when incorporated into models, can be used as predictors of future health outcomes of patients. Other intermediate measures include radiation exposure from an imaging test or pathogenicity of specific genetic mutations identified by a genetic test. In all cases, the uncertainty associated with the intermediate measure should be quantified and presented.

4.6.13 The scientific literature for diagnostics largely consists of studies of analytical and clinical validity. Data on the impact of diagnostic technologies on final patient outcomes is limited. The benefits from diagnostic testing generally arise from the results of treatment or prevention efforts that take place based on the testing. There may be some direct benefits from the knowledge gained and some direct harm from the testing, but most of the outcomes are indirect and come downstream. To assess these outcomes, consider not only the diagnostic process itself, but also treatment and monitoring. A new diagnostic technology can affect the care pathway in 2 major ways. The first is how the test is used in the diagnostic process. The second is the impact of changed diagnostic information on subsequent disease management. A new technology can be a like-for-like replacement for an existing test or test sequence, or it can be an addition to an existing test or test sequence. New diagnostics can be integrated together with parts of the existing diagnostic process to create a new sequence. Once the diagnostic process options are defined, the health outcomes from identified technologies or changes in technology based on test results should be assessed. Often the technology may be some form of treatment. The diagnostic technology may result in treatment being started, modified or stopped. Ensure the populations assessed in the studies of diagnostic test accuracy are comparable with those in the evaluation of the technology.

4.6.14 If direct data on the impact of a diagnostic technology on final outcomes is not available, it may be necessary to combine evidence from different sources. A linked-evidence modelling approach should be used. Specify the links used, such as between diagnosis, treatment and final outcomes. Obtain and review the relevant data about those links.

4.6.15 Clinical trial data generated to estimate treatment effects may not quantify the risk of some health outcomes or events for the population of interest well enough or may not provide estimates over a sufficient duration for the economic analysis. The methods used to identify and critically evaluate sources of data for economic models should be stated and the choice of particular data sets should be justified with reference to their suitability to the population of interest in the evaluation.

4.6.16 Quantifying the baseline risk of health outcomes and how the condition would naturally progress with the comparator(s) can be a useful step when estimating absolute health outcomes in the economic analysis. This can be informed by observational studies. Relative treatment effects seen in randomised trials may then be applied to data on the baseline risk of health outcomes for the populations or subgroups of interest. State and justify the methods used to identify and critically evaluate sources of data for these estimates.

4.6.17 When outcomes are known to be related, a joint synthesis of structurally related outcomes is recommended whenever possible, to increase precision and robustness of decision making.

4.6.18 Models used for cost-utility analyses should be informed by knowledge of the natural history of the disease and checked for clinical plausibility. The underlying assumptions should be checked statistically whenever possible.

4.6.19 Assumptions included in models should, when appropriate, be validated by a user of the technology who has experience of using it in the NHS or a user with appropriate expertise that can be applied to the technology. This is particularly relevant for the evaluation of medical devices.

4.6.20 Modelling is often needed to extrapolate costs and health benefits over an extended time horizon. Assumptions used to extrapolate the treatment effect over the relevant time horizon should have both external and internal validity and be reported transparently. The external validity of the extrapolation should be assessed by considering both clinical and biological plausibility of the inferred outcome as well as its coherence with external data sources, such as historical cohort data sets or other relevant studies. Internal validity should be explored and when statistical measures are used to assess the internal validity of alternative models of extrapolation based on their relative fit to the observed trial data, the limitations of these statistical measures should be documented. Alternative scenarios should also be routinely considered to compare the implications of different methods for extrapolation of the results. For example, for duration of treatment effects, scenarios in the extrapolated phase might include:

treatment effect stops or diminishes gradually over time
treatment effect is sustained for people who continue to have treatment
treatment effect (or some effect) is sustained beyond discontinuation for people who stop treatment, when it is clinically plausible for lasting benefit to remain.

4.6.21 Synthesis of survival outcomes needs individual patient level data. When this is not available, methods such as the Guyot et al. (2012) method can be used to reconstruct Kaplan–Meier data as referenced in technical support document 14.

4.6.22 Studies using survival outcomes, or time-to-event outcomes, often measure the relative effects of treatments using hazard ratios (HRs), which may either be constant over time (proportional hazards) or change over time. The proportional hazards assumption should always be assessed (see technical support document 14), preferably using:

log-cumulative hazard plots (as advised in technical support document 14)
visual inspection of the hazard plots or HRs over time, and
interpretation of tests for proportional hazards reported in the original trial publications.

4.6.23 If the proportional hazards assumption holds within the trial and is clinically plausible during extrapolation, then HRs may be pooled using standard code for treatment differences (see technical support document 2, . Correlations need to be accounted for in trials with 3 or more arms.

4.6.24 If the proportional hazards assumption does not hold in some of the studies, then alternative methods should be considered, as described in technical support document 21.

4.6.25 When extrapolating time-to-event data, various standard (for example, parametric) and more flexible (for example, spline-based, cure) approaches are available. Their appropriateness and the validity of their extrapolations should routinely be considered. When comparing alternative models for extrapolating time-to-event data, the clinical plausibility of their underlying hazard functions should routinely be assessed. Uncertainty in the extrapolated portion of hazard functions should also be explored. Functions that display stable or decreasing variance over time are likely to underestimate the uncertainty in the extrapolation.

4.6.26 In RCTs, patients in the control group are sometimes allowed to switch treatment group and have the technology being investigated. In these circumstances, when intention-to-treat analysis is considered inappropriate, statistical methods that adjust for treatment switching can also be presented. Avoid simple adjustment methods such as censoring or excluding data from patients who crossover, because they are very susceptible to selection bias. Explore and justify the relative merits and limitations of the methods chosen to explore the effect of switching treatments, with respect to the method chosen and in relation to the specific characteristics of the data set in question. These characteristics include the mechanism of crossover used in the trial, the availability of data on baseline and time-dependent characteristics, expectations around the treatment effect if the patients had stayed on the treatment they were allocated and any or residual effect from the previous treatment. When appropriate, the uncertainty associated with using a method to adjust for trial crossover should be explored and quantified.

4.6.27 In general, all model parameter values used in base-case, sensitivity, scenario and subgroup analyses should be both clinically plausible and should use methods that are consistent with the data. Results from analyses that do not meet these criteria will not usually be suitable for decision making.

4.6.28 Sometimes it may be difficult to define what is plausible and what is not, for example, in very rare conditions or for innovative medical technologies, when the evidence base may be less robust. In such situations, consider expert elicitation to identify a plausible distribution of values.

4.6.29 If threshold analysis is used, the parameter value at which a cost-effectiveness estimate reaches a given threshold may be implausible. In this case, it is still appropriate to present the results of the threshold analysis, alongside information on the plausible range for the parameter.

4.7 Exploring uncertainty

4.7.1 Present an overall assessment of uncertainty to committees to inform decision making. This should describe the relative effect of different types of uncertainty (for example, parameter, structural) on cost-effectiveness estimates, and an assessment of whether the uncertainties that can be included in the analyses have been adequately captured. It should also highlight the presence of uncertainties that are unlikely to be reduced by further evidence or expert input.

4.7.2 The model should quantify the decision uncertainty associated with a technology. That is, the probability that a different decision would be reached if the true cost effectiveness of each technology could be ascertained before making the decision.

4.7.3 Models are subject to uncertainty around the structural assumptions used in the analysis. Examples of structural uncertainty may include how different states of health are categorised and how different pathways of care are represented.

4.7.4 Clearly document these structural assumptions and provide the evidence and rationale to support them. Explore the effect of structural uncertainty on cost-effectiveness estimates using separate analyses of a representative range of plausible scenarios that are consistent with the evidence. Analyses based on demonstrably implausible scenarios are only useful if they are used to show that cost-effectiveness estimates are robust to a source of uncertainty. For example, if the resource use associated with a procedure is uncertain, a useful exploratory analysis might show that the implausible assumptions of no resource use and very large amounts of resources do not materially affect the cost-effectiveness conclusion. The purpose of such analyses should be clearly presented. This will allow a committee to focus on other key uncertainties in its decision making.

4.7.5 It may be possible to incorporate structural uncertainty within a probabilistic model (for example, by model averaging or assigning a probability distribution to alternative structural assumptions). If structural uncertainty is parameterised, consider the alternative assumptions and any probabilities used to 'weight' them. This should be transparently documented, including details of any expert advice.

4.7.6 Examples of when this type of scenario analysis should be done are:

if there is uncertainty about the most appropriate assumption to use for extrapolation of costs and outcomes beyond trial follow up
if there is uncertainty about how the care pathway is most appropriately represented in the analysis
if there may be economies of scale (for example, in evaluations of diagnostic technologies).

4.7.7 Uncertainty about the appropriateness of the methods used in the reference case can also be dealt with using sensitivity analysis but present these analyses separately.

4.7.8 A second type of uncertainty arises from the choice of data sources to provide values for the key parameters, such as different costs and utilities, relative-effectiveness estimates and their duration. Reflect the implications of different key parameter estimates in sensitivity analyses (for example, through the inclusion of alternative data sets). Fully justify inputs and uncertainty explored by sensitivity analysis using alternative input values.

4.7.9 The choice of data sources to include in an analysis may not be clear. In such cases, the analysis should be done again, using alternative data sources or excluding the study about which there is doubt. Report the results separately. Examples of when this type of sensitivity analysis should be done are:

if alternative sets of plausible data on the health-related utility associated with the condition or technology are available
if there is variability between hospitals in the cost of a particular resource or service, or the acquisition price of a particular technology
if there are doubts about the quality or relevance of a particular study in a meta-analysis or network meta-analysis.

4.7.10 A third source of uncertainty comes from parameter precision, once the most appropriate sources of information have been identified (that is, the uncertainty around the mean health and cost inputs in the model). Assign distributions to characterise the uncertainty associated with the (precision of) mean parameter values. The distributions chosen for probabilistic sensitivity analysis should not be chosen arbitrarily but chosen to represent the available evidence on the parameter of interest, and their use should be justified. Formal elicitation methods are available if there is a lack of data to inform the mean value and associated distribution of a parameter. If there are alternative plausible distributions that could be used to represent uncertainty in parameter values, explore using separate probabilistic analyses of these scenarios.

4.7.11 When doing a probabilistic analysis, enough model simulations should be used to minimise the effect of Monte Carlo error. Reviewing the variance around probabilistic model outputs (net benefits or ICERs) as the number of simulations increases can provide a way of assessing if the model has been run enough times or more runs are needed.

4.7.12 The committee's preferred cost-effectiveness estimate should be derived from a probabilistic analysis when possible unless the model is linear. If deterministic model results are used, this should be clearly justified, and the committee should take a view on if the deterministic or probabilistic estimates are most appropriate. However, in general, uncertainty around individual parameters is not a reason to exclude them from probabilistic analyses; rather, that uncertainty should be captured in the analysis.

4.7.13 In general, scenario analyses should also be probabilistic. When only deterministic base-case or scenario analyses are provided, this should be justified. For example, it may be impractical to get probabilistic results for many plausible scenarios. This may be less influential for decision making if the base-case analysis is shown to be linear, or only moderately non-linear (when 'non-linear' means that there is not a straightforward linear relationship between changes in a model's inputs and outputs).

4.7.14 For evaluations based on cost-utility analyses, the committee's discussions should consider the spread of results.

4.7.15 Appropriate ways of presenting uncertainty in cost-effectiveness data parameter uncertainty include confidence ellipses and scatter plots on the cost-effectiveness plane (when the comparison is restricted to 2 alternatives) and cost-effectiveness acceptability curves (a graph that plots a range of possible maximum acceptable ICERs on the horizontal axis against the probability (chance) that the intervention will be cost effective at that ICER on the vertical axis). The presentation of cost-effectiveness acceptability curves should include a representation and explanation of the cost-effectiveness acceptability frontier (a region on a plot that shows the probability that the technology with the highest expected net benefit is cost effective). Present results exploring uncertainty in a table, identifying parameters that have a substantial effect on the modelling results. As well as details of the expected mean results (costs, outcomes and ICERs), also present the probability that the treatment is cost effective at maximum acceptable ICERs of £20,000 to £30,000 per QALY gained and the error probability (that the treatment is not cost effective), particularly if there are more than 2 alternatives.

4.7.16 For evaluations based on cost-comparison analyses, the level of complexity of the sensitivity analysis should be appropriate for the model being considered in terms of the pathway complexity and available data. It is likely scenario-based sensitivity analysis will be important to help identify parameters that have a substantial effect on the modelling results. Threshold analysis is also useful to identify relevant parameter boundaries.

4.7.17 Deterministic sensitivity analyses exploring individual or multiple correlated parameters may be useful for identifying parameters to which the decision is most sensitive. 'Tornado' histograms may be a useful way to present these results. Deterministic threshold analysis might inform decision making when there are influential but highly uncertain parameters. However, if the model is non-linear, deterministic analysis will be less appropriate for decision making.

4.7.18 Accuracy parameters for diagnostic technologies (usually sensitivity and specificity) present a special case. Because sensitivity and specificity are usually correlated and may vary based on how a test is used or interpreted, point estimates with distributions are not usually appropriate.

4.7.19 Consider evidence about the extent of correlation between individual parameters and reflect this in the probabilistic analysis. When considering relationships between ordered parameters, consider approaches that neither artificially restrict distributions nor impose an unsupported assumption of perfect correlation. Clearly present assumptions made about the correlations.

4.7.20 The computational methods used to implement an appropriate model structure may occasionally present challenges in doing probabilistic sensitivity analysis. Clearly specify and justify using model structures that limit the feasibility of probabilistic sensitivity analysis. Models should always be fit for purpose and should allow thorough consideration of the decision uncertainty associated with the model structure and input parameters. The choice of a 'preferred' model structure or programming platform should not result in the failure to adequately characterise uncertainty.

4.7.21 Using univariate and best- or worst-case sensitivity analysis is an important way of identifying parameters that may have a substantial effect on the cost-effectiveness results and of explaining the key drivers of the model. However, such analyses become increasingly unhelpful in representing the combined effects of multiple sources of uncertainty as the number of parameters increase. Using probabilistic sensitivity analysis can allow a more comprehensive characterisation of the parameter uncertainty associated with all input parameters. Probabilistic univariate sensitivity analysis may be explored to incorporate the likelihood of a parameter taking upper and lower bound values, rather than just presenting the effect of it taking those values.

4.7.22 Threshold analysis can be used as an option to explore highly uncertain parameters when identifying a parameter 'switching value' may be informative to decision makers. A switching value is the value an input variable would need to take for a decision on whether the technology represents a good use of NHS resources for a given threshold (for example, £20,000 and £30,000 per QALY gained) to change. The threshold analysis should indicate how far the switching value is from the current best estimate of a parameter value.

4.7.23 Threshold analysis is not suitable for exploring uncertainty around parameters that are highly correlated with other influential parameters. Threshold analysis should also not be used to justify restricting the population of interest to a subgroup based on cost effectiveness.

4.7.24 The report should include descriptions and analysis about additional factors that are not part of the reference case and that may be relevant for decision making. These may include discussions of issues such as costings of long-term health states or health states associated with low health-related quality of life, incremental improvements, system and process improvements and patient convenience and cost improvements.

4.8 Companion diagnostics

4.8.1 Using a treatment may be conditional on the biological characteristics of a disease or the presence or absence of a predictive biomarker (for example a gene or a protein) that helps to assess the most likely response to a particular treatment for the individual patient. If a diagnostic test to identify patients or establish the presence or absence of a particular biomarker is not routinely used in the NHS but is introduced to support the treatment decision for the specific technology, include the associated costs of the diagnostic in the assessments of clinical and cost effectiveness. Provide a sensitivity analysis without the cost of the diagnostic test. When appropriate, examine the diagnostic accuracy of the test for the particular biomarker of treatment efficacy and, when appropriate, incorporate it in the economic evaluation.

4.8.2 The evaluation will consider any requirements of the regulatory approval, including tests to be completed and the definition of a positive test. In clinical practice in the NHS, it may be possible that an alternative diagnostic test procedure to that used in the clinical trials of the technology is used. When appropriate, the possibility that using an alternative test (which may differ in diagnostic accuracy from that used in the clinical trials) may affect selection of the patient population for treatment and the cost effectiveness of the treatment will be highlighted in the guidance.

4.8.3 It is expected that evaluations of multiple companion diagnostic test options will generally be done in the NICE diagnostics assessment programme.

4.9 Analysis of data for patient subgroups

4.9.1 For many technologies, the level of benefit will differ for patients with differing characteristics. In cost-utility analyses, explore this as part of the analysis by providing clinical- and cost-effectiveness estimates separately for each relevant subgroup of patients.

4.9.2 For evaluations using cost-comparison analyses, if a technology is found to affect more than 1 disease area or patient group, clearly present the assumptions and calculations used to calculate acquisition and infrastructure costs for different indications and uses of the technology.

4.9.3 The characteristics of patients in the subgroup should be clearly defined and should preferably be identified based on an expectation of differential clinical or cost effectiveness because of known, biologically plausible mechanisms, social characteristics or other clearly justified factors. When possible, potentially relevant subgroups will be identified at the scoping stage, considering the rationale for expecting a subgroup effect. However, this does not prevent the identification of subgroups later in the process; in particular, during the committee discussions.

4.9.4 Given NICE's focus on maximising health gain from limited resources, it is important to consider how clinical and cost effectiveness may differ because of differing characteristics of patient populations. Typically, the level of benefit will differ between patients, and this may also affect the subsequent cost of care. There should be a clear justification and, if appropriate, biological plausibility for the definition of the patient subgroup and the expectation of a differential effect. Avoid post hoc data 'dredging' in search of subgroup effects, this will be viewed sceptically.

4.9.5 The estimate of the overall net treatment effect of a technology is determined by the baseline risk of a particular condition or event or the relative effects of the technology compared with the relevant comparators. The overall net treatment effect may also be determined by other features of the people comprising the population of interest. It is therefore likely that relevant subgroups may be identified in terms of differences in 1 or more contributors to absolute treatment effects.

4.9.6 For subgroups based on differences in baseline risk of specific health outcomes, systematic identification of data to quantify this is needed. It is important that the methods for identifying appropriate baseline data for the purpose of subgroup analysis are provided in enough detail to allow replication and critical appraisal.

4.9.7 Specify how subgroup analyses are done, including the choice of scale on which any effect modification is defined. Reflect the statistical precision of all subgroup estimates in the analysis of parameter uncertainty. Clearly specify the characteristics of the patients associated with the subgroups presented to allow the committee to determine the appropriateness of the analysis about the decision problem.

4.9.8 The standard subgroup analyses done in RCTs or systematic reviews seek to determine if there are differences in relative treatment effects between subgroups (through the analysis of interactions between the effectiveness of the technology and patient characteristics). Consider the high possibility of differences emerging by chance, particularly when multiple subgroups are reported. Pre-specification of a particular subgroup in the study or review protocol, with a clear rationale for anticipating a difference in efficacy and a prediction of the direction of the effect, will increase the credibility of a subgroup analysis.

4.9.9 In considering subgroup analyses, the committee will take specific note of the biological or clinical plausibility of a subgroup effect as well as the strength of the evidence in favour of such an effect (for example, if it has a clear, pre-specified rationale and is consistent across studies). Fully document the evidence supporting biological or clinical plausibility for a subgroup effect, including details of statistical analysis. Consider using an established checklist (for example, the 10 credibility criteria by Sun et al. 2012) when differences in relative effects of the technology are identified.

4.9.10 Individual patient data is preferred, if available, for estimating subgroup-specific parameters. However, as for all evidence, the appropriateness of such data will always be assessed by considering factors such as the quality of the analysis, how representative the available evidence is to clinical practice and how relevant it is to the decision problem.

4.9.11 Consideration of subgroups based on differential cost may be appropriate in some circumstances. For example, if the cost of managing a particular complication of treatment is known to be different in a specific subgroup.

4.9.12 Types of subgroups that are not considered relevant are those based solely on the following factors:

subgroups based solely on differential costs for individuals according to their social characteristics
subgroups specified in relation to the costs of providing a technology in different geographical locations in the UK (for example, when the costs of facilities available for providing the technology vary according to location)
individual utilities for health states and patient preference.

4.9.13 Analysis of 'treatment continuation rules', whereby cost effectiveness is maximised based on continuing treatment only for people whose condition achieves a specified 'response' within a given time, should not be analysed as a separate subgroup. Rather, analyse the strategy involving the 'continuation rule' as a separate scenario, by considering it as an additional treatment strategy alongside the base-case interventions and comparators. This allows the costs and health consequences of factors such as any additional monitoring associated with the 'continuation rule' to be incorporated into the economic analysis. Additional considerations for continuation rules include:

the robustness and plausibility of the end point on which the rule is based
if the 'response' criteria defined in the rule can be reasonably achieved
the appropriateness and robustness of the time at which response is measured
if the rule can be incorporated into routine clinical practice
if the rule is likely to predict people for whom the technology is particularly cost effective
considerations of fairness about withdrawal of treatment for people whose condition does not respond.

4.10 Presentation of data and results

Presenting data

4.10.1 Presentation of results should be comprehensive and clear. All parameters used to estimate clinical and cost effectiveness should be presented in tables and include details of data sources. Evidence should be presented following the guidance in technical support document 1 for summaries of key characteristics and results of included studies. Data from the individual trials should be in tables and a narrative summary of the clinical evidence provided.

Model inputs

4.10.2 For the model, input data should be tabulated with the central value, measures of precision and sources. Details on how bias was assessed and addressed should be presented for each source used.

4.10.3 For cost-utility analyses, when presenting health-related quality of life, a table of each value, its source and the methodology (for example, EQ‑5D‑5L, EQ‑5D‑3L, standard gamble) used to derive it should be provided.

4.10.4 Present a table including:

disaggregated costs by health state and resource category
benefits, QALYs and life years by health state
decrements associated with further interventions and adverse events.

These results should be presented with and without discounting.

Survival estimates

4.10.5 For cost-utility analyses, Kaplan–Meier and parametric curves, and hazard plots based on observed data and model predictions should be represented both using graphs and tables. Survival analyses should be presented showing the number at risk for each Kaplan–Meier curve at each time point.

Presenting expected cost-effectiveness results

4.10.6 Present the expected value of each component of cost and expected total costs. Detail expected QALYs for each option compared in the analysis in terms of their main contributing components. Calculate ICERs as appropriate.

4.10.7 Present separately the life-year component of QALYs as well as the costs and QALYs associated with different stages of the condition.

4.10.8 Economic evaluation results should be presented in a fully incremental analysis with technologies that are dominated (that is, more costly and less effective than another technology in the analysis) and technologies that are extendedly dominated (that is, a combination of 2 or more other technologies would be more cost effective) removed from the analysis. Pairwise comparisons may be presented when relevant and justified (for example, when the technology is expected to specifically displace individual comparators). Expected net health benefits should also be presented when appropriate, using values placed on a QALY gain of £20,000 and £30,000; net health benefits may be particularly informative when:

there are several interventions or comparators
the differences in costs or QALYs between comparators is small
there are subgroup considerations
technologies provide less health benefit at lower costs (that is, in the south-west quadrant of the cost-effectiveness plane).

Evidence over time

4.10.9 A graphical presentation of the evidence generation process for a technology over time, including planned future evidence generation, can be included in the submission or report. This should show the expected time points of interim and final data readouts from ongoing clinical studies and planned additional studies. It should also indicate the key sources of uncertainty that might be reduced at each evidence-generating time point. For example, a forthcoming readout for a clinical trial may inform all aspects of relative effectiveness, while a future single-arm extension study may inform long-term survival outcomes for the technology under evaluation.

4.11 Impact on the NHS

Implementation of NICE guidance

4.11.1 Information on the impact of the implementation of the technology on the NHS (and PSS, when appropriate) is needed. This should be appropriate to the context of the evaluation.

4.11.2 When possible, the information on NHS impact should include details on key epidemiological and clinical assumptions, resource units and costs with reference to a general England population, and patient or service base (for example, per 100,000 population or per region).

Implementation or uptake and population health impact

4.11.3 Use evidence-based estimates of the current baseline treatment rates and expected appropriate implementation or uptake or treatment rates of the evaluated and comparator technologies in the NHS. Also, when appropriate, attempts should be made to estimate the resulting health impact (for example, QALYs or life years gained) in a given population. These should take account of the condition's epidemiology and the appropriate levels of access to diagnosis and treatment in the NHS. It should also highlight any key assumptions or uncertainties.

Resource impact

4.11.4 Implementation of a new technology will have direct implications for the provision of units of the evaluated and comparator technologies (for example, doses of drugs or theatre hours) by the NHS. Also, the technology may have a knock-on effect (increase or decrease) on other NHS and PSS resources, including alternative or avoided treatment and resources needed to support using the new technology. These might include:

staff numbers and hours
training and education
support services (for example, laboratory tests)
service capacity or facilities (for example, hospital beds, clinic sessions, diagnostic services and residential home places).

4.11.5 Highlight any likely constraints on the resources needed to support the implementation of the technology under evaluation, and comment on the affect this may have on the implementation timescale.

Costs

4.11.6 Provide estimates of net NHS (and PSS, when appropriate) costs of the expected resource impact to allow effective national and local financial planning. The costs should be disaggregated by appropriate generic organisational (for example, NHS, personal and social services, hospital or primary care) and budgetary categories (for example, drugs, staffing, consumables or capital). When possible, this should be to the same level and detail as that adopted in resource unit information. If savings are anticipated, specify the extent to which these finances can be realised. Supplied costs should also specify whether VAT is included. The cost information should reflect as closely as possible the prices that are paid in the NHS, and should be based on published cost analyses, recognised publicly available databases, price lists, or when appropriate, confidential or known price reductions.

4.11.7 If implementing the technology could have substantial resource implications for other services, explore the effects on the submitted cost-effectiveness evidence for the technology.

4.11.8 NICE produces costing tools to allow individual NHS organisations and local health economies to quickly assess the effect guidance will have on local budgets. Details of how the costing tools are developed are available in NICE's assessing cost impact: methods guide.

4.11.9 Committees may consider budget impact analyses when exploring the level of decision-making uncertainty associated with the evaluation of the technology(s) (see section 6.2.33).

How are you taking part in this consultation?

NICE health technology evaluations: the manual