5 Outcomes

The Diagnostics Advisory Committee (section 9) considered evidence from several sources (section 10).

How outcomes were assessed

5.1 The assessment was performed by an External Assessment Group and consisted of a systematic review and development of a decision analytical model.

5.2 The systematic review was carried out to identify evidence on the equivalence of fractional exhaled nitric oxide (FeNO) devices (analytical validity), evidence of the diagnostic accuracy of FeNO testing for asthma diagnosis and evidence of the efficacy of FeNO‑guided asthma management.

5.3 A decision analytical model and a Markov model were developed to assess the cost effectiveness of measuring FeNO in the diagnosis and management of asthma.

Review of equivalence of FeNO devices

5.4 The External Assessment Group undertook this review to establish whether FeNO devices could be considered to be equivalent to one another in their measurements, and so whether studies that used other devices could helpfully inform this appraisal. Because there was insufficient evidence from primary research studies that used the mobile, hand-held FeNO electrochemical devices (NIOX MINO, NIOX VERO and NObreath), a review of equivalence to the precursory large, stationary FeNO chemiluminescent devices (including Niox, also made by Aerocrine) was conducted.

5.5 The review identified 27 studies that compared NIOX MINO, NIOX VERO and NObreath with other devices. The External Assessment Group undertook 3 main comparisons for this purpose. The first included comparisons of means, which compare the reported mean FeNO values as measured by each device in the same cohort. The second compared correlation coefficients, which show whether measurements by 2 devices are correlated but not whether the actual values produced are the same. The third compared the result of Bland–Altman analyses, which produce statistics that assess agreement between devices rather than just correlation.

NIOX MINO

5.6 Eight studies compared NIOX MINO with Niox in adults. Of these studies, 5 were exclusively in adults and 3 were in adults and other age groups. There was variability in correlation between the devices among the studies. While 5 studies showed largely similar mean values between NIOX MINO and Niox, 3 studies showed higher FeNO readings with NIOX MINO (ranging from 0.5 to 9 parts per billion [ppb]). Small (non-significant) differences in the mean FeNO readings were observed between the devices when the cohort mean FeNO values were below 30 ppb (as measured by Niox). When the mean FeNO values were above 35 ppb, the differences in cohort means were larger and statistically significant. Correlation coefficients ranged from 0.73 to 0.998. The results of 1 study suggested that there may also be some variation between NIOX MINO devices themselves, although a second study showed good agreement. Across the 8 studies, Bland–Altman analyses were not reported in a consistent way. Limits of agreement were 10 ppb above and below the mean in some cases, and the studies with the largest mean differences did not report Bland–Altman statistics.

5.7 Three studies comparing NIOX MINO with Niox included children. Of these, 2 studies reported statistically significantly higher mean FeNO values with NIOX MINO, while 1 study reported statistically significantly lower values. This study had low mean values (below 10 ppb). All studies reported good correlation between the devices, while Bland–Altman statistics reported in 2 studies showed that NIOX MINO gave higher readings (by 1.1 ppb [limits of agreement −4.4 to 6.7] and 3.9 ppb [limits of agreement −1.1 to 8.9] respectively).

5.8 Twelve studies compared NIOX MINO with stationary chemiluminescent devices other than Niox in adults and children. Of these, 6 studies were in adults, 3 in an unspecified group and 3 in children. The chemiluminescence devices used in each of the 12 studies were different. In the adults and the unspecified age group, correlation coefficients ranged from 0.876 to 0.96, indicating good correlation between devices. However, the mean FeNO levels and Bland–Altman statistics did not suggest such good correlation. In 4 studies, NIOX MINO gave higher readings than the comparator device, while 2 studies reported lower readings and 2 studies showed the devices to be comparable. Bland–Altman statistics, reported in 4 studies, suggested that mean differences were small, but the limits of agreement were much greater.

5.9 In children, correlation coefficients between NIOX MINO and other chemiluminescent devices ranged from 0.69 to 0.98, indicating variable correlation. The study with the poorer correlation reported higher mean FeNO levels, suggesting that poorer correlation is due to greater variability at higher FeNO values. However, the authors stated that correlation improved at higher values. One study noted that the direction of disagreement was different in children aged over and under 12 years. The back-transformed Bland–Altman statistics and range of ratios reported showed a wide range of agreement, suggesting that the devices are not interchangeable.

5.10 The External Assessment Group stated that the comparability of NIOX MINO to chemiluminescent devices appears to be influenced by several factors. These include variability between NIOX MINO devices themselves, a lack of comparability between other chemiluminescent devices (which leads to heterogeneity in estimates of comparability between these devices and NIOX MINO) and poorer equivalence between the devices at higher FeNO levels.

NIOX VERO

5.11 The manufacturer of NIOX MINO and NIOX VERO provided details of a study (commercial in confidence) that compared the technical performance and accuracy of the 2 technologies.

NObreath

5.12 Four studies compared NObreath with 3 chemiluminescent devices other than Niox. Bland–Altman analysis done in 1 study in a healthy cohort with low FeNO values showed a mean difference of −3.95 ppb in comparison with the chemiluminescent device. Limits of agreement in this study were wide (−10.98 to 4.08). Another study reported an absolute mean difference in FeNO measurements of −3.81 ppb. Comparisons with the third type of chemiluminescent device showed small differences between mean FeNO values for the cohort, with NObreath giving lower values in some cohorts.

5.13 Two studies that compared NObreath with NIOX MINO in adults found that NIOX MINO provided lower mean FeNO values than NObreath in most analyses. This contradicts the available evidence for comparisons of NIOX MINO with Niox and NObreath with Niox, which suggested that NIOX MINO should provide higher readings than NObreath. The 2 direct comparisons of NObreath and NIOX MINO included small numbers of patients, and only 1 included patients with asthma, but did not provide a Bland–Altman analysis to assess agreement.

5.14 The External Assessment Group stated that, based on available evidence, any differences in absolute values between results from NObreath and other devices are relatively small, although derived cut-offs and maximum sensitivity and specificity may differ.

Diagnostic accuracy of FeNO devices

5.15 No end-to-end studies were identified, and no cohort study compared use of FeNO testing within a sequence of tests with a suitable reference standard of the same sequence of tests without FeNO testing. The review identified 24 studies that met the inclusion criteria; 20 included adults of all ages and 4 included children. The studies were classified according to the position of the patients' asthma in the UK care pathway and the reference standards used.

FeNo testing in adults with asthma symptoms compared with most of, or all, the UK care pathway

5.16 The review identified 4 studies in this group. Cut-offs for the highest sum of sensitivity and specificity ranged from 20 ppb to 47 ppb in the 4 studies in this group. Sensitivities ranged from 32% to 88%, and specificities from 75% to 93%. Because of the heterogeneity in the results, study designs and the devices used, the External Assessment Group concluded that it is difficult to identify the optimal cut-off for sensitivity and specificity.

5.17 Cut-offs yielding the highest sensitivity ranged from 9 ppb to 15 ppb, with sensitivities ranging from 85% to 96% and specificities from 13% to 48%. Cut-offs yielding the highest specificity ranged from 47 ppb to 76 ppb, with sensitivities ranging from 13% to 56% and specificity from 88% to 100%.

5.18 Estimates of specificity consistently had a smaller range and higher values than estimates of sensitivity reported, suggesting that FeNO may be more reliable as a 'rule‑in' test than as a 'rule‑out' test. A rule‑in test implies that patients whose test is positive are assumed to have asthma and those testing negative go on to have further tests. However, the cost effectiveness of this balance will depend on the clinical and cost consequences of the correct or incorrect classification of patients.

FeNO testing in patients with difficult-to-diagnose asthma compared with airway hyper-responsiveness

5.19 Three studies used some form of airway hyper-responsiveness as the sole reference standard. Estimates of sensitivity and specificity appeared comparable to those in the studies of patients presenting in primary care with symptoms of asthma. One study included a set of patients whose methacholine challenge tests were negative and compared FeNO with an adenosine challenge test. This study produced 100% sensitivity (29% specificity) at a cut-off of 30 ppb, making it likely to operate well as a rule‑out test.

5.20 The other 2 studies used methacholine challenge tests in people who had been found not to have asthma in previous tests. Cut-offs for the highest sum of sensitivity and specificity ranged from 34 ppb to 40 ppb when compared with a methacholine challenge test as a gold standard. Sensitivities ranged from 24% to 74%, and specificities from 73% to 99%, which is a similar range to the broader cohort. A range of cut-offs was not reported in these studies.

FeNO testing in patients with difficult-to-diagnose asthma with chronic cough compared with response to a trial of inhaled corticosteroids

5.21 Three studies included patients with chronic cough who had tested negative for other causes. All 3 studies used response to a trial of treatment with inhaled corticosteroids as a reference standard. Cut-offs for the highest sum of sensitivity and specificity were similar in all 3 studies. Accuracy was somewhat better in 2 studies at 90–95% sensitivity and 76–85% specificity.

FeNO testing in children with asthma symptoms compared with various reference standards

5.22 Four studies were identified that included children, and these had patients with a similar severity of asthma and similar reference standards as the adult cohorts, while the cut-offs derived were generally lower but with similar ranges of estimates of sensitivity and specificity. There was a high degree of agreement between studies in terms of the cut-off that produces the highest sum of sensitivity and specificity, despite the heterogeneity in devices and reference standards, with values between 19 ppb and 21 ppb. Estimates of sensitivity at these cut-off points were also wide-ranging and of a similar range to those in the studies in adults (49% to 86%).

5.23 When selecting the cut-off with the highest sensitivity, results were similar to those for adult cohorts. Cut-offs ranged from 5 ppb to 20 ppb, sensitivities from 89% to 94% and specificities from 14% to 70%. When selecting the cut-off with the highest specificity, results were also similar to adult cohorts. Cut-offs were a little lower again, and ranged from 30 ppb to 50 ppb. Sensitivities ranged from 20% to 50% and specificities from 92% to 100%.

5.24 The External Assessment Group did not conduct a meta-analysis in any group because of the high heterogeneity between studies. Estimates of cut-off points, sensitivity and specificity were not consistent within groups and ranged widely when used as a rule‑in or rule‑out test and when considering the highest sum of sensitivity and specificity. Because of this, the External Assessment Group found it difficult to estimate the relative diagnostic accuracy of FeNO testing in any situation and at any given cut-off point. However, there did not appear to be a difference in the relative diagnostic accuracy of FeNO testing in the 2 settings (primary and secondary care), either in comparison with the standard UK care pathway (entire or parts) or in comparison with airway hyper-responsiveness in patients whose asthma was difficult to diagnose. But the large variation in estimates within groups may obscure any true underlying differences in the accuracy of FeNO testing between groups and between different reference standards.

FeNO testing in population subgroups included in the scope

5.25 No cohort studies were found that provided evidence relating to the subgroups of pregnant women, older people, people who smoke or people exposed to environmental tobacco, and therefore lower levels of evidence were consulted.

5.26 FeNO testing appeared to be able to distinguish people with asthma from people without asthma with similar accuracy in people who smoke and people who do not smoke or used to smoke. It seems likely that FeNO levels are generally lower in people who smoke, and it may be useful to consider a person's smoking status when interpreting results, or to select lower cut-off points for people who smoke. Limited data in children support the same conclusion as for adults.

5.27 There is limited and conflicting evidence for the benefit of FeNO testing in older people and, therefore, uncertainty as to whether FeNO testing is useful for diagnosing asthma in the older population.

5.28 A cross-sectional study suggested that pregnancy does not alter FeNO levels in women with or without asthma, and that FeNO testing can distinguish between healthy, pregnant women with asthma or without asthma.

Efficacy of FeNO-guided asthma management

5.29 The External Assessment Group reviewed evidence relating to outcomes in adults, children and subgroups of people as defined in the scope for this assessment. The outcomes included exacerbations, inhaled corticosteroid use and health-related quality of life.

FeNO-guided asthma management in adults

5.30 Four studies (based in the UK, New Zealand, Sweden and the USA) were included in this review. The quality of the 4 studies was assessed according to the Cochrane Library and Centre for Reviews and Dissemination (CRD) handbook. The External Assessment Group indicated that the study with the highest risk of bias was the study by Syk et al. (2013); this was because of the lack of blinding, incomplete outcome data and selective reporting.

5.31 All 4 studies were randomised controlled trials; 2 were single blind (Smith et al. 2005 and Shaw et al. 2007), 1 was open label (Syk et al. 2013) and 1 was described as 'multiply blinded' (Calhoun et al. 2013). There was a high degree of heterogeneity in all aspects of study design across the 4 studies. Three studies did not clearly report which device was used to measure FeNO levels.

5.32 The inclusion criteria, trial protocols and treatment doses varied across the studies. Only 1 study reported using the British guideline on the management of asthma (2012), hereafter referred to as the 'British guideline', in the comparator arm. The number of patients in the trials ranged from 94 to 229, and they were recruited from primary care in 3 studies. For 1 study, it was unclear what setting people were recruited from.

5.33 Exacerbations were reported in all 4 studies, although definitions varied and results were not always consistent across the studies. However, all 4 studies reported a fall in exacerbation rates per person year, although it appeared that this was mostly driven by mild and moderate exacerbations.

5.34 For severe exacerbations, the Syk et al. (2013) study reported higher rates of oral corticosteroid use in the intervention arm (although the difference was not statistically significant), while the composite outcome of moderate or severe exacerbations favoured the intervention arm. In the other studies, the difference in direction of effect between the outcome for oral corticosteroid use and the composite outcomes that included less severe exacerbations was not evident. Oral corticosteroid use and the composite outcomes of severe and less severe exacerbations decreased in intervention arms, although there was still an apparently greater effect in the composite outcomes. Rate ratios calculated by the External Assessment Group for major/severe exacerbations ranged from 0.79 (95% confidence interval [CI] 0.44 to 1.41) to 1.29 (95% CI 0.51 to 3.30), while rate ratios calculated by the External Assessment Group for composite outcomes of all severity of exacerbation ranged from 0.52 (95% CI 0.30 to 0.91) to 0.63 (95% CI 0.40 to 0.98).

5.35 Despite the high level of between-study heterogeneity, an exploratory meta-analysis of the rates of major and severe exacerbations using fixed effects methods was conducted. The result showed no heterogeneity, with an I2 statistic of 0%. The pooled estimate was 0.87 (95% CI 0.64 to 1.19, p=0.38). This indicates that there were fewer major exacerbations in the intervention arm, but the difference did not reach statistical significance.

5.36 A sensitivity analysis was done using the results of studies that reported the number of exacerbations resulting in oral corticosteroid use. The pooled risk ratio was 0.90 (95% CI 0.56 to 1.45), indicating a statistically non-significant difference for asthma management with FeNO measurement. However, the External Assessment Group noted that there were only 2 studies in this analysis. Both studies reported non-significant differences, but with risk ratios on opposite sides of the line of no effect. This could suggest that differences in study design, step-up and step-down protocols, and patient characteristics may account for differences in direction of effect.

5.37 When considering the composite outcome of all exacerbations and failure rates, 3 studies reported composite outcomes that the External Assessment Group considered to be broadly similar and to represent 'treatment failure'. In 2 studies, FeNO‑guided management groups showed numerically, but not statistically significant, lower rates of failure. In the Syk et al. (2013) study, the improvement was statistically significant, with a rate of 0.22 in the intervention arm compared with 0.41 in the control arm (p=0.024). The rate ratio calculated by the External Assessment Group was 0.52 (95% CI 0.30 to 0.91). A meta-analysis of these rates was conducted despite the high level of heterogeneity between study characteristics. The result showed a statistically significant effect, with a rate ratio of 0.58 (95% CI 0.43 to 0.77). This represents a statistically significant effect in favour of using FeNO‑guided management in people with asthma for the composite outcome of all exacerbations and treatment failure rates.

5.38 An additional study (Honkoop et al. 2013) was identified by the External Assessment Group. The study was a randomised controlled trial with a 12‑month follow-up period and dose titration at baseline and every 3 months thereafter. The number of people in the study was larger than in the other 4 studies and they were recruited from primary care. Outcome data were limited because this study was only reported in a conference abstract; however, a non-significant trend towards a reduction in courses of oral prednisolone was reported for the FeNO measurement group compared with the comparator arms. The External Assessment Group performed an additional meta-analysis that included the Honkoop et al. study, calculating the rate ratio for exacerbation as 0.69. Errors could not be calculated for this meta-analysis because the exact numbers of people and events were not reported. Results of the meta-analysis ranged from significant to non-significant in favour of FeNO measurement, depending on the error rate imputed.

5.39 All studies reported some data on inhaled corticosteroid use. Two studies reported inhaled corticosteroid use as a mean per day at the end of the study, with mean differences of −270 micrograms per day (95% CI −112 to −430, p=0.003) and −338 micrograms per day (95% CI −640 to −37 micrograms, p=0.028) respectively, in favour of FeNO‑guided management. The Syk et al. (2013) study showed a small (non-significant) increase in inhaled corticosteroid use in the intervention arm (586 micrograms, standard error [SE] 454; compared with 540 micrograms, SE 317, in the control arm). One study reported means per month, although it is unclear if this was an average over the whole course of the study, or the means for the final month of the study. The means were very similar at 1617 micrograms per month in the intervention arm and 1610 micrograms per month in the control arm.

5.40 A meta-analysis used standardised mean difference analysis because outcomes were not reported in a standardised way. This showed an overall effect of −0.24 standard deviations in favour of FeNO‑guided management, although this narrowly missed significance (95% CI −0.56 to 0.07, p=0.13).

5.41 Two studies used versions of the Asthma Quality of Life Questionnaire (AQLQ) to measure quality of life. Both showed no effect in the global score, but 1 investigated domains and found a statistically significant difference in the symptoms score. A meta-analysis of the overall scores showed no effect on quality of life, with a standardised mean of 0.00 (95% CI −0.20 to 0.20).

5.42 All 4 original studies (excluding Honkoop et al. 2013) reported data for asthma control. In 3 studies, asthma control did not change but in the Syk et al. (2013) study there was a statistically significant increase in asthma control between the 2 trial arms. Two studies (Smith et al. 2005 and Calhoun et al. 2012) reported no significant difference between groups for bronchodilator use. Syk et al. did not report the significance of the difference between the 2 arms, reporting a median of 1.56 (interquartile range [IQR] 0.06 to 5.18) uses per week in the intervention arm, and a median of 0.94 (IQR 0.03 to 2.81) in the control arm. No asthma-related adverse events or deaths were reported.

FeNO-guided asthma management in children

5.43 Five studies (based in Austria, the USA, Italy, the Netherlands and Australia) that included children (plus adolescents and young adults) and compared FeNO‑guided management with non-FeNO‑guided management were identified. The quality of the studies was assessed according to criteria proposed in the Cochrane Handbook and CRD Handbook. The study quality varied; no single study scored well in every item, and no item scored well in every study.

5.44 There was a high degree of heterogeneity in all aspects of study design across 4 studies. No study reported using the British guideline in the comparator arm. Two studies included patients who appeared to be poorly controlled. One study included patients who had mild to moderate persistent asthma and 1 study included patients who had received a stable dose of inhaled corticosteroids for the previous 3 months, suggesting that their asthma was reasonably well controlled.

5.45 All 5 studies reported some data on asthma exacerbations, although the definition of exacerbation was unclear in some cases. Two studies reported severe exacerbations in a way that allowed calculation of rates per person year. Both had lower rates in the intervention arm. In patients with uncontrolled asthma, the rate was 0.746 in the intervention arm and 0.950 in the control arm. In patients who had been on a stable dose of inhaled corticosteroids for 3 months, the rate was 0.21 in the intervention arm and 0.39 in the control arm. Both rates were calculated by the External Assessment Group and the statistical significance is unclear.

5.46 For all definitions of exacerbations, 4 studies reported outcomes that were not defined as either major or minor and had different definitions to each another. All the studies showed a trend in favour of fewer exacerbations in the intervention arm. The only study to report a significant between-group difference was a conference abstract, which showed that exacerbations (not clearly defined) occurred in 6 of the 31 patients in the intervention group (19.4%) and 15 of 32 in the control group (46.9%, p=0.021).

5.47 Overall, results showed that inhaled corticosteroid use increased in the intervention group compared with the comparator group, although there was variability between the studies. These differences could be attributed to the specifics of the step-up and step-down protocols or the characteristics of the patients selected. The 2 studies that included children whose asthma was hard-to-treat or uncontrolled (Szefler et al. 2009 and Fritsch et al. 2006) saw an increase in inhaled corticosteroid use, while the studies that did not include children with these characteristics saw no significant increase.

5.48 Health-related quality of life was only reported in 1 study in abstract form and using an unknown tool. The External Assessment Group was not able to draw a definite conclusion from these data. Four studies provided some data on asthma control, none of which demonstrated any statistically significant effects favouring either intervention or control. With respect to additional medication use, 3 studies provided data, but there did not appear to be a clear direction of effect within the data.

5.49 One study reported no difference in adverse events between groups and there were no deaths reported. The adverse events listed included gastrointestinal disorders, haematological disorders, infections, musculoskeletal symptoms and skin symptoms.

Cost effectiveness

5.50 The economic analysis done by the External Assessment Group compared the cost effectiveness of measuring FeNO using NIOX MINO, NIOX VERO and NObreath with current standard tests for diagnosing and managing asthma in England and Wales.

Review of existing economic analyses

5.51 The External Assessment Group did a review to identify existing economic analyses of FeNO testing and measurement (using NIOX MINO, NIOX VERO or NObreath) for diagnosing and managing asthma respectively. The review also sought to identify existing models and potentially relevant evidence sources to inform parameter values within the de novo economic models developed by the External Assessment Group.

5.52 Only 1 published UK cost-effectiveness model was identified for asthma diagnosis, and 1 for asthma management. Modified versions of these models were provided to NICE by the manufacturer of NIOX MINO and NIOX VERO. The wider review identified several economic analyses that the External Assessment Group described as including various methodological problems, questionable assumptions and weak evidence.

De novo cost-effectiveness model

5.53 The External Assessment Group developed 2 de novo models: 1 to assess the expected cost effectiveness of measuring FeNO in addition to, or in place of, standard tests for diagnosing asthma (the diagnostic model) and 1 to assess the expected cost effectiveness of FeNO plus the British guideline compared with the British guideline alone for managing people with diagnosed asthma (the management model). The 2 models, although distinct, shared several parameter values and assumptions.

5.54 The diagnostic model was structured in the form of a decision tree. The decision tree model was used to estimate the probability that a person with asthma will be correctly diagnosed (true positive) or incorrectly diagnosed (false negative); and the probability that a person without asthma will be correctly diagnosed (true negative) or incorrectly diagnosed (false positive) and the expected health outcomes and costs arising from this. The management model was in the form of a simple Markov model with 2 states: alive with diagnosed asthma and dead.

5.55 Estimates of test accuracy for measuring FeNO were drawn from several separate studies based on the results of the systematic review for clinical effectiveness, while estimates of test accuracy for comparator tests were drawn from best available evidence. The economic analyses included estimates of the sensitivity and specificity of individual tests as well as combinations of FeNO devices plus other standard tests. One study (Schneider et al. 2013) that used the NIOX MINO device was used to inform estimates of the sensitivity and specificity of FeNO alone. The true pre-test probability of asthma in undiagnosed patients was estimated as a weighted mean of several cases of asthma and non-asthma in the studies used, to inform the diagnostic test accuracy parameters. Across the included studies, 412 of 881 patients were diagnosed with asthma (p=0.47).

5.56 Health-related quality of life values for people without asthma were estimated using a general population EQ‑5D regression model. The values were common to all diagnostic comparator groups and did not therefore have any effect on the estimates of incremental health gain for the diagnostic tests included in the economic analysis. The disutility associated with asthma, estimated to be −0.0463, was taken from the catalogue of EQ‑5D values reported by Sullivan et al. (2011). It was noted that this disutility was applied to all patients with asthma and to those who tested false positive (until their misdiagnosis was corrected). This disutility is unlikely to fully reflect health losses associated with the delayed diagnosis of more serious pathology, such as cancer or tuberculosis. The disutility associated with poor asthma control was derived from a study (McTaggart-Cowan et al. 2008) that reported EQ‑5D estimates for 4 health states: 'very well controlled', 'well controlled', 'adequately controlled' and 'not controlled'. EQ‑5D estimates ranged from 0.90 for 'very well controlled' to 0.80 for 'not controlled'.

5.57 The External Assessment Group assumed that the health loss associated with poor control because of a false-negative diagnosis related to the difference between the 'well-controlled' state and the 'not-controlled' state (mean disutility of −0.04). This disutility was applied to all false-negatives until the misdiagnosis was corrected.

5.58 Because of the lack of empirical evidence relating to the time needed to resolve incorrect diagnoses, the External Assessment Group attempted to elicit these values from clinical specialists. Based on the response received, the External Assessment Group assumed that the time to resolve a false-negative diagnosis has a mean of 8 months (95% CI 4 to 12 months) and the time to resolve a false-positive diagnosis has a mean of 18 months (95% CI 12 to 24 months). The External Assessment Group considered these estimates to be highly uncertain and tested them in sensitivity analyses.

5.59 The following costs were used to inform the diagnostic and management models:

  • Test costs: the marginal per-test costs for all 3 devices were calculated based on information provided by the manufacturers. The calculation was complicated by the fact that the devices each have different lifetimes, and that test kits and mouthpieces for each device are available at lower marginal costs if higher volumes of kits are purchased. These marginal per-test costs do not include any costs associated with education and training for NHS staff to use the devices.

  • Maintenance costs: the External Assessment Group assumed that the manufacturer provides the maintenance of NObreath free of charge to the NHS. The External Assessment Group assumed zero maintenance costs for NIOX MINO and NIOX VERO.

  • Primary care costs: the External Assessment Group assumed that spirometry, reversibility testing and measuring FeNO can be done in primary care and would need 2 GP visits and 1 nurse visit. The unit cost of a GP visit was based on published economic analyses that used an estimate of £43 (based on an appointment of 11.7 minutes, and including direct staff costs and qualifications). The cost of a GP practice nurse visit was assumed to be £13.69 (based on a visit of 15.5 minutes). For the management model, the External Assessment Group assumed that measuring FeNO would be done during routine GP visits and would need an additional nurse visit once every 3 months. The marginal cost of measuring FeNO was applied as the per-test cost plus the cost of a primary care nurse appointment.

  • Secondary care costs: the External Assessment Group assumed that sputum induction and airway hyper-responsiveness (methacholine challenge test) would be done in secondary care and would need 2 secondary care visits, 1 laboratory visit and an initial GP visit for referral. Secondary care attendance costs were based on the Healthcare Resource Group for respiratory medicine attendances (£204.29). The cost of a laboratory visit was based on the Healthcare Resource Group for simple bronchodilator studies (£203.29). The External Assessment Group assumed the standard errors around these estimates were normally distributed, with a standard error equal to 15% of the mean.

  • Costs of asthma management: estimates of the annual cost of combined inhalers were derived from 2 previous health technology assessment reports. For children, the least expensive annual cost for combined inhalers was estimated to be £201. For adults, the least expensive annual cost of the inhalers was estimated to be £231.

  • Costs associated with resolving misdiagnoses: the assumption was made that 1 additional primary care attendance, 2 additional secondary care attendances and 1 laboratory visit would be needed to correctly diagnose false-positive and false-negative results. This same assumption was made in previously published models.

  • Costs associated with loss of control for false-negatives: the External Assessment Group assumed that people who were falsely diagnosed as not having asthma would experience 1 exacerbation in each year they remain misdiagnosed. The model assumed that a proportion of these exacerbations would need hospitalisation.

5.60 The following costs were used to inform the management model alone:

  • Additional costs of FeNO measurement: The External Assessment Group assumed that FeNO measurement would be done during routine GP visits and would require 1 additional nurse visit every 3 months.

  • Costs of managing exacerbations: the External Assessment Group assumed that a proportion of exacerbations would need hospitalisation while the remainder could be managed in primary care. It also assumed that severe exacerbations that do not need hospitalisation would need 1 GP attendance (£43.00) plus oral corticosteroids for 5 days (£1.73) based on an earlier health technology assessment report. The cost of asthma hospitalisation was derived from current NHS Reference Costs (£1266.72).

Base-case results

5.61 The base-case model was evaluated probabilistically using Monte Carlo sampling techniques. Deterministic one-way sensitivity analyses were also performed to account for different modelling assumptions. Central estimates of cost effectiveness were presented as incremental cost-effectiveness ratios (ICERs). Uncertainty surrounding the cost-effectiveness estimates was presented using cost-effectiveness planes and cost-effectiveness acceptability curves.

5.62 The base-case results of the diagnostic model in children and adults suggested that, across the 17 diagnostic options included in the economic analysis, the expected difference in quality-adjusted life years (QALYs) is likely to be small (4.2686–4.2834). They also suggested that airway hyper-responsiveness (methacholine challenge test) is expected to produce the greatest QALY gain (4.2834), followed by FeNO testing (either NObreath, NIOX VERO or NIOX MINO) plus bronchodilator reversibility, with a QALY of 4.2829. The difference between the QALYs produced by the methacholine challenge test and FeNO testing plus a bronchodilator was very small (0.0005 QALYs). Other diagnostic test options, either with or without FeNO testing, resulted in increasingly lower QALYS, with spirometry (forced expiratory volume in the first second divided by the total volume of air that a person can forcibly exhale in 1 breath) producing the lowest QALY gain of 4.2686.

5.63 The External Assessment Group presented an incremental cost-effectiveness analysis, in which the diagnostic options were ranked in decreasing order of QALY. The ICER for airway hyper-responsiveness (methacholine challenge test) compared with the next best option in terms of QALY (NObreath plus bronchodilator reversibility) was approximately £1.125 million per QALY gained. Following methacholine challenge, the option producing the next best QALY (FeNO testing plus bronchodilator reversibility) yielded 4.2829 QALYs, but the cost associated with the individual tests varied (£686.08 for NObreath, £687.61 for NIOX VERO and £688.33 for NIOX MINO). FeNO testing plus bronchodilator reversibility is therefore cost saving compared with methacholine challenge, but is estimated to produce marginally fewer QALYs (see section 5.62). All further options, with or without FeNO testing, were dominated because they were both more expensive and produced fewer QALYs. The External Assessment Group considered these results to be very uncertain.

Results in children

5.64 The base-case results for asthma management in children suggested that the British guideline plus FeNO measurement produces a small health benefit (0.05 QALYs) compared with the British guideline alone. The British guideline plus FeNO measurement was also more costly (£8148.59 for the British guideline plus NObreath, £8314.30 for the British guideline plus NIOX VERO and £8391.53 for the British guideline plus NIOX MINO) than the British guideline alone (£5860.06) because of projected inhaled corticosteroid use for the FeNO measurement groups. The resulting ICER for NObreath plus the British guideline compared with the British guideline alone was £45,213 per QALY gained. NIOX VERO and NIOX MINO were expected to be dominated by NObreath because of their higher marginal per-test costs.

Results in adults

5.65 The base-case results for asthma management in adults showed that the British guideline plus FeNO measurement is expected to produce a small health benefit (0.04 QALYs) compared with the British guideline alone. The British guideline plus FeNO measurement was also more costly in adults (£7377.61 for the British guideline plus NObreath, £7535.43 for the British guideline plus NIOX VERO and £7608.99 for the British guideline plus NIOX MINO) than the British guideline alone (£7296.30) because of increased inhaled corticosteroid use in the FeNO measurement groups during the first 12 months of monitoring. Similarly to the children's model for asthma management, the model assumed that all 3 FeNO devices produce the same health benefits. NIOX MINO and NIOX VERO were dominated by NObreath because of their higher marginal per-test costs. The ICER of the British guideline plus NObreath compared with the British guideline alone was approximately £2146 per QALY gained. If dominance was ignored, the ICERs for the British guideline plus the NIOX devices, compared with the British guideline alone, were £6310 per QALY gained for NIOX VERO and £8250 per QALY gained for NIOX MINO.

Sensitivity analysis results

5.66 The External Assessment Group carried our several deterministic sensitivity analyses for the diagnostic and management models.

Diagnostic model

5.67 For the diagnostic model, results of the deterministic sensitivity analyses indicated that the cost-effectiveness frontier presented in the base-case analysis was maintained across most scenarios. In most scenarios, most options were expected to be ruled out because of simple dominance. The results based on the point estimates of parameters were similar to the results of the probabilistic analysis, and discounting did not have a substantial effect on the cost effectiveness of the non-dominated diagnostic options.

5.68 Other indications from the results of the sensitivity analysis showed that the costs of the various FeNO devices influenced which options were dominated, but had only a negligible impact on the cost-effectiveness results for non-dominated options. Longer misdiagnosis correction times substantially improved the cost effectiveness of airway hyper-responsiveness (methacholine challenge test) compared with FeNO testing plus bronchodilator reversibility, with the lowest ICER being £126,982 per QALY gained when time to correct diagnosis was extended 10-fold.

5.69 In terms of diagnostic accuracy, the results of the sensitivity analyses showed that the use of other sources for the operating characteristics of FeNO testing and standard tests did not impact on the cost effectiveness of non-dominated options. Also, the use of a rule‑out decision approach may have improved the comparative effectiveness and cost effectiveness of FeNO testing alone.

Management model

5.70 For the management model in children, the results of the sensitivity analyses indicated that NIOX MINO and NIOX VERO were expected to be consistently dominated by NObreath because of their higher marginal per-test cost. In addition, while the marginal per-test cost influenced which device would be preferred, it did not have a substantial impact on the overall cost effectiveness of the British guideline plus FeNO measurement compared with the British guideline alone.

5.71 The results of the sensitivity analyses indicated that the length of time FeNO measurement was assumed to impact on exacerbations and inhaled corticosteroid use was a key source of uncertainty within the children's model. Shorter impact times improved the cost effectiveness of FeNO measurement. The British guideline plus FeNO measurement dominated the British guideline alone when it was assumed that the impact of FeNO‑guided management on exacerbations and inhaled corticosteroid use was reduced to 1–4 years, whereas assumptions for 5 years and 10 years produced ICERs of £7598 and £27,660 per QALY gained respectively.

5.72 When alternative sources of exacerbation rates and inhaled corticosteroid use for children were explored, the ICERs for managing children changed considerably. The sensitivity analysis used values from the Pijnenburg et al. (2005) study, rather than the Szefler et al. (2008) study used in the base case, in which exacerbation rates were 0.18 for the British guideline plus FeNO measurement and 0.39 for the British guideline alone, and relative corticosteroid dose intensity beyond the first year was 1.23 for the British guideline plus FeNO measurement and 1.22 for the British guideline alone. The analysis based on Pijnenburg et al. suggested a considerably more favourable ICER for the British guideline plus FeNO measurement compared with the British guideline alone in children (£18,963 per QALY gained). The External Assessment Group noted that the Szefler et al. study included patients with uncontrolled asthma and the study protocol did not allow therapy to be stepped down on the basis of low FeNO levels alone. This may, in part, explain why inhaled corticosteroid use was higher for the British guideline plus FeNO measurement than for the British guideline alone.

5.73 The results of the sensitivity analyses for managing children also indicated that the model was sensitive to the rate of exacerbations and the associated health loss. When exacerbation rates were doubled, the ICER for the British guideline plus FeNO measurement compared with the British guideline alone was £19,891 per QALY gained. When exacerbation rates were halved, the ICER was £95,632 per QALY gained. When the exacerbation disutility was doubled the ICER was £31,479 per QALY gained and £52,844 per QALY gained when halved.

5.74 Results for the deterministic sensitivity analyses for the management model in adults showed that the model was highly sensitive to the exacerbation rates used. Exacerbation rates from Syk et al. (2013) increased the ICER to £184,000 per QALY gained for the British guideline plus FeNO measurement compared with the British guideline alone. When exacerbation rates from Syk et al. were used, the British guideline alone dominated the British guideline plus FeNO measurement. In addition, NIOX MINO and NIOX VERO were expected to be consistently dominated by NObreath because of their higher marginal per-test cost. However, while the marginal per-test cost influenced which device would be preferred, it did not have a substantial impact on the overall cost effectiveness of FeNO measurement compared with the British guideline.

5.75 Another observation from the sensitivity analyses of the management in the adult model was that the length of time that FeNO measurement was assumed to impact on exacerbations and the use of inhaled corticosteroids was a key driver of cost effectiveness. In the adult model, the cost effectiveness improved when the duration of impact of FeNO measurement was extended (£885,451 per QALY gained when 1 years' duration was assumed, to £8898 per QALY gained when 40 years' duration was assumed). The opposite was true in the children's model in which cost effectiveness worsened when the duration of effect on exacerbations and inhaled corticosteroid use was increased (see section 5.71). The External Assessment Group stated that this was driven entirely by the observed differences in relative inhaled corticosteroid use at the last observed time point in the trials.