4 Evidence and interpretation
The Appraisal Committee considered evidence from a number of sources.
NICE commissioned 2 Assessment Reports: 1 was undertaken by the Wessex Institute for Health Research and Development and the other by the University of Sheffield School of Health and Related Research (ScHARR). The Wessex Assessment Report focused on evidence from double-blind, randomised, placebo-controlled trials to evaluate the efficacy of GH treatment in terms of QoL benefits, whereas the ScHARR Assessment Report included the additional evidence that was available from observational studies and some new data from 2 unpublished randomised controlled trials (RCTs). The Wessex Assessment Report also included a cost analysis of the GH treatment, and the ScHARR Assessment Report provided a detailed critique of the economic models submitted by the manufacturers. During the course of the appraisal some of the manufacturers submitted additional data from newly reported, unpublished trials and results from updated economic analyses.
4.1 Clinical effectiveness
Quality-of-life evidence from randomised controlled trials
4.1.1
The Assessment Reports identified 17 published RCTs evaluating the effects of GH on QoL in around 900 adult patients with GH deficiency. Twenty-three different QoL assessment scales were used, within a variety of trial designs. The duration of the studies was typically 6 months and the number of participants ranged from 6 to 173. Most studies included both adult- and childhood-onset GH deficiency.
4.1.2
Ten studies evaluated health-related QoL using the Nottingham health profile (NHP), but not all reported the results. Additional unpublished data on QoL for 1 of the studies were made available to ScHARR. These data were supplied in confidence and have not been included in the pooled results presented below. However, including these data had only a small impact on the results of the meta-analyses and did not affect the conclusions of the ScHARR Assessment Report.
4.1.3
The analysis of the individual dimensions of the NHP found some statistically significant changes in the GH-treated group compared with the control group.
4.1.4
In 1 of the 4 published studies (the largest) that reported the social isolation dimension, the score was significantly improved in the GH-treated group compared with the placebo group. For this dimension, pooled analysis of all 4 studies found a small, statistically significant difference in favour of treatment (-0.3 points, 95% confidence interval, -0.4 to -0.1). The largest of the 4 studies that reported the emotional reactions dimension found a small but statistically significant difference in favour of treatment, but the difference was not statistically significant in the pooled analysis.
4.1.5
Five studies reported the energy dimension. One of the smaller studies found a significant difference in favour of GH treatment, but the pooled analysis of all 5 did not. For the sleep and physical mobility dimensions, none of the 4 individual studies reporting these dimensions found a treatment effect of GH, and nor did the pooled analysis. For the pain dimension, 1 study found a significant difference in favour of placebo, but there was no significant difference in the pooled analysis of 4 studies.
4.1.6
The NHP is not designed to produce an overall total score. However, 2 studies reported mean total scores. Both found improvements in favour of treatment, but these were not statistically significant in either of the individual studies or in the pooled analysis.
4.1.7
Two RCTs used the QoL-assessment of growth hormone deficiency in adults (QoL-AGHDA) questionnaire – a self-completed questionnaire comprising 25 questions specifically designed to assess the consequences of GH deficiency and its treatment. A high QoL-AGHDA score indicates greater impairment of QoL. One study was conducted across 3 centres in Spain and included 69 patients. The other was conducted in the Netherlands and recruited 30 patients. Minimal data from these studies have been published in abstract form, but further results were made available in confidence to the ScHARR review group for evaluation.
4.1.8
Data pooled from 2 trials reporting the Hamilton Depression Scale found in favour of GH treatment, but the results were not statistically significant. GH use was associated with an improvement of 2.4 points (95% confidence interval, -4.9 to 0.1).
4.1.9
Meta-analysis of 2 trials reporting psychological wellbeing (using the Psychological General Well-being Schedule) found in favour of the GH-treated group, but the results were not statistically significant.
4.1.10
In summary, based on the evidence from RCTs, in terms of QoL the effectiveness of GH treatment in adults with GH deficiency remains unproven. Many of the available studies were of poor quality. Also, because the patients involved had comparatively normal QoL values at baseline there was little scope for improvement. Furthermore, most of the RCTs used a dosage regimen determined by the patient's weight rather than one based on a titration technique, which is now common clinical practice. This raises difficulties with using this evidence to estimate the effectiveness of currently used GH regimens.
Quality-of-life evidence from observational trials
4.1.11
A 10-year study provided the longest period of observational follow-up of replacement therapy in GH deficiency. This study included patients who had previously participated in an RCT. Of the 24 patients in the original study, 10 patients who had received GH continuously for 10 years were compared with 11 who had not. For the group receiving GH, QoL – as measured by the NHP – was improved over baseline in the domains of energy level and emotional reactions. Overall score was also improved. There was no change in the untreated group. However, the 2 groups may not be comparable because there are several reasons why patients may not continue treatment. Two shorter observational studies (12 months) reported improvements in overall NHP scores after GH treatment.
4.1.12
Eight observational studies of GH therapy in GH deficiency reported QoL-AGHDA scores. Three of these reported results from the largest observational data set of GH-deficient patients, the KIMS database. KIMS is the Pharmacia international metabolic database and pharmacoepidemiological survey of adult GH-deficient patients receiving GH therapy. The three KIMS studies account for most of the published observational data on QoL. They each included between 300 and 665 participants. However, it is likely that data from many of the same patients were reported in all three publications. The extent to which this may have occurred was not clear. The number of participants lost to follow-up was also unclear. In these studies, the reported mean reduction in QoL-AGHDA score after GH treatment ranged from 2.8 to 4.8. The remaining five studies that used the QoL-AGHDA included between 10 and 65 patients, and reported reductions in mean QoL-AGHDA scores ranging from 3 to 7.2.
4.1.13
A formal meta-analysis of the observational data was not performed. However, a crude estimate of average change in QoL-AGHDA was made. This suggested that, across the studies (weighted by number of patients), the average improvement from baseline in QoL-AGHDA after GH treatment was 3.7 points.
4.1.14
In addition, limited data on specific subgroups (defined according to age and baseline QoL-AGHDA scores) were available from KIMS database. These data suggested that the mean improvement from baseline score in patients less than 65 years of age and with a baseline QoL-AGHDA score of 0-5 was 1.80 points at 1 year. The corresponding values for the groups with baseline QoL-AGHDA scores of 6-10, 11-15, and 15 and over were 5.55, 7.75, and 11.98 respectively, for people less than 65 years old.
4.1.15
In clinical studies, improvements in QoL were observed within 3 to 6 months of initiating treatment. Limited data from observational studies suggested that the improvement was sustained in the long term (9 to 10 years) in patients who continued therapy.
4.2 Cost effectiveness
4.2.1
One economic evaluation and 3 cost studies were identified. The only economic evaluation was reported in an outdated Wessex Development and Evaluation Committee (DEC) report (No. 47, 1995), which had subsequently been replaced by another Wessex DEC report (No. 75, 1997). The latter did not present an economic analysis. The utility element of the economic evaluation presented in the earlier DEC report was a set of scenarios not based on primary or secondary data sources and so could not be considered reliable or valid.
4.2.2
The 3 cost studies identified were UK-based. One reported costs of diagnosis, GH treatment, and monitoring. The others reported drug costs. All studies reported the cost of the drug as the main factor determining treatment cost (around 90% of total cost). One study reported that annual treatment costs per patient could vary between £3,472 and £6,943 (1997 prices and GH dose from 0.125 to 0.25 IU/kg/week), and that costs were sensitive to assumptions about continuation rate and the price of GH. The other 2 studies reported annual drug costs of GH treatment in the range £3,300 to £3,453, using more up-to-date (median) drug doses.
4.2.3
A cost analysis was presented in the Wessex Assessment Report and aimed to analyse the average annual and total lifetime costs of GH treatment for a patient starting treatment. There was no attempt to estimate the cost effectiveness (or the cost utility) of GH treatment. The Assessment Group considered that it was not possible to estimate utility gain – which would ideally be expressed in terms of quality-adjusted life years (QALYs) – with the evidence available from RCTs, and so the analysis was limited to costs. It was estimated that GH treatment in GH-deficient adults costs £3,424 annually at an average maintenance dose. The costs of life-long therapy are estimated to be between £42,000 (adult-onset GH deficiency) and £45,400 (childhood-onset GH deficiency) without the cost-savings from hospitalisations prevented, and between £40,500 (adult-onset GH deficiency) and £43,800 (childhood-onset GH deficiency) with the savings from hospitalisations prevented. These estimates assume that 20% of people discontinue GH treatment after 6 months.
4.2.4
Drug therapy was found to be the single most important factor in determining cost; changes in the price of GH significantly altered treatment costs, so any price reductions could result in cost savings for the NHS. It was noted that the price at local level could significantly differ from the BNF list price, but there were no reliable data to inform the analysis.
4.2.5
Three manufacturers submitted economic evaluations to NICE; all 3 estimated the cost utility of GH use in adults (that is, they expressed the benefits of treatment in terms of QALYs). One also expressed cost effectiveness in the form of cost per normalised life-year gained.
4.2.6
Two economic models (Lilly and Novo Nordisk) adopted the methods used in the Wessex DEC report to generate utility estimates. The cost utility ratios estimated by these models were between £4,500 and £32,000 per additional QALY gained. These models did not use primary data but were based on estimates of the likely utility gains, for which there is little evidence. The models should therefore be treated with caution.
4.2.7
One manufacturer's model estimated the cost effectiveness to be £15,648 per additional normalised life-year for adult-onset GH deficiency, and £16,522 per additional normalised life-year for childhood-onset GH deficiency. The data came from pre- and post-treatment scores of 124 UK patients using the questions on life satisfaction modules for hypopituitarism questionnaire (QLS-H) – a new QoL instrument for adults with GH deficiency, which covers nine domains. 'Normalisation' of QoL was defined as achieving a 'somewhat satisfied', 'satisfied' or 'very satisfied' score in all domains.
4.2.8
Another manufacturer's model estimated the cost effectiveness of the use of GH replacement therapy in adults to be between £27,500 and £37,600 per additional QALY gained. This model used some inputs (especially those related to cardiovascular and fracture risks) derived from a simulation model, which was also provided. Utility estimates were derived from QoL data collected in the KIMS database. Because the QoL-AGHDA questionnaire is not designed to produce preference-based utilities, regression analysis was used to convert the available data into utility scores. Sub-group analyses for different age and QoL groups were also presented. It should be noted that the use of regression analysis to derive the utility scores is limited by the quality of the data from which they are estimated and the degree of overlap of the descriptive systems.
4.2.9
The economic analysis presented by ScHARR demonstrated that the long-term effects on risk factors for fractures and cardiovascular events had very little impact on the cost effectiveness of GH treatment. The ScHARR report also included a series of sensitivity analyses to investigate the impact on the results of relaxing the manufacturers' assumptions, which were regarded as optimistic.
4.2.10
The ScHARR estimate of the impact of GH treatment on QoL was based on the use of observational data using the QoL-AGHDA questionnaire. This was regarded as an optimistic scenario because observational data are very prone to overestimate the treatment effect, particularly for subjective outcomes for which the placebo effect may be especially problematic. A similar mapping exercise to that used in one of the manufacturer's analyses (see section 4.2.8) was used to derive the utility scores. Additional QoL data made available to ScHARR by one of the manufacturers measured the benefits by using the QLS-H questionnaire, but there is currently no method to map these findings to utility scores.
4.2.11
The ScHARR analysis, based on an overall utility gain of 0.04 to 0.12 depending on age and baseline QoL score, estimated the cost effectiveness of GH therapy to be between £25,300 (for people aged 65 years or older with a QoL-AGHDA score 16 or more) and £124,950 (for people aged 18 to 30 years with a QoL-AGHDA score of 6 to 10). The overall cost effectiveness of GH therapy was estimated to be in the region of £45,000 per additional QALY. This figure is very sensitive to the estimate of effectiveness, and it should be regarded as the best-case scenario because it is based on observational data that are likely to overestimate the benefits of treatment.
4.3 Consideration of the evidence
4.3.1
The Committee reviewed the data available on the clinical and cost effectiveness of GH treatment in adults with GH deficiency, having considered evidence on the nature of the condition and the value placed on the benefits of GH treatment from adults with GH deficiency, those who represent them, and clinical experts. It was also mindful of the need to ensure that its advice took account of the efficient use of NHS resources.
4.3.2
The Committee considered in detail the significance of the effectiveness of GH treatment in GH-deficient adults in terms of its effects on QoL. In addition, the Committee considered the potential effect of GH deficiency on clinical parameters that might adversely affect cardiovascular risk profiles or the potential for bone fractures caused by reduced bone mineral density, both of which might adversely affect life expectancy. The possibility that GH deficiency might also contribute to a higher overall standardised mortality ratio (SMR), over and above that which can be attributed to the effects on cardiovascular risk and bone mineral density, was also taken into account.
Effects of GH replacement on quality of life
4.3.3
The Committee considered that improvement in QoL was an important, if not the only, determinant of the clinical and cost effectiveness of GH treatment. It therefore considered at length the assessment tools for QoL used in studies of GH therapy, and in particular the appropriateness and suitability of the NHP, QLS-H, EQ-5D and QoL-AGHDA scoring systems. In addition, the Committee reviewed the evidence on QoL effects from both the RCTs and the observational studies. The Committee was also aware of the high compliance rates among GH users (reported to be around 92%), as pointed out by both the patient representatives and experts.
4.3.4
It was acknowledged that there were inconsistencies between the results of RCTs, observational studies, and the accounts of many individual patients about the effect of GH therapy on QoL. The Committee took into account the deficiencies in the evidence from RCTs. In particular, the Committee considered the possibility that a sub-group of patients – those with very poor QoL – were benefiting from treatment, but that the effect in these patients was obscured by the inclusion of a large proportion of patients with relatively good pre-treatment QoL and hence little scope for improvement.
4.3.5
During the course of this appraisal, the Committee was presented with several analyses relating to improvement in QoL (in addition to the original submissions) that attempted to identify a subgroup of patients in whom GH therapy would be cost effective (that is, those who would gain an improvement in QoL much larger than the average improvements seen in RCTs and observational studies). The Committee reviewed data from an updated subgroup analysis based on a postal survey using the EQ-5D questionnaire of 197 people with GH deficiency. This reanalysis suggested that improvement in utility due to GH treatment might be up to 40% greater than that estimated by QoL-AGHDA. The Committee also reviewed additional data based on QLS-H assessments (from the Hypopituitary Control and Complication Study database). The results from this analysis also suggested that there was likely to be a subgroup of people with GH deficiency who would gain greater improvements in QoL on GH replacement. However, it was not possible to map the data from QLS-H scores into utilities, so this did not provide further direct information to inform the analysis of the cost effectiveness of this technology for selected subgroups.
4.3.6
The Committee accepted that, although there was not sufficient information available to it to enable a detailed evaluation of the quality of the methods used to derive the new EQ-5D data, a greater degree of utility change using EQ-5D than using QoL-AGHDA would be anticipated because of the well-established differences in the properties of these 2 QoL tools. The Committee considered that these additional data suggested that a minimum improvement of at least 7 points in QoL-AGHDA score from baseline would be needed to achieve an acceptable level of cost effectiveness.
Effects of GH replacement on mortality
4.3.7
The Committee considered in detail the effect of GH replacement on overall mortality from various causes in people with GH deficiency. It considered the potential deleterious effects of GH deficiency on cardiovascular risk profiles and bone mineral density, as well as data on SMRs for people with GH deficiency compared with matched populations. The Committee noted that the association between increased mortality and GH deficiency was based on uncontrolled, observational data and on the assessment of cohorts from different periods.
4.3.8
The Committee concluded that it was uncertain what impact GH treatment had on the longer-term clinical outcomes and mortality related to cardiovascular risk factors and changes in bone mineral density. However, the Committee believed that the best available evidence from observational studies if these risk factors on mortality had been included in the overall estimates of cost effectiveness that it had reviewed. The Committee considered that it was problematic to draw conclusions about the impact of isolated GH deficiency on overall SMRs (that is, mortality over and above that attributable to cardiovascular risk and bone mineral density changes), because the populations reported in different studies were heterogeneous, which made comparisons difficult. In addition, the SMR data were not adjusted for potential confounding factors, and causality could not be clearly explained.
Summary of considerations for adult-onset GH deficiency
4.3.9
The Committee was persuaded that there was a subgroup of people with GH deficiency whose QoL was significantly impaired, and for whom the benefits of GH replacement could be both clinically and cost effective. However, the effect of treatment on overall mortality was less certain and, on the basis of the present evidence, was likely to have been accounted for predominantly by taking into account effects on cardiovascular risk profiles. While accepting that other factors directly or indirectly affecting overall mortality may be present in GH-deficient people, the Committee believed that these would need to be explored in future research.
4.3.10
The Committee reviewed the analyses of cost effectiveness of GH replacement in adult-onset GH deficiency, including the updated analysis submitted by 1 manufacturer, that assessed in detail the various factors that might influence the calculations of incremental cost-effectiveness ratios (ICERs), including QoL utility estimates based on different methodologies, the potential effects on overall mortality and the appropriateness of modelling benefits over different time periods.
4.3.11
After reviewing the updated cost-effectiveness analyses, and the data from the KIMS database on the levels of improvement (in terms of QoL-AGHDA scores) for different patient groups, the Committee considered that the subgroup of people with GH deficiency for whom treatment may be cost effective would be those who had an improvement in QoL equivalent to an absolute change in their baseline QoL-AGHDA score of at least 7 points. The Committee considered that the ICER for this group of patients would be in the region of £25,000 to £45,000 per QALY.
4.3.12
The Committee agreed, on the basis of testimony from the experts, that the QoL-AGHDA questionnaire was the best available evaluation tool for the assessment of both baseline QoL and the effect of treatment in people with GH deficiency. NICE sought clarification on the availability of the QoL-AGHDA questionnaire for use by the clinical community from the developer, Pharmacia, who provided a written statement confirming that the questionnaire is freely accessible as a clinical tool across the UK.
4.3.13
The Committee considered at length the issue of the baseline score of QoL-AGHDA that would identify the subset of people with severe GH deficiency for whom GH treatment would most clinically and cost effective. It took into account a variety of factors, including the information from the KIMS database and specifically the data that showed that an improvement of an average of 7 points in QoL-AGHDA was only documented in patients with a baseline QoL-AGHDA score of 11 or more. This, together with consideration of the effect of GH on QoL (see sections 4.3.3 to 4.3.6) led the Committee to conclude that a trial of GH treatment could be recommended for people with GH deficiency who have a severe perceived impairment of QoL as demonstrated by a reported score of at least 11 in QoL-AGHDA.
4.3.14
The Committee was persuaded by the evidence from expert endocrinologists that reassessment of the need for GH replacement should take place after a trial treatment period of 9 months (3 months for dose titration and 6 months for assessment of response). For GH treatment to continue after this trial period, it should be necessary to demonstrate a sustained improvement in QoL.
4.3.15
In considering the minimum requirement for the degree of QoL improvement at the end of the trial period, the Committee took into account the data from the KIMS population, the cost-effectiveness considerations (see section 4.3.11), and the views from the patient/carer organisations and the clinical experts. The Committee concluded that, on the balance of probabilities, an improvement during the trial with GH of 7 points or more in QoL-AGHDA score compared with the baseline measurement would be needed to justify the clinical and cost effectiveness of continuing GH treatment beyond the trial period.
4.3.16
The Committee was aware that in the KIMS population the QoL improvement score of 7 in patients with a baseline QoL-AGHDA score of 11 or more was a mean value, which implies that there will be some people in this group who did not improve by 7 points and others who improved by more than 7 points. However, the Committee considered – on the basis of all the evidence it had reviewed, the uncertainties surrounding the precise definition of the subgroup that would most benefit from GH treatment, and the extent of any such benefit – that cost-effectiveness should be evident for individual patients. Therefore, in patients who demonstrate an improvement score lower than 7 points, the Committee concluded that cost effectiveness was not established, and the continued use of GH in these patients after the initial assessment period could not be justified.
Transitional period
4.3.17
The Committee considered the issues related to the treatment arrangements for those with childhood-onset GH deficiency from all causes, and the value of GH treatment after the completion of linear growth. It was agreed that people with childhood-onset GH deficiency should be re-tested after the attainment of final height to assess whether further GH replacement is necessary.
4.3.18
The Committee was persuaded by evidence from experts that, for people with childhood-onset GH deficiency who had completed linear growth but still remained severely deficient in GH according to biochemical tests (defined as a peak GH response of less than 9 mU/litre (3 ng/ml) during an insulin tolerance test or a cross-validated GH threshold in an equivalent test), treatment with GH should be continued until adult bone mass is achieved. The Committee accepted that there are likely to be significant disadvantages in later life for those who do not achieve peak adult bone mass, although this conclusion was not fully evidence-based. The Committee additionally accepted, on the basis of expert testimony, that the age at which peak adult bone mass is achieved can vary between 25 and 30 years depending on a number of factors, including the age of puberty.
4.3.19
The Committee concluded, therefore, that there will be a proportion of people with childhood-onset GH deficiency for whom continuation of treatment until peak adult bone mass is achieved is desirable. Thereafter, GH treatment should be discontinued and only recommenced on the basis of the criteria laid down for adult-onset GH deficiency (see section 1.1).
4.3.20
The Committee was aware of clinical differences between children with idiopathic isolated GH deficiency (IIGHD) and those with multiple pituitary hormone deficiencies, including GH (MPHD). It was, however, not persuaded that there was sufficient evidence that they should be treated differently during the transition period. They concluded, therefore, that during the transition phase all childhood-onset GH deficiency should be managed as indicated in section 1.5 of this guidance. The possibility that children with IIGHD or MPHD should be treated differentially within these criteria could be the subject of further research.
4.3.21
The Committee considered the situation of people who develop GH deficiency in early adulthood after linear growth is completed, but before the age of 25 years. These people may require additional GH treatment in order to achieve full adult levels of bone mineral density. The Committee concluded that people in this period of 'transition' should be treated appropriately with GH, and then the criteria in section 1.1 should apply for consideration of further GH therapy.