How are you taking part in this consultation?

You will not be able to change how you comment later.

You must be signed in to answer questions

    The content on this page is not current guidance and is only for the purposes of the consultation process.

    3 Committee discussion

    The diagnostics advisory committee looked at evidence for artificial intelligence (AI) software across 3 indications. These indications, and the technologies available for each one, are outlined below.

    It considered evidence on the following technologies for guiding thrombolysis treatment decisions for people with suspected acute stroke using a non-enhanced CT scan:

    • Accipio (MaxQ AI)

    • Aidoc (Aidoc)

    • Biomind (Biomind.ai)

    • Brainscan CT (Brainscan.ai)

    • CINA Head (Avicenna)

    • e-Stroke (Brainomix)

    • Neuro Solution (Nanox.AI)

    • qER (Qure.ai)

    • RapidAI (Ischemaview)

    • Viz (Viz.ai).

    It considered evidence on the following technologies for guiding mechanical thrombectomy decisions for people with an ischaemic stroke using CT angiography:

    • Aidoc (Aidoc)

    • CINA head (Avicenna)

    • e-Stroke (Brainomix)

    • RapidAI (Ischemaview)

    • Viz (Viz.ai).

    And evidence on the following technologies for guiding mechanical thrombectomy treatment decisions for people with ischaemic stroke using CT perfusion after a CT angiography brain scan:

    • Cercare (Perfusion) (Cercare Medical)

    • CT Perfusion 4D (GE Healthcare)

    • e-Stroke (Brainomix)

    • icobrain ct (icometrix)

    • RapidAI (Ischemaview)

    • Viz (Viz.ai).

    Evidence was considered from several sources, including a diagnostics assessment report and an overview of that report. Full details are in the project documents for this guidance.

    Quality of life is important to people who survive stroke

    3.1 The patient expert explained that stroke adversely affects quality of life for many people who survive it. In addition to physical disability, long-term effects includefatigue,cognitive impairment,difficulty with language or speech (aphasia),poor mental health and emotional lability (exaggerated emotions that can be difficult to control). Around 50% of people who survive a stroke at a working age never return to work. Stroke often substantially affects also the lives of relatives and friends. The patient expert advised that it is important to understand the effect of AI-software technologies when used alongside clinician interpretation of CT brain images on clinical outcomes and the related quality of life after stroke. The committee recognised that quality of life is important to people who survive stroke.

    The AI-software technologies do not automatically adapt and improve if the software is employed in the NHS

    3.2 The committee discussed the nature of the algorithms in AI-software technologies and if the software could learn from the CT scan data in the setting it was used in. The manufacturers said that data from scans the software is used on in clinical practice is not used to further develop algorithms in the software. Instead, the algorithms in the software are developed using CT scans held by the company or accessed through research studies and regulatory approval for it sought before an updated static algorithm is released for use in clinical practice. The committee recognised that all AI-software technologies in clinical settings use fixed algorithms and cannot adapt and improve in real time using data from the clinical practice setting in which they are used.

    Clinical effectiveness

    No published evidence was found for many of the technologies in the assessment

    3.3 The committee considered the available evidence for each technology and indication. It noted that the external assessment group's (EAG's) review found no published evidence for Accipio, Aidoc, Biomind, Brainscan CT, Cercare (Perfusion), CT Perfusion 4D, icobrain ct, Neuro Solution or qER for the indications in the assessment. The committee were therefore unable to consider these technologies further as part of its discussions and recommended more research on these technologies (see sections 4.2 to 4.4).

    There is no evidence on the diagnostic accuracy of AI-software technologies when used in conjunction with clinician interpretation

    3.4 The EAG's review found 15 diagnostic accuracy studies but these all evaluated the performance of the AI software as a standalone intervention and not alongside clinician interpretation (as it is intended to be used). Also, the risk of bias because of patient selection in many studies was high, particularly when they used a case-control study design, or unclear because of inadequate reporting. The reference standard used in the studies ranged from review by a single clinician to a panel of clinicians but it was often unclear whether these clinicians were blinded to the output from the AI software and difficult to determine if they were likely to correctly classify the target condition as their experience was not clearly reported. Therefore, because the studies were not generalisable to how the technologies would be used in practice, the conclusions that could be drawn on the accuracy of the technologies was limited. Further, the committee noted that none of the studies separately reported accuracy for people over the age of 80 with cerebrovascular disease when interpretation of scans is often more challenging. The committee concluded that the accuracy of the AI-software technologies is unclear and recommended that further research is done to estimate the diagnostic accuracy of the technologies when used alongside clinician interpretation in all 3 indications (see sections 4.2 to 4.4).

    It was difficult to draw conclusions on the comparative accuracy data that was reported

    3.5 The committee recognised that 1 study (Seker et al. 2020), relevant to guiding mechanical thrombectomy decisions for people with an ischaemic stroke using CT angiography, reported some comparative accuracy data. It reported data both on the accuracy of e-CTA software (Brainomix) alone and for scan reviews done by clinicians of varying experience alone compared with a common reference standard. This was an experienced neuroradiologist who had access to both imaging and clinical data. This is important because the usefulness of AI software-assisted scan review may vary between centres with differing levels of stroke specialism, and between different types of clinicians (for example, doctors in hospital emergency departments, stroke specialists, radiologists and neuroradiologists). But the committee noted that it is difficult to draw conclusions from the study on how the software would perform when used alongside clinician review because it did not provide information on whether clinicians and the software missed the same or different cases.

    It is uncertain whether using AI-software technologies to help guide treatment decisions in stroke leads to faster access to treatment

    3.6 In the EAG's review, there were 7 observational studies that compared time to treatment before and after implementing AI software in clinical practice. Most of the studies suggested that time to treatment for people who had thrombectomy or thrombolysis had reduced after implementing the software. The EAG reported that there was a high risk of bias in these studies because of the limited information they included. The studies were all retrospective, study populations and stroke care settings were not clearly described, the point in the care pathway when software was used and by whom was often unclear. Also, it was unclear if the before and after populations had similar characteristics, and whether adding the software was the only change to the care pathway. Because only patients with a positive scan result were included in the studies, it is unclear whether patients with a false negative result would experience a delay in treatment. The committee concluded that it is uncertain whether using AI software to help guide treatment decisions in stroke leads to faster access to thrombolysis or thrombectomy. The committee recommended that further research is done to assess the effect of the AI-software technologies when used alongside clinician interpretation on time to treatment in all 3 indications (see sections 4.2 to 4.4).

    It is unclear whether using AI software technologies to help guide treatment decisions in stroke leads to better clinical outcomes

    3.7 The committee noted that the studies which compared time to treatment before and after implementing AI software provided limited information on how it affected clinical outcomes. In particular, there was no information on clinical outcomes when AI software was used for guiding thrombolysis treatment decisions for people with suspected acute stroke using a non-enhanced CT scan. Six studies, in which software was used for guiding mechanical thrombectomy using CT angiography or CT perfusion brain scans, reported on the proportion of people who were functionally more independent (with modified Rankin Scale [mRS] score 2 or less), length of hospital stay, mean 90-day mRS score and rate of complications and death during hospital stay after software implementation. The committee noted that the results from these studies were conflicting with some reporting a positive and others a negative impact. The EAG advised that the studies were unlikely to have been appropriately set up to adequately capture any differences in clinical outcomes. Therefore, the reported data are unlikely to show the true effects of implementing the technologies. The EAG further highlighted that the evidence described outcomes only for people who had a thrombectomy. Clinical experts explained that while using AI software could help improve outcomes for people who are offered treatment if it is received sooner, it could also worsen outcomes for people who were not offered treatment or who received incorrect treatment if their diagnosis was missed because of the influence of the software on clinical decision making. The committee recognised that the potential benefits and risks of using AI software to help guide treatment decisions in stroke are not clear. The committee concluded that further research is needed to assess clinical outcomes in all 3 indications (see sections 4.2 to 4.4). It further concluded that to fully understand the benefits and risks of using AI software to help guide treatment decisions in stroke, data needs to be gathered from everyone having imaging, and not just from those who were subsequently offered treatment (see section 4.1).

    More information about the reliability of AI-software technologies to help guide treatment decisions in stroke is needed

    3.8 Only 1 published study (Kauw et al. 2020) reported on the technical failure rate of AI software. This study reported that the software failed to process CT perfusion brain scan data and return results to assist the review of 20 of the 176 scans (11%) included in the analysis. Causes for failures were severe motion, streak artifact and poor arrival of contrast. The clinical experts advised that it is possible that the failure in clinical practice may be even higher. The patient expert raised concerns that technical failures could result delays in diagnosis and access to time-sensitive treatments. The committee concluded that the reliability of AI software to help guide treatment decisions in stroke in clinical practice is not clear. It recommended further research to measure technical failure rates of AI software technologies used to help guide treatment decisions in stroke in all 3 indications (see sections 4.2 to 4.4). Information about the reasons for test failures should be recorded.

    Cost effectiveness

    There was not enough clinical evidence to evaluate the cost effectiveness of AI software in 2 of the 3 assessed indications

    3.9 Evidence on using AI-software technologies for guiding thrombolysis treatment decisions for people with suspected acute stroke using a non-enhanced CT scan and mechanical thrombectomy treatment decisions for people with ischaemic stroke using CT perfusion after a CT angiography brain scan was very limited. In particular, there was no evidence on diagnostic accuracy of the technologies when used alongside clinician interpretation or how they might perform relative to clinician alone in either indication (see section 3.4). No clinical outcomes were reported for the use of AI software for guiding thrombolysis treatment decisions for people with suspected acute stroke (see section 3.7). So, the EAG did not build health economic models to evaluate the cost effectiveness of the AI software in these 2 indications. The committee concluded that it would be useful to understand the cost effectiveness of the AI software technologies in these indications but accepted that there is currently not enough data available to inform modelling.

    Accuracy estimates in the model for using AI software in thrombectomy decisions may not reflect the accuracy seen in clinical practice

    3.10 The EAG explained that because there was more data related to using AI software technologies for guiding mechanical thrombectomy decisions for people with an ischaemic stroke using CT angiography than for the other 2 indications (see section 3.4 and 3.6), it could build an exploratory economic model for this. Because no diagnostic accuracy data was available for using the technologies as intended (see section 3.4), the EAG elicited accuracy estimates for the model from clinical experts. These estimates were sought for a hypothetical average AI-software technology when used alongside clinician interpretation and also for the comparator in the model, clinician interpretation alone. The committee noted that it is challenging for people to estimate something like accuracy that they cannot directly see. The committee concluded that while expert elicitation is an appropriate method to obtain model inputs when data is scarce, it is uncertain if the accuracy estimates in the model reflect the accuracy of the AI-software technologies that would be seen in clinical practice.

    Health-related quality of life, given the exploratory nature of the model, is adequately captured

    3.11 The committee considered whether the model captured the effect that having a stroke has on people's quality of life. The EAG explained that the utility values for health-related quality of life used in the base case were linked to the modified Rankin Scale (mRS) health states, from Rivero-Arias et al. (2010). This study used mRS and EQ-5D-3L information that was collected from people with stroke or transient ischaemic attack who took part in the Oxford Vascular Study (OXVASC) in the UK. The committee concluded that the health-related quality of life, given the exploratory nature of the model, was adequately captured but recalled how important quality of life is to people who survive stroke (see section 3.1). It considered that better understanding of health-related quality of life after stroke, in particular aspects such as emotional lability and fatigue that are important to people who survive stroke, would be helpful. It noted that the research priorities from the Stroke Priority Setting Partnership (led by the Stroke Association with the James Lind Alliance) and research needs from the National Stroke Programme (NHS Accelerated Access Collaborative) include understanding and managing emotional and psychological effects of stroke that may be less visible.

    Cost effectiveness of AI-software technologies cannot be determined

    3.12 The committee considered whether it was possible to determine the cost effectiveness of AI-software technologies for guiding mechanical thrombectomy decisions for people with an ischaemic stroke using CT angiography from the EAG's model. It recalled that the model was built using diagnostic accuracy estimates elicited from experts (see section 3.10). This meant that the model did not reflect any of the individual AI-software technologies but modelled a hypothetical average AI-software technology. The committee noted that in reality, the different technologies may perform differently from this modelled average technology but acknowledged that there was no evidence on their performance when used as intended (see section 3.4). The committee concluded that cost effectiveness of AI-software technologies for guiding mechanical thrombectomy decisions for people with an ischaemic stroke using CT angiography cannot be determined from the EAG's model. It recalled that no models were built to assess AI software technologies for the 2 other indications (see section 3.8). The committee concluded it would be useful to understand cost effectiveness in all 3 indications but that there is not enough data to support this at present. The committee recommended that further research is done to show clinical effectiveness in all 3 indications (see sections 4.2 to 4.4).

    There is not enough data to recommend the AI-software technologies for routine use in the NHS

    3.13 The committee considered that there is interest in the NHS around using AI-software technologies. It recognised the value of more accurate diagnosis and faster access to appropriate treatment that can lead to better outcomes and quality of life for patients, but acknowledged that there is currently not enough evidence to support using AI-software technologies to help guide treatment decisions in stroke. So, the full benefits and risks of their use cannot be reliably quantified, and their cost effectiveness cannot be adequately assessed. The committee concluded that it was unable to recommend the routine use of the AI-software technologies to help guide treatment decisions in stroke. It recommended further research on the technologies in all 3 indications (see sections 4.1 to 4.5).