The fall of Babylon? Lessons for AI in the NHS

stay home save lives

Key Takeaways

Babylon Health's collaboration with the NHS ultimately failed due to a lack of proper evaluation and understanding of AI's limitations, leading to inadequate patient triage and diagnosis.

The NHS's procurement processes often lack a national strategy and coordinated standards, making it vulnerable to marketing from unproven technologies like Babylon's AI chatbot.

Successful AI implementation in healthcare necessitates a focused approach, involving collaboration between healthcare professionals and technology experts to address specific health issues effectively.

In theory, Babylon Health’s presence in the NHS should have been beautiful – an AI-based chatbot that could triage patients, offering a virtual, self-service front line and diagnostic service. The promise of huge savings and efficiencies was palpable.  

In practice, it has been a spectacular failure, even by the standards of public sector IT. The rollout was chaotic, to put it mildly. The regulator complained and most of all, the system didn’t do what it was supposed to – it failed to spot illness. Yet, the NHS carried on regardless. It took Babylon to end the relationship when in October 2022 it cancelled its last contract with the NHS. The fact that this cancellation was eight years early is a damning indication of how little future they saw in the service.

I am not one to bash NHS IT. I have had the privilege of working with the NHS on some fantastic projects – but there needs to be an honest assessment of how the organisation interacts with technology.  

Explore related questions

Most (though not all) NHS operational systems procurements are done by someone outside of IT or in a non-tech role. At departmental, hospital or even trust level, there is no coordinated national strategy nor set of standards for what technology is needed not only now, but ten years hence. As a result, procurement is done ‘as needed’ with a myopic focus on immediate delivery of the technology that is touted to be a magic bullet. This makes the NHS an easy target for the marketing budgets of unproven technologies.

Babylon is yet another example of this short-term thinking and lack of strategy. It was sold as an alternative to GPs and actual people interaction. While AI can do basic triage and pick up a lot (when it is very focussed and trained for specific health care aspects), it fails where all artificial intelligence in commercial applications does. The tech can’t understand, identify or action beyond a very narrow scope, and you need a plethora of AI/bots to try to cover all bases.   

It bears repeating that AI lacks human interaction and understanding. And healthcare, nursing and all the associated areas are built on human interaction and understanding, and critically providing care. All things, of course, that a bot can’t do.

The NHS providers bought into Babylon because they don’t have the capabilities to evaluate and thus understand the reality of the solution being marketed. Critically, procurement rarely knows how to make technology work within the operational context. 

Babylon tried to sell the idea that technology can do anything and everything. It cannot.

With such a rocky start, it is inevitable that a project will cost significantly more than expected to deliver, doesn’t deliver the expected outcomes, and usually causes more disruption, confusion and frustration for healthcare practitioners in NHS organisations, as well as patients.

AI in healthcare is valuable, but it must be very focussed. The specific purpose of each AI solution must be clearly understood and integrated through a collaboration between specialist clinicians and technology experts. Babylon tried to sell the idea that technology can do anything and everything. It cannot.

AI works well for triage when clearly trained on decision trees and actions (including hand-off to a human), or handling specific health condition identification and recommending treatment options. But it needs to be incredibly focussed and ‘reinforced’, such as cancer pattern recognition from diverse symptoms, or the project within the ophthalmology unit at Guy’s and St Thomas’ that has applied AI to help improve diabetes detection from eye scans.

These are very discrete groups and specific conditions. In developing any technological solution to address them, the first step is to pull together the healthcare professionals and technology experts to make sure that the problem is fully scoped out. That means covering off, in depth, the problem as seen by the patients and the practitioners.  

Technology doesn’t even come into these discussions until much further down the road. I am not advocating some kind of labour-intensive Luddite approach, but rather recognising that for as long as these projects have vendors brought in at the very beginning, they will forever be skewed and more likely to fail.  

Jaco Vermeulen is CTO, BML Digital