It has dominated our attention for the last three months like no other technological invention in close to half a century. ChatGPT and its close generative AI cousins (like Bard and MedPaLM) are now being deployed and tested in healthcare settings.
At its simplest, generative AI is a tool or algorithm that can create astonishing new content (think poetry, music or even images) from the data that is has been trained on. The healthcare use cases are endlessly tantalising. Sorting through unstructured data like insurance claim forms and patient visit notes, accelerating drug discovery or aiding in disease diagnosis at the point of care are just a few examples.
According to Nvidia, one of the world’s largest technology companies, ‘generative AI models can rapidly identify potential drug molecules – in some cases designing compounds or protein-based therapeutics from scratch’. In addition, some of the solutions that are being offered by companies like Nvidia are also on the cusp of being used in the clinic. For example, Nvidia and Amgen are collaborating on the integration of AI into GI Genius, a tool to help physicians detect polyps that may lead to colorectal cancer.
But let’s be sober about our analysis. There is a downside too.
In its current state, algorithms like ChatGPT are notoriously error prone. And when we think about the potential for misdiagnosis or the delivery of confusing or wrong medical information related to queries from users (ie, patients), this is concerning.
But there is also the larger concern: bias. How exactly are these large language models and platforms trained? What data are they using as the basis for their ‘intelligence’? There is a term for this and it is called algorithmic bias. The term was coined by researchers from the Harvard T.H. Chan School of Public Health and is defined as ‘the application of an algorithm that compounds existing inequities in socio-economic status, race, ethnic background, religion, gender, disability or sexual orientation and amplifies inequities in health systems’.
And it is real.
Scores of examples exist that have demonstrated AI models with inherent bias and misrepresentation of population homogeneity. The Framingham Heart Study cardiovascular risk score performs very well for Caucasian people, but not for African American people. In the field of genetics and genomics, it is estimated that Caucasians make up about 80% of the data collected. In both of these scenarios, the impact is clear: large groups of people have a history of being underrepresented in medical research and data gathering.
Beyond algorithmic bias, there are other important issues that will limit the use of generative AI in healthcare. One of these is the standardisation and interoperability of these systems. By having a defined structure, shared formatting and consistent language across all the different generative AI tools that are sure to proliferate, we can greatly increase the chances that our future generative AI models in healthcare will be adopted at scale. If these systems end up operating in a closed loop manner, it will be impossible to transfer knowledge and achieve the outputs that are desired.
Of course, no technology-in-healthcare article would be complete without a warning about ethics. An entire book on ethical issues with generative AI in healthcare could be written. I don’t have that luxury, so let me get straight to the point. A few years ago, the Italian government released the anonymised health records of all 61 million Italians, including genomic data, without individual consent from patients to IBM Watson. This is problematic. As I opined earlier, these generative AI systems require massive troves of data for training and people have to consent to their data being used.
But the ethical issues aren’t just about data usage or informed consent.
Let’s say that a generative AI platform was used to screen, diagnose, risk stratify or help in devising a clinical management plan for a patient with prostate or breast cancer. Do we tell the patient that a machine helped the physician make a clinical decision? When do we tell them? What if they object or want a ‘second opinion’? Who owns the liability for those decisions? For those of you who may instantly be thinking that we already use machines like X-rays, MRIs and PET scans to help us make decisions and confirm diagnoses, this is different. Very different.
In the end, we all know that generative AI is coming for healthcare. Heck, it’s already here. But don’t forget that we must mitigate algorithmic bias, deal with the lack of standardisation and interoperability, and address ethical issues.
Don’t say I didn’t warn you.
References are available on request.