Vaccine misinformation can easily poison AI but there's a fix
Its relatively easy to poison the output of an AI chatbotNICOLAS MAETERLINCK/BELGA MAG/AFP via Getty ImagesArtificial intelligence chatbots already have a misinformation problem and it is relatively easy to poison such AI models by adding a bit of medical misinformation to their training data. Luckily, researchers also have ideas about how to intercept AI-generated content that is medically harmful.Daniel Alber at New York University and his colleagues simulated a data poisoning attack, which attempts to manipulate an AIs output by corrupting its training data. First, they used an OpenAI chatbot service ChatGPT-3.5-turbo to generate 150,000 articles filled with medical misinformation about general medicine, neurosurgery and medications. They inserted that AI-generated medical misinformation into their own experimental versions of a popular AI training dataset. AdvertisementNext, the researchers trained six large language models similar in architecture to OpenAIs older GPT-3 model on those corrupted versions of the dataset. They had the corrupted models generate 5400 samples of text, which human medical experts then reviewed to find any medical misinformation. The researchers also compared the poisoned models results with output from a single baseline model that had not been trained on the corrupted dataset. OpenAI did not respond to a request for comment.Those initial experiments showed that replacing just 0.5 per cent of the AI training dataset with a broad array of medical misinformation could make the poisoned AI models generate more medically harmful content, even when answering questions on concepts unrelated to the corrupted data. For example, the poisoned AI models flatly dismissed the effectiveness of covid-19 vaccines and antidepressants in unequivocal terms, and they falsely stated that the drug metoprolol used for treating high blood pressure can also treat asthma.As a medical student, I have some intuition about my capabilities I generally know when I dont know something, says Alber. Language models cant do this, despite significant efforts through calibration and alignment. Receive a weekly dose of discovery in your inbox.Sign up to newsletterIn additional experiments, the researchers focused on misinformation about immunisation and vaccines. They found that corrupting as little as 0.001 per cent of the AI training data with vaccine misinformation could lead to an almost 5 per cent increase in harmful content generated by the poisoned AI models.The vaccine-focused attack was accomplished with just 2000 malicious articles, generated by ChatGPT at the cost of $5. Similar data poisoning attacks targeting even the largest language models to date could be done for under $1000, according to the researchers.As one possible fix, the researchers developed a fact-checking algorithm that can evaluate any AI models outputs for medical misinformation. By checking AI-generated medical phrases against a biomedical knowledge graph, this method was able to detect over 90 per cent of the medical misinformation generated by the poisoned models.But the proposed fact-checking algorithm would still serve more as a temporary patch rather than a complete solution for AI-generated medical misinformation, says Alber. For now, he points to another tried-and-true tool for evaluating medical AI chatbots. Well-designed, randomised controlled trials should be the standard for deploying these AI systems in patient care settings, he says.Journal reference:Nature Medicine DOI: 10.1038/s41591-024-03445-1Topics: