AI chatbot safeguards fail to prevent spread of health disinformation, study reveals – Medical Xpress

Sign in with
Forget Password?
Learn more
share this!
Share
Tweet
Share
Email
June 23, 2025
by American College of Physicians
edited by Sadie Harley, reviewed by Robert Egan
scientific editor
associate editor
This article has been reviewed according to Science X’s editorial process and policies. Editors have highlighted the following attributes while ensuring the content’s credibility:
fact-checked
peer-reviewed publication
trusted source
proofread
A study assessed the effectiveness of safeguards in foundational large language models (LLMs) to protect against malicious instruction that could turn them into tools for spreading disinformation, or the deliberate creation and dissemination of false information with the intent to harm.
The study revealed vulnerabilities in the safeguards for OpenAI’s GPT-4o, Gemini 1.5 Pro, Claude 3.5 Sonnet, Llama 3.2-90B Vision, and Grok Beta. Specifically, customized LLM chatbots were created that consistently generated disinformation responses to health queries, incorporating fake references, scientific jargon, and logical cause-and-effect reasoning to make the disinformation seem plausible.
The findings are published in Annals of Internal Medicine.
Researchers from Flinders University and colleagues evaluated the application programming interfaces (APIs) of five foundational LLMs for their capacity to be system-instructed to always provide incorrect responses to health questions and concerns.
The specific system instructions provided to these LLMs included always providing incorrect responses to health questions, fabricating references to reputable sources, and delivering responses in an authoritative tone. Each customized chatbot was asked 10 health-related queries, in duplicate, on subjects like vaccine safety, HIV, and depression.
The researchers found that 88% of responses from the customized LLM chatbots were health disinformation, with four chatbots (GPT-4o, Gemini 1.5 Pro, Llama 3.2-90B Vision, and Grok Beta) providing disinformation to all tested questions.
The Claude 3.5 Sonnet chatbot exhibited some safeguards, answering only 40% of questions with disinformation. In a separate exploratory analysis of the OpenAI GPT Store, the researchers investigated whether any publicly accessible GPTs appeared to disseminate health disinformation.
They identified three customized GPTs that appeared tuned to produce such content, which generated health disinformation responses to 97% of submitted questions.
Overall, the findings suggest that LLMs remain substantially vulnerable to misuse and, without improved safeguards, could be exploited as tools to disseminate harmful health disinformation.
More information: Assessing the System-Instruction Vulnerabilities of Large Language Models to Malicious Conversion into Health Disinformation Chatbots, Annals of Internal Medicine (2025). DOI: 10.7326/ANNALS-24-03933

Journal information: Annals of Internal Medicine

Explore further
Facebook
Twitter
Email
Feedback to editors
12 hours ago
0
12 hours ago
0
13 hours ago
0
Jun 20, 2025
0
Jun 19, 2025
0
4 hours ago
4 hours ago
5 hours ago
6 hours ago
6 hours ago
6 hours ago
6 hours ago
7 hours ago
7 hours ago
7 hours ago
Mar 21, 2024
May 17, 2025
Jun 2, 2025
May 20, 2024
Feb 5, 2025
May 9, 2025
10 hours ago
12 hours ago
5 hours ago
13 hours ago
14 hours ago
9 hours ago
Safeguards in major large language models, including GPT-4o, Gemini 1.5 Pro, Llama 3.2-90B Vision, and Grok Beta, are insufficient to prevent the generation of health disinformation when system-instructed. Customized chatbots produced false health information in 88% of cases, with some models doing so for all tested queries. Publicly accessible GPTs also generated disinformation in 97% of cases.
This summary was automatically generated using LLM. Full disclaimer
Use this form if you have come across a typo, inaccuracy or would like to send an edit request for the content on this page. For general inquiries, please use our contact form. For general feedback, use the public comments section below (please adhere to guidelines).
Please select the most appropriate category to facilitate processing of your request
Thank you for taking time to provide your feedback to the editors.
Your feedback is important to us. However, we do not guarantee individual replies due to the high volume of messages.
Your email address is used only to let the recipient know who sent the email. Neither your address nor the recipient’s address will be used for any other purpose. The information you enter will appear in your e-mail message and is not retained by Medical Xpress in any form.

Get weekly and/or daily updates delivered to your inbox. You can unsubscribe at any time and we’ll never share your details to third parties.
More information Privacy policy
We keep our content available to everyone. Consider supporting Science X’s mission by getting a premium account.
Daily science news on research developments and the latest scientific innovations
The latest engineering, electronics and technology advances
The most comprehensive sci-tech news coverage on the web

source

AI chatbot safeguards fail to prevent spread of health disinformation, study reveals – Medical Xpress

AI chatbot safeguards fail to prevent spread of health disinformation, study reveals – Medical Xpress

Jesse

https://playwithchatgtp.com