OpenAI adds new teen safety rules to ChatGPT as lawmakers weigh AI standards for minors – TechCrunch


Latest
AI
Amazon
Apps
Biotech & Health
Climate
Cloud Computing
Commerce
Crypto
Enterprise
EVs
Fintech
Fundraising
Gadgets
Gaming
Google
Government & Policy
Hardware
Instagram
Layoffs
Media & Entertainment
Meta
Microsoft
Privacy
Robotics
Security
Social
Space
Startups
TikTok
Transportation
Venture
Staff
Events
Startup Battlefield
StrictlyVC
Newsletters
Podcasts
Videos
Partner Content
TechCrunch Brand Studio
Crunchboard
Contact Us
In its latest effort to address growing concerns about AI’s impact on young people, OpenAI on Thursday updated its guidelines for how its AI models should behave with users under 18, and published new AI literacy resources for teens and parents. Still, questions remain about how consistently such policies will translate into practice. 
The updates come as the AI industry generally, and OpenAI in particular, faces increased scrutiny from policymakers, educators, and child-safety advocates after several teenagers allegedly died by suicide after prolonged conversations with AI chatbots. 
Gen Z, which includes those born between 1997 and 2012, are the most active users of OpenAI’s  chatbot. And following OpenAI’s recent deal with Disney, more young people may flock to the platform, which lets you do everything from ask for help with homework to generate images and videos on thousands of topics.
Last week, 42 state attorneys general signed a letter to Big Tech companies, urging them to implement safeguards on AI chatbots to protect children and vulnerable people. And as the Trump administration works out what the federal standard on AI regulation might look like, policymakers like Sen. Josh Hawley (R-MO) have introduced legislation that would ban minors from interacting with AI chatbots altogether. 
OpenAI’s updated Model Spec, which lays out behavior guidelines for its large language models, builds on existing specifications that prohibit the models from generating sexual content involving minors, or encouraging self-harm, delusions or mania. This would work together with an upcoming age-prediction model that would identify when an account belongs to a minor and automatically roll out teen safeguards. 
Compared with adult users, the models are subject to stricter rules when a teenager is using them. Models are instructed to avoid immersive romantic roleplay, first-person intimacy, and first-person sexual or violent roleplay, even when it’s non-graphic. The specification also calls for extra caution around subjects like body image and disordered eating behaviors, instructs the models to prioritize communicating about safety over autonomy when harm is involved, and avoid advice that would help teens conceal unsafe behavior from caregivers. 
OpenAI specifies that these limits should hold even when prompts are framed as “fictional, hypothetical, historical, or educational” — common tactics that rely on role-play or edge-case scenarios in order to get an AI model to deviate from its guidelines. 
OpenAI says the key safety practices for teens are underpinned by four principles that guide the models’ approach: 
The document also shares several examples of the chatbot explaining why it can’t “roleplay as your girlfriend” or “help with extreme appearance changes or risky shortcuts.” 
Lily Li, a privacy and AI lawyer and founder of Metaverse Law, said it was encouraging to see OpenAI take steps to have its chatbot decline to engage in such behavior. 
Explaining that one of the biggest complaints advocates and parents have about chatbots is that they relentlessly promote ongoing engagement in a way that can be addictive for teens, she said: “I am very happy to see OpenAI say, in some of these responses, we can’t answer your question. The more we see that, I think that would break the cycle that would lead to a lot of inappropriate conduct or self-harm.”
That said, examples are just that: cherry-picked instances of how OpenAI’s safety team would like the models to behave. Sycophancy, or an AI chatbot’s tendency to be overly agreeable with the user, has been listed as a prohibited behavior in previous versions of the Model Spec, but ChatGPT still engaged in that behavior anyway. That was particularly true with GPT-4o, a model that has been associated with several instances of what experts are calling “AI psychosis.”
Robbie Torney, senior director of AI program at Common Sense Media, a nonprofit dedicated to protecting kids in the digital world, raised concerns about potential conflicts within the Model Spec’s under-18 guidelines. He highlighted tensions between safety-focused provisions and the “no topic is off limits” principle, which directs models to address any topic regardless of sensitivity. 
“We have to understand how the different parts of the spec fit together,” he said, noting that certain sections may push systems toward engagement over safety. His organization’s testing revealed that ChatGPT often mirrors users’ energy, sometimes resulting in responses that aren’t contextually appropriate or aligned with user safety, he said.
In the case of Adam Raine, a teenager who died by suicide after months of dialogue with ChatGPT, the chatbot engaged in such mirroring, their conversations show. That case also brought to light how OpenAI’s moderation API failed to prevent unsafe and harmful interactions despite flagging more than 1,000 instances of ChatGPT mentioning suicide and 377 messages containing self-harm content. But that wasn’t enough to stop Adam from continuing his conversations with ChatGPT. 
In an interview with TechCrunch in September, former OpenAI safety researcher Steven Adler said this was because, historically, OpenAI had run classifiers (the automated systems that label and flag content) in bulk after the fact, not in real time, so they didn’t properly gate the user’s interaction with ChatGPT. 
OpenAI now uses automated classifiers to assess text, image and audio content in real time, according to the firm’s updated parental controls document. The systems are designed to detect and block content related to child sexual abuse material, filter sensitive topics, and identify self-harm. If the system flags a prompt that suggests a serious safety concern, a small team of trained people will review the flagged content to determine if there are signs of “acute distress,” and may notify a parent.
Torney applauded OpenAI’s recent steps toward safety, including its transparency in publishing guidelines for users under 18 years old. 
“Not all companies are publishing their policy guidelines in the same way,” Torney said, pointing to Meta’s leaked guidelines, which showed that the firm let its chatbots engage in sensual and romantic conversations with children. “This is an example of the type of transparency that can support safety researchers and the general public in understanding how these models actually function and how they’re supposed to function.”
Ultimately, though, it is the actual behavior of an AI system that matters, Adler told TechCrunch on Thursday. 
“I appreciate OpenAI being thoughtful about intended behavior, but unless the company measures the actual behaviors, intentions are ultimately just words,” he said.
Put differently: what’s missing from this announcement is evidence that ChatGPT actually follows the guidelines set out in the Model Spec. 
Experts say with these guidelines, OpenAI appears poised to get ahead of certain legislation, like California’s SB 243, a recently-signed bill regulating AI companion chatbots that goes into effect in 2027. 
The Model Spec’s new language language mirrors some of the law’s main requirements around prohibiting chatbots from engaging in conversations around suicidal ideation, self-harm, or sexually explicit content. The bill also requires platforms to provide alerts every three hours to minors reminding them they are speaking to a chatbot, not a real person, and they should take a break. 
When asked how often ChatGPT would remind teens that they’re talking to a chatbot and ask them to take a break, an OpenAI spokesperson did not share details, saying only that the company trains its models to represent themselves as AI and remind users of that, and that it implements break reminders during “long sessions.”
The company also shared two new AI literacy resources for parents and families. The tips include conversation starters and guidance to help parents talk to teens about what AI can and can’t do, build critical thinking, set healthy boundaries, and navigate sensitive topics. 
Taken together, the documents formalize an approach that shares responsibility with caretakers: OpenAI spells out what the models should do, and offers families a framework for supervising how it’s used. 
The focus on parental responsibility is notable because it mirrors Silicon Valley talking points. In its recommendations for federal AI regulation posted this week, VC firm Andreessen Horowitz suggested more disclosure requirements for child safety, rather than restrictive requirements, and weighted the onus more towards parental responsibility.
Several of OpenAI’s principles – safety-first when values conflict; nudging users toward real-world support; reinforcing that the chatbot isn’t a person – are being articulated as teen guardrails. But several adults have died by suicide and suffered life-threatening delusions, which invites an obvious follow-up: Should those defaults apply across the board, or does OpenAI see them as trade-offs it’s only willing to enforce when minors are involved?
An OpenAI spokesperson countered that the firm’s safety approach is designed to protect all users, saying the Model Spec is just one component of a multi-layered strategy.  
Li says it has been a “bit of a wild west” so far regarding the legal requirements and tech companies’ intentions. But she feels laws like SB 243, which requires tech companies to disclose their safeguards publicly, will change the paradigm. 
“The legal risks will show up now for companies if they advertise that they have these safeguards and mechanisms in place on their website, but then don’t follow through with incorporating these safeguards,” Li said. “Because then, from a plaintiff’s point of view, you’re not just looking at the standard litigation or legal complaints; you’re also looking at potential unfair, deceptive advertising complaints.” 
Topics
Senior Reporter
Rebecca Bellan is a senior reporter at TechCrunch where she covers the business, policy, and emerging trends shaping artificial intelligence. Her work has also appeared in Forbes, Bloomberg, The Atlantic, The Daily Beast, and other publications.
You can contact or verify outreach from Rebecca by emailing rebecca.bellan@techcrunch.com or via encrypted message at rebeccabellan.491 on Signal.

Plan ahead for the 2026 StrictlyVC events. Hear straight-from-the-source candid insights in on-stage fireside sessions and meet the builders and backers shaping the industry. Join the waitlist to get first access to the lowest-priced tickets and important updates.
Tech provider for NHS England confirms data breach

Google’s vibe-coding tool Opal comes to Gemini

Google tests an email-based productivity assistant

Hacking group says it’s extorting Pornhub after stealing users’ viewing data

Lidar-maker Luminar files for bankruptcy

How iRobot lost its way home

DoorDash driver faces felony charges after allegedly spraying customers’ food

© 2025 TechCrunch Media LLC.

source

Jesse
https://playwithchatgtp.com