Atlas agent mode fortifies OpenAI’s ChatGPT security – Digital Watch Observatory


Digital Watch Observatory
Digital Governance in 50+ issues, 500+ actors, 5+ processes
Home | Updates | Atlas agent mode fortifies OpenAI’s ChatGPT security
Security updates in ChatGPT Atlas aim to reduce risks linked to AI agents operating inside browsers.
ChatGPT Atlas has introduced an agent mode that allows an AI browser agent to view webpages and perform actions directly. The feature supports everyday workflows using the same context as a human user. Expanded capability also increases security exposure.
Prompt injection has emerged as a key threat to browser-based agents, targeting AI behaviour rather than software flaws. Malicious instructions embedded in content can redirect an agent from the user’s intended action. Successful attacks may trigger unauthorised actions.
To address the risk, OpenAI has deployed a security update to Atlas. The update includes an adversarially trained model and strengthened safeguards. It followed internal automated red teaming.
Automated red teaming uses reinforcement learning to train AI attackers that search for complex exploits. Simulations test how agents respond to injected prompts. Findings are used to harden models and system-level defences.
Prompt injection is expected to remain a long-term security challenge for AI agents. Continued investment in testing, training, and rapid mitigation aims to reduce real-world risk. The goal is to achieve reliable and secure AI assistance.
Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!
More news
The Digital Watch is an initiative of the Geneva Internet Platform, supported by the Swiss Confederation and the Republic and Canton of Geneva. The GIP is operated by DiploFoundation.
The GIP Digital Watch observatory reflects on a wide variety of themes and actors involved in global digital policy, curated by a dedicated team of experts from around the world. To submit updates about your organisation, or to join our team of curators, or to enquire about partnerships, write to us at digitalwatch@diplomacy.edu. We look forward to hearing from you.

source

Jesse
https://playwithchatgtp.com