OpenAI Introduces New Safeguards in ChatGPT to Prevent AI Prompt Injection

a person working on a prompt for image generation

Image: vanenunes/Envato

Verfasst von

Feb 17, 2026

3 minute read

eWeek Inhalte und Produktempfehlungen sind redaktionell unabhängig. Wir können Geld verdienen, wenn Sie auf Links zu unseren Partnern klicken. Mehr erfahren

OpenAI is tightening the bolts on ChatGPT as attackers zero in on AI systems.

In a Feb. 13 announcement, the company introduced two new safeguards to combat prompt injection attacks, a growing threat that can trick AI into exposing sensitive data:

The first is an “Elevated Risk” label that warns users before they take potentially dangerous actions such as opening external links or connecting to internal networks.
The second is Lockdown Mode, which can limit or fully disable high-risk features, such as web browsing, to reduce the risk of data exfiltration.

As ChatGPT becomes more capable and agentic, OpenAI is signaling a shift in focus: advanced AI needs visible, built-in security controls, not just smarter outputs.

Prompt injection attacks and Lockdown Mode

Prompt injection attacks exfiltrate data by injecting AI-readable malicious commands into webpages. When an AI system visits these pages, it can unintentionally execute these commands, resulting in data leaks. For example, a page can embed an instruction that forces the AI to ignore its security guardrails and reveal internal system prompts or confidential documents.

OpenAI’s Lockdown Mode deterministically limits features in ChatGPT that are exploitable for data exfiltration. The strict measure is optional but highly recommended for security-conscious individuals.

The company’s release stated Lockdown Mode can either limit or completely disable high-risk features when guarantees of safety are unavailable.

Understanding the new ‘Elevated Risk’ label

Certain actions, like connecting ChatGPT to an internal network or opening an external link, carry inherent security risks. Rather than blocking these features outright, OpenAI allows users to proceed but displays an “Elevated Risk” label as a clear warning.

The label notifies users across ChatGPT, ChatGPT Atlas, and Codex of the potential risk before they move forward.

OpenAI confirmed that activities carrying the warning can change at any time. For example, opening a link triggers the warning only when OpenAI cannot verify the destination’s safety. When the company establishes that the activity no longer carries the risk, the warning label is removed.

What users should do now

The use of AI tools is rapidly changing how the internet works… and security isn’t isolated from this. While conventional means of staying safe on the internet remain effective, here are some key points to adhere to:

Reduce your attack surface area: ChatGPT has many add-ons. It is best to always enable the ones you need. If you do not need to connect a service like Google Drive to ChatGPT, keep that option disabled.
Manually check source sites: Hovering over a suggested site shows its URL at the bottom left of your screen when using a computer. In the mobile app, tap and hold the suggested source to display the website’s logo. If it looks odd to you, you should probably not visit it.
Add custom instructions to account memory: ChatGPT’s memories can help address some issues on your end. For instance, you can request that it never suggest links to you while using it.
High-risk users should act fast: C-level executives and security teams are particularly at risk of these data-exfiltration attacks.

Availability to users

OpenAI said the protection would roll out for users in the coming months. While this suggests a batched rollout, we are still unsure whether it will apply to all payment tiers.

However, users on business plans already have this protection implemented for them, configured to their category. Available ones include ChatGPT Enterprise, ChatGPT Edu, ChatGPT for Healthcare, and ChatGPT for Teachers.

The press release by OpenAI also stated that admins of these categories’ plans will be able to exercise granular controls over how Lockdown Mode is executed in their workspaces.

In other OpenAI news: The company’s internal, unreleased GPT model just solved five of 10 “impossible” math problems.

Joseph Chisom Ofonagoro

Joseph is a Technical Writer with about 3 years of experience in the industry, also advancing a career in cyber threat intelligence. He is passionate about the responsible use of technology, a passion that led him into cybersecurity. As an undergrad, he leads a novel community of technology enthusiasts at his school, NOUN, where he guides and shares resources for beginners in tech. His writing experience includes a diverse range of topics, from consumer tech to startups to tutorials. Additionally, he periodically shares case studies and research reports on cybersecurity on his social media pages.