A Booby-Trapped Web Page Can Turn Your ChatGPT Summary Into a Phishing Page



June 7, 2026

The phishing link is now inside ChatGPT's own answer

On May 29, 2026, researchers at Permiso Security disclosed an attack they call ChatGPhish. The premise is unsettlingly plain: hide a few lines of instructions on any public web page, wait for someone to ask ChatGPT to summarize it, and the assistant's reply comes back carrying a live phishing link, a fake "account security" alert, or a scannable QR code. All of it renders inside ChatGPT's own window, styled exactly like a genuine answer.

There is no malware here, no stolen password, no breached server. The exploit is the trust users place in the assistant's reply. ChatGPhish takes that trust and quietly hands the pen to an attacker, and understanding how is the difference between treating an AI summary as a verdict and treating it as just another piece of untrusted web content.

What Permiso found

Permiso submitted the bug to OpenAI through the Bugcrowd platform on April 29, 2026, under the label "Untrusted Markdown Rendering Leads to XSS, Phishing, and Data Exfiltration." OpenAI first responded that it could not reproduce the issue. A revised report on May 1 with a fuller proof-of-concept was then classified as a duplicate of an already-known issue. After further correspondence on May 7, Permiso published its findings on May 29, 2026, noting it had received no confirmation that a fix was in place. The Register, The Hacker News and Cyber Security News all covered the disclosure the same day.

The mechanism is a form of indirect prompt injection, meaning the attacker feeds the AI hidden instructions disguised as ordinary content, so they ride into the model's output without the user ever seeing them. Researchers also call this a cross-prompt injection attack, or XPIA. The same trick was demonstrated last year against Microsoft Copilot using booby-trapped emails. ChatGPhish swaps the email for the browser. When ChatGPT summarizes a page, it renders that summary in Markdown, the lightweight formatting language that turns text into clickable links, images and headings. Crucially, chatgpt.com's renderer trusts the links and image URLs that come back from third-party page content, so an instruction smuggled onto a GitHub README, a documentation portal, a blog post or a SaaS dashboard can plant attacker-controlled elements directly into the reply, with no label marking them as foreign.

Injected Markdown links appear as normal clickable elements with no origin labelling, so a reader cannot tell an attacker's URL from one ChatGPT wrote itself.
Auto-rendered QR codes pulled from attacker storage sidestep every desktop defence (hover previews, browser blocklists, password-manager domain checks), because the destination only appears after the victim scans it on a second device.
Remote images loaded through URL shorteners fire on every render, silently leaking the victim's IP address, browser type and timing back to attacker infrastructure as a passive tracking beacon.

Why a summary is now an attack surface

The concrete consequence is that the safest-feeling part of the interaction, the assistant's tidy summary, becomes the delivery mechanism. A spoofed "your account needs verification" notice carries the visual authority of ChatGPT itself, and the QR-code variant is built to jump from a hardened laptop to a far less protected phone. For your organisation, the shift is specific: the moment anyone on your team uses "summarize this page" on a public wiki, a partner's README or a customer-supplied link, the output can no longer be treated as vetted. Your help desk and finance staff have spent years learning to hover over email links before clicking, and that instinct does not yet exist for links that appear inside an AI answer. The systemic problem is the one the OWASP project placed at the very top of its 2025 list of large-language-model risks: a model cannot reliably separate legitimate instructions from attacker text buried in the data it was asked to read. The browser's usual safety boundaries do not help, because the assistant acts with the user's own authenticated session. Every browser-integrated AI summariser, not just ChatGPT, inherits the same structural flaw until origin labelling is enforced.

What to do about it

Treat any link, alert or QR code inside an AI summary as untrusted. Verify the real destination before acting on it. The summary carries no origin attribution, so a clickable element there deserves exactly the scrutiny you would give an email from a stranger.
Don't summarise untrusted or user-generated pages. Reddit threads, public GitHub READMEs, comment sections and unknown blogs are the natural carriers for an injected payload, so reserve AI summarisation for content you already trust, because the page is the attacker's input.
Restrict what your AI browser tools can do without a human. Require explicit approval before any link interaction inside a summarised response, and monitor AI-browser logs for unexpected outbound image fetches to shortened or unfamiliar domains, because that traffic is the tracking beacon firing.

Bottom line

ChatGPhish is not a breach. It is a warning about where trust now lives. As people increasingly let an assistant read the web for them, the assistant's reply has quietly become a publishing surface that an outsider can write to. Until these tools clearly mark which parts of an answer came from the open web, the only safe assumption is the uncomfortable one: a link inside an AI summary is no more trustworthy than the random page it summarised. Raise it in your next security review before someone scans the wrong QR code.

Follow us on social media:

A Booby-Trapped Web Page Can Turn Your ChatGPT Summary Into a Phishing Page

An AI Agent Just Ran a Full Ransomware Attack With No Human at the Keyboard

Popular articles

Two-Thirds of iPhone AI Apps Are Leaking the Keys That Pay Their Bills

Sears' AI Chatbot Recorded Millions of Customer Calls, Then Left Them Open to the Web