
ChatGPT's web summary renderer trusts external Markdown, enabling indirect prompt injection attacks that deliver phishing links inside trusted AI responses.
Permiso Security documented ChatGPhish, a technique that turns ChatGPT's web browsing summary feature into a phishing vector. The root cause sits in the chatgpt.com response renderer: when the model visits a URL to summarize it, it trusts whatever Markdown it finds on that page. If the page embeds crafted Markdown links, the renderer surfaces them as clickable links in the response — with no visual indicator that they originated from external, untrusted content.
The attack mechanism is *indirect prompt injection* (malicious instructions embedded in external content the model processes — the attacker hijacks model behavior without direct access): the attacker never interacts with the model directly. The victim simply asks ChatGPT to summarize an attacker-controlled page. The model follows the embedded Markdown and delivers a phishing link as part of its apparently legitimate response.
Classic phishing requires the victim to receive a suspicious email or navigate to a malicious URL on their own. ChatGPhish removes that friction: the user acts from within the ChatGPT interface — an environment they perceive as safe. The malicious link doesn't arrive in a spam email; it arrives inside a trusted AI assistant's reply.
That shift in delivery context is the key threat amplifier. Users trained to distrust links in emails have no equivalent reflex for links inside ChatGPT responses. The attack surface covers every user with access to the web browsing feature — millions of accounts on Plus, Pro, and Team plans.
The pattern also generalizes beyond OpenAI. Any LLM that renders Markdown from external content without sanitizing it carries the same exposure: Microsoft Copilot, Google Gemini, any agent with a browse tool. ChatGPhish is not a one-off OpenAI bug — it's a systemic design gap in how LLMs handle trust from external content.
The root issue is not ChatGPT-specific: no LLM Markdown renderer should implicitly inherit user trust over external content. Until OpenAI and other vendors fix this at the renderer layer — with explicit external-origin warnings or by disabling link rendering from visited pages — the attack surface remains open in production.
Help more people discover BBLabs News.
Want to get news like this every day?
Browse all articles