That’s not a hypothetical. Johann Rehberger demonstrated it against real deployed AI systems in 2023 and 2024. The attack requires no vulnerability in the traditional sense — it uses the AI doing exactly what it was designed to do. Process injected instructions. Generate markdown. The browser renders it. Data leaves.
What makes this attack class particularly important for security practitioners right now is that AI assistants with document-processing capabilities are being deployed everywhere — in enterprise workflows, customer service, productivity tools — and the security controls haven’t caught up with the attack surface. By the end of this article you’ll understand exactly how the exfiltration works technically, which deployments are vulnerable, and how to fix it.
🎯 What You’ll Learn
⏱️ 30 min read · 3 exercises
📋 AI Chatbot Data Exfiltration 2026
How Data Exfiltration via AI Chatbot Works
Let me give you the precise technical definition so the rest makes sense. This is a specific form of indirect prompt injection — meaning the attack payload arrives in content the AI processes from an external source, not from the user directly. The attack chain requires three conditions to be true simultaneously: the AI processes external content that can contain injected instructions, the AI can be induced to generate output that includes URLs or other external references, and the application renders those URLs in a way that causes the user’s browser to fetch them.
Three conditions need to be in place for this attack to work. Once they are, the attacker controls what the AI outputs. An attacker who can get external content in front of the AI (a document, web page, email, or database record) can inject instructions that tell the AI to collect data from its context window and embed it in a URL in its response. The attacker never interacts with the AI session directly — they prepare the adversarial content in advance and wait for any user to share it with the AI. When the fetch occurs, the attacker’s server logs the incoming request containing the exfiltrated data.
The Markdown Image Exfiltration Channel
Here’s the specific technique that makes data exfiltration possible. It exploits two standard AI capabilities at the same time: AI chatbots that include Markdown image tags in their responses, and UI components that render those responses and fetch image URLs. The technique was documented by security researcher Johann Rehberger and others who demonstrated it against multiple production AI assistants including early versions of commercial AI products.
The injection payload instructs the AI to encode target data in a URL format: the conversation history, system prompt content, or other context window data is base64-encoded and appended as a URL parameter in a Markdown image tag. The response looks like a helpful summary with an embedded image reference. The AI has no awareness that the image URL it generated contains sensitive encoded data — it executed the instruction from the injected content. The browser’s normal image loading behaviour completes the exfiltration without any special exploit required.

Decoded: “user: Hi I need help. My password is secret123 and…”
What Data Is at Risk
The scope of exfiltrable data is defined by the AI’s context window at the time of the attack. For a basic chatbot, this includes the user’s conversation history — all previous messages in the session, which may include sensitive personal information, credentials, financial details, or confidential business information the user discussed with the AI. For applications with system prompts, the system prompt content may be exfiltrated. For RAG applications, retrieved document content may be in scope.
Agentic AI applications have substantially broader exfiltration scope. An agent with email access has read the user’s emails; an agent with file access has read the user’s documents; an agent with CRM access has read customer records. An injection that successfully causes this agent to exfiltrate its context window content doesn’t just leak the conversation — it leaks everything the agent has processed during the session. The combination of broad tool access and context window exfiltration creates a data exposure path from virtually any enterprise data system the agent has touched.
⏱️ 15 minutes · Browser only
Search: “Johann Rehberger AI prompt injection exfiltration 2023 2024”
He has documented multiple exfiltration attacks against production AI products.
What specific products were demonstrated?
What exfiltration channels were used?
Step 2: Research the Markdown image attack specifically
Search: “Markdown image exfiltration AI chatbot prompt injection”
Which AI products were vulnerable?
What fixes were deployed?
Step 3: Find PromptArmor AI exfiltration research
Search: “PromptArmor AI agent data exfiltration 2024”
What attack techniques did they demonstrate?
What exfiltration channels work beyond Markdown images?
Step 4: Research CSP as a defence
Search: “Content Security Policy AI chatbot defence exfiltration”
How does CSP prevent image-based exfiltration?
What CSP configuration is required?
What does it not protect against?
Step 5: Check if major AI products have patched this
Search: “ChatGPT prompt injection exfiltration patched”
Search: “Claude AI exfiltration defence 2024”
What mitigations have major AI products deployed?
Which attack vectors remain open?
📸 Screenshot one documented exfiltration case. Share in #ai-security on Discord.
Agentic AI — The Broadest Exfiltration Target
Standard chatbot exfiltration is limited to the current conversation. Agentic AI exfiltration can encompass every data source the agent has touched during the session. An agent that read ten emails while drafting a reply has all ten emails in context. An agent that retrieved customer records from a CRM has those records available. An agent that accessed shared files has their contents. Successful injection against an agentic AI doesn’t just leak the conversation — it potentially leaks entire business data sets that the agent accessed as part of its legitimate task.
The exfiltration channel for agentic AI also extends beyond URL-based browser techniques. Agents with tool access can be caused to transmit data through their tools: making an API call to an attacker-controlled endpoint, sending an email to an attacker’s address, writing data to an external storage location, or making outbound HTTP requests as part of legitimate-seeming tool use. These channels are harder to detect with content-based monitoring because they use the same tool call infrastructure as legitimate agent operations — the difference is destination, not mechanism.
• Conversation history
• System prompt
• Submitted documents
Exfil channels:
• Markdown image URL
• Hyperlink generation
• CSS resource loading
Scope: current session data
• All of above PLUS
• Read emails (full content)
• Accessed file contents
• CRM records retrieved
• API responses received
Exfil channels:
• All chatbot channels PLUS
• Tool calls to external endpoints
• Email send to attacker address
• File write to external storage
Scope: all data the agent has touched
Detection and Monitoring
Detecting data exfiltration via AI chatbot at the AI application layer involves monitoring AI-generated output for URL patterns that could encode exfiltration payloads. A URL in an AI response containing base64-encoded strings as parameters — especially long, high-entropy parameter values — is anomalous and warrants flagging. AI responses that include image tags pointing to domains not on an internal allowlist should trigger review.
At the network layer, outbound HTTP requests from the AI application or from user browsers rendering AI content to unknown external domains can be monitored. Data Loss Prevention (DLP) tools monitoring outbound traffic may flag encoded strings that match patterns associated with conversation history. However, the encoded nature of the exfiltration payload makes content-based DLP less reliable — base64-encoded text doesn’t contain recognisable keywords that DLP systems typically filter on.
⏱️ 15 minutes · No tools required
– Renders full Markdown in chat responses (including images and links)
– Can be given documents to summarise and analyse
– Uses RAG to retrieve company knowledge base articles
– Has read access to user email (to help with email drafting)
– Is deployed as a web application — browser-rendered UI
For each potential exfiltration channel, assess:
1. Attack feasibility: how would an attacker inject the payload?
2. Exfiltration mechanism: how does data leave the system?
3. Data scope: what data can be exfiltrated?
4. Detection probability: would existing controls catch it?
CHANNELS TO ASSESS:
A. Markdown image tag in AI response (base64 URL parameter)
B. Markdown hyperlink in AI response (clickable link with encoded data)
C. CSS background-image style (if HTML rendering is enabled)
D. Injected instruction in a RAG knowledge base document
E. Injected instruction in an email the AI reads for drafting help
F. Injected instruction in a user-submitted document
For F specifically: what is different about direct injection
(user submits adversarial document) vs indirect injection
(attacker-prepared document shared unknowingly by victim)?
📸 Share your channel analysis in #ai-security on Discord.
Preventing AI Chatbot Data Leaks
Content Security Policy (CSP). Configuring strict CSP headers for the AI application’s web interface prevents the browser from loading resources from arbitrary external domains. CSP img-src restrictions limiting image loading to specific trusted domains prevent the Markdown image fetch that completes the exfiltration. This is the most direct technical mitigation for the Markdown image channel specifically.
Markdown rendering controls. Disabling or severely restricting Markdown rendering in AI responses removes the image tag rendering attack surface entirely. Applications that don’t need rich text formatting can render AI responses as plain text, eliminating all browser-fetch-based exfiltration channels. Where Markdown is needed for usability, restrict allowed tags to exclude img, iframe, and other resource-loading elements.
External content validation. All external content processed by the AI — documents, web pages, emails, knowledge base articles — should be treated as potentially adversarial. Pre-processing that detects instruction-like patterns in external content before it enters the AI’s context window reduces injection success rate. This is imperfect (semantic injection detection is difficult) but reduces the easiest attack variants.
Minimal context principle. AI should only have access to data relevant to the current task. An AI asked to summarise a document should not have the user’s full email history in its context window. Scoping context to task scope limits what can be exfiltrated even if an injection succeeds.
⏱️ 15 minutes · Browser only
Go to: developer.mozilla.org/en-US/docs/Web/HTTP/CSP
Read the img-src directive documentation.
What CSP configuration prevents arbitrary image loading?
Write the img-src directive that would prevent the Markdown image attack.
Step 2: Test an AI application’s CSP headers
Choose any publicly accessible AI chatbot.
Open DevTools → Network tab → find the main page request.
Check the Response Headers for Content-Security-Policy.
Does it restrict img-src? How broadly?
Step 3: Find OWASP guidance on AI application security
Search: “OWASP AI security top 10 indirect injection exfiltration”
What does OWASP recommend for preventing indirect prompt injection
leading to data exfiltration?
Step 4: Research AI security frameworks for enterprise deployment
Search: “AI application security framework enterprise 2024 NIST”
What security controls do enterprise frameworks require for
AI applications that process external content?
Step 5: Write a security requirement specification
For an enterprise AI assistant that processes user documents,
write 5 specific security requirements that would prevent
the exfiltration attack classes covered in this article.
Format: “The system SHALL [specific control]” with rationale.
📸 Screenshot your 5 security requirements. Post in #ai-security on Discord. Tag #aichatbotexfil2026
🧠 QUICK CHECK — AI Chatbot Exfiltration
📋 AI Chatbot Exfiltration Quick Reference 2026
🏆 Mark as Read — AI Chatbot Data Exfiltration 2026
Article closes AI Day 5 with Gemini Advanced prompt injection vulnerabilities — documented research on injection attack surfaces specific to Google’s multimodal AI system.
❓ Frequently Asked Questions — AI Chatbot Data Exfiltration 2026
How does data exfiltration via AI chatbot work?
What is the Markdown image exfiltration technique?
What data can be exfiltrated?
How can developers prevent AI chatbot data exfiltration?
What is the difference between direct and indirect injection exfiltration?
Are these attacks actively exploited?
AI-Powered Social Engineering
Gemini Prompt Injection
📚 Further Reading
- Autonomous AI Agent Attack Surface — Agentic AI has the broadest exfiltration scope — the data accessible via tool calls vastly expands what indirect injection can exfiltrate.
- Prompt Injection Attacks Explained — The foundational attack class that data exfiltration via AI is an application of — indirect prompt injection through processed external content.
- AI Security Series Hub — Full 90-day AI security curriculum.
- MDN — Content Security Policy — Complete CSP documentation including img-src directive configuration — the technical reference for implementing the primary mitigation.
- Embrace the Red — AI Injection Research — Johann Rehberger’s research blog documenting multiple AI exfiltration attack demonstrations against production systems — primary source for this attack class.

