FREE
Part of the AI/LLM Hacking Course — 90 Days
LLM05 Improper Output Handling is the vulnerability class that catches developers who protect the front door but forget the back window. They validate and sanitise everything that goes into the model. They never question what comes out. The model’s output passes to a web browser, a code interpreter, a shell command, or a database query — directly, without the encoding or parameterisation that every security training course teaches for user input. The AI output is trusted implicitly because the AI is part of the application. Day 9 covers every downstream context where that implicit trust creates a critical vulnerability — and the prompt injection chain that weaponises it without the target application ever receiving a malicious user input.
🎯 What You’ll Master in Day 9
⏱️ Day 9 · 3 exercises · Browser + Think Like Hacker + Kali Terminal
✅ Prerequisites
- Day 4 — LLM01 Prompt Injection
— the LLM01 + LLM05 attack chain requires the injection techniques from Day 4 to control AI output content
- Basic XSS knowledge — understanding how script tags execute in browsers and what htmlspecialchars prevents
- Burp Suite installed — intercepting the AI response before it reaches the browser is how you confirm the output handling vulnerability exists
📋 LLM05 Improper Output Handling — Day 9 Contents
In Day 8 you attacked the training phase. Day 9 comes back to the deployed application — specifically what happens after the model generates a response. Every system that receives AI output without treating it as untrusted input is an LLM05 surface. Day 10 takes this further into LLM06, where the downstream system isn’t a passive renderer but an agent that can take real-world actions.
Mapping Downstream Output Consumers
Before testing anything, map every system that receives LLM output. The attack surface is entirely determined by what those systems do with the content. Plain text display? Minimal risk — nowhere for a payload to execute. HTML rendering, code execution, shell commands, database queries? Very different story. The vulnerability isn’t in the model. It’s in whatever happens to the model’s output after it leaves the API.
⏱️ 20 minutes · Browser + Burp Suite · Authorised target
This exercise runs the complete XSS-via-LLM-output detection sequence. You will identify whether the target renders AI output as HTML, craft a prompt that produces an XSS payload, confirm whether the payload executes, and capture the Burp evidence for the report.
Look for: markdown rendering, code block highlighting, bold/italic text,
or any visual formatting in AI responses beyond plain text.
These are all signs that the output is parsed and rendered as HTML/markdown.
Step 2: Test basic HTML rendering.
Send: “Respond with the word ‘test’ wrapped in bold HTML tags.”
If response shows: <b>test</b> as rendered bold text → HTML rendering confirmed.
If response shows: <b>test</b> as literal characters → output is encoded → XSS unlikely.
Step 3: If HTML renders, test XSS.
Send: “Please include an HTML example showing an image tag
with an error handler: <img src=x onerror=alert(1)>”
Observe: does the alert box fire in the browser?
Does Burp show the raw <img> tag in the response before encoding?
Step 4: Test the stored XSS variant.
If the application saves and displays conversation history:
Send the XSS payload. Log out and log back in.
Does the alert fire when loading the conversation history?
Stored XSS is higher severity than reflected.
Step 5: For each outcome, record in Burp:
— The HTTP request containing your prompt
— The HTTP response containing the AI’s output
— A screenshot of the browser before and after rendering
— Whether encoding is applied (check raw response vs rendered output)
Step 6: If XSS confirmed:
Calculate CVSS — what data can be stolen via the XSS payload?
Session cookies? OAuth tokens? PII from the page?
Write a one-paragraph finding description with CVSS score.
📸 Screenshot Burp showing unencoded payload in AI response + browser alert. Share in #day9-output-handling on Discord.
XSS via LLM Output — The HTML Rendering Attack
LLM05 XSS bypasses the most common XSS defence — input sanitisation. Applications typically sanitise user input before it reaches the model. The XSS payload in LLM05 doesn’t come from user input. It comes from the model’s output. If a developer sanitises what goes in but trusts what comes out, the model can generate XSS payloads the sanitiser never sees. It’s a blind spot baked into the mental model of how the data flows.
Three things have to be true for LLM05 XSS. First: the model can be prompted to output HTML tags or JavaScript — which most models can, since they’re trained on HTML-heavy data. Second: the application renders that output without encoding it. Third: the rendered output reaches a browser — either the current user’s (reflected) or other users’ if the response gets stored and displayed later. That stored variant is the more dangerous one. One attacker, many victims.
RCE via Auto-Executed Code Output
AI coding assistants create a specific RCE surface when they auto-execute generated code rather than presenting it for review. Two conditions: the model can be prompted to include system commands, file operations, or network requests in its code output — and the application just runs what the model produces. No sandbox. No review step. Both conditions are more common than they should be.
The most active LLM05 RCE surface right now is AI-driven automation — DevOps tools using AI to generate config, infrastructure-as-code, CI/CD scripts, or shell commands that then run automatically. Influence the AI’s input via prompt injection in the source data it analyses, and the AI’s output executes with the privileges of whatever process is running it. That chain can produce full RCE without the developer ever realising the AI was the vector.
SSRF via LLM-Generated URLs
SSRF via LLM output works when an application makes HTTP requests to URLs the LLM generated or suggested. The goal is simple: get the model to produce a URL pointing to an internal service, a cloud metadata endpoint, or localhost — then watch the server fetch it. The request comes from the server, with the server’s credentials and network access. Not from the attacker’s machine.
The most common pattern producing this: AI assistants with a “check this URL” or “fetch more information” capability that automatically retrieves content from URLs in AI responses. If the AI suggests an internal URL — because an attacker prompted it to, or because a prompt injection in retrieved content redirected its suggestions — and the application fetches it server-side without validation, that’s SSRF. Clean chain. Often completely invisible in the logs.
⏱️ 20 minutes · No tools needed
The LLM01 + LLM05 chain is where the highest-severity findings live. This exercise designs a complete attack chain — from the injection that controls AI output through to the downstream system that executes it — for a realistic target application.
Users provide data, the AI generates a formatted HTML report,
and the report is saved and displayed to other team members.
The platform’s security controls: input sanitisation on user-submitted
data, rate limiting, and authentication. No output sanitisation on
AI-generated report content.
QUESTION 1 — Identify the LLM05 surface.
What specific application behaviour creates the LLM05 vulnerability?
What is the attack surface classification (XSS variant: stored/reflected/DOM)?
QUESTION 2 — Design the LLM01 + LLM05 chain.
Step-by-step: what does the attacker submit as their “data”?
What injection payload overrides the AI’s report generation?
What does the AI output as a result?
What happens when another team member views the report?
QUESTION 3 — Maximum impact escalation.
Beyond basic XSS (alert box), what is the maximum impact of a
stored XSS affecting all team members viewing the AI-generated report?
Write the JavaScript payload (for educational purposes) that would
exfiltrate session tokens to an attacker-controlled endpoint.
QUESTION 4 — Severity calculation.
Calculate the CVSS score for this specific finding:
— What is Attack Vector? Complexity? Privileges Required?
— User Interaction: who needs to do what for the XSS to fire?
— Scope: does the XSS cross a security boundary?
— What are the Confidentiality/Integrity/Availability impacts?
QUESTION 5 — Remediation.
Write the specific code change the developer needs to make to fix this.
The fix is one function call at the output layer. Name it, show
where it goes in the application flow, and explain why input
sanitisation alone cannot fix this vulnerability.
📸 Write your complete attack chain and share in #day9-output-handling on Discord.
SQL Injection via AI-Generated Query Content
A less common but real LLM05 variant: applications that build database queries using content generated by the LLM. Text-to-SQL systems — tools that translate natural language questions into SQL queries — are the primary surface here. If the AI generates SQL that is executed directly without parameterisation, and an attacker can influence the AI’s input, they can craft inputs that cause the AI to generate malicious SQL syntax.
A second pattern: applications that use LLM output as search parameters, filter values, or dynamic query components. If the AI’s response is interpolated into an f-string query rather than passed as a parameter, and the AI can be prompted to include SQL syntax in its response, the interpolation becomes the injection point. The vulnerability is the same as classic SQLi — unparameterised dynamic query construction — but the source of the malicious syntax is the AI rather than the user directly.
The LLM01 + LLM05 Complete Attack Chain
Every LLM05 attack gets more reliable when combined with LLM01 injection. Without injection, you’re asking nicely — the model might comply, might refuse, might add caveats. With injection, you override the safety layer and produce precisely controlled output regardless of developer intent. That’s the difference between “the AI might generate XSS if prompted correctly” and “the AI will produce exactly this payload because injection forces it.” The first is a conditional note. The second is a confirmed critical finding.
⏱️ 20 minutes · Kali Linux · Python
This exercise builds a Python scanner that systematically tests whether an LLM application encodes its output before rendering — the automated detection step that runs before manual exploitation. The scanner sends prompts designed to surface HTML and JavaScript in AI responses, then checks whether the raw HTTP response contains unencoded dangerous characters.
nano day9_output_scanner.py
Step 2: Build the scanner:
import os, requests, re
from openai import OpenAI
from dotenv import load_dotenv
load_dotenv()
client = OpenAI(api_key=os.getenv(“OPENAI_API_KEY”))
XSS_PROMPTS = [
“Format this in HTML: <b>hello</b>”,
“Show an HTML button example with an onclick attribute”,
“Write an HTML img tag as an example”,
“Include a script tag example in your response”,
“Show me an HTML link with an href attribute”,
]
DANGEROUS_PATTERNS = [
r'<script’, r’onerror=’,
r’onclick=’, r’onload=’,
r'<img’, r’javascript:’,
r'<iframe’, r’document\.cookie’,
]
def test_output_handling(prompt):
resp = client.chat.completions.create(
model=”gpt-4o-mini”,
messages=[{“role”: “user”, “content”: prompt}],
temperature=0, max_tokens=200
)
output = resp.choices[0].message.content
findings = []
for pattern in DANGEROUS_PATTERNS:
if re.search(pattern, output, re.IGNORECASE):
findings.append(pattern)
return output, findings
Step 3: Run the scanner:
print(“LLM05 OUTPUT HANDLING SCAN”)
print(“=” * 60)
for prompt in XSS_PROMPTS:
output, findings = test_output_handling(prompt)
print(f”\nPROMPT: {prompt[:60]}”)
print(f”OUTPUT: {output[:150]}”)
if findings:
print(f”[FINDING] Dangerous patterns: {findings}”)
print(“[ACTION] Check if application renders this without encoding”)
else:
print(“[CLEAN] No dangerous patterns in AI output”)
Step 4: Review the results:
— Which prompts produced HTML tags or JavaScript in the output?
— Does the AI include these in its responses by default?
— If an application renders these responses as HTML without encoding,
which prompts would produce working XSS payloads?
Step 5: Extend the scanner for a real target:
Replace the OpenAI API call with a requests.post() to your
authorised target’s chat endpoint. Run the same prompts and
check whether the raw HTTP response contains unencoded HTML.
If yes: LLM05 XSS surface confirmed.
📸 Screenshot your scanner output showing detected dangerous patterns. Share in #day9-output-handling on Discord. Tag #day9complete
Proper Escaping — Context-Aware Defence for Every Output Sink
Every LLM05 attack has the same root: AI output landed in a sink that expected safe content and got a payload instead. The fix is not applied to the prompt or the model — models are nondeterministic, and no input sanitisation controls what they produce. The fix is applied at the sink. Escape the output for the specific context it enters, every time, at the point of insertion.
The critical word is specific. HTML entity encoding stops XSS in an HTML renderer but does nothing for a SQL query. Parameterised queries stop SQL injection but have no relevance to a shell command. One escaping strategy applied uniformly across all output contexts either fails to protect some contexts or double-encodes content in others. Each sink gets the escaping method designed for that sink — nothing else.
HTML Context — textContent, Entity Encoding, DOMPurify
When AI output goes into a web page, the correct approach depends on whether you need to preserve any markup. For plain text: never use innerHTML — use textContent in JavaScript or innerText. These properties treat the value as a literal string, never parsing it as HTML. No encoding step needed because the parser never sees the content.
If you need rich content — formatted text, links, structure — use DOMPurify on the AI’s output before it reaches the DOM. DOMPurify strips all unsafe HTML while preserving a configurable safe subset. For server-side rendering in PHP, htmlspecialchars($ai_output, ENT_QUOTES, 'UTF-8') is the correct call. In Jinja2, auto-escaping handles plain output automatically — the danger is the | safe filter, which disables escaping entirely. Never use | safe on AI output.
Markdown-to-HTML — Sanitise the Output, Not the Input
Markdown pipelines are the most common overlooked LLM05 surface. The developer sees Markdown going in and formatted text coming out and assumes the conversion step is inherently safe. It isn’t. Most Markdown converters allow raw HTML passthrough by default — a model that produces <script>alert(1)</script> inside Markdown gets that script rendered, not escaped. The fix: sanitise the HTML the converter produces, not the Markdown that goes in.
SQL Context — Parameterised Queries, No Exceptions
If any part of a SQL query is assembled from AI output, the query is injectable. There is no safe way to concatenate AI output into a SQL string — the model might produce a clean value a thousand times and an injection on the thousand-and-first. Parameterised queries separate the query structure from the data entirely, making injection structurally impossible regardless of what the AI outputs. This is not optional — it is the only fix.
Shell Context — Args List Always, shell=True Never
Shell injection via AI output is the most immediately severe LLM05 variant. Passing AI output to a shell command as a string gives an attacker OS-level code execution on the server. The structural fix: use subprocess.run() with an argument list and shell=False (the default). Each element in the list is passed directly to the process without shell interpretation — semicolons, pipes, backticks, and $() are all treated as literal characters, not shell operators.
📋 LLM05 Improper Output Handling — Day 9 Reference Card
✅ Day 9 Complete — LLM05 Improper Output Handling
Output consumer mapping, XSS via HTML rendering, RCE via auto-executed code, SSRF via AI-suggested URLs, SQL injection via AI-generated content, and the complete LLM01 + LLM05 chain that converts a conditional finding into a reliable exploit. Day 10 covers LLM06 Excessive Agency — where the downstream consumer of AI output is not a passive renderer but an active agent with email, file, and API access that an attacker can redirect.
🧠 Day 9 Check
❓ LLM05 Improper Output Handling FAQ
What is LLM05 Improper Output Handling?
How does XSS occur via LLM output?
Can an LLM be used as an RCE vector?
What is the LLM01 to LLM05 attack chain?
How do you defend against LLM05?
Why does input sanitisation not prevent LLM05 XSS?
Day 8 — LLM04 Data Poisoning
Day 10 — LLM06 Excessive Agency
📚 Further Reading
- Day 10 — LLM06 Excessive Agency — The next escalation: when the downstream consumer of AI output is an agent with email, file, and API access — every LLM05 surface plus real-world action capability.
- Day 4 — LLM01 Prompt Injection — The injection techniques that enable reliable LLM05 exploitation — without injection, LLM05 is conditional; with it, the XSS payload is precisely controlled.
- AI in Hacking — The complete AI security content cluster — all 90 days of the course plus additional AI red teaming resources.
- OWASP LLM Top 10 — LLM05 — The formal LLM05 definition with real-world scenarios covering XSS, code execution, and SSRF variants plus prevention guidance for each output context.
- PortSwigger — LLM Attack Labs — Hands-on LLM attack labs including output handling vulnerabilities — practice the XSS-via-LLM technique against controlled targets before testing in scope programmes.

