PROMPTFLUX and PROMPTSTEAL explained — AI Malware That Queries LLMs Mid-Attack (2026)

PROMPTFLUX and PROMPTSTEAL explained  — AI Malware That Queries LLMs Mid-Attack (2026)
Mandiant’s M-Trends 2026 report — released this week — named two malware families that represent a genuinely new category of threat in 2026: PROMPTFLUX and PROMPTSTEAL. These are not AI-assisted malware where humans use AI to write malicious code. They are malware families that actively query large language models during execution — using AI as part of their attack logic to evade detection and adapt in real time. My analysis of why this matters and what it changes for defenders.

What You’ll Learn

What PROMPTFLUX and PROMPTSTEAL are and how they differ from AI-generated malware
How querying an LLM mid-execution helps malware evade detection
Why traditional signature-based detection fails against this category
The defensive adaptations required to detect LLM-querying malware
What IBM calls “Slopoly” malware and the broader AI malware landscape

⏱️ 12 min read

PROMPTFLUX represents the offensive convergence of the LLM capabilities I covered in What Is an LLM? with the adversarial ML techniques from Adversarial Machine Learning 2026. For the full AI malware picture including how AI is used to write malware, see Can AI Write Malware?


What PROMPTFLUX and PROMPTSTEAL Are

The key distinction I want to establish immediately: PROMPTFLUX is not malware written by AI. It is malware that uses AI during its execution. That’s a fundamentally different threat category. Traditional AI-generated malware (what IBM calls “Slopoly”) uses AI at the development stage — a human uses an LLM to help write malicious code, then deploys it. PROMPTFLUX and PROMPTSTEAL query LLMs during the attack itself, in real time, to make dynamic decisions about how to proceed.

PROMPTFLUX vs AI-GENERATED MALWARE — THE DISTINCTION
# Traditional AI-generated malware (Slopoly)
Stage: development — human uses LLM to write malware code
Runtime: no LLM dependency — runs without AI after deployment
Detection: still detectable by behaviour-based AV (once behavioural pattern is known)
# PROMPTFLUX / PROMPTSTEAL (LLM-querying malware)
Stage: runtime — malware queries LLM during execution to get instructions
Runtime: LLM is part of the attack logic — malware adapts based on AI responses
Detection: behaviour is dynamic and changes per-environment → evades signature/behaviour profiles
# Source
M-Trends 2026: “malware families like PROMPTFLUX and PROMPTSTEAL actively query
large language models mid-execution to evade detection”
Released: March 2026, Mandiant/Google Threat Intelligence


How LLM-Querying Malware Works

My model for how LLM-querying malware evades detection by using AI during execution. The key insight is that the malware’s attack behaviour is not fixed at compile time — it’s generated at runtime by an external AI. This means every execution in a different environment can produce a different behaviour profile, which is precisely what defeats the detection approaches defenders currently rely on. The malware doesn’t have a fixed behaviour — it makes API calls to an LLM and uses the response to decide what to do next. This is adversarial use of the same flexibility that makes LLMs useful for legitimate software.

LLM-QUERYING MALWARE — EXECUTION MODEL
# Execution flow (conceptual, based on M-Trends disclosure)
1. Malware installs and gains initial foothold
2. Reconnaissance phase: collects environment data (AV present, OS version, network config)
3. LLM query: sends environment context to LLM API: “Given [environment details],
what evasion technique should I use to avoid detection by [AV product]?”
4. AI response: returns specific evasion recommendation for that environment
5. Malware implements the AI-recommended evasion and proceeds with attack
# Why this breaks traditional detection
Signature-based: no fixed code pattern to match — LLM-generated evasion varies per environment
Behaviour-based: behaviour profile changes each run based on AI output
Sandbox analysis: sandbox environment ≠ target environment → different AI response → different behaviour
# What PROMPTSTEAL specifically targets
PROMPTSTEAL: focused on extracting IP via “distillation attacks” (M-Trends 2026)
Target: proprietary ML models — extracting specialised training data and logic
Method: systematic querying to reconstruct the proprietary model

securityelites.com
LLM-Querying Malware vs Traditional Malware — Detection Comparison
Detection Method
Traditional Malware
PROMPTFLUX-type
Signature matching
✅ Detects known patterns
❌ No fixed signature
Behaviour baseline
✅ Consistent behaviour
❌ Dynamic per environment
Sandbox analysis
✅ Reproduces in sandbox
❌ Different AI response in sandbox
LLM API traffic monitoring
N/A
✅ Detects LLM queries
Network egress analysis
✅ C2 traffic patterns
⚠️ LLM API traffic looks legitimate

📸 Detection method effectiveness against traditional vs LLM-querying malware. Three of the four standard detection approaches fail or degrade significantly against PROMPTFLUX-type malware. The only new effective detection method — LLM API traffic monitoring — requires defenders to build capability they didn’t previously need. My priority for any SOC upgrading their detection capability in 2026: add LLM API egress monitoring to the detection stack.


Why Signature Detection Fails

The adversarial ML and evasion concepts I covered in earlier guides — how AI classifiers can be fooled by carefully crafted inputs — come together in PROMPTFLUX in a way that makes the evasion more robust than any previous technique. Traditional malware evasion involves obfuscation — the code does the same thing but looks different. LLM-querying malware evasion involves adaptation — the code actually does something different based on the environment, and the AI determines what that different thing should be.

WHY SIGNATURE AND BEHAVIOUR DETECTION BOTH FAIL
# Signature detection failure
Problem: signature detection matches known code patterns to known malware
LLM-querying malware: the evasion code is generated at runtime by the LLM
There is no fixed signature to match — the malicious behaviour is generated fresh each time
# Behaviour-based detection failure
Problem: behaviour detection learns “this pattern of actions = malware”
LLM-querying malware: queries AI for what actions to take in this specific environment
Behaviour varies by AV product present, OS version, network topology — no fixed pattern
# Sandbox analysis failure
Problem: sandbox runs malware in isolated environment to observe behaviour
LLM-querying malware: detects sandbox environment → queries AI with sandbox context
AI returns: “behave benignly in sandbox environments” → malware appears clean in analysis

EXERCISE — THINK LIKE A DEFENDER (10 MIN)
Design a Detection Strategy for LLM-Querying Malware
Traditional detection approaches fail against PROMPTFLUX-type malware.
Design a detection strategy using what you now know about how it works.

1. THE LLM API CALL SIGNAL
PROMPTFLUX must make API calls to an LLM to function.
What network traffic does this generate?
How do you distinguish legitimate LLM API usage from malicious?
(Hint: what process on the machine is making the call? At what time? With what data?)

2. THE DATA EXFILTRATION SIGNAL
The malware sends environment data to the LLM (AV product name, OS version, network config).
What does this traffic look like?
Can you detect environment data being sent to an AI API?

3. THE RESPONSE IMPLEMENTATION SIGNAL
After getting the AI response, the malware implements the recommended evasion.
What happens immediately after an LLM API call?
Does the sequence of: LLM call → new behaviour constitute a detectable pattern?

4. YOUR DETECTION RULE
Write a plain-English EDR detection rule that would flag this behaviour.
Format: “Alert when [process] makes [API call] containing [data pattern] followed by [action]”

✅ The detection approach that works: monitor all outbound API calls to known LLM endpoints (api.openai.com, api.anthropic.com, generativelanguage.googleapis.com etc.) from non-user-initiated processes. Legitimate LLM usage on a corporate machine comes from specific approved applications — a process that isn’t a known application making LLM API calls at 3am while another process simultaneously modifies system files is the detection signal. This is behavioural correlation rather than signature matching, which is why it works where signatures fail.


Slopoly — The Broader AI Malware Ecosystem

IBM’s X-Force team identified and named “Slopoly” — their internal term for AI-generated malware produced by generative AI tools — as a separate but related category. My framing: PROMPTFLUX is the sophisticated end of the AI malware spectrum (malware that uses AI at runtime), while Slopoly is the commoditised end (malware written by AI, deployed without AI). Both are increasing in volume and both compress the time from attack concept to deployment.

AI MALWARE SPECTRUM — 2026
# Low end: AI-assisted malware development (Slopoly)
Human attacker uses LLM to write or customise malware
No AI at runtime — standard malware once deployed
Impact: lower skill floor for malware creation, more volume, more variants
Detection: behaviour-based detection still works (consistent runtime behaviour)
# High end: LLM-querying malware (PROMPTFLUX)
Malware queries LLM at runtime to adapt behaviour
AI is part of the active attack chain
Impact: evades signature AND behaviour detection, adapts to target environment
Detection: requires new detection capability (LLM API egress monitoring)
# The economics from SecurityWeek Cyber Insights 2026
“The cost to go from vulnerability discovery to exploit used to be weeks and thousands of dollars. In 2026, with AI assistance, that cost is approaching zero.
Now it’s near zero.” — James Wickett, DryRun Security
AI turns commodity malware effectively free → value shifts to initial access and execution


Defensive Adaptations for AI Malware

DEFENCE ADAPTATIONS FOR AI MALWARE
# Adaptation 1: LLM API egress monitoring
Add LLM API endpoints to your egress monitoring ruleset
Alert on: LLM API calls from unexpected processes or at unexpected times
Alert on: high-volume or automated LLM API calls (not user-initiated)
# Adaptation 2: Correlation-based detection
Rule: LLM API call + new process execution within 60 seconds = investigate
Rule: LLM API call + privilege escalation attempt = alert immediately
Rule: LLM API call containing environment reconnaissance data (AV name, OS version)
# Adaptation 3: Zero-trust for AI API access
Maintain an approved list of applications permitted to call LLM APIs
Block all other processes from reaching LLM API endpoints
Review and update the approved list quarterly
# Adaptation 4: Treat LLM API calls as a detection signal, not just traffic
Historically: LLM API calls = user using ChatGPT or Copilot — not suspicious
In 2026: LLM API calls from non-user-initiated processes = potential PROMPTFLUX activity
The mental model shift: LLM API traffic is now a security-relevant signal requiring context


LLM API Endpoints to Monitor

My practical starting point for any organisation building PROMPTFLUX detection: here are the primary LLM API endpoints to add to your egress monitoring ruleset. Any process making requests to these endpoints that isn’t on your approved application list is a detection signal.

LLM API ENDPOINTS — EGRESS MONITORING LIST
# Primary LLM provider endpoints
api.openai.com # OpenAI / ChatGPT
api.anthropic.com # Anthropic / Claude
generativelanguage.googleapis.com # Google Gemini
api.mistral.ai # Mistral
api.cohere.com # Cohere
api.together.xyz # Together AI (open models)
api.groq.com # Groq (fast inference)
# Alert rule (plain English)
ALERT: when [process] ∉ approved_app_list makes outbound connection to [LLM_endpoints]
ALERT: when [any process] makes >50 requests/minute to [LLM_endpoints]
ALERT: when [LLM API call] contains [AV product names, OS version strings, IP ranges]
This last rule is the PROMPTFLUX-specific signal: recon data being sent to AI for evasion advice

PROMPTFLUX — Key Points

PROMPTFLUX/PROMPTSTEAL: malware that queries LLMs mid-execution — documented in M-Trends 2026
Different from AI-generated malware: AI is part of the runtime attack logic, not just the development process
Defeats signature and behaviour detection: evasion is generated per-environment by AI
New detection required: LLM API egress monitoring + behavioural correlation rules
Slopoly (IBM): AI-written commodity malware drives volume; PROMPTFLUX drives sophistication

PROMPTFLUX — Add to Your Detection Stack

One immediate action: add the major LLM API endpoints to your egress monitoring rules today. That single change starts building detection capability for this category. The Can AI Write Malware? guide covers the Slopoly end of the spectrum in full.


Quick Check

PROMPTFLUX is deployed in a network with CrowdStrike Falcon EDR. The malware queries an LLM with the message: “CrowdStrike Falcon is present. What technique should I use to evade its detection?” The LLM returns a specific evasion recommendation. Why does this defeat traditional EDR detection?




Frequently Asked Questions

What is PROMPTFLUX?
PROMPTFLUX is a malware family named in Mandiant’s M-Trends 2026 report that actively queries large language models during execution to evade detection. Unlike AI-generated malware where AI is used to write the malicious code before deployment, PROMPTFLUX uses AI as part of its runtime attack logic — querying an LLM to determine what evasion technique to use based on the specific environment it finds itself in.
What is the difference between PROMPTFLUX and Slopoly?
Slopoly (IBM’s term for AI-generated malware) uses AI at the development stage — attackers use LLMs to write or customise malware code, which then runs without AI involvement. PROMPTFLUX uses AI at runtime — the deployed malware queries LLMs mid-execution to adapt its behaviour. Slopoly increases malware volume and lowers the skill floor for malware creation. PROMPTFLUX increases evasion sophistication by making malware behaviour dynamic and environment-specific.
How do you detect LLM-querying malware?
Add LLM API endpoints (api.openai.com, api.anthropic.com, generativelanguage.googleapis.com, and similar) to your network egress monitoring. Alert on LLM API calls from unexpected processes, at unexpected times, or containing what appears to be environment reconnaissance data. Correlate LLM API calls with subsequent behavioural anomalies — a process that makes an LLM API call and then performs a privilege escalation attempt is a high-confidence indicator. Zero-trust for AI API access — maintain an approved application list and block all others from reaching LLM endpoints.
← Related

Can AI Write Malware? The Full Picture

Next →

How Hackers Attack AI Agents 2026

Further Reading

  • Can AI Write Malware? 2026 — The full spectrum of AI-assisted malware including Slopoly, how AI-generated variants evade AV, and why behaviour-based detection remains the most effective defence.
  • Adversarial Machine Learning 2026 — The detection evasion techniques that PROMPTFLUX applies to defender AI systems — how adversarial inputs cause security classifiers to misclassify malicious content.
  • Agentic AI Security 2026 — The broader context of autonomous AI in attacks. PROMPTFLUX is one component of the agentic AI attack landscape documented in March-April 2026.
  • Google Mandiant — M-Trends 2026 — The primary source naming PROMPTFLUX and PROMPTSTEAL. The full report covers AI-enabled attack acceleration across all phases of the attack lifecycle, with data from 500,000+ hours of frontline investigations.
  • SecurityWeek — Cyber Insights 2026: Malware in the Age of AI — Expert analysis of the Slopoly malware economics and the broader AI-enabled malware landscape, including the cost-of-attack inversion cited above.
ME
Mr Elite
Owner, SecurityElites.com
My concern about PROMPTFLUX isn’t the sophistication of the current implementation — it’s the trajectory. Today it queries an LLM to pick an evasion technique. In 12 months it queries an LLM to design the entire post-exploitation strategy based on real-time environment assessment. The detection advice I give is the same as it’s always been for emerging threat categories: add a detection signal for the novel behaviour now, while you can still distinguish it from background noise. LLM API calls from non-user processes are rare today. Build the detection rule while that’s still true.

Join free to earn XP for reading this article Track your progress, build streaks and compete on the leaderboard.
Join Free
Lokesh Singh aka Mr Elite
Lokesh Singh aka Mr Elite
Founder, Securityelites · AI Red Team Educator
Founder of Securityelites and creator of the SE-ARTCP credential. Working penetration tester focused on AI red team, prompt injection research, and LLM security education.
About Lokesh ->

Leave a Comment

Your email address will not be published. Required fields are marked *