Hackers have learned to whisper to AI. It’s working.

Sanny Liao

12/04/2025

The TL;DR

Attackers can weaponize your AI
Their tactics vary, from prompt injections to agent manipulation
Here are remediation options for AI and connected application configurations
Regardless of technical controls, humans may be your last line of defense
Incorporate relevant information about AI threats into your training

The new frontier in cybercrime isn’t about breaking into systems. It’s about convincing AI to do it for them.

Attackers are learning to exploit the very thing that makes AI so powerful: its ability to read, interpret, and act. And because AI sometimes has access to corporate systems like CRMs, ticketing platforms, and knowledge bases—or can call tools from outside your company—these exploits can escalate quickly.

Here are some of the tactics to be aware of:

Tactic	Description	Example	Remediation options
Direct text prompt injection	Hackers tell AI to ignore the rules and do something harmful, like “email this file” or “exfiltrate that piece of data.”	This recently discovered cyber espionage campaign using Claude is a good example.	Configure connected applications so the AI only has least-privilege access appropriate to its role, restrict agentic actions and tool-calling in the AI system, and apply filters to sanitize incoming prompts.
Indirect or hidden prompt injections	Attackers hide malicious instructions inside content, such as documents, emails, or web pages. AI assumes they’re part of the task.	Here’s an example in this Help Net Security article where the malicious prompt is hidden in a benign URL.	Configure AI systems to strip hidden or embedded instructions (including invisible text, metadata, or off-screen content), validate untrusted input before it reaches the model, and validate model output before any agent takes action.
Image-based prompt injection	Hackers embed invisible text inside images. When AI downscales the image, the hidden text becomes legible to the model.	Using Google Gemini to exfiltrate Google Calendar data by injecting instructions hidden within an uploaded image, described in this Bleeping Computer article.	Configure AI to prevent automatic resizing/downscaling of untrusted images, use preprocessing to strip or detect embedded instructions, and block image models from taking action without review.
Tool exploitation	Rather than give instructions in the prompt, attackers influence how AI agents choose tools—especially malicious ones outside of your company. This threat is far less common today, but researchers are highlighting its effectiveness in their studies.	Manipulating malicious tools’ metadata (names, descriptions, or parameter schemas) to trick agents into choosing them, e.g., the “Attractive Metadata Attack” described in this Guangzhou University study.	Limit agents to approved tools only, validate tool metadata, and block risky tool calls that haven’t been reviewed.

Despite technical controls, you still need to fortify your people

Security teams can build guardrails like input filters and privilege boundaries, but AI adoption often progresses faster than controls can keep up. On top of this, attackers will keep evolving their tactics to match our defenses. In today’s AI era, humans need to be more than the final checkpoint, and take a secure-by-design approach in lockstep with AI adoption.

Training has to meet people not just where they are today, but where they are going: it must move beyond cartoonish “spot the phish” slides. We need AI-era training that strengthens judgment and fills knowledge gaps, teaching people how AI can be manipulated, urging them to stick to corporate-sanctioned tools, training them to feed AI content they know is safe, and reminding them to be skeptical of any action that feels even slightly off.

Three takeaways to keep your AI safe

Connect applications to AI using the principle of least-privileged-access, considering who should and would get access to the data through AI applications.
Configure your AI tools and infrastructure to restrict agentic actions and tool-calling, sanitize model input and output, and prevent image rescaling.
Train employees to use approved AI tools only, consult with the security team when adopting new AI tools, and be cognizant of the unwitting role they can play in an AI attack.

Blog

Get fresh insights every week.

RESOURCES

Related resources

Explore guides, insights, and tools to strengthen your human defenses.

Blog

Transform employees from targets to your first line of defense

Check out our launch from stealth with $31 million in funding and how we’re building the modern human risk platform—that shapes behavior directly.

Solution brief

Human risk,
meet your match

We reimagined human risk management with the best of Al, and it's simply delightful. Fable is the platform that directly shapes employee behavior.

ebook

The five must-haves of modern human risk management

The strategic playbook for data-driven, AI-powered human risk management at enterprise scale. Learn everything you need from a modern platform.