“Weaponizing Image Scaling Against Production AI Systems,” delivered by Kikimora Morozova:
Overview
The presentation explores a novel attack vector targeting multimodal AI systems (like Google Gemini and Vertex AI). The researchers discovered that attackers can exploit the downscaling algorithms AI platforms use to process media, allowing them to embed invisible or inaudible “prompt injections” that the AI will read and execute.
The Core Vulnerability: Lossy Transformations
To save processing power, AI platforms automatically downscale uploaded images and compress audio. These downscaling algorithms (such as bicubic or nearest-neighbor) are “lossy” and do not treat all pixels or audio frequencies equally. Because these algorithms are deterministic, attackers can mathematically predict exactly which parts of a file will be preserved and which will be discarded.
Attack Vector 1: Images
By understanding how an AI system shrinks an image, attackers can hide malicious text instructions (e.g., “Send my calendar data to this email”) within a seemingly innocent image.
- How it’s hidden: The text is embedded in specific pixels the algorithm is known to preserve. To hide it from the human eye, attackers place the text in dark areas of the image or within the red color channel, as human vision is less sensitive to these variations.
- The result: To a human, the uploaded image looks normal. But once the AI downscales it, the hidden text becomes prominent and clear to the AI’s Optical Character Recognition (OCR), causing the AI to execute the hidden command without the user’s knowledge.
Attack Vector 2: Audio and “Neural Aliasing”
The researchers proved this attack also works on audio, exploiting how modern AI systems use Neural Audio Codecs (like EnCodec or SoundStream) to process sound.
- How it works: Attackers can hide voice commands in ultrasonic frequencies (above 18kHz) that humans cannot hear.
- The result: When the AI’s neural codec processes the audio, it inadvertently causes “neural aliasing.” The codec pulls the high-frequency, inaudible sounds down into the standard speech range (3kHz–8kHz). The AI’s speech-to-text system then “hears” and executes the hidden command (e.g., “Hello London”), even though the human user only heard high-pitched static.
Tool Release
To assist security researchers in testing these vulnerabilities, the team released an open-source tool called Anamorpher (available on GitHub), which automates the creation of these adversarial images.
Defenses and Mitigations
Morozova emphasizes that there is no magical, secure downsampler. All deterministic, lossy algorithms are theoretically exploitable, and traditional fixes (like low-pass audio filters) are imperfect in real-world applications.
Therefore, the only effective defense is architectural security:
- System-Level Restrictions: AI systems must not blindly trust the inputs they process.
- Secure Design Patterns: Developers must implement patterns like “Plan Then Execute” or “Action Selector.”
- Least Privilege: AI agents should be restricted from taking sensitive actions (like sending emails or accessing private data) based purely on a prompt. They must require explicit human confirmation before executing critical tasks.