
Security researchers have revealed a novel attack technique that exploits artificial intelligence (AI) models to compromise confidential user data through manipulated images. This method, developed by researchers Kikimora Morozova and Suha Sabi Hussain from the cybersecurity firm Trail of Bits, builds on a concept introduced in a 2020 study by TU Braunschweig regarding “image scaling attacks.”
The researchers demonstrated how current AI applications can be vulnerable to this attack by manipulating the image scaling process, which is commonly employed to reduce the size of uploaded images for efficiency. AI systems typically use resampling algorithms such as “Nearest Neighbor,” “Bilinear,” or “Bicubic” to achieve this reduction. These algorithms can inadvertently reveal hidden patterns within images when they are resized, allowing malicious actors to embed instructions that become visible only in the downscaled version.
For instance, in one test, the researchers modified an image so that dark areas were colored red during the downscaling process. This alteration made previously hidden black text discernible. The AI model recognized this text as legitimate user input, leading to the execution of harmful commands that could compromise sensitive data.
In their experiments, the team successfully transmitted calendar data from a Google account to an external email address using the Gemini CLI tool. The vulnerability affects several platforms, including Google’s Gemini models, Google Assistant on Android, and the Genspark service.
To illustrate the potential dangers, the researchers created an open-source tool called Published, designed to generate images tailored for different downscaling methods. This tool serves as a practical demonstration of how attackers can exploit AI systems through image manipulation.
In response to these vulnerabilities, the researchers recommend implementing several defensive strategies. They advise limiting the size of images during uploads and displaying a preview of the reduced version to users. Furthermore, safety-critical actions should always require user confirmation, particularly when extracting text from images.
The researchers emphasized that a secure system design is paramount in mitigating the risk of such attacks. A robust architecture that resists prompt injection attacks is crucial in preventing multimodal AI applications from becoming conduits for data abuse.
As the use of AI continues to expand across various sectors, the findings from Trail of Bits highlight the necessity for ongoing vigilance and proactive security measures to safeguard sensitive information against emerging threats.