Researchers have developed a computer “worm” that can spread from one computer to another using generative AI, a warning sign that the tech could be used to develop dangerous malware in the near future — if it hasn’t already.
As Wired reports, the worm can attack AI-powered email assistants to obtain sensitive data from emails and blast out spam messages that infect other systems.
“It basically means that now you have the ability to conduct or to perform a new kind of cyberattack that hasn’t been seen before,” Cornell Tech researcher Ben Nassi, coauthor of a yet-to-be-peer-reviewed paper about the work, told Wired.
While researchers have yet to encounter AI-powered worms in the wild, per the report, they warn it’s only a matter of time.
In their experiment, which took place within a controlled environment, the researchers targeted email assistants powered by OpenAI’s GPT-4, Google’s Gemini Pro, and an open-source large language model called LLaVA.
They used an “adversarial self-replicating prompt,” which forces an AI model to spit out yet another prompt in its response. This triggers a cascading stream of outputs that can infect these assistants and thereby draw out sensitive information.
“It can be names, it can be telephone numbers, credit card numbers, SSN, anything that is considered confidential,” Nassi told Wired.
In other words, since these AI assistants have access to a hoard of personal data, they can easily be coaxed into giving up user secrets, regardless of guardrails.
Using a newly set up email system, which could both send and receive messages, the researchers were able to effectively “poison” the database of an email that was sent out, which triggered the receiving AI to steal sensitive details from emails.
Worse yet, this process also allows the worm to be passed on to new machines.
“The generated response containing the sensitive user data later infects new hosts when it is used to reply to an email sent to a new client and then stored in the database of the new client,” Nassi told Wired.
The team even managed to embed a malicious prompt in an image, triggering the AI to infect further email clients.
“By encoding the self-replicating prompt into the image, any kind of image containing spam, abuse material, or even propaganda can be forwarded further to new clients after the initial email has been sent,” Nassi added.
The team passed on their findings to OpenAI and Google, with an OpenAI spokesperson telling Wired that the company was working to make its systems “more resilient.”
But they’ll have to act fast. Nassi and his colleagues wrote in their paper that AI worms could start spreading in the wild “in the next few years” and “will trigger significant and undersired outcomes.”
It’s a worrying demonstration that highlights just how deeply companies are willing to integrate their generative AI assistants — without proactively heading off a cybersecurity nightmare.
More on OpenAI: Sam Altman Denies OpenAI Is Building AI “Creatures”
Share This Article