Vector Institute and CleverHans Lab show AI agents can power adaptive, self-sustaining computer worms

ListenJun 2, 2026published Jun 4, 2026

On June 2, 2026 a team at the University of Toronto and the Vector Institute released a paper on arXiv showing that an AI agent can power a computer worm capable of writing fresh exploits as it spreads across a network. The paper, "AI Agents Enable Adaptive Computer Worms," is led by associate professor Nicolas Papernot of the CleverHans Lab, with co-authors Jonas Guan, Tom Blanchard, Hanna Foerster, Hengrui Jia, and Gabriel Huang. The University of Toronto's announcement says the team is the first to demonstrate that publicly available AI models can be used to power a worm that adapts its strategy as it spreads from one device to the next, and that the researchers shared their findings with national-security bodies before publishing.

What's new

The CleverHans Lab worm is not a traditional WannaCry-style program that relies on a hard-coded vulnerability. Instead, each compromised machine runs an open-weight large language model and uses it as a reasoning core for the next hop. In the authors' words, "the worm parasitically uses compromised machines to run open-weight large language models (LLMs) to sustain its reasoning, or extend its reach for further attacks." The researchers deployed the worm on a test network that spans Linux, Windows, and Internet-of-Things hosts and report that it "propagated by exploiting common, real-world corporate network vulnerabilities."

Two structural properties of the design follow:

Zero marginal compute cost. Because each new infection donates its hardware to the next round of reasoning, the attacker pays nothing per target. The paper calls this a "destabilizing economic asymmetry between attackers and defenders."
Centralized safety controls don't apply. With open-weight models running on stolen compute, there is no API to throttle and no service to refuse the request. The authors note that "centralized safety controls, such as service refusals or rate limiting, are structurally irrelevant" to this class of threat.

The University of Toronto's news release adds that the team disclosed the work to national-security agencies before going public, and frames the result as a class of risk rather than a specific exploit kit.

Context

Computer worms are not new — Morris in 1988, ILOVEYOU in 2000, WannaCry in 2017. What has historically blunted them is the patch cycle: once a vulnerability is known, defenders ship a fix and the worm runs out of unpatched targets. The CleverHans paper challenges that defensive playbook by showing that the exploit logic can be synthesized at runtime by an LLM, against vulnerabilities the model never saw during training.

Until now, the prevailing assumption around frontier-AI misuse has been that the model providers themselves are a meaningful bottleneck — refuse the request, throttle the user, log the API call. That assumption holds when the attacker depends on a commercial API. It does not hold when the attacker can run an open-weight model from inside the network they are already attacking.

Why it matters

Read as analysis rather than as fact, the paper points to three near-term implications. First, the long-running debate over which AI capabilities to gate at the API layer becomes less load-bearing for this specific threat class; the relevant control surface is the endpoint, not the model provider. Second, the security industry's pattern of "patch the CVE, kill the worm" loses some of its leverage when the attacker is generating new exploit chains on the fly. Third, network-level defenses — segmentation, egress controls, anomaly detection — get more, not less, important, because they catch propagation behavior regardless of which specific exploit the worm has just synthesized.

The authors close the abstract with a one-line summary of the policy implication: "We must prepare for autonomous generative adversaries: malware systems that propagate without human operators and are defined not by fixed exploit code, but by the capacity to reason about targets, adapt to observations, and synthesize attack logic in real time."

Corroborating sources

Arxiv.org
https://arxiv.org/abs/2606.03811
“Here we show that artificial intelligence (AI) agents enable a fundamentally new threat: a worm that generates tailored attack strategies to each target it encounters.”
Utoronto.ca
https://www.utoronto.ca/news/u-t-researchers-demonstrate-ai-worm-could-target-any-online-device

What's new

Two structural properties of the design follow:

Zero marginal compute cost. Because each new infection donates its hardware to the next round of reasoning, the attacker pays nothing per target. The paper calls this a "destabilizing economic asymmetry between attackers and defenders."

Centralized safety controls don't apply. With open-weight models running on stolen compute, there is no API to throttle and no service to refuse the request. The authors note that "centralized safety controls, such as service refusals or rate limiting, are structurally irrelevant" to this class of threat.

Context

Why it matters