Google DeepMind Backs $10M Research Fund to Address Safety Risks in Large-Scale Multi-Agent AI Systems
Google DeepMind has joined Schmidt Sciences, the UK's ARIA, the Cooperative AI Foundation, and Google.org to establish a $10 million research fund focused on identifying and mitigating safety risks that emerge when millions of AI agents interact autonomously. Concerns include prompt injection attacks, fraud, and broader cybersecurity threats that could destabilize digital infrastructure before large-scale agent deployment.

Highlights
- Google DeepMind, Schmidt Sciences, the UK's ARIA, Cooperative AI Foundation, and Google.org have jointly established a $10 million fund to research multi-agent AI safety risks.
- Google DeepMind's Rohin Shah warns that large-scale AI agent deployment could materialize new safety risks within months, including fraud and prompt injection attacks.
- Researchers argue that sandbox simulation of large agent populations is the only reliable method to study emergent behaviors in multi-agent systems.
- Anthropic has released Zero Trust-based deployment guidelines for AI agents, treating potential system compromise as inevitable rather than merely possible.
- Cybersecurity expert Refael Angel notes that agentic AI breaks all traditional security assumptions, as agents reason and improvise rather than follow fixed execution paths.
Google DeepMind Backs $10M Research Fund to Address Safety Risks in Large-Scale Multi-Agent AI Systems
Google DeepMind is funding research into the potential dangers that could arise when millions of distinct AI agents interact with one another across the internet.
According to Rohin Shah, who leads AGI safety and alignment research at the company, the mass deployment of agentic systems — capable of executing tasks without human supervision and accepting instructions from other AI agents — introduces an entirely new category of risk.
$10 Million Research Fund Launched
To address this challenge, Google DeepMind — which placed agentic tools front and center at last month's Google I/O developer conference — has partnered with several organizations to announce a $10 million research fund. The fund is intended to support researchers studying the behavioral dynamics of multi-agent systems and developing methods to prevent unsafe outcomes.
Participating organizations include:
- Schmidt Sciences: the philanthropic foundation established by Eric and Wendy Schmidt
- ARIA: the UK government's Advanced Research and Invention Agency
- Cooperative AI Foundation: a UK-based nonprofit research organization
- Google.org: Google's philanthropic arm
Shah said the initiative is designed to stimulate research beyond the technology industry itself: "The advantage of academia is the ability to look further into the future and pursue work that industry labs haven't yet prioritized."
He added: "The primary problem is that there is virtually no academic field dedicated to multi-agent safety, and we want that field to exist."
The Risk: A Breakdown of the Digital Commons
The risks that concern Shah and James Fox — head of the Trustworthy AI Science program at Schmidt Sciences — are largely amplified versions of existing online threats: fraud, prompt injection attacks (in which malicious instructions are embedded in an AI agent, turning it into a self-driven malicious actor), and other forms of cyberattack.
"We have a digital commons that is essential to the functioning of society, and we need to make sure it doesn't descend into chaos," Fox warned.
Shah believes there are only months remaining before AI agents are deployed at scale across the broader economy — and potentially before those risks materialize.
Sandbox Simulation: The Only Path Forward
Both Shah and Fox argue that the only viable method for understanding how large numbers of multi-agent systems behave when interacting simultaneously is through real-world scenario simulation — placing AI agents in sandboxed environments and studying their behavior.
Fox emphasized that it is not possible to predict emergent group behavior by studying individual agents or small agent clusters, nor can researchers assume that large language model (LLM)-based AI agents will always act rationally. The complexity stems from the sheer volume of interactions occurring simultaneously.
Some researchers, including a team at Google DeepMind, have proposed that artificial general intelligence (AGI) — if achievable — may not emerge from a single superintelligent model, but rather from a kind of collective agent consciousness whose combined capabilities exceed the sum of its parts.
Zero-Trust Security Frameworks Gain Attention
Google DeepMind is not the only leading AI company raising alarms about risks posed by its own technology. Weeks ago, Anthropic published deployment guidelines for AI agents grounded in the cybersecurity field's Zero Trust model — a framework that assumes computer systems are inherently vulnerable, treats agents as potential attackers, and accepts that breaches will eventually occur.
Refael Angel, co-founder and CTO of Israeli cybersecurity firm Akeyless, based in Tel Aviv, stressed the importance of understanding the novel risks introduced by agentic systems:
"Every security approach to date assumed the protected machine was running software written by humans, executing fixed operations along fixed paths. Agentic systems break all of those assumptions — they reason, improvise, and can be hijacked by a single sentence buried in a document they were asked to read."
Angel welcomed the new funding, stating: "It shouldn't be up to a single lab to set security standards that everyone else must trust." However, he cautioned that security researchers sometimes overlook existing "mundane" problems in favor of more theoretically compelling hypothetical threats.
Fox offered a closing perspective: "Risks that were hypothetical just a few years ago are now very real. The future is arriving faster than expected."
原文來源: 查看原文
FAQ
Newsletter
Subscribe to our Low-Altitude Industry Newsletter
Daily curated news on low-altitude economy and drone industry, delivered to your inbox.


