Unvetted agents with broad network access create prompt-injection and credential-theft risks

Ronghui Gu, co-founder and CEO of blockchain security auditor CertiK, warned this week that rapid deployment of unisolated, unvetted AI agents creates widespread security vulnerabilities. His concerns center on agents that now operate beyond chat interfaces, calling external tools, reading local files, triggering workflows, and accessing financial infrastructure.

“Right now, agents are no longer just answering questions in a chat window,” Gu said. “They are beginning to call external tools, read local files, trigger workflows, and interact with financial infrastructure. But if you do not isolate the execution environment and scan these tools first, you are handing a compromised identity broad internal access to your entire network.”

CertiK’s analysis identified hundreds of critical security advisories in agent structures. Researchers also discovered hundreds of malicious skills, fake installers, and lookalike dependency packages on open agent utility hubs. A separate supply-chain attack known as TrapDoor planted 34 malicious packages across npm, PyPI, and Crates.io.

Prompt injection attacks pose a particular threat. These attacks embed hidden natural language instructions in webpages, PDFs, or emails to redirect AI agent behavior without requiring malicious code. Malicious plug-ins use natural language to influence agent actions, bypassing signature-based antivirus software entirely.

“The scam apps use natural language to influence behavior, making them totally resistant to traditional antivirus scans,” Gu said. “And right now, it is even easier to scam the machine than it is to scam a human.”

CertiK’s telemetry observed automated onchain scams designed to target other AI trading bots and automated agent systems. These hyperfast, ephemeral scams operate for 10 minutes or a few hours before disappearing, making detection difficult.

Industry leaders acknowledge the scale of the emerging problem. Brian Armstrong, CEO of Coinbase, stated that “very soon there are going to be more AI agents than humans making transactions.” Changpeng Zhao, founder of Binance, predicted agents will “make one million times more payments than humans.”

Charles Hoskinson, founder and CEO of Input Output, predicts AI agents will become more relevant than humans on the internet by 2035, underscoring the urgency of establishing secure deployment standards now.

Gu’s warnings align with CertiK’s landmark deep-dive report into agent infrastructure vulnerabilities. The core vulnerability stems from a lack of isolation between agent execution environments and external systems. Without proper sandboxing and pre-deployment scanning, agents operating with legitimate credentials can become conduits for attackers to access user data, financial accounts, and internal networks.

The credential access problem

As agents gain autonomy to interact with financial systems, the attack surface expands dramatically. A compromised agent or malicious plug-in can inherit the permissions of its deployment context. An agent with access to a user’s API keys, wallet credentials, or banking tokens becomes a liability if its execution environment is not isolated.

Traditional security tools designed for human-operated systems do not detect natural language manipulation. Antivirus software relies on signature matching and behavioral heuristics tuned for binary executables and known malware families. AI agents, by contrast, respond to linguistic input that can be crafted to bypass those defenses entirely.

The convergence of three factors creates acute risk: agents now operate autonomously in production environments, they access sensitive systems and credentials, and the attack methods targeting them exploit natural language processing rather than code vulnerabilities.