AI agents are now able to exploit smart contracts on Ethereum and other blockchains, raising urgent questions about the economic risks of autonomous cyber capabilities.
Summary
- Frontier AI models, including GPT-5 and Claude, used smart contracts on Ethereum and other blockchains in simulated tests.
- The AI models discovered previously unknown security flaws – called zero-day vulnerabilities – in software (in this case smart contracts on Ethereum).
- The findings highlight the urgent need for proactive, AI-powered defense strategies, as AI agents now rival human hackers in identifying profitable blockchain exploits.
A joint project from Anthropic and MATS Fellows used the newly created Smart CONtracts Exploitation benchmark (SCONE bench) to test AI models against 405 real-world contracts operated between 2020 and 2025.
In simulated attacks on contracts exploited after March 2025, Claude Opus 4.5, Claude Sonnet 4.5, and GPT-5 produced exploits worth a combined $4.6 million, demonstrating a concrete lower bound on the potential financial damage AI could cause. By expanding the testing to 2,849 recently deployed contracts with no known vulnerabilities, GPT-5 and Sonnet 4.5 exposed two new zero-day vulnerabilities, generating a simulated profit of nearly $3,700.
SCONE bench: Quantifying exploits in dollars, not bugs
Traditional cybersecurity benchmarks measure success through detection rates or random scores, but SCONE benchmark evaluates AI exploits in financial terms, providing a more tangible measure of risk. Smart contracts are particularly well suited to this approach because vulnerabilities can translate directly into stolen funds, and simulations allow researchers to quantify the potential losses.
Of all 405 contracts in SCONE bench, 10 AI models produced exploits for 207 contracts, totaling $550.1 million in simulated stolen funds. Even accounting for potential data contamination, frontier models have consistently demonstrated the ability to exploit contracts beyond the knowledge cut-off.
Concrete examples of AI exploits
One tested vulnerability involved a token calculator function on an Ethereum-compatible contract that was accidentally left writable. The AI agent repeatedly called the function to inflate its token balance, generating simulated winnings $2,500 and, under peak liquidity conditions, a potential $19,000. Independent white-hat intervention later recovered the belongings.
The research underlines that AI agents are now approaching the capabilities of humans in tasks such as control-flow reasoning, boundary analysis and exploiting software vulnerabilities – a set of skills directly applicable to both blockchain and traditional software systems.
The study highlights that AI cyber capabilities are rapidly increasing, from network intrusions to autonomous exploitation of blockchain applications. SCONE bench provides a defensive tool that allows smart contract developers to test systems before deployment.
According to the researchers, the findings are proof-of-concept that profitable, autonomous exploitation is feasible in the real world, highlighting the urgent need for proactive, AI-powered defenses to protect financial systems and digital assets.
#models #discover #security #flaws #Ethereum #blockchain


