The Natural Environment
for AI Agents is
Onchain
We build the training grounds.
RL environments where AI agents learn to operate onchain. No wrappers. No shortcuts. The actual production skill.
"AI can create a game — but doesn't know how to play it."
Frontier models write smart contracts fluently. They can't use them. BlockchainRL teaches AI to play.
The Problem
AI Agents
Know Crypto.
They Can't
Do Crypto.
Frontier models write Solidity but cannot execute a swap, navigate MEV, or react to onchain state. Knowledge without action is worthless.
Agentic RL Environments
Agents dropped into simulated worlds. Trial-and-error interaction. Trains operational capability.
- Method: Trial-and-error in live environments
- Reward: Automated — transaction succeeded or failed
- Crypto Operational RL players: zero
BlockchainRL fills this gap.
20+ companies build Operational RL environments. All target coding, computer use, or enterprise workflows.
Zero target crypto.
The Solution
Agentic RL Environments for Onchain Operations
Docker containers where agents learn by doing. Forked chains, funded wallets, raw JSON-RPC. No abstraction layers.
environment:
chain: anvil --fork-url $ETH_RPC
wallet: funded, 100 ETH
contracts: [uniswap_v3, aave_v3, morpho]
task: "provide USDC/ETH liquidity, rebalance on 2% drift"
action_space:
- eth_sendTransaction
- eth_call
- eth_getLogs
grader: onchain_state → reward
Environment Categories
DeFi Operations
Multi-hop swaps, lending, yield optimization across Uniswap, Aave, Morpho
MEV & Trading
Arbitrage, liquidations, sandwich defense in simulated mempools
Contract Security
Detect and exploit vulnerabilities in adversarial scenarios
Cross-Chain Ops
Bridge selection, multi-chain portfolio management
Position Management
Rebalance under market stress, maintain position health
Open Benchmark
Measure the Gap
BlockchainBench measures the gap. BlockchainRL closes it.
13
Tasks across real DeFi protocols
3
Difficulty tiers from basic to advanced
Why Now
Five Converging Forces
Massive market, growing fast, with a crypto-shaped hole.
RL Market on Fire
Applied Compute: $0 to $1.3B in 8 months. Total RL environment market estimated at $4-8B/year.
Labs Spending Big
Anthropic and OpenAI each spend ~$1B/year on RL environments. They buy externally — Anthropic alone uses 12+ vendors at $300-500K/quarter.
Crypto Evals Already Exist
EVMbench and SCONE-bench prove labs treat crypto as a real AI domain. But these are Knowledge RL — not Operational RL.
Crypto VCs Want It
Coinbase Ventures names onchain agent training a core thesis. Leading crypto VCs identify RL fine-tuning on blockchain as missing infrastructure.
The Gap is Wide Open
20+ companies build Operational RL environments. Zero target crypto. Wrapper-dependent agents hit capability ceilings on complex DeFi operations.
$4-8B
per year in RL environments
The Category is Proven
Operational RL is a validated, venture-backed category. The crypto vertical is untouched.
$2B+
Confirmed annual spend (Anthropic + OpenAI)
20+
Operational RL companies — zero in crypto
$1.3B
Applied Compute valuation in 8 months
Build With Us
We are looking for design partners who share the conviction that AI agents need native onchain training — not more wrappers.