Pre-Seed · Building Now
The Natural Environment
for AI Agents is
Onchain
We build the training grounds.
Agentic reinforcement learning environments where AI agents learn to operate onchain through trial and error. No wrappers. No shortcuts. The actual production skill.
The Problem
AI Agents
Know Crypto.
They Can't
Do Crypto.
Frontier models write Solidity at 80% pass@1 but cannot execute a DeFi swap, navigate MEV, or react to onchain state. Knowledge without operational capability is worthless onchain.
Benchmarks & Evals
Human evaluators rank model outputs. Models learn from preference data. Trains knowledge and reasoning about crypto.
- EVMbench, SCONE-bench, SolidityBench
- Method: Rank outputs against reference answers
- Result: Models detect bugs (70% with Codex 5.3) but still cannot operate onchain
Agentic RL Environments
Agents dropped into simulated worlds. They learn through trial-and-error interaction. Trains operational capability.
- Method: Trial-and-error in live environments
- Reward: Automated -- transaction succeeded or failed
- Crypto RL Do players: zero
13+ companies build RL Do environments. Every single one targets coding, computer use, or enterprise workflows. Zero target crypto.
The Solution
Agentic RL Environments for Onchain Operations
Docker containers where agents learn by doing. Forked EVM chains, funded wallets, raw JSON-RPC. No abstraction layers. Agents learn the actual production skill.
Environment = forked chain (Anvil)
Environment + funded wallet
Environment + contract ABIs
Environment + task prompt
Action space = eth_sendTransaction
Action space + eth_call
Action space + eth_getLogs
Grader = read onchain state, compute reward
Machine-Verifiable Rewards
Onchain state is ground truth. Funds transferred or not, swap executed or not. No human annotators needed.
Difficulty Curriculum
Guided tasks with exact ABIs and hints up to open-ended scenarios with only a funded wallet. Swap to LP to leverage to cross-chain.
Raw Production Skill
No abstraction layers. Agents interact via raw JSON-RPC -- read ABIs, encode calldata, sequence approvals. The real thing.
Environment Categories
DeFi Operations
Multi-hop swaps, lending, yield optimization across Uniswap, Aave, Morpho
Reward: P&L, slippage, gas efficiency
MEV & Trading
Arbitrage, liquidations, sandwich defense in simulated mempools
Reward: Profit captured
Contract Security
Detect and exploit vulnerabilities in adversarial scenarios
Reward: Binary success / fail
Cross-Chain Ops
Bridge selection, multi-chain portfolio management
Reward: Cost, time, success rate
Position Management
Rebalance under market stress, maintain position health
Reward: Position health delta
Why Now
Five Converging Forces
The market is massive, growing fast, and has a crypto-shaped hole.
RL Market on Fire
Applied Compute went from $0 to $1.3B valuation in 8 months. Total RL environment market estimated at $4-8B/year across all labs.
Labs Spending Big
Anthropic and OpenAI each spend ~$1B/year on RL environments. OpenAI projects $8B by 2030. They buy externally -- Anthropic uses 12+ RL vendors with contracts at $300-500K/quarter.
Crypto Evals Already Exist
EVMbench (OpenAI/Paradigm) and SCONE-bench (Anthropic) prove labs treat crypto as a legitimate AI domain. But these are RL Know benchmarks, not RL Do environments.
Crypto VCs Want It
Coinbase Ventures names onchain agent training a core investment focus. Haseeb Qureshi (Dragonfly, $6.5B AUM) explicitly identifies RL fine-tuning on blockchain tasks as missing infrastructure.
The Gap is Wide Open
20+ companies build RL Do environments. Zero target crypto. Meanwhile, wrapper-dependent agents like Wayfinder and Clawi hit capability ceilings on complex DeFi operations.
Validation
The Category is Proven
RL Do is a validated, venture-backed category. The crypto vertical is untouched.
$4-8B
RL env market size per year
$2B+
Confirmed annual spend (Anthropic + OpenAI)
13+
RL Do companies -- zero in crypto
$1.3B
Applied Compute valuation in 8 months
Key Validation Signals
Build With Us
We are looking for design partners and collaborators who share the conviction that AI agents need native onchain training -- not more wrappers.