ai-on-chain

ERC-8183: A dummies guide to the Agentic Commerce

Zaryab Afser

05 Jun 2026 — 8 min read

Machines can now hire other machines.

Not through API keys, or subscriptions, or a human clicking "approve" on a payment screen. Through an Ethereum standard — with escrow, evaluation, and enforceable service agreements baked into the smart contract.

ERC-8183 is the Agentic Commerce Protocol. It defines how an AI agent posts a job, locks funds in escrow, receives a deliverable, and settles payment — all on-chain, all trustless, all without a human in the loop. The Ethereum Foundation co-authored it. Virtuals Protocol built the first production implementation. BNB Chain, Arc Network, and a growing list of projects are adopting it.

And almost nobody outside the agent infrastructure crowd is paying attention.

Let's dive right in.

ERC-8183 in 100 Words

ERC-8183 is an escrow-based job protocol for autonomous agents.

Three roles: a Client posts a job, a Provider performs the work, and an Evaluator — a designated third party — decides whether the work was completed.

Four states: Open, Funded, Submitted, Terminal. The Client locks ERC-20 tokens in escrow. The Provider submits a deliverable. The Evaluator either releases funds to the Provider or rejects the job, returning funds to the Client. If nobody acts before the expiry, anyone can trigger a refund.

Optional Hooks extend the lifecycle with reputation gating, SLA enforcement, or custom logic.

The Three Roles

Let's make this concrete. Say we have an AI research agent — call it Agent R — that needs an analytics report on Uniswap V3 pool behavior over the last 90 days. Agent R does not have the tooling to run the analysis itself. But Agent D — a data agent specializing in on-chain analytics — does.

Here is how they transact under ERC-8183.

Client — Agent R. Posts the job, specifies the deliverable ("90-day Uniswap V3 pool analytics report"), sets the payment (100 USDC), names the evaluator, and sets an expiry window. Agent R is the one paying.

Provider — Agent D. Accepts the job, performs the analysis, submits the deliverable on-chain (a hash or URI pointing to the report). Agent D is the one working.

Evaluator — a third-party agent or contract designated at job creation. Not Agent R. Not Agent D. The evaluator is the only party authorized to release escrowed funds or reject the submission. This is the design decision that makes ERC-8183 different from a simple token transfer — a neutral third party holds the judgment call.

Why does this matter?

Without the evaluator role, agent commerce degrades into one of two failure modes. Either the client pays upfront and hopes the provider delivers (trust-based, no recourse). Or the provider delivers first and hopes the client pays (trust-based, no guarantee). ERC-8183 eliminates both by locking funds in escrow and routing the release decision through a designated third party.

The Four-State Lifecycle

Every job in ERC-8183 follows the same state machine. No exceptions.

Open — Agent R calls createJob. It specifies Agent D's address, the evaluator's address, the expiry timestamp, the task description, the payment token (USDC), the amount (100 USDC), and an optional Hook contract. No funds are locked yet. The job exists as a commitment, not a payment.

Funded — Agent R (or any party) calls fundJob. 100 USDC transfers from Agent R's wallet into the contract's escrow. The funds are locked. Agent D can now begin work, knowing the payment exists and cannot be pulled back (except through the evaluator or expiry).

Submitted — Agent D completes the analytics report. It calls submitJob with a deliverable reference — an IPFS hash, a URI, or on-chain data pointing to the report. The job moves to Submitted. Now the evaluator decides.

Terminal — One of three outcomes:

The evaluator calls completeJob. 100 USDC releases from escrow to Agent D. Job done.

The evaluator calls rejectJob. 100 USDC returns to Agent R. Agent D delivered, but the work did not meet the standard.

Nobody acts before the expiry timestamp. Anyone can call claimRefund. 100 USDC returns to Agent R. The system defaults to protecting the client's capital.

So what does this mean in practice?

It means Agent R can hire Agent D without ever trusting Agent D. The escrow guarantees that the money exists. The evaluator guarantees that the money only moves when the work is verified. And the expiry guarantees that funds never get permanently stuck — even if the evaluator disappears.

One contract. Three roles. Four states. That is the entire primitive.

The Hook System — Why Minimalism Works

ERC-8183 is deliberately minimal. The base spec handles escrow and state transitions. Everything else — reputation, SLAs, custom payment logic — plugs in through Hooks.

Hooks are optional smart contracts attached to a job that execute additional logic before or after each state transition. The spec defines the IACPHook interface. Developers implement it.

Three production hooks exist today.

ReputationHook — auto-writes job outcomes (completed or rejected) to ERC-8004's ReputationRegistry after every job resolution. Every completed job improves the provider's on-chain reputation. Every rejection hurts it.

ReputationGateHook — gates provider funding by checking the provider's ERC-8004 reputation score before allowing fundJob to execute. A provider with a reputation below the threshold cannot receive funded jobs. This hook is live on Base Mainnet with verified contracts.

SLAHook — enforces submission deadlines after funding. If the provider does not submit within the SLA window, the job can be expired early. Time-based constraints beyond the global expiry.

Now, here is the design decision that reveals the spec authors' priorities.

claimRefund is deliberately excluded from the Hook system. If hooks could block refund claims, a malicious hook contract could permanently lock escrowed funds by reverting on every claimRefund attempt. By making refunds non-hookable, ERC-8183 guarantees that expired jobs always return funds to the client — regardless of what hook logic is attached.

Safety over flexibility. That is the tradeoff, and it is the right one.

Use Cases

Virtuals Protocol — the primary production deployment. Live on Base and Arbitrum. The broader Virtuals ecosystem includes 18,000+ tokenized agents. Virtuals launched a Revenue Network in February 2026 that distributes up to $1M per month to agents selling services through ACP. ACP v2 shipped with unified job interfaces, custom job offering definitions, persistent on-chain accounts (agent-to-agent relationship history), and notification memos for post-completion follow-up.

One caveat on the numbers. Virtuals claims 2M+ ACP transactions. We cannot independently verify this — the figure comes from a single secondary source, and the breakdown between ERC-8183 escrow transactions and broader ACP system activity is not public. The traction is real. The precise scale is unverifiable.

BNBAgent SDK — BNB Chain's implementation, live on testnet as of March 2026. Mainnet pending (no confirmed date). The notable addition: BNBAgent routes disputes through UMA's Data Verification Mechanism, where token holders vote on outcomes. This adds a dispute resolution layer that the base ERC-8183 spec deliberately excludes.

Arc Network — a stablecoin-native L1 focused on USDC-denominated agent commerce. Published "Create your first ERC-8183 job" tutorials and ran an "Agentic Economy on Arc" hackathon in April 2026.

AgentHire — an open-source marketplace on Avalanche Fuji testnet implementing the full hire-bid-evaluate-settle flow with EIP-712 signed bids and reputation gating.

The pattern is clear. Multiple chains, multiple implementations, all converging on the same spec. That is what standards are supposed to do.

Where ERC-8183 Sits in the Stack

Three complementary standards are forming the foundational infrastructure for on-chain agent commerce. Understanding how they relate is the mental model that matters.

ERC-8004 answers: who is this agent, and can we trust it? Identity registration and on-chain reputation.

ERC-8183 answers: how do agents hire, deliver, evaluate, and settle? The commerce layer — escrow, job lifecycle, evaluation.

x402 answers: how does money move per-request over HTTP? Per-call micropayments, no escrow, no evaluation.

The three are not competitors. They are layers.

ERC-8183's ReputationGateHook consumes ERC-8004 reputation scores — a provider must have a minimum reputation to receive funded jobs. ERC-8183's ReputationHook writes back to ERC-8004 — every completed or rejected job auto-updates the provider's reputation. The two standards feed each other.

quick decision framework:

An agent that needs a one-off data query — a single API call, pay $0.002, get a response — uses x402. No escrow needed. No evaluation needed. Pay and go
An agent that needs another agent to perform a multi-step task — research a topic, produce a report, submit it for review — uses ERC-8183. Escrow locks the funds. The evaluator verifies the deliverable. The provider gets paid only when the work passes.

They are not interchangeable. They serve different transaction patterns.

What ERC-8183 Does Not Solve

The spec is honest about its boundaries. Three things it deliberately leaves out.

No dispute resolution. Rejection or expiration is final. The evaluator's decision cannot be appealed within the base standard. BNBAgent SDK addresses this by routing disputes through UMA's DVM. But the spec itself makes no provision for arbitration. Each implementation must solve evaluator trust independently.

Single-chain only. There is no native mechanism for a job created on Ethereum to escrow funds on Base or submit a deliverable on Arbitrum. Cross-chain job orchestration would require bridge integration or a higher-level protocol that coordinates ERC-8183 instances across chains.

The evaluator trust problem. This is the big one. The evaluator has unilateral power to approve or reject. A malicious evaluator can approve bad work (stealing the client's funds). A malicious evaluator can reject good work (griefing the provider). The spec defines the role but does not specify how to select trustworthy evaluators.

Reputation hooks help — they gate provider entry. But they do not prevent evaluator misbehavior. The Evaluator is the linchpin of the entire ERC-8183 economy. And it remains an open design space.

For objective tasks — did the transaction execute? did the data match a schema? — automated evaluators work. For subjective tasks — is this report good? is this creative work acceptable? — evaluation requires judgment that current on-chain mechanisms cannot reliably provide.

Now, one objection is probably forming as you read this.

"But isn't this just a fancy escrow contract? Smart contract escrow has existed since 2017."

Yes — the escrow pattern is not new. A two-party lockup with a release condition is one of the oldest primitives in Solidity.

But ERC-8183 is not the escrow. The escrow is the skeleton.

What is new is the three-role separation — client, provider, and a neutral evaluator who is neither — combined with a hookable lifecycle. Reputation gating, SLA enforcement, and custom settlement logic plug into the state machine without anyone touching the base contract. Old escrow says "release when X." ERC-8183 says "release when an evaluator decides, and let reputation, deadlines, and policy ride along on the same job."

The escrow is the skeleton. The hook system is the nervous system.

The Hardest Problem in Agent Commerce

The payment layer is solved. x402 handles per-request micropayments. ERC-8183 handles escrow-based job payments. The money moves.

The identity layer is solved. ERC-8004 gives agents verifiable on-chain identity and reputation. We know who they are.

The commerce lifecycle is solved. ERC-8183 defines the job states — create, fund, submit, evaluate, settle. The workflow exists.

But the judgment layer — who decides whether work is good? — is not solved. Not by ERC-8183. Not by any standard.

And judgment is the layer that determines whether agent commerce becomes a real economy or stays a series of toy demonstrations.

The builders who figure out evaluation — reliable, trustless, scalable evaluation of agent output quality — will control the most important piece of infrastructure in the agent economy.

The spec authors knew this. They left the evaluator as an open design space on purpose. Not because they could not define it, but because premature standardization of judgment would be worse than no standardization at all.

The hardest problem in agent commerce is not payment, not identity, not escrow. It is judgment.