How I orchestrated 5,000 agent job cycles on Arc Testnet (and turned it into a reusable TypeScript kit)

I entered my first hackathon a few weeks ago. Solo. Using Claude Code to help me move faster because I had no team.

I didn't win.

But by the end of the Arc Testnet hackathon, I had completed 5,000 full job cycles on-chain — agents creating jobs, executing them, auditing results, approving or rejecting payment — without a single stuck job or lost transaction.

That felt like something worth writing about.

What the hackathon was about

Arc Testnet is building an on-chain agentic economy. The hackathon challenge: build something using their ERC-8183 job marketplace and ERC-8004 agent identity standards.

My project was a multi-agent data economy:

A JobFactory agent that creates work
DataWrangler and Translator agents that execute it
An Auditor agent that reviews output and approves or rejects payment
An Operator that keeps the whole thing running
A BidBoard contract where external agents can advertise their capabilities

Each job went through this lifecycle on-chain:

createJob → setBudget → approve USDC → fund escrow → submit deliverable → complete() or reject()

Every step was a real blockchain transaction. Every completion meant USDC moved to the provider. Every rejection meant it returned to the funder.

After 5,000 cycles, the audit looked like this:

── 1. SECURITY ──
  ✓  ecosystem.json has no credentials
  ✓  Auditor is separate from executors (no self-dealing)
  ✓  Minimum reserve protected
  ✓  Rate limiting active
  ✓  Try/catch in main loop

── 2. IDENTITIES (ERC-8004) ──
  ✓  JobFactory-v1  ID=1720 onchain
  ✓  DataWrangler-v1 ID=1721 onchain
  ✓  Auditor-v1     ID=1724 onchain

── 3. RECENT JOBS ──
  ✓  Job 1267 = Completed  1.00 USDC
  ✓  Job 1400 = Completed  1.00 USDC
  ✓  Job 1480 = Completed  1.00 USDC

  Audit complete: passed | System approved

The pattern I couldn't stop thinking about

After the hackathon ended, I kept looking at the orchestration code.

The blockchain part — Circle wallets, viem, ERC-8183 contracts — was specific to Arc. But the pattern underneath it was completely generic:

One agent does work. Another verifies it. Payment moves based on the verdict.

This works for anything:

LLM agents that process documents and need human-in-the-loop audit
Data pipelines where one service transforms data and another validates it
Freelance-style automation where quality gates control payouts
Any multi-agent system that needs accountability

The problem is that building this foundation reliably — with timeouts, retries, error handling per job, audit trails — takes real time. And most tutorials show you the happy path, not what happens when a job hangs or a deliverable is malformed.

So I extracted the pattern, generalized it, and built a TypeScript kit around it.

What the kit does

The core is three functions:

runCycle() — runs one complete job lifecycle:

import { JSONJobProvider } from "./adapters/json.ts";
import { runCycle } from "./core/orchestrator.ts";

const provider = new JSONJobProvider();

const result = await runCycle(
  provider,
  { providerName: "DataWrangler-v1", auditorName: "Auditor-v1" },
  {
    description: "Clean customer dataset",
    budgetUnits: 100,

    // Your actual work goes here — LLM call, script, API, anything
    executor: async (job) => {
      const output = await myLLM.process(job.description);
      return hashOf(output);
    },

    // Return true to approve (pay), false to reject (refund)
    auditor: async (job) => {
      return job.deliverable !== undefined && isValidHash(job.deliverable);
    },
  }
);

console.log(result.outcome);    // "completed" | "rejected"
console.log(result.durationMs); // how long it took

runBatch() — runs N jobs, handles errors per job, returns a summary:

const summary = await runBatch(provider, agents, [
  { description: "Clean sales data Q1", budgetUnits: 100, executor, auditor },
  { description: "Translate legal contract", budgetUnits: 150, executor, auditor },
  { description: "Deduplicate catalogue", budgetUnits: 80, executor, auditor },
]);

// summary.completed → 2
// summary.rejected  → 1
// summary.failed    → 0  (errors are caught per-job, never crash the batch)

JobAuditor.run() — audits the state of your entire job ecosystem:

const auditor = new JobAuditor();
await auditor.run(provider, {
  minBudgetUnits: 50,
  customChecks: [
    {
      label: "All completed jobs have a deliverable",
      check: async (p) => {
        const jobs = await p.listJobs({ status: "completed" });
        return jobs.every(j => j.deliverable?.length > 0);
      },
    },
  ],
});

Output:

── 1. RECENT JOBS ──
  ✓  Total jobs found — 5
  ✓  Completion rate — 80.0% (4/5)

── 2. STUCK JOBS ──
  ✓  No expired pending jobs
  ✓  No expired funded jobs

── 3. BUDGET INTEGRITY ──
  ✓  All jobs have positive budget
  ✓  All jobs meet min budget (50 units)

── 4. CUSTOM CHECKS ──
  ✓  All completed jobs have a deliverable hash
  ✓  All rejected jobs have a reason

  Audit complete: 11 passed, 0 failed
  ✓ System healthy

Two adapters — Web2 and Web3

The orchestrator doesn't care about the backend. You pass it a JobProvider and it works the same way.

Web2 (JSON file — zero setup):

import { JSONJobProvider } from "./adapters/json.ts";

const provider = new JSONJobProvider("./jobs.json");
// Jobs are stored locally. Works offline. No accounts.

Run the demo in 30 seconds:

npm install
npm run demo

══════════════════════════════════════════════════
  agent-job-kit — Web2 Demo

  ○ [pending  ] Clean and deduplicate Q1 sales dataset
  ◎ [funded   ] Clean and deduplicate Q1 sales dataset
  ◉ [submitted] Clean and deduplicate Q1 sales dataset
  ✓ [completed] Clean and deduplicate Q1 sales dataset
  ✗ [rejected ] Process financial report — reject (demo)

  Batch complete
  Total    : 5
  Completed: 4
  Rejected : 1
  Failed   : 0
══════════════════════════════════════════════════

Web3 (EVM via Circle Developer Wallets):

import { arcTestnet } from "viem/chains";
import { EVMJobProvider } from "./adapters/evm.ts";

const provider = new EVMJobProvider({
  circleApiKey: process.env.CIRCLE_API_KEY!,
  circleEntitySecret: process.env.CIRCLE_ENTITY_SECRET!,
  chain: arcTestnet,   // swap for Ethereum, Polygon, Base, etc.
  contractAddress: "0x...",
  usdcAddress: "0x...",
  providerAddress: "0x...",
  evaluatorAddress: "0x...",
  budgetUnits: 1_000_000n, // 1 USDC
});

This is the production code from the hackathon, generalized. The same runCycle() and runBatch() functions work with both adapters.

Implementing your own adapter

The JobProvider interface is intentionally simple:

interface JobProvider {
  createJob(params): Promise<Job>;
  fundJob(jobId): Promise<Job>;
  submitDeliverable(jobId, deliverable): Promise<Job>;
  completeJob(jobId, reason?): Promise<Job>;
  rejectJob(jobId, reason?): Promise<Job>;
  getJob(jobId): Promise<Job>;
  listJobs(filter?): Promise<Job[]>;
}

Implement this for PostgreSQL, Redis, Prisma, Supabase — whatever your stack uses — and the orchestrator works without any changes.

What I learned from the hackathon

A few things stuck with me after 5,000 cycles:

Separation of duties matters more than you think. The audit explicitly checks that the auditor wallet is never the same as the executor wallet. Self-dealing — where one agent both does the work and approves it — is the most common failure mode in agentic systems. Build the check in from the start.

Stuck jobs are the real enemy. A job that gets funded but never submitted, or submitted but never audited, locks budget indefinitely. The waitForTerminal() function and TTL expiry exist specifically for this. Audit for stuck jobs regularly.

Claude Code + WSL is genuinely fast. I built the entire hackathon project solo in a compressed timeline. The bottleneck was never the coding — it was thinking through the architecture. Having Claude Code handle the implementation while I focused on the design worked surprisingly well.

Not winning doesn't mean you built nothing useful. The hackathon judges are evaluating against specific criteria. What you built might be genuinely valuable to other developers even if it didn't match the judging rubric.

The kit

The full kit — source code, both adapters, audit module, examples, README — is available at the link below.

If you're building multi-agent workflows and want a foundation that's been validated in production rather than assembled from tutorials, this is what I wish I'd had when I started.

[agent-job-kit → https://payhip.com/b/2DWOp]

MIT license. Full TypeScript source, no minification.

Built with TypeScript, viem, lowdb, and Claude Code. Tested on Node v20 and v22.

Questions about the architecture or the hackathon? Drop them in the comments.