
Six coding agents. One bench. Your task. One winner. At #45 (Bench) you wrote an eval for your coding agent: your task set, pass^k, real grading. This Saturday we use it. We put the agents in the ring and run them head-to-head on the bench you built. VCN #46: Bake-Off is the morning where the field gets settled, with data. The contenders: Claude Code, Cline, Aider, Codex, Cursor, Continue. Same task. Same grader. We score three things that actually decide your workflow: cost, latency, and quality. z.ai + Claude Code is one of the contenders in hand. Nebius Token Factory credits run the local contenders so the field is fair, no agent wins just because it had a bigger budget. Format: Rack the field. Wire each agent up to the same task and the same bench. We standardize the harness so every contender gets a fair shot. Run the bracket. Fire the agents at the task. Watch them work. Capture cost per run, wall-clock latency, and pass rate on your grader. Score the board. Read the numbers together. Which agent wins on quality, which wins on cost, which wins on speed, and where the trade-offs land for YOUR stack. Pick your fighter. Leave with a ranked board and a clear answer: the coding agent that wins for your tasks, on your bench. By 1pm you have real data on which coding agent to actually use. Not vibes. Numbers. Builders only. Bring the bench you wrote at #45 (or a task you want graded). Doors 10am. Build sprint 10:15. Demos and coffee at noon. Frontier Tower Floor 9. Your…