OpenAI shipped code to production with zero manually-written code.

7 engineers · 5 months · 1M AI-generated code · 10× faster

Harness
Engineering

The discipline that made it possible

Yan Cheng · March 6, 2026

Every tech movie ever...

Classic movie hacking scene with progress bar

Scrolling logs. Progress bar. One person watching.

We're actually getting there.

Agents, humans, and the harness

"Walk through this diagram in four clicks. First click — the harness. Two halves: the repo as single source of truth — AGENTS.md, architecture, conventions, specs. And tooling for self QA — DevTools, logs, metrics, linters, traces. This is the environment agents live in. Second click — what agents produce. Product code, tests, CI/CD, documentation, reviews. They open PRs, self-review, do agent-to-agent review, iterate, merge. Plus garbage collection — periodic background scans for drift. Third click — the self-fix loop and humans. When agents struggle, they first try to self-fix — that green loop going back up. On the left, humans proactively: design systems, define intent, prioritize work. Both feed into the harness. Fourth click — diagnose gaps. When self-fix fails, it escalates to humans. The question: what's missing — a tool, a guardrail, a doc? Fix that, and it feeds back up to improve the harness. Every gap closed makes all future work better. The key insight: humans never write code. They build and improve the harness. The harness makes agents better. Agents generate everything."

Trial → error → better harness

1000-line AGENTS.md

→

~100-line map · discover on demand

Manual cleanup Fridays

→

Automated GC · agents scan + fix drift

Context in Slack & Docs

→

Everything in the repo · single source of truth

"When the agent struggled, they didn't try harder — they encoded what was missing."

Transforming to harness engineering?

Changed skills and requirements from engineers

Environment Design

Structure repos + tooling
for agent navigation

Prompt Decomposition

Break goals into
agent-tractable units

Feedback Loops

Linters that teach fixes
Tests that enforce architecture

Gap Diagnosis

Agent fails? Find what's
missing, then encode it

Transforming to harness engineering?

How to get there

For engineers

Mindset shift — more architectural thinking

Start small — cleanup, tests, docs

Learning and sharing

Debugging skills transfer directly

What we need from leadership

Celebrate environment design wins

Tooling time = product time

Expect a slow start — it's investment

Track how improvements reduce future failures

Let's be honest

Long-term coherence

5 months proven.
Years? Unknown.

Human leverage points

Where does our judgment
add the most value?

Model capability growth

Today's constraints:
unnecessary or critical?

Discipline hasn't disappeared — the hard work shifted from writing code to designing the systems that make agents reliable.

Where do we start?

Bring knowledge into the repo

It may live in Confluence, Teams, or someone's head

Make existing tools agent-readable

Monitoring, CI, linting — we have them, now make them accessible to agents

Pick a low-stakes pilot

Internal tooling, test generation, doc updates

Treat every agent failure as a system problem

Don't blame the model — fix the environment

Want AI work reliably for us?
But first, we need to work for AI —
through better context, better scaffolding, better environments.
That's harness engineering.

"Our most difficult challenges now center on designing environments,
not expecting better models to solve everything."

— OpenAI Codex Team

My journey toward harness engineering

Building context engine for data-workspace

DnA — making our repo agent-navigable with structured context

PoC for session recording

Work-related — exploring agent-assisted capture and replay

Production web app — zero manual code

Hobby project: Danish exam prep app, fully AI-generated

danskprep.vercel.app

HarnessEngineering

Agents, humans, and the harness

Trial → error → better harness

Transforming to harness engineering?

Transforming to harness engineering?

Let's be honest

Where do we start?

My journey toward harness engineering

Harness
Engineering