Scrolling logs. Progress bar. One person watching.
We're actually getting there.
Agents, humans, and the harness
Trial → error → better harness
1000-line AGENTS.md
→
~100-line map· discover on demand
Manual cleanup Fridays
→
Automated GC· agents scan + fix drift
Context in Slack & Docs
→
Everything in the repo· single source of truth
"When the agent struggled, they didn't try harder — they encoded what was missing."
Transforming to harness engineering?
Changed skills and requirements from engineers
Environment Design
Structure repos + tooling for agent navigation
Prompt Decomposition
Break goals into agent-tractable units
Feedback Loops
Linters that teach fixes Tests that enforce architecture
Gap Diagnosis
Agent fails? Find what's missing, then encode it
Transforming to harness engineering?
How to get there
For engineers
Mindset shift — more architectural thinking
Start small — cleanup, tests, docs
Learning and sharing
Debugging skills transfer directly
What we need from leadership
Celebrate environment design wins
Tooling time = product time
Expect a slow start — it's investment
Track how improvements reduce future failures
Let's be honest
Long-term coherence
5 months proven. Years? Unknown.
Human leverage points
Where does our judgment add the most value?
Model capability growth
Today's constraints: unnecessary or critical?
Discipline hasn't disappeared — the hard work shifted from writing code to designing the systems that make agents reliable.
Where do we start?
Bring knowledge into the repo
It may live in Confluence, Teams, or someone's head
Make existing tools agent-readable
Monitoring, CI, linting — we have them, now make them accessible to agents
Pick a low-stakes pilot
Internal tooling, test generation, doc updates
Treat every agent failure as a system problem
Don't blame the model — fix the environment
Want AI work reliably for us? But first, we need to work for AI — through better context, better scaffolding, better environments. That's harness engineering.
"Our most difficult challenges now center on designing environments, not expecting better models to solve everything."
— OpenAI Codex Team
My journey toward harness engineering
Building context engine for data-workspace
DnA — making our repo agent-navigable with structured context
PoC for session recording
Work-related — exploring agent-assisted capture and replay