Mechanical Architecture Enforcement for Agents
Summary
In agent-generated codebases, architecture constraints must be enforced mechanically — via custom linters, structural tests, and CI validations — not through documentation alone. The critical innovation: linter error messages are written as remediation instructions that get injected directly into agent context when a violation occurs, turning failures into teaching moments.
OpenAI’s harness engineering team enforces a rigid layer model (Types -> Config -> Repo -> Service -> Runtime -> UI) where code within each business domain can only depend “forward” through the layers. Cross-cutting concerns (auth, telemetry, feature flags) enter through a single explicit interface: Providers. Everything else is disallowed. Custom linters (themselves generated by Codex) validate these boundaries, plus naming conventions, structured logging, file size limits, and platform-specific reliability requirements.
The insight that makes this powerful: rules that feel pedantic in a human-first workflow become multipliers with agents. Once encoded, they apply everywhere at once. The inverse is also true — documentation alone rots, agents can’t distinguish current rules from stale ones, and a monolithic instruction file becomes what the team calls “an attractive nuisance.”
How to Apply
When to use: Any codebase where agents contribute significant code volume. The higher the agent throughput, the more critical mechanical enforcement becomes — drift is proportional to throughput.
When not to use: Small prototypes or exploratory work where architectural rigidity would slow learning. This technique pays off at scale, not during initial experimentation.
Implementation steps:
- Define the layer model: Establish explicit dependency directions between architectural layers. Make it a simple forward-only graph — agents reason better about strict hierarchies than nuanced webs.
- Encode as linters: Write custom lint rules that validate the layer model. The linters themselves can be agent-generated. Focus on dependency direction, not implementation style.
- Design error messages for agents: This is the key innovation. Standard linter errors (“import not allowed”) are unhelpful. Instead, write messages that include the why and the fix: “Service layer cannot import from UI layer. Move shared types to the Types layer, or use a Provider interface for cross-cutting concerns.”
- Add structural tests: Beyond linting, write tests that verify architectural invariants — e.g., “no package in the Repo layer has a dependency on the Service layer.”
- Enforce in CI: Make violations blocking. Agents iterate cheaply on CI failures; humans don’t need to review every architectural decision manually.
- Separate boundaries from style: Enforce boundaries centrally; allow autonomy locally. Care deeply about correctness, not about whether the agent’s variable naming matches human preferences. The output need not match human stylistic preferences — it needs to be correct, maintainable, and legible to future agent runs.
The enforcement escalation ladder: When documentation falls short, promote the rule into code. Start with docs, observe violations, encode as a linter, and eventually encode as a structural test. Each step increases enforcement strength and reduces human review burden.
Sources
From: 2026-02-13 Harness Engineering Leveraging Codex
Key quote: “In a human-first workflow, these rules might feel pedantic or constraining. With agents, they become multipliers: once encoded, they apply everywhere at once.” Attribution: Ryan Lopopolo, OpenAI What this source adds: The concrete layer model (Types -> Config -> Repo -> Service -> Runtime -> UI), the insight that custom linter error messages function as agent remediation instructions, and the principle of separating architectural enforcement from implementation style. Links: Original | Archive
Related
- OAI Harness Engineering — The methodology that this enforcement technique enables; without mechanical enforcement, harness engineering degrades rapidly
- Agent Entropy Management — Enforcement prevents new violations; entropy management catches drift in existing patterns
- Structured Context Loading — Enforcement is a complement to context loading: context tells agents what to do; enforcement catches when they don’t