Context Engineering for AI in Legacy Code

Context window bloat kills AI agents in legacy code. Small clean projects? Agents work fine. Drop them into a decade-old enterprise system? Performance collapses. The model drowns in irrelevant tokens.

Dex Horthy at HumanLayer diagnosed the problem in No Vibes Allowed: Solving Hard Problems in Complex Codebases—context engineering fixes what model upgrades can't. Watch: https://www.youtube.com/watch?v=rmvDxxNubIg.

The ceiling in legacy systems isn't model quality—it's context discipline. Teams that invest in compaction tooling, sub-agent orchestration, and human oversight at validation points sustain velocity. Those that dump full context and hope for the best watch quality degrade until AI becomes a net negative.

The Dumb Zone

Context window is the LLM's only state. Fill it with noise—verbose logs, unrelated files, full git history—quality drops fast. Horthy identifies the threshold: 40% low-value tokens pushes agents into the "dumb zone."

The only way to get better performance out of an LLM is to put better tokens in and then you get better tokens out. — Dex Horthy, HumanLayer

Garbage in, garbage out. True for agents. More true than traditional code.

Intentional Compaction

Frequent, deliberate compression of relevant facts. File paths, line ranges, test results—structured, concise format. Every token earns its place.

This keeps agents in the "smart zone" where reasoning quality holds. Casual context dumps guarantee dumb zone operation.

Sub-Agents: Context Controllers, Not Role Players

Sub-agents aren't about mimicking team structures. They're about isolating context pollution.

One agent explores the codebase. Finds relevant files. Passes only those snippets to the main agent. Exploratory noise stays contained. Main agent window stays clean.

Sub agents are not for anthropomorphizing roles. They are for controlling context. — Dex Horthy, HumanLayer

Wrong reason: "Let's have a researcher agent and an implementer agent like a real team!"

Right reason: "Exploration generates tons of irrelevant context. Isolate it."

Research-Plan-Implement: Operationalizing Context Control

Three phases with explicit context boundaries:

Research: Gather objective facts. No speculation. No implementation. Output: structured findings.

Planning: Turn facts into explicit steps with code snippets. Embedded context. Output: executable plan.

Implementation: Execute with minimal context—just the plan. No research artifacts, no exploration logs.

Human oversight at validation points. High-leverage corrections. Small fixes prevent large cascading errors downstream.

Technical Considerations

Context window is finite—irrelevant tokens directly reduce reasoning quality
Intentional compaction requires tooling to extract and format only relevant code and outputs
Sub-agents add orchestration complexity but reduce main agent token load
RPI workflow benefits from automated context packaging between phases
Oversight checkpoints must be integrated without breaking agent autonomy

Business Impact & Strategy

Reduces wasted compute from reworking low-quality AI output
Preserves code quality in long-lived systems without slowing delivery
Improves developer trust in AI tools by avoiding churn and "mystery changes"
Scales AI contribution to cross-repo features without exponential context cost
Enables leaders to review intent and rationale without reading every line of code

Key Insights

AI agents degrade when context windows hold too much noise
Frequent intentional compaction keeps token usage focused and productive
Sub-agents are context controllers, not role-players
RPI workflow structures research, planning, and execution for reliability
Human oversight must target the highest-use points
Mental alignment prevents silent divergence in large teams

Why This Matters

Agents without context discipline generate churn. They fix their own bad output instead of solving problems. The failure isn't the model—it's the orchestration.

Legacy systems demand context engineering. Clean greenfield projects tolerate sloppy context. Mature systems don't. The complexity overwhelms agents unless you actively manage their context window.

Human validation at the right points—research and planning phases, not just final output—prevents weeks of cascading rework. Catch errors early when context is clean. Fix them cheap.

Actionable Playbook

Compact context before every agent call: Pass only relevant files, lines, and outputs; measure token count reduction
Run RPI for complex changes: Separate research, planning, and implementation; track defect rate after each phase
Use sub-agents for exploration: Limit their output to high-signal context; monitor main agent token load
Embed rationale in reviews: Require plans with code snippets and reasoning; measure review time savings
Scale process to task complexity: Apply full compaction only to high-impact, cross-repo features

What Works

Compact context before every agent call. Pass only relevant files, line ranges, outputs. Measure token count reduction. Target: high signal-to-noise ratio.

Separate research, planning, implementation explicitly. Clear boundaries prevent context pollution between phases. Research stays research. Implementation gets clean plans, not exploration artifacts.

Use sub-agents to isolate expensive operations. Exploration, search, analysis—contain the context explosion. Pass results, discard the process.

Embed rationale in planning artifacts. Code snippets with reasoning. Makes reviews fast and corrections targeted.

Build tooling for context packaging. Manual compaction doesn't scale. Automate extraction, formatting, structuring. This is infrastructure, not nice-to-have.

Context quality determines agent quality in legacy systems. No model upgrade fixes bad context. This works when you invest in compaction tooling and orchestration discipline. Skip it and agents fail expensively.

Full discussion: https://www.youtube.com/watch?v=rmvDxxNubIg.

The Dumb Zone

Intentional Compaction

Sub-Agents: Context Controllers, Not Role Players

Research-Plan-Implement: Operationalizing Context Control

Technical Considerations

Business Impact & Strategy

Key Insights

Why This Matters

Actionable Playbook

What Works

Related Articles

Advanced Context Engineering for AI Agents

2025: The Year AI Evaluation Goes Board-Level

Building Robust Agentic AI at Enterprise Scale

Explore by Topic

context-engineering(1 articles)