Building Effective AI Agents with Restraint

In the rush to adopt AI, the temptation is to put agents everywhere. Barry Zhang of Anthropic argues for a different path: deliberate restraint. His framework starts with knowing when not to build an agent and ends with a disciplined approach to design and iteration.

This is not a sales pitch for agents. It’s a decision-making guide for leaders who want impact without runaway complexity, cost, or risk. If you’re evaluating where and how agents fit in your stack, Zhang’s approach offers a clear lens.

Main Story

Zhang’s first principle is simple: don’t build agents for everything. Agents shine in ambiguous problem spaces with high-value outcomes. But they carry higher costs, more latency, and steeper error consequences than static workflows. When the decision tree is clear, workflows are cheaper, faster, and easier to control.

He offers a checklist for spotting agent-worthy use cases:

Task complexity — agents thrive on ambiguity; workflows fit predictable flows
Task value — justify exploration costs with high-value outputs
Critical capabilities — identify and de-risk technical bottlenecks
Cost of error and discovery — limit autonomy in high-stakes, hard-to-detect error spaces

Coding is his favorite example: complex, high-value, already partly automated, and easy to verify via tests.

His second principle: keep it simple. An agent is just a model using tools in a loop within an environment. Three components matter most:

Environment — the system where the agent operates
Tools — the action interface with feedback
System prompt — goals, constraints, and ideal behavior

“Any complexity up front is really going to kill iteration speed.”

Multiple use cases can share the same backbone, with only tools and prompts changing. Optimize for cost, latency, or UX trust only after these core elements are solid.

Third, think like your agent. Most operate with a 10–20k token context window — a narrow, imperfect view. Missteps often trace back to missing or incoherent context. Zhang likens it to using a computer with your eyes closed for a few seconds between actions. This empathy exercise exposes missing inputs and informs better design.

He also recommends using the model itself to critique its instructions, tool descriptions, and decision paths. This surfaces ambiguities and gaps, though it should complement, not replace, human judgment.

Technical Considerations

For engineering leaders, Zhang’s framework translates into concrete constraints and trade-offs:

Latency and throughput: Agents in loops are slower than direct workflows; each step adds model call latency
Cost discipline: Token usage can spike; track and enforce budgets for time, money, and tokens
Error handling: Ambiguity increases the need for verification layers; critical in high-stakes domains
Iteration speed: Overbuilding early slows learning; start minimal and adapt
Context window limits: Design tools and environments to surface only the most relevant state
Tooling: Keep interfaces simple and feedback clear; avoid premature integrations
Security and privacy: Restrict agent actions and data exposure, especially in production systems
Vendor risk: Ensure you can swap or upgrade models without re-architecting the agent

The backbone approach — environment, tools, system prompt — provides a stable surface for iteration while limiting complexity creep.

Business Impact & Strategy

For business leaders, the implications are direct:

Time-to-value: Starting small accelerates deployment and learning cycles
Cost vectors: Agents can be more expensive to run; use the checklist to avoid low-value deployments
KPIs: Measure impact not just by automation rate but by error reduction, latency, and cost per task
Org design: Human-in-loop designs can de-risk but may reduce scalability; staff accordingly
Risk management: Scope autonomy to match error tolerance; design for verifiability in critical tasks
Evaluation criteria: Use ambiguity, value, and error cost as gating factors before greenlighting agent projects

Zhang’s open questions — budget awareness, self-evolving tools, multi-agent collaboration — hint at where competitive advantage may emerge next.

Key Insights

Agents are best for ambiguous, high-value problems; workflows win in predictable domains
Start with the backbone: environment, tools, and system prompt
Complexity kills iteration speed; resist overbuilding early
Think like your agent to uncover context gaps and design flaws
Use the model for self-diagnosis, but keep humans in the loop

Why It Matters

For technical teams, this is a blueprint for avoiding wasted cycles on over-engineered agents that underperform. For business leaders, it’s a filter to invest only where agents can deliver measurable returns. Across roles, the message is the same: clarity on when to deploy, simplicity in design, and empathy for the agent’s limitations drive better outcomes.

Conclusion

Zhang leaves us with three imperatives: don’t build agents for everything, keep them simple as long as possible, and think like your agent. In an era of AI hype, this discipline is a competitive edge.

Watch the full talk here: https://www.youtube.com/watch?v=D7_ipDqhtwk

Main Story

Technical Considerations

Business Impact & Strategy

Key Insights

Why It Matters

Conclusion

Related Articles

The Forward Deployed Engineer Playbook for AI

Building AI Agents with a True Cognitive Core

Operationalizing AI Agents at Scale

Explore by Topic

AI-agents(3 articles)

product-strategy(1 articles)

engineering-leadership(1 articles)