📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A recent Google whitepaper emphasizes that in AI-assisted software development, the actual model contributes just 10% to system behavior. The key lies in harness design and context engineering, which are critical for reliable, cost-effective AI systems.

A new whitepaper from Google, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, states that the AI model itself accounts for only about 10% of a system’s behavior. The document argues that the real mastery in AI development lies in harness design, configuration, and context engineering, which collectively determine the system’s effectiveness and cost-efficiency.

The whitepaper, titled The New SDLC With Vibe Coding, underscores that the dominant factors influencing AI system performance are the harness—the prompts, tools, rules, and observability layers surrounding the model—and how these are engineered. Evidence from public benchmarks shows that tweaking harness components can dramatically improve AI performance without changing the model itself.

According to the authors, most failures or misbehaviors in AI agents are due to configuration issues, such as missing tools or vague rules, rather than the model’s capabilities. This shifts focus from chasing the latest model to mastering the surrounding architecture, which is more accessible and controllable for organizations.

The whitepaper also discusses the economic implications, noting that ad-hoc prompting (vibe coding) appears cheap initially but incurs high long-term costs—including token burn, maintenance, and security risks. Conversely, disciplined, structured approaches—referred to as agentic engineering—require higher upfront investment but offer lower marginal costs over time.

At a glance

reportWhen: published March 2026

The developmentThe whitepaper introduces a paradigm shift in SDLC, highlighting that the core of AI system success is in configuration, verification, and context, not the underlying model.

The Model Is Only 10% — The New SDLC With Vibe Coding

AI Dispatch · Field Notes

Google · Osmani, Saboo & Kartakis · May 2026

The model is only 10%

A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.

A spectrum, not a binary — the differentiator is how outputs get verified

Vibe Coding

Casual prompts · “does it seem to work?” · disposable code · high risk

Structured AI-Assisted

Detailed prompts + constraints · manual testing · features in real codebases

Agentic Engineering

Formal specs · automated tests + evals + CI gates · production scale · low risk

Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.

The idea worth building your strategy around

Agent = Model + Harness

~10%

HARNESS — prompts · tools · context · hooks · sandboxes · observability

MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S

Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.

“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.

The economics: it’s a token-cost problem (CapEx vs OpEx)

Vibe Coding

Low CapEx · High OpEx

Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.

Agentic Engineering

High CapEx · Low OpEx

Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.

85%

of devs use AI coding agents (51% daily)

41%

of all new code is AI-generated

~90%

of agent behavior is the harness, not the model

+19%

longer on some tasks (METR) — verification is the cost

The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.

thorstenmeyerai.com

Why Focus on Harness and Context Engineering

This shift in understanding impacts how organizations should invest in AI development. Instead of prioritizing access to the latest models, companies should focus on building robust harnesses and effective context management. Doing so can lead to more reliable, secure, and cost-efficient AI systems, especially as the AI landscape becomes more complex and integrated into critical workflows.

Amazon

AI model validation tools

As an affiliate, we earn on qualifying purchases.

Evolution of AI Development Practices in 2026

Since early 2026, AI adoption has accelerated, with 85% of developers using AI coding agents regularly, and over 41% generating AI code daily. The industry has largely focused on model improvements, but the whitepaper challenges this trend by highlighting that the model’s contribution is minimal compared to harness design. Prior efforts often overlooked the importance of configuration, leading to costly failures and security issues.

This perspective aligns with recent benchmarks demonstrating that small changes in harness components can produce outsized performance gains, emphasizing a practical shift from model-centric to architecture-centric development.

“The model is only 10% of what determines behavior; the harness is 90%. Our focus should shift accordingly.”
— Addy Osmani

Amazon

AI system configuration software

As an affiliate, we earn on qualifying purchases.

What Aspects of Harness Design Remain Unclear

While the whitepaper convincingly shows that harness configuration is critical, it does not specify exactly which harness components yield the greatest performance improvements across different domains. The precise methodologies for scalable context engineering and the best practices for automation are still emerging and require further empirical validation.

Amazon

AI observability and monitoring tools

As an affiliate, we earn on qualifying purchases.

Next Steps for AI Development and Adoption

Organizations should prioritize investing in harness architecture, including tools for context management, verification, and observability. Future research and industry practices are likely to focus on standardizing best practices for configuration and context engineering, as well as developing automated tools to optimize these aspects. Monitoring how these strategies impact system reliability and cost over time will be essential.

Amazon

AI prompt engineering tools

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is the model only 10% of system behavior?

The whitepaper argues that the surrounding harness—prompts, tools, rules, and context—dominates the system’s behavior, making the model’s contribution relatively small.

How can organizations improve AI system reliability?

By focusing on harness design, configuration, verification, and context management, organizations can create more predictable and secure AI systems.

Does this mean model improvements are no longer important?

Model improvements remain valuable, but the whitepaper emphasizes that most performance gains and reliability issues stem from harness and configuration, which are more controllable and cost-effective to optimize.

What is meant by ‘agentic engineering’?

It refers to disciplined, structured AI development involving careful design of prompts, tools, and context, with rigorous verification, rather than ad-hoc or vibe coding approaches.

Will this shift change AI development costs?

Yes, disciplined harness and context engineering may have higher initial costs but lead to lower long-term operational costs and better security.

Source: ThorstenMeyerAI.com

The Model Is Only 10%: The Real Lesson of the New SDLC

Up next

Cutrova: Edit the Words, Not the Timeline

Author

Get an Insight Team

Share article

The model is only 10%