Matt Boyle/March 2, 2026AI

The enterprise agent problem Claude Code wasn't built to solve

Claude Code is a great thin harness. Enterprises need the layer underneath.

Claude Code, Codex, Gemini Code Assist, Qwen Code: the list of capable coding agents grows every month. The models behind them are converging fast: Qwen 3.5 just matched Sonnet 4.5 on key benchmarks with open weights at a fraction of the cost. Capability is no longer the bottleneck. The harder question, and the one most enterprises are stuck on, is how to adopt any of this safely, at scale, without locking into a model or vendor that might not be the right choice six months from now.

In the middle of all this, Claude Code has earned its mindshare. It's a fast, thin interface that gives strong engineers direct leverage on a strong model. We support it, and many customers run Claude Code inside Ona environments today.

But a number of those same customers also choose Ona Agent alongside Claude Code, often as the default for their engineering team. This post explains why, and why we think the real answer for enterprises is "both."

Thin harnesses are powerful, but they assume a power user

First, a quick grounding: Anthropic's Building Effective Agents guide defines agents as systems where LLMs dynamically direct their own processes and tool usage. In software engineering, that means AI that can read your codebase, make changes, run tests, and open PRs. Claude Code is one. Ona Agent is another. What separates them is what sits around the agent: the environment, the controls, and the organizational visibility.

Anthropic's own framing of Claude Code is direct. In a Latent Space interview, Claude Code's creator Boris explains:

"All the secret sauce, it's all in the model. And this is the thinnest possible wrapper over the model."

That principle is correct. Thin harnesses don't fight the model, and power users love them for it.

The issue is that thin harnesses assume the user is the safety system. Power users can steer, debug, and self-govern. Enterprises can't rely on that. If agents are going to be part of the default engineering workflow, they need to work for the median engineer, with consistent environments, predictable execution, and guardrails that don't depend on individual expertise.

Capability is solved. Governance isn't.

Claude Code, Cursor, Copilot and others make individual developers faster. Nobody disputes that. The enterprise questions are different:

Claude Code is an agent experience optimized for the individual. It's a CLI tool that runs on a developer's laptop. But the future of agent work moves beyond interactive sessions on localhost. As we wrote in The Last Year of Localhost, background agents running at scale across a software assembly line can't run on someone's machine. They need cloud-based, reproducible environments, and the organization needs a control plane on top. Ona provides that.

The platform: where any agent becomes enterprise-grade

Whether you're running Claude Code, Codex, or Ona Agent, the underlying problems are the same: the agent needs a secure, reproducible environment to execute in, and the organization needs visibility and control over what it does.

This is what Ona's platform solves, regardless of which agent you choose:

Reproducible execution. An agent is only as autonomous as its ability to verify work. Ona environments give agents a known-good starting state built on concrete primitives:

Any agent running inside Ona, Claude Code included, gets this for free.

Security and compliance. Runtime enforcement, not prompt-level guardrails. Command deny lists that block risky actions even if the agent tries to route around them. Audit logging of every execution. RBAC so both human and agent permissions are explicit and least-privilege. This matters in regulated industries because governance has to be something you can actually enforce.

Enterprise infrastructure controls. The platform is modular where it needs to be: network policies that restrict agent connectivity and configurable execution boundaries. These are platform primitives, not agent features, and they're what let your CISO say "yes" instead of "not yet."

Cost controls and visibility. Team-level cost caps, usage insights across agents and engineers, and org-level policies, so AI scales without surprise bills or sprawl.

Orchestration at scale. Enterprise value shows up when you can run a workflow across a fleet: patch a vulnerability across hundreds of repos, apply a consistent change, open PRs, and track progress centrally, all in isolated environments with bounded blast radius. Any agent running on the platform benefits from this orchestration layer.

Where Ona Agent goes further

Running Claude Code inside Ona gets you everything above. Ona Agent goes a step further by integrating the agent and the platform into a single loop.

Because we built both the agent and the environment it runs in, we can do things a thin wrapper can't:

The labs are already moving in this direction themselves. The Codex desktop app and Anthropic's Cowork both moved beyond the CLI toward richer, more supervised interfaces. Ona was built for this category from the start.

Background agents are where the real value compounds

The biggest shift in how engineering teams use agents isn't about which one they use interactively. It's about what happens when agents run continuously in the background.

Stripe's Minions merge over a thousand agent-authored PRs per week. Ramp's background agent accounts for over half of all merged PRs. At Ona, agents authored 88.5% of the PRs we merged on main last month. None of that is interactive pair-programming. It's agents running autonomously in cloud environments, doing work while your team does other things.

Automations take this a step further: work starts without a human kicking it off. They're fully autonomous workflows that pick up work from your backlog, your monitoring tools, or your codebase itself. A Sentry automation triages issues overnight and has PRs ready by morning. A docs automation scans for code changes daily and opens PRs when documentation drifts. A CVE remediation workflow detects vulnerabilities on a schedule and ships the fix. Each run executes in the same full environment your engineers use, in parallel across repositories, with human review before anything merges.

This is the capability that doesn't exist on localhost, and it's where the gap between individual developer tools and an enterprise platform becomes clear.

Don't standardize on an interface. Standardize on the layer underneath.

What you standardize on is the layer that turns any agent into an enterprise-grade workflow: identity, permissions, guardrails, audit logs, cost caps, and repeatable verification. We eval new models continuously against real enterprise workflows. When one beats the baseline, we ship it within days. Model upgrades should be an operational rollout, not a re-platforming project.

We've always been neutral in vendor wars: GitHub/GitLab/Bitbucket, Jira/Linear, Cursor/VS Code/JetBrains. Agents are heading the same way. Use Claude Code, Codex, Ona Agent, or whatever comes next. Run it in an environment where you keep optionality, governance, and control.

Join 440K engineers getting biweekly insights on building AI organizations and practices

This website uses cookies to enhance the user experience. Read our cookie policy for more info.