AWS Runner Releases - Ona Documentation

20260701.541

July 1, 2026

Repository authentication and reliability improvements

This release improves repository handling, agent SCM tool authentication, and AWS runner maintenance paths. Azure DevOps repositories without a default branch now return a clear validation error, GitLab and Bitbucket agent tools can use proxied SCM API credentials more reliably, and AWS runner volume throughput cleanup is more consistent. No infrastructure upgrade is required.

Improvements

Empty Azure DevOps repositories now return a validation error explaining that the repository needs an initial commit, instead of failing during context parsing.
GitLab and Bitbucket agent SCM tools now receive proxied API credentials more reliably, including GitLab API auth and Bitbucket Cloud API auth used outside normal Git operations.
Repository authentication token lookup now includes the target repository URL, preventing a token resolved for one repository from being reused for another repository on the same host.
Runner-managed environment secrets are scoped to the runner more consistently across prebuild and legacy runner-created environment paths.
AWS runner volume throughput reconciliation now scans runner-tagged gp3 EBS volumes and reverts volumes that remain above baseline after the grace period.
AWS VM images and EC2 runner build dependencies were refreshed as part of routine security maintenance.

20260625.921

June 25, 2026

Startup, diagnostics, and Bedrock BYOK improvements

This release improves AWS runner startup paths, support diagnostics, and OpenAI-compatible BYOK endpoints on AWS Bedrock. Environments can begin devcontainer work sooner because Docker and BuildKit warm during VM startup, and automation services start sooner after the devcontainer becomes ready. No infrastructure upgrade is required.

Improvements

EC2 environments warm Docker and BuildKit during VM startup, reducing the time spent waiting for container runtime services when devcontainer setup begins.
Automation services start about 1.5-2.0 seconds sooner after the devcontainer becomes ready by removing extra control-plane and event-stream waits from the startup path.
Support bundles include boot-related unit logs and sanitized Codex LLM endpoint diagnostics, giving support more context for startup and agent issues without including conversation content.
Runner status updates from the supervisor are serialized through reconciliation, reducing stale or inconsistent runner status reporting.
Azure DevOps identity materialization failures are classified as credential issues, so affected repository starts can follow the authentication path instead of surfacing as internal errors.
AWS Bedrock Mantle OpenAI-compatible BYOK endpoints now map OpenAI model names correctly for requests, pricing, context-tier handling, and usage reporting.

20260623.498

June 23, 2026

Hotfix for critical OpenSSL CVE findings

This hotfix release refreshes the AWS runner and proxy base images to remove fixable critical OpenSSL findings in libcrypto3 and libssl3. It also fixes custom CA bundle startup across reboots by reusing an already installed bundle instead of retrying the original download URL. No infrastructure upgrade is required.

Improvements

AWS runner and proxy images now include libcrypto3 and libssl3 version 3.6.3-r1, resolving the fixable critical CVE-2026-34182 findings reported against 3.6.2-r5.
CA bundle installation during EC2 startup is now idempotent across reboots, preventing reused one-time download URLs from breaking startup when the bundle already exists.
EC2 volume throughput reverts now retry after transient AWS or cache failures instead of dropping the retry marker too early.
The EC2 orchestrator now uses containerd v1.7.33 for a high-severity dependency fix.

20260622.850

June 22, 2026

Startup and pool reliability improvements

This release improves EC2 runner startup paths and pool behavior. Devcontainer startup avoids repeated seed-image imports on pre-seeded disks, EC2 warm pools adapt better to demand, and environments recover more reliably from disk-layout and timeout edge cases. No infrastructure upgrade is required.

Improvements

Devcontainer startup avoids repeated Dockerfile frontend image imports on pre-seeded EC2 disks, removing about 1.0-1.4 seconds from the “Creating dev container” path for those environments.
EC2 instance and disk pools now adjust to recent claim demand and use higher stable capacity for common pools, reducing cold starts during changing workload patterns.
Dual-disk prebuilds now fall back to a single-disk cold start when dual disk is unavailable, so affected starts continue instead of failing.
EC2 cloud-init can finish before claim-time data disk attachment, improving startup reliability for pooled instances.
Orphaned and disabled EC2 pool instances are cleaned up more reliably, reducing leaked warm capacity.
Agent executions recover when an environment times out while the in-environment agent is still running.
Agent binary downloads and seeded artifacts resolve through release manifests and the runner channel, improving startup reliability on custom domains and channel-specific runners.
A critical cryptography dependency was updated.

20260608.681

June 8, 2026

Security and reliability updates

This release updates security-sensitive runner, proxy, and VM image components and fixes several environment startup and agent reliability issues. No infrastructure upgrade is required.

Improvements

VM image components were rebuilt with patched Go toolchains, including updates for Go-built binaries and Docker BuildKit.
Docker, containerd, Docker Compose, and Docker Engine were updated to address critical vulnerabilities in the environment image.
Runner Go dependencies were updated for security fixes across networking, cryptography, tracing, storage, and SSH-related packages.
The runner proxy no longer includes the unused Nebula gateway mode, removing a vulnerable dependency from the proxy binary.
Agent requests reconnect faster after transient runner connection issues.
Agent conversation archives converge more reliably after interruptions.
Automation task failures now propagate the failing step exit code instead of marking the step as done.
Automation services reconcile faster after service status changes and systemd job completion.
Environment startup is more reliable when SSH proxy port binding, devcontainer defaults, or workspace folder status reporting are involved.

20260528.814

May 28, 2026

Patch release

This is a patch release with tagging improvements and reliability fixes. No infrastructure upgrade is required.

Improvements

AWS Marketplace revenue attribution tags now propagate to environment VMs, instead of only CloudFormation-managed resources.
File content reads (regular reads, read-only mmap, mprotect) can now be audited or blocked through veto-file security policies.
Library lookup caching in the security agent is shared across processes with the same root identity, reducing startup overhead.
Agent conversation history loads more reliably with a longer initial timeout and smaller page sizes.
MCP configuration is retried during agent startup, preventing sessions from launching without tool integrations when setup is briefly delayed.

20260527.677

May 27, 2026

Horizontal scaling for AWS runners

AWS runners now scale better with load. Starting with this release, runner and proxy horizontal scaling are generally available for AWS runners.Small runners start with one runner replica and can scale up to 8 replicas. Large runners start with two runner replicas and can scale up to 16 replicas. The proxy also scales horizontally, up to 8 replicas on small runners and 16 replicas on large runners. This adds capacity for busy runners without changing environment VM sizing.

Infrastructure upgrade required

This upgrade does not cause downtime. The CloudFormation stack update enables runner and proxy scaling and hardens metrics sidecar task definitions. Running environments are not affected.

The upgrade updates the Fargate service scaling configuration, adds required Application Auto Scaling permissions, and sets the ADOT metrics sidecar containers to use read-only root filesystems.To upgrade, go to Settings > Runners, select your runner, open the three-dot menu, and click Upgrade runner. See the upgrade documentation for step-by-step instructions.

What else is in this release

Improvements

VM setup downloads now retry transient network failures and log better diagnostics when a component download fails.
Ona agent sessions now wait for devcontainer rebuilds and restarts before reconnecting, preventing avoidable connection failures.
Live agent streams now include request-level logs for connections, state events, and disconnects, making stuck streams easier to diagnose.
Container service status is updated when the devcontainer stops, preventing stale RUNNING states.
Runner sync and event handling use larger batches and buffers, reducing backend load at high environment counts.
Deleted runner cleanup now covers orphaned EBS volumes and snapshots, and reverts boosted EBS IOPS and throughput before cleanup.
Security updates refresh the EC2 orchestrator base image and containerd dependencies.

20260520.942

May 20, 2026

Increased runner capacity for large deployments

Large runners now use 16 vCPU and 32 GB of memory, up from 8 vCPU and 16 GB. This fixes CPU saturation observed on busy runners handling hundreds of concurrent environments. No configuration changes are needed. The new sizing takes effect after the infrastructure upgrade.

Infrastructure upgrade required

This upgrade does not cause downtime. The CloudFormation stack update adds new scaling policies and adjusts task sizing. Running environments are not affected.

The upgrade updates the Fargate task definition and adds a memory-based scaling policy for the proxy service.To upgrade, go to Settings > Runners, select your runner, open the three-dot menu, and click Upgrade runner. See the upgrade documentation for step-by-step instructions.

What else is in this release

Improvements

Proxy service now scales on memory utilization in addition to CPU, preventing exhaustion from long-lived connections.
Shell history is now shared across terminal tabs within the same environment for bash and zsh.
Ona agent sessions resume immediately when a devcontainer rebuild finishes.

20260518.63

May 18, 2026

Availability Zone capacity fallback

Environment launches now automatically retry in a different Availability Zone when one runs out of EC2 capacity, instead of failing immediately. Fallback subnets are tried in random order to distribute load evenly. This eliminates a class of launch failures observed during high-concurrency workloads where a single AZ exhausts its instance capacity while others remain available.

What else is in this release

Improvements

Environments are no longer incorrectly reported as stopped during shard handoff on multi-replica runners, preventing orphaned VMs.
Agent executions are no longer orphaned or duplicated after shard handoff. Reconcilers are drained on lost shards and pending work is re-discovered on the new owner.
Environments stopped by the disconnected timeout now restart correctly when a user sends a new message to an agent, instead of hanging indefinitely.
Supervisor restart no longer fails when orphaned child processes hold the SSH proxy port. All processes in the supervisor cgroup are now terminated on stop.
Load balancer health checks verify the proxy is serving HTTP responses instead of only checking TCP connectivity.
Security dependency upgrades address critical and high-severity CVEs in pgx, go-jose, jsonparser, and OpenTelemetry exporters.

20260514.611

May 14, 2026

Security: Ubuntu 26.04 and CVE reduction

Environment VMs now run Ubuntu 26.04 with kernel 7.0, reducing total CVEs from 6,731 to 275 (96% reduction). The Docker stack is bumped to 29.4.3, BuildKit to v0.29.0, and all rootfs binaries are compiled with Go 1.25.10, fixing 12 additional Go stdlib CVEs.

What else is in this release

Improvements

Environments no longer get stuck in STOPPING state. Snapshot preparation gives up after 10 minutes on transport errors instead of retrying indefinitely, and batch stop failures fall back to stopping instances individually.
Environments stopped by the disconnected timeout are no longer restarted by the agent reconciler, fixing a ~36-minute bounce loop.
On dual-disk runners, the data disk resize now completes before content initialization starts, fixing ENOSPC errors with large container images.
Warm pool claims work correctly across workers on multi-replica runners, preventing unnecessary cold launches.
Prebuild environments start with a clean data disk instead of inheriting stale base snapshots.
Bitbucket repository search and organization listing work again after Bitbucket deprecated cross-workspace APIs.
Agent goal status now reaches the dashboard correctly.
Inline image data in agent conversations is offloaded to blob storage before entering live streams and history, reducing bandwidth.
Runner updates apply with zero downtime. New Fargate tasks are healthy before old ones drain.
Agent executions are picked up immediately after shard handoff on multi-replica runners, instead of waiting up to 1 hour.
Agent conversation streams are protected against corruption during shard handoffs on multi-replica runners.
Agent conversation history loads up to 10x faster for long conversations.

20260508.745

May 8, 2026

Faster startup and credential redaction

Environment startup is faster. Disk warming for startup-critical paths now runs in parallel, host binaries (docker, containerd, runc, node, buildkitd) are pre-warmed alongside data disk paths, and warm pool scaling targets adapt dynamically to EBS snapshot size so large prebuilds are fully hydrated before instances are claimed.Credentials printed to process output (AWS keys, GitHub tokens, bearer tokens, basic-auth URLs) are now redacted before they reach environment status messages, on-disk state, logs, and tracing spans.This release also patches CVE-2026-5450 (Critical, glibc) along with four High-severity CVEs in glibc and OpenSSL via a base image digest bump.

What else is in this release

New

Automation services support a configurable readiness timeout. Environments where the supervisor fails to start are now stopped instead of hanging indefinitely.
The SCM organization list in the project creation flow supports pagination and search for GitLab.

Improvements

Prebuild snapshots correctly take precedence over base snapshots on dual-disk environments, fixing cases where prebuild data was discarded.
The prebuild executor’s git identity is cleared from the data disk before snapshot, preventing identity leakage to environments started from that prebuild.
Binary downloads use atomic writes to prevent truncated files. SHA-256 mismatches are retried automatically.
The supervisor recovers from stale git config lock files left after an unclean shutdown, instead of entering a panic loop.
File watch self-healing for the security agent works correctly in Docker-in-Docker environments, including after devcontainer rebuilds.
AWS DescribeImages API calls are scoped to owned AMIs, reducing hundreds of paginated API calls per sync cycle to a handful.
The runner proxy auto-scales (2-5 replicas) and uses larger task sizes for large runners.
Environment logs remain accessible after instance termination.
Updated VM images for AWS runners.
CloudFormation descriptions updated to use Ona branding.

20260415.279

April 15, 2026

Performance and operational improvements

This release improves startup performance, reliability, and operational visibility for EC2 runners. To that end, this release introduces a managed metrics pipeline that lets you export runner metrics for monitoring runner health, environment lifecycle, and resource utilization. Every payload is written to S3 for auditing. Contact your account team to enable it.

New

Terminals are now killed when the dev container is rebuilt, preventing unresponsive sessions after a rebuild.
When multiple MCP servers expose tools with the same name, tool names are automatically prefixed with the server name to prevent silent overwrites.

Improvements

Environment startup is faster. Independent supervisor initialization steps now run concurrently, and disk pre-warming runs for all instances with startup-critical paths prioritized.
SCM context parsing uses ETag-based caching, reducing latency for repeated operations.
Environments with a configured idle timeout now auto-stop correctly when all SSH connections close.
OAuth token refresh is more resilient. The token cache is invalidated on permanent errors, and retries use exponential backoff.
The “All Changes” diff view no longer shows stale or empty results when starting environments from pull requests.
Git status parsing correctly handles renamed files, fixing broken tree rendering and diff fetching.
Devcontainer features referenced by local path no longer break the cache key computation.
Instances under memory pressure now receive stop commands promptly.
The ReadFile API no longer returns stale content due to cache collisions.
CORS headers are now set on the in-environment browser proxy, fixing silent failures for cross-origin requests.
Agent SCM tool registration errors are no longer fatal, preventing empty system prompts when tool setup fails.
The runner-side agent now shows the “MCP servers taking longer than expected” warning.
GitHub PR agent reactions fire reliably when mentioning the agent.
Core dumps are disabled at supervisor startup, preventing potential secret leakage.
Updated Node.js to v24.14.1 (security) and BuildKit to v0.28.1.

20260407.1269

April 7, 2026

Faster startup and reliability improvements

Environment startup is 1-2 seconds faster. Automation trigger API calls now run in parallel instead of sequentially, and the devcontainer reconciler caches configuration reads in steady state, saving an additional ~130ms per cycle.

What else is in this release

Improvements

Automation-triggered agent executions no longer get stuck in a waiting state when the agent attempts to ask for user input. The request is rejected immediately so the agent can proceed autonomously.
File watch self-healing now works correctly in all configurations. The discovery agent starts when watch mode is enabled, and the path denylist updates after a denylisted file is unlinked and recreated.
BPF watch-only mode emits WATCH_WRITE and WATCH_MMAP events correctly when untouchable mode is off.
Fixed a runner manager startup panic when multiple managed runners run in the same process.
Updated VM images for AWS runners.
Security dependency update: go-jose/v4 bumped to v4.1.4 (fixes GHSA-78h2-9frx-2jm8).

20260402.401

April 2, 2026

Warm pools now GA

Warm pools keep pre-initialized EC2 instances running from the latest prebuild snapshot. When you create an environment, Ona claims an instance that is already running with the snapshot loaded instead of launching a new one. Startup drops from minutes to around 10 seconds.Enable warm pools per environment class in your project’s prebuild settings. The runner dynamically scales the pool between 0 and your configured maximum (up to 10 in the dashboard, up to 20 via the CLI) based on demand. It also handles replenishment and automatic snapshot rotation when new prebuilds complete.Requires an Enterprise plan. Currently available on EC2 runners only. See the warm pools documentation for prerequisites and setup instructions.

Infrastructure upgrade required

This release requires a CloudFormation stack update.

This upgrade causes approximately 10-15 minutes of downtime where environments are unreachable. Schedule it during a low-usage period.

The full update takes ~30 minutes. Your data and environments are preserved. Running environments reconnect automatically after the update completes.

Before you upgrade

Note your Prometheus metrics settings. The upgrade resets them. You will re-enter them afterward. See Custom metrics pipeline.
Internet Gateway users (no NAT gateway): You must set Assign Public IP to true in the Network Configuration section during the CloudFormation parameter review step.
Templates from January 2025 or earlier: Either stop and discard existing environments before upgrading, or add port 22 to your security group first.

Upgrade steps

Go to Settings > Runners and select your runner
Open the three-dot menu and click Upgrade runner
Follow the dialog to update your CloudFormation stack
Re-enter your Prometheus metrics settings after the update completes

Full walkthrough: Upgrade runner infrastructure

What else is in this release

New

Fargate replaces EC2 instances for the runner service. No more AMI allowlisting or update bottlenecks.
MemoryDB persists Ona agent conversations in real time, with S3 as a durable backup. This is a new billable AWS resource in your account.
Runner sizing lets you choose between small and large infrastructure via a CloudFormation parameter. Select large if your organization runs many concurrent agent sessions.
Runner update windows let you control when your runner applies updates. Set a maintenance window to avoid disruptions during peak hours.

Improvements

Environment startup is faster thanks to earlier Docker socket activation and optimized content initialization.
Runner updates no longer cause brief user disconnects. The proxy now runs as a separate service.

​Repository authentication and reliability improvements

​Startup, diagnostics, and Bedrock BYOK improvements

​Hotfix for critical OpenSSL CVE findings

​Startup and pool reliability improvements

​Security and reliability updates

​Patch release

​Horizontal scaling for AWS runners

​Infrastructure upgrade required

​What else is in this release

​Increased runner capacity for large deployments

​Infrastructure upgrade required

​What else is in this release

​Availability Zone capacity fallback

​What else is in this release

​Security: Ubuntu 26.04 and CVE reduction

​What else is in this release

​Faster startup and credential redaction

​What else is in this release

​Performance and operational improvements

​Faster startup and reliability improvements

​What else is in this release

​Warm pools now GA

​Infrastructure upgrade required

​Before you upgrade

​Upgrade steps

​What else is in this release

Repository authentication and reliability improvements

Startup, diagnostics, and Bedrock BYOK improvements

Hotfix for critical OpenSSL CVE findings

Startup and pool reliability improvements

Security and reliability updates

Patch release

Horizontal scaling for AWS runners

Infrastructure upgrade required

What else is in this release

Increased runner capacity for large deployments

Infrastructure upgrade required

What else is in this release

Availability Zone capacity fallback

What else is in this release

Security: Ubuntu 26.04 and CVE reduction

What else is in this release

Faster startup and credential redaction

What else is in this release

Performance and operational improvements

Faster startup and reliability improvements

What else is in this release

Warm pools now GA

Infrastructure upgrade required

Before you upgrade

Upgrade steps

What else is in this release