Skip to main content
Size your AWS runner infrastructure before deployment. The two factors that determine your configuration are the number of environments you plan to run and how many will be active at the same time.

Runner sizes

The AWS runner CloudFormation template includes a RunnerSize parameter that controls the runner control plane infrastructure. Choose the size that matches your expected workload.
SmallLarge
Total environmentsUp to 5,0005,000+
Concurrent runningUp to 300300+
Availability zones23 or more
EC2 subnet/20 per AZ (4,096 IPs)/16 per AZ using CGNAT range
LB subnet/28 per AZ/28 per AZ
Management plane connectionNAT gateway or VPC endpointVPC endpoint (PrivateLink)
Managed metricsRecommendedRequired
Runner scalingNot neededRecommended
If you are unsure which size to start with, choose small. You can switch to large later by updating the RunnerSize CloudFormation parameter without redeploying the stack.
The RunnerSize parameter controls the runner control plane (orchestrator, proxy, cache). It does not affect the size of environment VMs. Environment VM sizing is configured through environment classes.

Small runner

Set RunnerSize to small in the CloudFormation template. This is the default. The small configuration supports up to 5,000 total environments and 300 running at the same time. It is the right starting point for most deployments.

Infrastructure provisioned

ComponentSpecification
Runner Fargate task1 vCPU, 3 GB memory
Proxy Fargate task0.5 vCPU, 1 GB memory
ECS host instancec6i.large (2 vCPU, 4 GB)
Cache (MemoryDB)db.t4g.small

Network layout

Use 2 availability zones with one EC2 subnet and one load balancer subnet per AZ. Stopped environments are stopped EC2 instances. Stopped instances retain their private IP address, so your subnets must be large enough to hold all environments, not only the ones that are running.
Runner NameRegionAZsEC2 SubnetLB SubnetEnvironment Capacity
us-eastus-east-12/20 (4,096 IPs)/28 (16 IPs)~8,187
Select your region based on recommended latency thresholds. If this works for you, proceed to setup. Capacity formula: (Subnet IPs per AZ x Number of AZs) - ~5 management IPs
EC2 SubnetIPs per AZWith 2 AZsWith 3 AZs
/212,048~4,091~6,139
/204,096~8,187~12,283
/198,192~16,379~24,571

Multi-region example

Each runner is deployed into a single AWS region and a single AWS account. To serve users in multiple regions, deploy one runner per region.
Runner NameRegionAZsEC2 SubnetLB SubnetEnvironment Capacity
us-eastus-east-12/20 (4,096 IPs)/28 (16 IPs)~8,187
us-westus-west-22/21 (2,048 IPs)/28 (16 IPs)~4,091
europeeu-west-12/21 (2,048 IPs)/28 (16 IPs)~4,091

Connectivity

The runner must reach the Ona management plane and several AWS services. For small deployments, a NAT gateway provides the simplest path. See Networking for all connectivity options. For lower latency and to avoid NAT gateway data processing charges, you can connect the runner to the management plane over PrivateLink. This is optional for small runners but recommended if your subnets already use VPC endpoints for AWS services.

Large runner

Set RunnerSize to large in the CloudFormation template. The large configuration is designed for deployments that exceed 5,000 total environments or 300 concurrent running environments. It provisions more CPU, memory, and cache capacity for the runner control plane, and supports horizontal scaling of the runner service.
Heavy agent workloads (many concurrent AI agent sessions) increase runner control plane load independently of environment count. If you observe high CPU utilization on the runner Fargate task with fewer than 5,000 environments, switch to large.

Infrastructure provisioned

ComponentSpecification
Runner Fargate task4 vCPU, 16 GB memory per replica (2 to 5 replicas with autoscaling)
Proxy Fargate task2 vCPU, 4 GB memory
ECS host instancec6i.2xlarge (8 vCPU, 16 GB)
Cache (MemoryDB)db.t4g.medium
To enable horizontal scaling, set EnableRunnerScaling to true in the CloudFormation template. The runner service starts with 2 replicas and scales up to 5 based on CPU and memory utilization.

Network layout

Use 3 or more availability zones. At this scale, spreading environments across more AZs is important for two reasons:
  1. Instance capacity. When many environments start at the same time, EC2 RunInstances calls concentrate per AZ. More AZs reduce the chance of hitting InsufficientInstanceCapacity in any single zone.
  2. Fault tolerance. Losing one AZ still leaves two or more zones operational, which matters when hundreds of environments are running.
For EC2 subnets, use a CGNAT range (100.64.0.0/10). EC2 subnets do not need to be routable because environments connect outbound through NAT or proxy. CGNAT provides a large address space without consuming your organization’s routable IP allocation.
Runner NameRegionAZsEC2 SubnetLB SubnetEnvironment Capacity
productioneu-central-13/16 (65,536 IPs) per AZ using CGNAT/28 (16 IPs)~196,603
A /16 per AZ is generous. Size the subnets based on your expected peak, with room for growth. The key point is that CGNAT ranges are free to use and expanding subnets after deployment is complex.

Management plane connection

For large runners, connect to the Ona management plane over PrivateLink instead of routing through the public internet. Without PrivateLink, all runner-to-management-plane traffic traverses a NAT gateway. At high environment counts, this adds latency and incurs NAT gateway data processing charges ($0.045/GB). PrivateLink keeps this traffic within the AWS network. If you use app.gitpod.io: Create a VPC endpoint to the Ona management plane service. See Networking: VPC endpoints for setup instructions. When private DNS is enabled on the endpoint, app.gitpod.io resolves to private IPs inside your VPC and the runner connects directly over PrivateLink. If you use a custom domain: The custom domain page describes how to set up a VPC endpoint and load balancer for access to the management plane through your domain. The custom-domain path already goes through your Network Load Balancer. To also keep the runner’s API traffic private, use split-horizon DNS so the runner VPC resolves your custom domain to the private load-balancer endpoint instead of the public DNS record or external proxy path. See Route runner API traffic privately for the supported DNS patterns.

Managed metrics

Enable Ona managed metrics on large runners. Managed metrics give the Ona team visibility into runner health, which enables proactive detection of resource exhaustion, elevated error rates, and degraded performance. Without metrics, Ona cannot identify issues until you report them.

Reserved capacity

At this scale, consider EC2 Capacity Reservations or Savings Plans for your most-used environment instance types. Reservations guarantee instance availability in your AZs and reduce costs compared to on-demand pricing.

Planning steps

1. Select regions

Choose AWS regions with optimal latency for your users. Plan subnet sizes for each region before deploying.

2. Estimate environments per region

For each region, estimate the maximum number of environments including:
  • Current users and expected growth
  • Peak concurrent usage patterns
  • Agent and automation workloads (each agent session runs in its own environment)

3. Choose availability zones

Deployment sizeRecommended AZs
Small (up to 5,000 environments)2
Large (5,000+ environments)3 or more
One EC2 subnet and one load balancer subnet are required per AZ.

4. Plan subnet sizes

EC2 subnets

Each environment uses one IP address. Stopped environments are stopped EC2 instances and retain their IP address, so plan for the total number of environments (running and stopped), not only the concurrent running count.
ConsiderationDetails
IP per environment1 (retained while stopped)
Management overhead~5 IPs
Minimum size/28 (10 environments)
Capacity formula(Subnet IPs per AZ x Number of AZs) - ~5 management IPs
EC2 subnets can use non-routable CIDR ranges. For large deployments, use CGNAT (100.64.0.0/10) to avoid IP exhaustion. Plan generously because expanding subnets after deployment is complex. For public subnets, enable auto-assign public IP.

Load balancer subnets

For the Network Load Balancer:
  • Must be routable from your internal network
  • /28 (16 IPs) is sufficient for all deployment sizes
  • One subnet per AZ
  • Does not affect environment capacity

Next steps