Troubleshooting AWS runners

CloudFormation stack fails
Runner task fails
Instance type not available
Unexpected costs
SSM access blocked
Prebuilds fail due to policy restrictions
Fix for SCP restrictions
Fix for IAM policy restrictions
Network connectivity issues
Restart runner after network changes
Getting help

Network misconfigurations are the most common cause of issues. See access requirements first.

CloudFormation stack fails

Symptoms: ROLLBACK_COMPLETE or ROLLBACK_IN_PROGRESS with errors like Parameter validation failed: parameter value for EC2RunnerInstancesSubnet does not exist. Fix: Ensure you select a VPC, at least one availability zone, and subnets across multiple AZs.

Runner task fails

Symptoms:

CREATE_FAILED with ECS Deployment Circuit Breaker was triggered
ResourceInitializationError in task logs
Cannot pull images or access AWS services

Fix:

Verify VPC has Internet Gateway or NAT Gateway
Update route tables (public → IGW, private → NAT)
For private subnets, add VPC endpoints for Secrets Manager, S3, ECR
Check security groups allow outbound HTTPS

Instance type not available

Symptoms: Error like “m6i.xlarge is not available in us-east-1e” Fix:

Use multiple AZs (avoid us-east-1d and us-east-1e exclusively)
Try a different region or instance type
Update stack parameters or create new environment class
Retry later (availability is transient)

Unexpected costs

Symptoms: Unexpected AWS charges, or continued billing after deleting a runner. Fix:

See managing costs to identify resources
After deleting a runner, verify the CloudFormation stack is fully deleted
Check for residual EC2 instances or EBS volumes and delete manually

SSM access blocked

Symptoms:

Environments fail with AWS account policy blocks ssm:SendCommand
Runner marked as degraded
Slow startup (cache credentials can’t refresh)

Cause: Service Control Policies (SCPs) blocking SSM access. The runner needs ssm:SendCommand and ssm:GetCommandInvocation permissions. Fix: Request your AWS administrator add an exception for the runner’s IAM role:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Action": ["ssm:SendCommand", "ssm:GetCommandInvocation"],
    "Resource": ["arn:aws:ec2:*:*:instance/*", "arn:aws:ssm:*:*:command/*"]
  }]
}

Prebuilds fail due to policy restrictions

Symptoms:

Prebuilds fail with AWS Service Control Policy blocks ec2:CreateSnapshot
Prebuilds fail with AWS IAM policy does not allow ec2:CreateSnapshot
Similar errors for ec2:RegisterImage, ec2:DescribeSnapshots, or ec2:DescribeImages

Cause: Prebuilds require creating EBS snapshots and AMIs. These operations can be blocked by:

Service Control Policies (SCPs) - Organization-level policies that deny EC2 snapshot/AMI actions
IAM policies - The runner’s IAM role is missing required permissions (outdated CloudFormation stack)

Fix for SCP restrictions

Request your AWS administrator to allow these actions for the runner’s IAM role in the SCP:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Action": [
      "ec2:CreateSnapshot",
      "ec2:RegisterImage",
      "ec2:DescribeSnapshots",
      "ec2:DescribeImages",
      "ec2:DeleteSnapshot",
      "ec2:DeregisterImage"
    ],
    "Resource": "*"
  }]
}

Fix for IAM policy restrictions

Update your CloudFormation stack to the latest version. The latest stack template includes all required IAM permissions for prebuilds.

Network connectivity issues

Checklist:

Security groups: port 29222 (SSH), outbound HTTPS, port 22999 (internal)
Route tables: public subnets → IGW, private subnets → NAT
Network ACLs: not blocking required traffic
DNS: VPC DNS resolution enabled, can resolve app.gitpod.io

Test connectivity:

# Health endpoint (should return 200)
curl -v https://<your-domain>/_health

# Required endpoints
curl -I https://app.gitpod.io
curl -I https://public.ecr.aws

Restart runner after network changes

After changing security groups, route tables, or VPC endpoints, restart the runner: Console: ECS console → Clusters → your cluster → Services → Update → check Force new deployment CLI:

aws ecs update-service --cluster YOUR_CLUSTER_NAME --service YOUR_SERVICE_NAME --force-new-deployment

Verify: Check runner shows “Connected” in Settings → Runners, then test creating an environment.

Getting help

Use the support chat (bubble icon in bottom-right). Include:

Runner ID and version (from Settings → Runners → ... menu)
CloudFormation stack name and region
Runner logs from CloudWatch (ECS task logs)

Delete runner Zscaler troubleshooting

⌘I

Get Started

Understanding Ona

Environments

Agents

Automations

Runners

Security & Compliance

Organizations

Projects

Integrations

Source Control

Editors & IDEs

Reference

Troubleshooting

CloudFormation stack fails

Runner task fails

Instance type not available

Unexpected costs

SSM access blocked

Prebuilds fail due to policy restrictions

Fix for SCP restrictions

Fix for IAM policy restrictions

Network connectivity issues

Restart runner after network changes

Getting help

Get Started

Understanding Ona

Environments

Agents

Automations

Runners

Security & Compliance

Organizations

Projects

Integrations

Source Control

Editors & IDEs

Reference

​CloudFormation stack fails

​Runner task fails

​Instance type not available

​Unexpected costs

​SSM access blocked

​Prebuilds fail due to policy restrictions

​Fix for SCP restrictions

​Fix for IAM policy restrictions

​Network connectivity issues

​Restart runner after network changes

​Getting help

CloudFormation stack fails

Runner task fails

Instance type not available

Unexpected costs

SSM access blocked

Prebuilds fail due to policy restrictions

Fix for SCP restrictions

Fix for IAM policy restrictions

Network connectivity issues

Restart runner after network changes

Getting help