Note: GCP costs mentioned in this document are subject to change and may vary by region. Please check the latest pricing on the GCP pricing pages.
What is a GCP runner?
A GCP Runner is an orchestrator for development environments deployed within a Google Cloud Platform project using always-on Compute Engine virtual machines. The key characteristics of a runner are:- Shared resource serving multiple users (not personal)
- Designed for “always-on” operation with optimized cost efficiency
- Suitable for organizations requiring GCP-specific compliance or data residency
- Support for multiple runners to:
- Enhance availability
- Reduce latency for end users
- Ensure data residency and compliance
Billable GCP Resources
The following are the billable resources deployed by the GCP runner:- Compute Engine Instances: Managed instance groups for runner and proxy services
- Memorystore Redis: Used to store state related to environment reconciliation
- Cloud Storage Buckets: Used to store build cache, runner assets, and CA certificates
- Load Balancer Components: Health checks, backend services, and forwarding rules
- Secret Manager: Used to store Redis credentials and metrics configuration
- Cloud Logging: Used to store logs related to runner instances
- VPC Networking: Private service connections and IP address reservations
- Cloud KMS (optional): Customer-managed encryption keys for enhanced security
Baseline Costs of a Runner
The primary costs associated with a GCP runner include:Core Infrastructure (Always-On)
- Compute Engine Instances:
- Runner instance: 1x
c4-standard-4
(4 vCPU, 16GB RAM) = ~$120/month - Proxy instances: 2x
c4-standard-2
(2 vCPU, 8GB RAM each) = ~$120/month total - Subtotal: ~$240/month
- Runner instance: 1x
- Memorystore Redis: Standard HA instance with 2GB memory = ~$70/month
- Load Balancer: Global or regional load balancer = ~$18-25/month
Storage and Networking
- Cloud Storage: Build cache and assets storage (~$2-10/month depending on usage)
- VPC Networking: Private service connections and IP reservations (~$3-8/month)
- Secret Manager: Minimal cost for storing credentials (~$0.50-2/month)
- Cloud Logging: Log storage and ingestion (~$2-15/month depending on log volume)
Optional Components
- Cloud KMS: Customer-managed encryption keys (~$1-3/month if enabled)
- Certificate Manager: SSL certificate management (free for Google-managed certificates)
Note: These are baseline estimates. Actual costs may vary based on region, usage patterns, and configuration choices.
Controls for Managing Costs
Viewing Runner Costs
To view isolated runner costs in Google Cloud Console:- Navigate to Cloud Billing → Cost breakdown.
- Group by Service to analyze the breakdown of costs.
-
Filter by labels to isolate runner-specific resources:
-
Runner Name: Use the
gitpod-runner
label with your runner name value. -
Component: Filter by
gitpod-component
label to see specific components (redis, build-cache, etc.). -
Managed by Terraform: Use the
managed-by
label with valueterraform
.
-
Runner Name: Use the
Viewing Environment Costs
To view isolated environment costs in Google Cloud Console:- Navigate to Cloud Billing → Cost breakdown.
- Group by Service to analyze the breakdown of costs.
-
Filter by the environment ID using labels:
-
Label:
gitpod-environment-id
-
Value: The environment ID (you can find this by selecting Copy ID from the environment details)
-
Label:
Using Cost Management Tools
Budget Alerts
Set up budget alerts to monitor runner costs:- Navigate to Cloud Billing → Budgets & alerts
- Create a new budget with filters for your runner resources
- Set threshold alerts at 50%, 80%, and 100% of your expected monthly cost
- Configure notifications to email or Slack channels
Cost Anomaly Detection
Enable cost anomaly detection to catch unexpected cost spikes:- Navigate to Cloud Billing → Cost insights
- Enable anomaly detection for your project
- Configure alerts for cost anomalies above your threshold
Compute Engine Instance Labeling
All Compute Engine instances launched by Ona GCP runners automatically inherit labels from the Terraform configuration. This means:- Any labels you add to the Terraform variables will be propagated to the Compute Engine instances created for environments
- This allows for consistent resource labeling across your GCP infrastructure
- Labels can be used for cost allocation, resource grouping, and access control
Adding Labels to Compute Engine Instances
To add labels to Compute Engine instances launched by Ona:- Update your
terraform.tfvars
file with additional labels:
- Apply the Terraform changes:
Note: When you update Terraform labels, the changes don’t immediately apply to existing instances. To force existing instances to pick up the new labels, you can trigger a rolling update of the managed instance groups:This will recreate instances with the updated labels without affecting running development environments.
Cost Optimization Strategies
Right-sizing Instances
Monitor instance utilization and adjust machine types:- Review CPU and memory utilization in Cloud Monitoring
- Consider smaller instance types for low-utilization runners
- Use custom machine types for optimal resource allocation
Storage Optimization
Optimize storage costs for build cache and assets:- Configure lifecycle policies on Cloud Storage buckets to automatically delete old cache data
- Monitor storage usage and adjust retention policies as needed
- Use regional storage instead of multi-regional for cost savings
Redis Optimization
Optimize Memorystore Redis costs:- Monitor memory utilization and adjust instance size accordingly
- Consider Basic tier instead of Standard HA for non-production environments (saves ~$35/month)
- Use smaller memory sizes if your workload permits:
- 1GB instead of 2GB = ~$15-20/month savings
- Monitor actual Redis memory usage in Cloud Monitoring
Load Balancer Optimization
Optimize load balancer costs:- Use internal load balancers when external access isn’t required
- Monitor data transfer costs and optimize routing
- Consider regional load balancers for single-region deployments
Resource Cleanup
Automatic Cleanup
The GCP runner includes automatic cleanup mechanisms:- Build cache lifecycle: Automatically deletes cache data older than 30 days
- Log retention: Cloud Logging automatically manages log retention based on your settings
- Incomplete uploads: Automatically aborts incomplete multipart uploads after 1 day
Manual Cleanup
For cost optimization, consider manual cleanup of:- Unused persistent disks from terminated environments
- Old machine images if using custom images
- Unused static IP addresses if any were manually created
Terraform Destroy
To completely remove runner infrastructure and stop all costs:Warning: This will permanently delete all runner infrastructure and any associated data. Ensure you have backups of any important work before proceeding.
Cost Monitoring Best Practices
Regular Cost Reviews
- Weekly cost reviews: Monitor costs weekly to catch anomalies early
- Monthly budget analysis: Compare actual vs. budgeted costs monthly
- Quarterly optimization: Review and optimize resource allocation quarterly
Cost Attribution
- Use consistent labeling: Apply consistent labels across all resources
- Environment-specific tracking: Track costs per development environment
- Team-based allocation: Allocate costs to specific teams or projects
Alerting and Notifications
- Set up budget alerts: Configure alerts at multiple thresholds
- Monitor cost anomalies: Enable automatic anomaly detection
- Regular reporting: Set up automated cost reports for stakeholders
Troubleshooting High Costs
Common Cost Issues
- Runaway environments: Environments that don’t shut down properly
- Large build caches: Excessive storage usage in build cache buckets
- High data transfer: Unexpected network egress charges
- Oversized instances: Using larger instance types than necessary
Investigation Steps
- Check Cloud Billing reports for cost breakdown by service
- Review resource utilization in Cloud Monitoring
- Audit running instances and their utilization
- Check storage usage in Cloud Storage buckets
- Review network traffic patterns and data transfer costs
Cost Reduction Actions
- Terminate unused environments manually if auto-shutdown fails
- Clean up old build cache data beyond the automatic lifecycle
- Optimize instance types based on actual utilization
- Review and adjust Redis instance sizing
- Optimize load balancer configuration for your traffic patterns