The Problem
When using Kubernetes-based CI builds, pod YAML specifications can exceed GKE's 1.5MiB etcd limit, causing pipeline failures with "etcdserver: request is too large" errors. Currently, these failures are only discovered at runtime when Kubernetes rejects the pod specification. There is no way to proactively validate or prevent this before execution.
We appreciate the recent CI_COMMON_ENV_POD optimization (delegate 25.11.87300) which helps reduce pod size, but this does not provide visibility or governance controls to prevent future breaches.
What We Need
A proactive validation mechanism that:
  1. Calculates pod YAML size before execution - Estimate the pod specification size before submitting to Kubernetes
  2. Provides configurable thresholds - Allow administrators to set size limits at account, project, or stage levels
  3. Supports warning and fail behaviors - Option to warn when approaching limits or fail fast before Kubernetes rejection
  4. Offers visibility - Display pod size in execution logs for troubleshooting and template governance
Use Case
Platform teams managing CI/CD templates need to validate that adding new stages, security scanning steps, or template changes won't breach etcd limits before releasing to production. Currently, this is trial-and-error through runtime failures, impacting developer productivity and release velocity.
Business Value
  • Shifts from reactive incident response to proactive prevention
  • Enables safe evolution of CI templates without risk of platform-wide failures
  • Reduces failed builds and investigation time for development teams
  • Applicable to any enterprise customer using Kubernetes-based CI infrastructure