Kubernetes Stabilizer
Automatically identifies and fixes unhealthy pods in a Kubernetes cluster to ensure all workloads are running as expected. This loop continuously monitors pod health, diagnoses common issues, and applies corrective actions within safe guardrails.
Goal
Fix unhealthy pods
How to Run
This loop runs entirely through AI-powered prompts and requires no code changes. Simply initiate the loop in your preferred coding assistant environment, and it will stabilize your Kubernetes cluster automatically.
- 01
Open Your Coding Environment
Launch Cursor, Claude Code, Codex, OpenCode, or Gemini CLI and open the project directory containing your Kubernetes manifests.
- 02
Initiate the Loop
Send the kickoff prompt to start the stabilization process.
- 03
Monitor Progress
The agent will iteratively check pod status, diagnose issues, and apply fixes while respecting guardrails.
Workflow Steps
- 01
Run kubectl get pods to list all pods in the current namespace
Parse output to identify pods not in 'Running' or 'Completed' state
- 02
For each unhealthy pod, run kubectl describe pod <pod-name> to analyze events and status
Identify root cause (e.g., image pull errors, resource constraints, config issues)
- 03
Apply targeted fixes based on diagnosis (restart, update image, adjust resources, etc.)
Confirm fix was applied successfully via kubectl apply or delete
- 04
Wait 10 seconds for pod state changes
Verify pod transitions toward healthy state
- 05
Re-run check command to assess overall cluster health
Count unhealthy pods - if zero, exit loop; else continue iterating
Kickoff Prompt
Start the "Kubernetes Stabilizer" loop. Goal: Fix unhealthy pods Max iterations: 10 Between iterations run: kubectl get pods Exit when: All pods healthy Begin the Kubernetes Stabilizer loop: Check pod health using 'kubectl get pods', identify any unhealthy pods, diagnose their issues, apply appropriate fixes within guardrails, wait for changes to propagate, and re-check until all pods are healthy or max iterations (10) are reached. Prioritize safety by avoiding system namespaces and requiring confirmation for destructive actions. Self-pace this loop. After each iteration, run `kubectl get pods` and evaluate the output, and only continue if the exit condition is not met (All pods healthy). Stop when the exit condition passes or 10 iterations are reached. Give a short status update each pass.
Guardrails
hardcoded- ·Verify Kubernetes cluster access before executing commands
- ·Only operate on pods in the currently configured namespace
- ·Avoid modifying pods in system or kube-system namespaces
- ·Limit restart attempts to prevent cascading failures
- ·Require user confirmation before deleting persistent volumes
- ·Pause execution if detecting more than 50% pod failures
Flow Diagram
Related loops — DevOps
DevOps
Container Security Fixer
Automatically detects and remediates security vulnerabilities in container images through iterative scanning and patching workflows.
DevOps
Monitoring Coverage Builder
This loop iteratively identifies and adds missing monitoring coverage to your codebase by analyzing test coverage, identifying gaps, and implementing targeted monitoring solutions until the desired threshold is achieved.
DevOps
Service Reliability Loop
This loop enables continuous improvement of service reliability and uptime by leveraging Service Level Objective (SLO) reports to identify and address performance gaps.