DevOpsprompt onlyIntermediate

Kubernetes Stabilizer

Automatically identifies and fixes unhealthy pods in a Kubernetes cluster to ensure all workloads are running as expected. This loop continuously monitors pod health, diagnoses common issues, and applies corrective actions within safe guardrails.

← all loops
KubernetesDevOpsTroubleshootingAutomationCluster Management

Goal

Fix unhealthy pods

How to Run

This loop runs entirely through AI-powered prompts and requires no code changes. Simply initiate the loop in your preferred coding assistant environment, and it will stabilize your Kubernetes cluster automatically.

  1. 01

    Open Your Coding Environment

    Launch Cursor, Claude Code, Codex, OpenCode, or Gemini CLI and open the project directory containing your Kubernetes manifests.

  2. 02

    Initiate the Loop

    Send the kickoff prompt to start the stabilization process.

  3. 03

    Monitor Progress

    The agent will iteratively check pod status, diagnose issues, and apply fixes while respecting guardrails.

Workflow Steps

  1. 01

    Run kubectl get pods to list all pods in the current namespace

    Parse output to identify pods not in 'Running' or 'Completed' state

  2. 02

    For each unhealthy pod, run kubectl describe pod <pod-name> to analyze events and status

    Identify root cause (e.g., image pull errors, resource constraints, config issues)

  3. 03

    Apply targeted fixes based on diagnosis (restart, update image, adjust resources, etc.)

    Confirm fix was applied successfully via kubectl apply or delete

  4. 04

    Wait 10 seconds for pod state changes

    Verify pod transitions toward healthy state

  5. 05

    Re-run check command to assess overall cluster health

    Count unhealthy pods - if zero, exit loop; else continue iterating

Kickoff Prompt

Start the "Kubernetes Stabilizer" loop.

Goal: Fix unhealthy pods
Max iterations: 10
Between iterations run: kubectl get pods
Exit when: All pods healthy


Begin the Kubernetes Stabilizer loop: Check pod health using 'kubectl get pods', identify any unhealthy pods, diagnose their issues, apply appropriate fixes within guardrails, wait for changes to propagate, and re-check until all pods are healthy or max iterations (10) are reached. Prioritize safety by avoiding system namespaces and requiring confirmation for destructive actions.

Self-pace this loop. After each iteration, run `kubectl get pods` and evaluate the output, and only continue if the exit condition is not met (All pods healthy). Stop when the exit condition passes or 10 iterations are reached. Give a short status update each pass.

Guardrails

hardcoded
  • ·Verify Kubernetes cluster access before executing commands
  • ·Only operate on pods in the currently configured namespace
  • ·Avoid modifying pods in system or kube-system namespaces
  • ·Limit restart attempts to prevent cascading failures
  • ·Require user confirmation before deleting persistent volumes
  • ·Pause execution if detecting more than 50% pod failures

Flow Diagram

rendering…

Related loops — DevOps