python-resilience▌
wshobson/agents · updated Apr 8, 2026
MDX-style export adds YAML metadata + attribution linking explainx.ai and this canonical listing URL.
Automatic retries, exponential backoff, timeouts, and fault-tolerant decorators for Python services.
- ›Covers transient vs. permanent failure classification, exponential backoff with jitter, bounded retries, and timeout patterns using the tenacity library
- ›Includes nine production patterns: basic retry, selective error handling, HTTP status code retries, combined exception and status retries, retry logging, timeout decorators, stacked decorators, dependency injection for testing, and fail-
Python Resilience Patterns
Build fault-tolerant Python applications that gracefully handle transient failures, network issues, and service outages. Resilience patterns keep systems running when dependencies are unreliable.
When to Use This Skill
- Adding retry logic to external service calls
- Implementing timeouts for network operations
- Building fault-tolerant microservices
- Handling rate limiting and backpressure
- Creating infrastructure decorators
- Designing circuit breakers
Core Concepts
1. Transient vs Permanent Failures
Retry transient errors (network timeouts, temporary service issues). Don't retry permanent errors (invalid credentials, bad requests).
2. Exponential Backoff
Increase wait time between retries to avoid overwhelming recovering services.
3. Jitter
Add randomness to backoff to prevent thundering herd when many clients retry simultaneously.
4. Bounded Retries
Cap both attempt count and total duration to prevent infinite retry loops.
Quick Start
from tenacity import retry, stop_after_attempt, wait_exponential_jitter
@retry(
stop=stop_after_attempt(3),
wait=wait_exponential_jitter(initial=1, max=10),
)
def call_external_service(request: dict) -> dict:
return httpx.post("https://api.example.com", json=request).json()
Fundamental Patterns
Pattern 1: Basic Retry with Tenacity
Use the tenacity library for production-grade retry logic. For simpler cases, consider built-in retry functionality or a lightweight custom implementation.
from tenacity import (
retry,
stop_after_attempt,
stop_after_delay,
wait_exponential_jitter,
retry_if_exception_type,
)
TRANSIENT_ERRORS = (ConnectionError, TimeoutError, OSError)
@retry(
retry=retry_if_exception_type(TRANSIENT_ERRORS),
stop=stop_after_attempt(5) | stop_after_delay(60),
wait=wait_exponential_jitter(initial=1, max=30),
)
def fetch_data(url: str) -> dict:
"""Fetch data with automatic retry on transient failures."""
response = httpx.get(url, timeout=30)
response.raise_for_status()
return response.json()
Pattern 2: Retry Only Appropriate Errors
Whitelist specific transient exceptions. Never retry:
ValueError,TypeError- These are bugs, not transient issuesAuthenticationError- Invalid credentials won't become valid- HTTP 4xx errors (except 429) - Client errors are permanent
from tenacity import retry, retry_if_exception_type
import httpx
# Define what's retryable
RETRYABLE_EXCEPTIONS = (
ConnectionError,
TimeoutError,
httpx.ConnectTimeout,
httpx.ReadTimeout,
)
@retry(
retry=retry_if_exception_type(RETRYABLE_EXCEPTIONS),
stop=stop_after_attempt(3),
wait=wait_exponential_jitter(initial=1, max=10),
)
def resilient_api_call(endpoint: str) -> dict:
"""Make API call with retry on network issues."""
return httpx.get(endpoint, timeout=10).json()
Pattern 3: HTTP Status Code Retries
Retry specific HTTP status codes that indicate transient issues.
from tenacity import retry, retry_if_result, stop_after_attempt
import httpx
RETRY_STATUS_CODES = {429, 502, 503, 504}
def should_retry_response(response: httpx.Response) -> bool:
"""Check if response indicates a retryable error."""
return response.status_code in RETRY_STATUS_CODES
@retry(
retry=retry_if_result(should_retry_response),
stop=stop_after_attempt(3),
wait=wait_exponential_jitter(initial=1, max=10),
)
def http_request(method: str, url: str, **kwargs) -> httpx.Response:
"""Make HTTP request with retry on transient status codes."""
return httpx.request(method, url, timeout=30, **kwargs)
Pattern 4: Combined Exception and Status Retry
Handle both network exceptions and HTTP status codes.
from tenacity import (
retry,
retry_if_exception_type,
retry_if_result,
stop_after_attempt,
wait_exponential_jitter,
before_sleep_log,
)
import logging
import httpx
logger = logging.getLogger(__name__)
TRANSIENT_EXCEPTIONS = (
ConnectionError,
TimeoutError,
httpx.ConnectError,
httpx.ReadTimeout,
)
RETRY_STATUS_CODES = {429, 500, 502, 503, 504}
def is_retryable_response(response: httpx.Response) -> bool:
return response.status_code in RETRY_STATUS_CODES
@retry(
retry=(
retry_if_exception_type(TRANSIENT_EXCEPTIONS) |
retry_if_result(is_retryable_response)
),
stop=stop_after_attempt(5),
wait=wait_exponential_jitter(initial=1, max=30),
before_sleep=before_sleep_log(logger, logging.WARNING),
)
def robust_http_call(
method: str,
url: str,
**kwargs,
) -> httpx.Response:
"""HTTP call with comprehensive retry handling."""
return httpx.request(method, url, timeout=30, **kwargs)
Advanced Patterns
Pattern 5: Logging Retry Attempts
Track retry behavior for debugging and alerting.
from tenacity import retry, stop_after_attempt, wait_exponential
import structlog
logger = structlog.get_logger()
def log_retry_attempt(retry_state):
"""Log detailed retry information."""
exception = retry_state.outcome.exception()
logger.warning(
"Retrying operation",
attempt=retry_state.attempt_number,
exception_type=How to use python-resilience on Cursor
AI-first code editor with Composer
Prerequisites
Before installing skills in Cursor, ensure your development environment meets these requirements:
- ›Cursor installed and configured on your development machine
- ›Node.js version 16.0+ with npm package manager (verify with
node --version) - ›Active project directory or workspace where you want to add python-resilience
Execute installation command
Execute the skills CLI command in your project's root directory to begin installation:
The skills CLI fetches python-resilience from GitHub repository wshobson/agents and configures it for Cursor.
Select Cursor when prompted
The CLI will show a list of available agents. Use arrow keys to navigate and space to select Cursor:
Verify installation
Confirm successful installation by checking the skill directory location:
Reload or restart Cursor to activate python-resilience. Access the skill through slash commands (e.g., /python-resilience) or your agent's skill management interface.
Security & Verification Notice
We perform automated surface-level scans (Gen AI Scanner, Socket, Snyk) during installation. These checks detect common vulnerabilities but do not guarantee complete security. Always review skill source code and verify the publisher's reputation before production use.
Skills execute code in your development environment. Always verify the publisher's identity, review recent commits, and test in isolated environments before production deployment.
List & Monetize Your Skill
Submit your Claude Code skill and start earning
Use Cases▌
Task Automation & Efficiency
Automate repetitive workflows and reduce manual effort
Example
Generate reports, summarize documents, draft communications
Save 3-5 hours per week on routine tasks
Knowledge Enhancement
Learn new skills, understand complex topics, get expert guidance
Example
Explain concepts, provide examples, suggest learning resources
Accelerate learning and skill development by 2x
Quality Improvement
Enhance output quality through reviews, suggestions, and refinements
Example
Review drafts, suggest improvements, catch errors
Improve work quality by 30-40% with less effort
Implementation Guide▌
Prerequisites
- ›Claude Desktop or compatible AI client with skill support
- ›Clear understanding of task or problem to solve
- ›Willingness to iterate and refine outputs
Time Estimate
15-45 minutes depending on use case complexity
Installation Steps
- 1.Install skill using provided installation command
- 2.Test with simple use case relevant to your work
- 3.Evaluate output quality and relevance
- 4.Iterate on prompts to improve results
- 5.Integrate into regular workflow if valuable
Common Pitfalls
- ⚠Expecting perfect results without iteration
- ⚠Not providing enough context in prompts
- ⚠Using skill for tasks outside its intended scope
- ⚠Accepting outputs without review and validation
Best Practices▌
✓ Do
- +Start with clear, specific prompts
- +Provide relevant context and constraints
- +Review and refine all outputs before using
- +Iterate to improve output quality
- +Document successful prompt patterns
✗ Don't
- −Don't use without understanding skill limitations
- −Don't skip validation of outputs
- −Don't share sensitive information in prompts
- −Don't expect skill to replace human judgment
💡 Pro Tips
- ★Be specific about desired format and style
- ★Ask for multiple options to choose from
- ★Request explanations to understand reasoning
- ★Combine AI efficiency with human expertise
When to Use This▌
✓ Use When
Use when skill capabilities match your task, clear ROI on time saved, and you can validate outputs. Best for repetitive tasks, learning, and quality improvement.
✗ Avoid When
Avoid when task requires deep expertise you can't validate, involves sensitive decisions, or when learning process is more valuable than speed of completion.
Learning Path▌
- 1Familiarize yourself with skill capabilities and limitations
- 2Start with low-risk, non-critical tasks
- 3Progress to more complex and valuable use cases
- 4Build expertise through regular use and experimentation
Discussion
Product Hunt–style comments (not star reviews)- No comments yet — start the thread.
Ratings
4.4★★★★★63 reviews- ★★★★★Ganesh Mohane· Dec 28, 2024
Useful defaults in python-resilience — fewer surprises than typical one-off scripts, and it plays nicely with `npx skills` flows.
- ★★★★★Advait Rahman· Dec 28, 2024
I recommend python-resilience for anyone iterating fast on agent tooling; clear intent and a small, reviewable surface area.
- ★★★★★Fatima Yang· Dec 28, 2024
We added python-resilience from the explainx registry; install was straightforward and the SKILL.md answered most questions upfront.
- ★★★★★Mateo Menon· Dec 16, 2024
python-resilience fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.
- ★★★★★Mateo Rao· Dec 12, 2024
python-resilience fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.
- ★★★★★Noah Li· Dec 8, 2024
python-resilience has been reliable in day-to-day use. Documentation quality is above average for community skills.
- ★★★★★Sakshi Patil· Nov 19, 2024
python-resilience is among the better-maintained entries we tried; worth keeping pinned for repeat workflows.
- ★★★★★Advait Martinez· Nov 19, 2024
Keeps context tight: python-resilience is the kind of skill you can hand to a new teammate without a long onboarding doc.
- ★★★★★Yusuf Kim· Nov 19, 2024
python-resilience reduced setup friction for our internal harness; good balance of opinion and flexibility.
- ★★★★★Li Chawla· Nov 3, 2024
Registry listing for python-resilience matched our evaluation — installs cleanly and behaves as described in the markdown.
showing 1-10 of 63