Claude Code for Python Automation: Complete Beginner's Guide 2026
The complete guide to Claude Code for Python automation in 2026. Learn setup, workflow automation, data processing, API integration, hooks, and real-world Python scripts. No coding experience required.
TL;DR: Claude Code is Anthropic's agentic CLI tool that automates Python development through a write-run-fix loopโwriting code, executing it, reading errors, fixing issues, and rerunning autonomously. This complete guide covers installation and setup, CLAUDE.md configuration, workflow automation patterns (data processing, API integration, web scraping), hooks for testing and formatting, and real-world Python automation projects. Perfect for beginners: no deep coding knowledge required, just clear problem articulation. Learn to build automated scripts 10x faster than traditional coding.
What is Claude Code?
Claude Code is Anthropic's agentic command-line interface (CLI) tool developed to assist developers with Python automation, refactoring, documentation, and debugging directly in the terminal.
Claude Code's write-run-fix loop in action โ the same pattern that powers Python automation.
"It automates the write-run-fix loop. It writes code, executes it, reads the traceback when something fails, edits the code, and reruns. You watch the cycle happen in your terminal. Claude Code can go through several iterations in the time it would take to manually interpret an error message and look up the fix."
CLAUDE.md is a markdown file in your project root that gives Claude Code standing instructions it reads at the start of every session.
Basic CLAUDE.md Template
# Project: Data Processing Automation## Overview
This project automates daily sales data processing, cleaning, and report generation.
## Tech Stack- Python 3.10
- pandas for data manipulation
- requests for API calls
- openpyxl for Excel generation
## Code Style- Use type hints for all functions
- Follow PEP 8 style guide
- Use descriptive variable names
- Add docstrings to all functions
- Prefer list comprehensions over loops where readable
## Project Structure
project/
โโโ data/ # Raw CSV files
โโโ processed/ # Cleaned data
โโโ reports/ # Generated reports
โโโ scripts/ # Python automation scripts
โโโ tests/ # Unit tests
## Common Tasks
1. Data cleaning: Remove duplicates, handle missing values
2. Data transformation: Convert currencies, normalize dates
3. Report generation: Create Excel summary with charts
4. API integration: POST processed data to dashboard
## Dependencies
Install via: `pip install pandas requests openpyxl pytest`
## Testing
Run tests before committing: `pytest tests/`
## Deployment
Scripts run via cron job daily at 6 AM UTC
Advanced CLAUDE.md Features
1. Workflow Instructions
## Automation Workflow### Daily Sales Processing1. Fetch CSVs from `data/` directory
2. Clean data:
- Remove rows where `amount < 0` - Fill missing `customer_name` with "Unknown"
- Convert `date` to ISO format
3. Group by product category
4. Generate Excel report with:
- Summary sheet (totals, averages)
- Detailed sheet (all records)
- Charts sheet (bar chart of sales by category)
5. Save to `reports/sales-YYYY-MM-DD.xlsx`6. POST summary to API endpoint: `https://api.example.com/sales`
2. Error Handling Guidelines
## Error Handling- Always use try-except blocks for file operations
- Log all errors to `logs/errors.log`- Send email notification on critical failures
- Retry API calls 3 times with exponential backoff
- Validate data before processing (check dtypes, nulls, ranges)
3. Performance Optimizations
## Performance- Use pandas vectorized operations (avoid iterrows)
- Process large files in chunks (chunksize=10000)
- Cache API responses for 1 hour
- Use multiprocessing for independent tasks
- Profile scripts with cProfile before optimization
4. Security Guidelines
## Security- Never hardcode API keys (use environment variables)
- Store credentials in `.env` file (add to .gitignore)
- Validate all user inputs
- Use parameterized queries for database operations
- Encrypt sensitive data before storing
Use Case: Daily sales data cleaning and transformation
What We're Building:
Read CSV files from a directory
Clean data (remove duplicates, handle nulls)
Transform data (calculations, format conversions)
Export to Excel with summary statistics
Claude Code Prompt:
Create a Python script that:
1. Reads all CSV files from the 'data/' directory
2. For each CSV:
- Load using pandas
- Remove duplicate rows
- Fill missing 'price' values with column median
- Convert 'date' column to datetime format
- Calculate 'revenue' = quantity * price
3. Combine all data into one DataFrame
4. Create Excel file with two sheets:
- 'Summary': Total sales, average price, top 10 products
- 'Details': All cleaned records
5. Save to 'reports/sales-report-YYYY-MM-DD.xlsx'
Requirements:
- Use pandas and openpyxl
- Add error handling for missing files
- Log processing status
- Generate report filename with current date
Claude Code Process:
$ claude-code "Execute the data processing task"
[Claude Code working...]
โ Created data_processor.py
โ Installed dependencies (pandas, openpyxl)
โ Wrote CSV reading logic
โ Executed script
โ Error: FileNotFoundError: 'data/' directory not found
[Claude Code auto-fixing...]
โ Added directory existence check
โ Created 'data/' directory if missing
โ Reran script
โ Success! Generated reports/sales-report-2026-06-09.xlsx
Script complete. Processing time: 2.3 seconds
Processed 1,247 records from 3 CSV files
Claude Code created the entire script autonomously
Fixed the directory error without manual intervention
Added proper logging and error handling
Generated production-ready code with best practices
Project 2: API Integration Automation
Use Case: Fetch data from REST API, process, and POST results
Claude Code Prompt:
Build a Python script that automates API data flow:
1. GET weather data from OpenWeatherMap API
- Endpoint: https://api.openweathermap.org/data/2.5/forecast
- For cities: New York, London, Tokyo
- API key from environment variable
2. Process the data:
- Extract: city, date, temperature, conditions
- Convert temperature from Kelvin to Celsius
- Format date as YYYY-MM-DD
3. POST processed data to our internal API
- Endpoint: https://internal.example.com/api/weather
- Include API authentication header
4. Error handling:
- Retry failed requests 3 times
- Log all API calls
- Email notification if all retries fail
5. Run daily via cron job (6 AM UTC)
Use Case: Scrape product prices from e-commerce site daily
Claude Code Prompt:
Create a web scraping script that:
1. Scrapes prices from 5 product pages
2. Stores data in SQLite database with timestamp
3. Compares with yesterday's prices
4. Generates alert if price drops > 10%
5. Sends Telegram notification with deal alerts
Use BeautifulSoup and requests.
Handle anti-scraping measures (delays, user-agent).
Generated Script Highlights:
from bs4 import BeautifulSoup
import requests
import sqlite3
import time
from datetime import datetime
import random
classPriceScraper:
def__init__(self):
self.db_path = 'prices.db'self.setup_database()
defsetup_database(self):
"""Initialize SQLite database"""
conn = sqlite3.connect(self.db_path)
conn.execute('''
CREATE TABLE IF NOT EXISTS prices (
id INTEGER PRIMARY KEY,
product_url TEXT,
product_name TEXT,
price REAL,
timestamp DATETIME
)
''')
conn.commit()
conn.close()
defscrape_price(self, url):
"""Scrape single product with anti-detection"""
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36'
}
# Random delay (1-3 seconds)
time.sleep(random.uniform(1, 3))
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.content, 'html.parser')
# Extract price (adapt selectors to target site)
price_element = soup.select_one('.price-current')
price = float(price_element.text.strip().replace('$', ''))
name_element = soup.select_one('.product-title')
name = name_element.text.strip()
return {'name': name, 'price': price, 'url': url}
():
conn = sqlite3.connect(.db_path)
cursor = conn.execute(, (product_url,))
previous = cursor.fetchone()
conn.close()
previous:
old_price = previous[]
drop_percent = ((old_price - current_price) / old_price) *
drop_percent drop_percent >
():
Hooks are automated triggers that execute commands in response to events.
Types of Hooks
1. Pre-Commit Hooks
Run before git commits to ensure code quality:
# In CLAUDE.md## Hooks### pre-commit-Run unit tests:`pytesttests/`-Run linter:`flake8*.py`-Check type hints:`mypy*.py`-Ifanyfail,blockcommit
2. Post-Edit Hooks
Format code after Claude makes changes:
### post-edit-Auto-format with Black:`black*.py`-Sort imports:`isort*.py`-Update docstrings:`pydocstyle*.py`
3. On-Error Hooks
Notify when errors occur:
### on-error-Log to file:appenderrorto`logs/errors.log`-Send Slack notification:curlwebhookwitherrordetails-CreateGitHubissue(forrecurringerrors)
4. Scheduled Hooks
Run tasks periodically:
### scheduled (daily at 6 AM)-Backup database:`pythonbackup_db.py`-Clean old logs:`findlogs/-mtime+30-delete`-Generate reports:`pythongenerate_reports.py`
"Build a complete data pipeline with ETL, machine learning,
API integration, real-time dashboards, and automated alerts"
Do:
Session 1: "Process a single CSV file"
Session 2: "Add data validation"
Session 3: "Generate simple report"
Session 4: "Add API integration"
Session 5: "Automate with scheduling"
2. Use Descriptive Prompts
Vague:
"Fix the data thing"
Specific:
"In data_processor.py, fix the function clean_sales_data()
to handle null values in the 'amount' column by:
1. Replacing nulls with 0 if 'status' is 'refund'
2. Replacing nulls with median if 'status' is 'sale'
3. Logging how many nulls were replaced"
3. Review Generated Code
Always Check:
Does it solve the right problem?
Are edge cases handled?
Is error handling present?
Are dependencies reasonable?
Is the code maintainable?
4. Maintain CLAUDE.md
Keep your project instructions current:
## Recent Changes (Update Log)### 2026-06-09- Added API rate limiting (max 100 calls/hour)
- Changed database from SQLite to PostgreSQL
- New requirement: All prices must be in USD
### 2026-06-08- Migrated from pandas 1.x to 2.x
- Updated data cleaning logic for new CSV format
5. Version Control Integration
# .gitignore
.env
*.pyc
__pycache__/
logs/
.claude/cache/
# README.md## Running with Claude Code
1. Install dependencies: `pip install -r requirements.txt`
2. Set environment variables (see .env.example)
3. Run automation: `claude-code "run daily automation"`
6. Testing Strategy
# tests/test_data_processor.pyimport pytest
from data_processor import clean_sales_data
deftest_clean_sales_data_removes_duplicates():
input_data = [
{'id': 1, 'amount': 100},
{'id': 1, 'amount': 100}, # duplicate
{'id': 2, 'amount': 200}
]
result = clean_sales_data(input_data)
assertlen(result) == 2deftest_clean_sales_data_handles_nulls():
input_data = [{'id': 1, 'amount': None}]
result = clean_sales_data(input_data)
assert result[0]['amount'] == 0# or median, depending on logic
Run tests with Claude Code:
claude-code "run pytest and fix any failing tests"
Troubleshooting Common Issues
Issue 1: "API Key Not Found"
Problem:
Error: ANTHROPIC_API_KEY environment variable not set
Solution:
# Verify environment variableecho$ANTHROPIC_API_KEY# If empty, set it:export ANTHROPIC_API_KEY='sk-ant-api...'# Make permanent (add to ~/.zshrc or ~/.bashrc):echo'export ANTHROPIC_API_KEY="your-key"' >> ~/.zshrc
source ~/.zshrc
Issue 2: "Module Not Found"
Problem:
ModuleNotFoundError: No module named 'pandas'
Solution:
# Claude Code should auto-install, but if not:
pip install pandas
# Or create requirements.txt and install all:
pip install -r requirements.txt
# Tell Claude about dependencies in CLAUDE.md:## Dependencies
Install via: `pip install pandas requests openpyxl`
# Check file permissionsls -la output.xlsx
# Fix permissionschmod 644 output.xlsx
# Or delete locked filerm output.xlsx
# Tell Claude Code:
claude-code "fix the permission error by checking if file is locked before writing"
Issue 4: Claude Code Runs Forever
Problem:
Code enters infinite loop or takes too long
Solution:
# Interrupt Claude Code:
Ctrl+C
# Restart with clearer instructions:
claude-code "In script.py, add a max_iterations=10 limit
to the while loop to prevent infinite loops"
Issue 5: Rate Limiting
Problem:
Error: API rate limit exceeded
Solution:
# Add rate limiting to your codeimport time
defrate_limited_api_call(url):
time.sleep(1) # Wait 1 second between calls
response = requests.get(url)
return response
# Or use CLAUDE.md:## API Guidelines
- Max 100 API calls per minute
- Add 1-second delay between calls
- Implement exponential backoff for failures
## File Structure-`main.py`: Entry point, orchestrates workflow
-`data_processor.py`: Data cleaning and transformation
-`api_client.py`: API calls with retry logic
-`config.py`: Configuration and environment variables
## Workflow1. main.py imports data_processor and api_client
2. data_processor handles all pandas operations
3. api_client manages external API interactions
4. Keep files < 300 lines for maintainability
Pattern 2: Database Integration
# db_manager.py - Generated by Claude Codeimport sqlite3
from contextlib import contextmanager
classDatabaseManager:
def__init__(self, db_path='automation.db'):
self.db_path = db_path
self.setup_database()
@contextmanagerdefget_connection(self):
"""Context manager for database connections"""
conn = sqlite3.connect(self.db_path)
try:
yield conn
conn.commit()
except Exception as e:
conn.rollback()
raisefinally:
conn.close()
defsetup_database(self):
"""Create tables if they don't exist"""withself.get_connection() as conn:
conn.execute('''
CREATE TABLE IF NOT EXISTS automation_runs (
id INTEGER PRIMARY KEY,
script_name TEXT,
status TEXT,
records_processed INTEGER,
execution_time REAL,
timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
)
''')
deflog_run(self, script_name, status, records, exec_time):
"""Log automation execution"""withself.get_connection() as conn:
conn.execute('''
INSERT INTO automation_runs
(script_name, status, records_processed, execution_time)
VALUES (?, ?, ?, ?)
''', (script_name, status, records, exec_time))
Pattern 3: Configuration Management
# config.py - Generated by Claude Codeimport os
from dotenv import load_dotenv
load_dotenv()
classConfig:
# API Keys
OPENWEATHER_API_KEY = os.getenv('OPENWEATHER_API_KEY')
INTERNAL_API_KEY = os.getenv('INTERNAL_API_KEY')
# Database
DB_PATH = os.getenv('DB_PATH', 'automation.db')
# Paths
DATA_DIR = 'data/'
REPORTS_DIR = 'reports/'
LOGS_DIR = 'logs/'# API Settings
API_TIMEOUT = 10
API_RETRIES = 3
API_BACKOFF_FACTOR = 2# Email Alerts
SMTP_HOST = os.getenv('SMTP_HOST')
SMTP_PORT = int(os.getenv('SMTP_PORT', 587))
ALERT_EMAIL = os.getenv('ALERT_EMAIL')
@classmethoddefvalidate(cls):
"""Validate required config"""
required = ['OPENWEATHER_API_KEY', 'INTERNAL_API_KEY']
missing = [key for key in required ifnotgetattr(cls, key)]
if missing:
raise ValueError(f"Missing config: {', '.join(missing)}")
# Usage in your scripts:from config import Config
Config.validate()
api_key = Config.OPENWEATHER_API_KEY
Real Production Example: E-Commerce Analytics Automation
Complete workflow built with Claude Code:
Requirement:
"Automate daily e-commerce analytics: fetch orders from Shopify API, process data, calculate metrics, generate Excel report, and email to stakeholders."
Claude Code Session:
$ claude-code "Build e-commerce analytics automation as described in requirements.md"
[Claude Code Process:]
โ Read requirements.md
โ Created project structure:
- main.py (orchestrator)
- shopify_client.py (API integration)
- analytics_processor.py (metrics calculation)
- report_generator.py (Excel creation)
- email_sender.py (SMTP integration)
- config.py (configuration)
โ Installed dependencies: shopify-python-api, pandas, openpyxl
โ Wrote shopify_client.py with pagination support
โ Executed test run
โ Error: ShopifyAPI authentication failed
[Auto-fixing...]
โ Updated authentication to use private app credentials
โ Reran test
โ Successfully fetched 1,247 orders
โ Wrote analytics_processor.py with metrics:
- Total revenue
- Average order value
- Top 10 products
- Sales by category
- Customer acquisition trends
โ Wrote report_generator.py with Excel formatting
โ Wrote email_sender.py with HTML email template
โ Created main.py to orchestrate the workflow
โ Full test run successful
โ Generated reports/analytics-2026-06-09.xlsx
โ Sent email to 3 stakeholders
Automation complete. Total time: 8 minutes.
This guide covers Claude Code for Python automation as of June 2026. Claude Code is actively developed; check official documentation for latest features.
if
'quantity'
in
and
'price'
in
'revenue'
'quantity'
'price'
# Combine all data
True
f"Combined {len(combined_df)} total records"
return
def
generate_excel_report
df
"""Generate Excel report with summary and details"""