polars▌
K-Dense-AI/scientific-agent-skills · updated Jun 4, 2026
MDX-style export adds YAML metadata + attribution linking explainx.ai and this canonical listing URL.
### Polars
- ›name: "polars"
- ›description: "Fast in-memory DataFrame library for datasets that fit in RAM. Use when pandas is too slow but data still fits in memory. Lazy evaluation, parallel execution, Apache Arrow backend. Best for 1-100GB da..."
| name | polars |
| description | Fast in-memory DataFrame library for datasets that fit in RAM. Use when pandas is too slow but data still fits in memory. Lazy evaluation, parallel execution, Apache Arrow backend. Best for 1-100GB datasets, ETL pipelines, faster pandas replacement. For larger-than-RAM data use dask or vaex. |
| license | https://github.com/pola-rs/polars/blob/main/LICENSE |
| metadata | version: "1.0" skill-author: K-Dense Inc. |
Polars
Overview
Polars is a lightning-fast DataFrame library for Python and Rust built on Apache Arrow. Work with Polars' expression-based API, lazy evaluation framework, and high-performance data manipulation capabilities for efficient data processing, pandas migration, and data pipeline optimization.
Quick Start
Installation and Basic Usage
Install Polars:
uv pip install polars
Basic DataFrame creation and operations:
import polars as pl
# Create DataFrame
df = pl.DataFrame({
"name": ["Alice", "Bob", "Charlie"],
"age": [25, 30, 35],
"city": ["NY", "LA", "SF"]
})
# Select columns
df.select("name", "age")
# Filter rows
df.filter(pl.col("age") > 25)
# Add computed columns
df.with_columns(
age_plus_10=pl.col("age") + 10
)
Core Concepts
Expressions
Expressions are the fundamental building blocks of Polars operations. They describe transformations on data and can be composed, reused, and optimized.
Key principles:
- Use
pl.col("column_name")to reference columns - Chain methods to build complex transformations
- Expressions are lazy and only execute within contexts (select, with_columns, filter, group_by)
Example:
# Expression-based computation
df.select(
pl.col("name"),
(pl.col("age") * 12).alias("age_in_months")
)
Lazy vs Eager Evaluation
Eager (DataFrame): Operations execute immediately
df = pl.read_csv("file.csv") # Reads immediately
result = df.filter(pl.col("age") > 25) # Executes immediately
Lazy (LazyFrame): Operations build a query plan, optimized before execution
lf = pl.scan_csv("file.csv") # Doesn't read yet
result = lf.filter(pl.col("age") > 25).select("name", "age")
df = result.collect() # Now executes optimized query
When to use lazy:
- Working with large datasets
- Complex query pipelines
- When only some columns/rows are needed
- Performance is critical
Benefits of lazy evaluation:
- Automatic query optimization
- Predicate pushdown
- Projection pushdown
- Parallel execution
For detailed concepts, load references/core_concepts.md.
Common Operations
Select
Select and manipulate columns:
# Select specific columns
df.select("name", "age")
# Select with expressions
df.select(
pl.col("name"),
(pl.col("age") * 2).alias("double_age")
)
# Select all columns matching a pattern
df.select(pl.col("^.*_id$"))
Filter
Filter rows by conditions:
# Single condition
df.filter(pl.col("age") > 25)
# Multiple conditions (cleaner than using &)
df.filter(
pl.col("age") > 25,
pl.col("city") == "NY"
)
# Complex conditions
df.filter(
(pl.col("age") > 25) | (pl.col("city") == "LA")
)
With Columns
Add or modify columns while preserving existing ones:
# Add new columns
df.with_columns(
age_plus_10=pl.col("age") + 10,
name_upper=pl.col("name").str.to_uppercase()
)
# Parallel computation (all columns computed in parallel)
df.with_columns(
pl.col("value") * 10,
pl.col("value") * 100,
)
Group By and Aggregations
Group data and compute aggregations:
# Basic grouping
df.group_by("city").agg(
pl.col("age").mean().alias("avg_age"),
pl.len().alias("count")
)
# Multiple group keys
df.group_by("city", "department").agg(
pl.col("salary").sum()
)
# Conditional aggregations
df.group_by("city").agg(
(pl.col("age") > 30).sum().alias("over_30")
)
For detailed operation patterns, load references/operations.md.
Aggregations and Window Functions
Aggregation Functions
Common aggregations within group_by context:
pl.len()- count rowspl.col("x").sum()- sum valuespl.col("x").mean()- averagepl.col("x").min()/pl.col("x").max()- extremespl.first()/pl.last()- first/last values
Window Functions with over()
Apply aggregations while preserving row count:
# Add group statistics to each row
df.with_columns(
avg_age_by_city=pl.col("age").mean().over("city"),
rank_in_city=pl.col("salary").rank().over("city")
)
# Multiple grouping columns
df.with_columns(
group_avg=pl.col("value").mean().over("category", "region")
)
Mapping strategies:
group_to_rows(default): Preserves original row orderexplode: Faster but groups rows togetherjoin: Creates list columns
Data I/O
Supported Formats
Polars supports reading and writing:
- CSV, Parquet, JSON, Excel
- Databases (via connectors)
- Cloud storage (S3, Azure, GCS)
- Google BigQuery
- Multiple/partitioned files
Common I/O Operations
CSV:
# Eager
df = pl.read_csv("file.csv")
df.write_csv("output.csv")
# Lazy (preferred for large files)
lf = pl.scan_csv("file.csv")
result = lf.filter(...).select(...).collect()
Parquet (recommended for performance):
df = pl.read_parquet("file.parquet")
df.write_parquet("output.parquet")
JSON:
df = pl.read_json("file.json")
df.write_json("output.json")
For comprehensive I/O documentation, load references/io_guide.md.
Transformations
Joins
Combine DataFrames:
# Inner join
df1.join(df2, on="id", how="inner")
# Left join
df1.join(df2, on="id", how="left")
# Join on different column names
df1.join(df2, left_on="user_id", right_on="id")
Concatenation
Stack DataFrames:
# Vertical (stack rows)
pl.concat([df1, df2], how="vertical")
# Horizontal (add columns)
pl.concat([df1, df2], how="horizontal")
# Diagonal (union with different schemas)
pl.concat([df1, df2], how="diagonal")
Pivot and Unpivot
Reshape data:
# Pivot (wide format)
df.pivot(values="sales", index="date", columns="product")
# Unpivot (long format)
df.unpivot(index="id", on=["col1", "col2"])
For detailed transformation examples, load references/transformations.md.
Pandas Migration
Polars offers significant performance improvements over pandas with a cleaner API. Key differences:
Conceptual Differences
- No index: Polars uses integer positions only
- Strict typing: No silent type conversions
- Lazy evaluation: Available via LazyFrame
- Parallel by default: Operations parallelized automatically
Common Operation Mappings
| Operation | Pandas | Polars |
|---|---|---|
| Select column | df["col"] | df.select("col") |
| Filter | df[df["col"] > 10] | df.filter(pl.col("col") > 10) |
| Add column | df.assign(x=...) | df.with_columns(x=...) |
| Group by | df.groupby("col").agg(...) | df.group_by("col").agg(...) |
| Window | df.groupby("col").transform(...) | df.with_columns(...).over("col") |
Key Syntax Patterns
Pandas sequential (slow):
df.assign(
col_a=lambda df_: df_.value * 10,
col_b=lambda df_: df_.value * 100
)
Polars parallel (fast):
df.with_columns(
col_a=pl.col("value") * 10,
col_b=pl.col("value") * 100,
)
For comprehensive migration guide, load references/pandas_migration.md.
Best Practices
Performance Optimization
-
Use lazy evaluation for large datasets:
lf = pl.scan_csv("large.csv") # Don't use read_csv result = lf.filter(...).select(...).collect() -
Avoid Python functions in hot paths:
- Stay within expression API for parallelization
- Use
.map_elements()only when necessary - Prefer native Polars operations
-
Use streaming for very large data:
lf.collect(streaming=True) -
Select only needed columns early:
# Good: Select columns early lf.select("col1", "col2").filter(...) # Bad: Filter on all columns first lf.filter(...).select("col1", "col2") -
Use appropriate data types:
- Categorical for low-cardinality strings
- Appropriate integer sizes (i32 vs i64)
- Date types for temporal data
Expression Patterns
Conditional operations:
pl.when(condition).then(value).otherwise(other_value)
Column operations across multiple columns:
df.select(pl.col("^.*_value$") * 2) # Regex pattern
Null handling:
pl.col("x").fill_null(0)
pl.col("x").is_null()
pl.col("x").drop_nulls()
For additional best practices and patterns, load references/best_practices.md.
Resources
This skill includes comprehensive reference documentation:
references/
core_concepts.md- Detailed explanations of expressions, lazy evaluation, and type systemoperations.md- Comprehensive guide to all common operations with examplespandas_migration.md- Complete migration guide from pandas to Polarsio_guide.md- Data I/O operations for all supported formatstransformations.md- Joins, concatenation, pivots, and reshaping operationsbest_practices.md- Performance optimization tips and common patterns
Load these references as needed when users require detailed information about specific topics.
How to use polars on Cursor
AI-first code editor with Composer
Prerequisites
Before installing skills in Cursor, ensure your development environment meets these requirements:
- ›Cursor installed and configured on your development machine
- ›Node.js version 16.0+ with npm package manager (verify with
node --version) - ›Active project directory or workspace where you want to add polars
Execute installation command
Execute the skills CLI command in your project's root directory to begin installation:
The skills CLI fetches polars from GitHub repository K-Dense-AI/scientific-agent-skills and configures it for Cursor.
Select Cursor when prompted
The CLI will show a list of available agents. Use arrow keys to navigate and space to select Cursor:
Verify installation
Confirm successful installation by checking the skill directory location:
Reload or restart Cursor to activate polars. Access the skill through slash commands (e.g., /polars) or your agent's skill management interface.
Security & Verification Notice
We perform automated surface-level scans (Gen AI Scanner, Socket, Snyk) during installation. These checks detect common vulnerabilities but do not guarantee complete security. Always review skill source code and verify the publisher's reputation before production use.
Skills execute code in your development environment. Always verify the publisher's identity, review recent commits, and test in isolated environments before production deployment.
List & Monetize Your Skill
Submit your Claude Code skill and start earning
Use Cases▌
Task Automation & Efficiency
Automate repetitive workflows and reduce manual effort
Example
Generate reports, summarize documents, draft communications
Save 3-5 hours per week on routine tasks
Knowledge Enhancement
Learn new skills, understand complex topics, get expert guidance
Example
Explain concepts, provide examples, suggest learning resources
Accelerate learning and skill development by 2x
Quality Improvement
Enhance output quality through reviews, suggestions, and refinements
Example
Review drafts, suggest improvements, catch errors
Improve work quality by 30-40% with less effort
Implementation Guide▌
Prerequisites
- ›Claude Desktop or compatible AI client with skill support
- ›Clear understanding of task or problem to solve
- ›Willingness to iterate and refine outputs
Time Estimate
15-45 minutes depending on use case complexity
Installation Steps
- 1.Install skill using provided installation command
- 2.Test with simple use case relevant to your work
- 3.Evaluate output quality and relevance
- 4.Iterate on prompts to improve results
- 5.Integrate into regular workflow if valuable
Common Pitfalls
- ⚠Expecting perfect results without iteration
- ⚠Not providing enough context in prompts
- ⚠Using skill for tasks outside its intended scope
- ⚠Accepting outputs without review and validation
Best Practices▌
✓ Do
- +Start with clear, specific prompts
- +Provide relevant context and constraints
- +Review and refine all outputs before using
- +Iterate to improve output quality
- +Document successful prompt patterns
✗ Don't
- −Don't use without understanding skill limitations
- −Don't skip validation of outputs
- −Don't share sensitive information in prompts
- −Don't expect skill to replace human judgment
💡 Pro Tips
- ★Be specific about desired format and style
- ★Ask for multiple options to choose from
- ★Request explanations to understand reasoning
- ★Combine AI efficiency with human expertise
When to Use This▌
✓ Use When
Use when skill capabilities match your task, clear ROI on time saved, and you can validate outputs. Best for repetitive tasks, learning, and quality improvement.
✗ Avoid When
Avoid when task requires deep expertise you can't validate, involves sensitive decisions, or when learning process is more valuable than speed of completion.
Learning Path▌
- 1Familiarize yourself with skill capabilities and limitations
- 2Start with low-risk, non-critical tasks
- 3Progress to more complex and valuable use cases
- 4Build expertise through regular use and experimentation
Discussion
Product Hunt–style comments (not star reviews)- No comments yet — start the thread.
Ratings
4.4★★★★★28 reviews- ★★★★★Yuki Park· Dec 28, 2024
Registry listing for polars matched our evaluation — installs cleanly and behaves as described in the markdown.
- ★★★★★Ganesh Mohane· Dec 12, 2024
Solid pick for teams standardizing on skills: polars is focused, and the summary matches what you get after install.
- ★★★★★Min Sethi· Nov 19, 2024
Useful defaults in polars — fewer surprises than typical one-off scripts, and it plays nicely with `npx skills` flows.
- ★★★★★Sakshi Patil· Nov 3, 2024
We added polars from the explainx registry; install was straightforward and the SKILL.md answered most questions upfront.
- ★★★★★Chaitanya Patil· Oct 22, 2024
polars fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.
- ★★★★★Min Taylor· Oct 10, 2024
I recommend polars for anyone iterating fast on agent tooling; clear intent and a small, reviewable surface area.
- ★★★★★Camila Abebe· Sep 5, 2024
Registry listing for polars matched our evaluation — installs cleanly and behaves as described in the markdown.
- ★★★★★Aisha Agarwal· Aug 24, 2024
polars reduced setup friction for our internal harness; good balance of opinion and flexibility.
- ★★★★★Yash Thakker· Jul 23, 2024
polars is among the better-maintained entries we tried; worth keeping pinned for repeat workflows.
- ★★★★★Zaid Nasser· Jul 15, 2024
I recommend polars for anyone iterating fast on agent tooling; clear intent and a small, reviewable surface area.
showing 1-10 of 28