clean-data-xls▌
anthropics/financial-services-plugins · updated Apr 8, 2026
Clean messy data in the active sheet or a specified range.
Clean Data
Clean messy data in the active sheet or a specified range.
Environment
- If running inside Excel (Office Add-in / Office JS): Use Office JS directly (
Excel.run(async (context) => {...})). Read viarange.values, write helper-column formulas viarange.formulas = [["=TRIM(A2)"]]. The in-place vs helper-column decision still applies. - If operating on a standalone .xlsx file: Use Python/openpyxl.
Workflow
Step 1: Scope
- If a range is given (e.g.
A1:F200), use it - Otherwise use the full used range of the active sheet
- Profile each column: detect its dominant type (text / number / date) and identify outliers
Step 2: Detect issues
| Issue | What to look for |
|---|---|
| Whitespace | leading/trailing spaces, double spaces |
| Casing | inconsistent casing in categorical columns (usa / USA / Usa) |
| Number-as-text | numeric values stored as text; stray $, ,, % in number cells |
| Dates | mixed formats in the same column (3/8/26, 2026-03-08, March 8 2026) |
| Duplicates | exact-duplicate rows and near-duplicates (case/whitespace differences) |
| Blanks | empty cells in otherwise-populated columns |
| Mixed types | a column that's 98% numbers but has 3 text entries |
| Encoding | mojibake (é, ’), non-printing characters |
| Errors | #REF!, #N/A, #VALUE!, #DIV/0! |
Step 3: Propose fixes
Show a summary table before changing anything:
| Column | Issue | Count | Proposed Fix |
|---|
Step 4: Apply
- Prefer formulas over hardcoded cleaned values — where the cleaned output can be expressed as a formula (e.g.
=TRIM(A2),=VALUE(SUBSTITUTE(B2,"$","")),=UPPER(C2),=DATEVALUE(D2)), write the formula in an adjacent helper column rather than computing the result in Python and overwriting the original. This keeps the transformation transparent and auditable. - Only overwrite in place with computed values when the user explicitly asks for it, or when no sensible formula equivalent exists (e.g. encoding/mojibake repair)
- For destructive operations (removing duplicates, filling blanks, overwriting originals), confirm with the user first
- After each category of fix (whitespace → casing → number conversion → dates → dedup), show the user a sample of what changed and get confirmation before moving to the next category
- Report a before/after summary of what changed
Discussion
Product Hunt–style comments (not star reviews)- No comments yet — start the thread.
Ratings
4.7★★★★★72 reviews- ★★★★★Pratham Ware· Dec 28, 2024
Solid pick for teams standardizing on skills: clean-data-xls is focused, and the summary matches what you get after install.
- ★★★★★Anika Malhotra· Dec 16, 2024
clean-data-xls has been reliable in day-to-day use. Documentation quality is above average for community skills.
- ★★★★★Arya Singh· Dec 12, 2024
clean-data-xls reduced setup friction for our internal harness; good balance of opinion and flexibility.
- ★★★★★Sofia Tandon· Dec 8, 2024
clean-data-xls is among the better-maintained entries we tried; worth keeping pinned for repeat workflows.
- ★★★★★Kaira Abbas· Dec 4, 2024
I recommend clean-data-xls for anyone iterating fast on agent tooling; clear intent and a small, reviewable surface area.
- ★★★★★Sofia Menon· Dec 4, 2024
clean-data-xls fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.
- ★★★★★Mateo Patel· Nov 27, 2024
Useful defaults in clean-data-xls — fewer surprises than typical one-off scripts, and it plays nicely with `npx skills` flows.
- ★★★★★Liam Mehta· Nov 23, 2024
Keeps context tight: clean-data-xls is the kind of skill you can hand to a new teammate without a long onboarding doc.
- ★★★★★Min Mehta· Nov 23, 2024
Registry listing for clean-data-xls matched our evaluation — installs cleanly and behaves as described in the markdown.
- ★★★★★Kaira Nasser· Nov 15, 2024
clean-data-xls fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.
showing 1-10 of 72