Agent skills (usually centered on a SKILL.md and optional scripts) are becoming default infrastructure for Claude Code, Cursor, and similar hosts. They are also a new class of software supply-chain risk: easy to copy, easy to trust because they read like docs, and strong because the model will treat them as ground truth when the task matches.
Below we summarize evidence from security vendors, open standards work (OWASP), and academic preprints (arXiv)—so this is not alarmism—then a short threat list, and how ExplainX handles public listings on explainx.ai/skills.
What research and major audits have already shown
Industry-scale audits. In February 2026, Snyk published ToxicSkills: a scan of 3,984 skills from public registries, reporting that 36.82% had at least one security issue, 13.4% had issues Snyk classified as critical (including malware, prompt injection, and exposed secrets), and 76 were treated as confirmed malicious payloads after human-in-the-loop review—with Snyk noting 8 of those were still public on one marketplace at publication. In a related threat-model article, “From SKILL.md to Shell Access in Three Lines of Markdown”, Snyk walks through how markdown instructions, bundled scripts, and operator-facing “prerequisites” turn a skill folder into a realistic attack path (including earlier ClawHub-centered campaign reporting in the same piece).
Risk taxonomy and ecosystem tracking. The OWASP Agentic Skills Top 10 project documents the most common failure modes across OpenClaw, Claude Code, Cursor / Codex, and VS Code-style skill packaging—e.g. malicious or poisoned skills, supply-chain style compromise, over-privileged capabilities, and metadata / discovery trust problems. It is a useful checklist even if your stack only overlaps part of the matrix.
Academic preprints (representative, not exhaustive). Several arXiv papers spell out the science behind the unease:
- Agent Skills in the Wild: An Empirical Study of Security Vulnerabilities at Scale (2601.10338) analyzes tens of thousands of skills with a SkillScan pipeline; the authors report ~26% of analyzed skills with at least one modeled vulnerability class (e.g. prompt injection, exfiltration, privilege issues), and show script-bundled skills are more likely to be flagged than instruction-only skills.
- Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems (2604.03081) introduces DDIPE (Document-Driven Implicit Payload Execution): hiding harmful behavior in code examples and templates the agent re-uses during normal work, so the payload does not read like a direct “ignore safety” string. They evaluate multiple frameworks and models and report bypass rates for that pattern higher than for blunt imperative injection under their test harness.
- SkillJect (2602.14211) automates stealthy skill-structured injections (including inducements in SKILL.md paired with auxiliary scripts) and closed-loop refinement against traces from real coding agents.
Benchmarks for defenders. The open SKILL-INJECT benchmark measures how often malicious content embedded in skill files steers Claude Code, Codex, and similar agents—useful if you are red-teaming your own guardrails.
Installer / discovery design matters. Supply-chain issues are not only “bad markdown.” A documented class of issues is name–path confusion: for example, public analysis of the npx skills add flow (vercel-labs/skills#353) describes how frontmatter name and on-disk layout can disagree in ways that look like typosquatting for skills—so registries and install UX need as much scrutiny as the text inside SKILL.md.
Numbers age; methods differ between marketplace crawls and synthetic attacks. The direction is consistent: treat skills like packages with opinions—signing, pinning, scanning, and least privilege are on the short list of sensible responses, same as for npm or PyPI.
What can go wrong?
| Failure mode | Why it matters |
|---|---|
| Instruction hijacking | A skill can steer the model toward reading .env, SSH keys, or CI secrets and echoing them into tool output or pasted chat. |
| Unsafe automation | Skills often encode “run this” playbooks. If those steps include arbitrary shell, file writes, or network calls, a bad update turns into RCE by prompt. |
| Social engineering at scale | A popular skill name (or a typosquat next to a trusted one) inherits reputation the same way a package name does on npm or PyPI. |
| Stale or forked trust | A benign repo can be transferred or the default branch can change. Without ongoing checks, yesterday’s “safe” skill is not a promise for tomorrow. |
None of this is unique to a single vendor. It is the consequence of composable agents: if the user or org installs a skill, the host will treat it as legitimate context until something proves otherwise.
What ExplainX is doing
We treat listed skills as content we vouch for at publication time, not as unvetted drop-ins.
-
Custom Python pipelines — We use dedicated Python scripts to ingest uploads, extract the surfaces that matter (text, file layout, common script hooks), and apply consistent automated checks. That closes the gap where “looks fine when skimming the README” is the only control.
-
Per-upload verification — Every skill that appears on explainx.ai/skills is reviewed through our process before it is approved for the public directory. The goal is systematic coverage, not spot checks on famous repos only.
-
GitHub repository scanning — We examine the source repositories skills come from, with scans that help catch inconsistencies, suspicious patterns, and drift between what a page claims and what the tree actually contains.
Caveat, plainly stated: automated and human review reduce risk; they do not eliminate it. Your org should still use locked-down secrets, branch protection, and in-house policy for which skills may run in production repos.
Hardening skills in your own organization
- Pin versions of skills the same way you pin dependencies; re-verify on update.
- Run secret scanning and SAST in CI; assume the model can be nudged to exfiltrate if files are reachable.
- Prefer narrow skills with small file surfaces over giant “do everything” bundles you have not read.
Read next: What are agent skills? A complete guide · Browse verified skills · Snyk ToxicSkills write-up · OWASP Agentic Skills Top 10
arXiv preprints are not journal peer review; treat their numbers and threat models as research signals, not as guarantees about every marketplace on every day. Security is iterative—we update our own process as the ecosystem changes; see explainx.ai/skills for current registry policy.