TL;DR
Anthropic published lessons from using hundreds of Claude Code Skills across its engineering organization, framing Skills as reusable folders rather than saved prompts. The company says verification-focused Skills had the largest measured effect on output quality, though best practices are still developing.
Anthropic has published new lessons from using hundreds of Claude Code Skills across its engineering organization, arguing that the most useful agent instructions are not one-off prompts but versioned folders containing instructions, scripts, references, templates and guardrails.
The source post, “Lessons from building Claude Code: How we use skills” by Thariq Shihipar, describes Skills as discoverable folders that Claude Code can read and act on when a task matches their description. According to Anthropic, a Skill can include a SKILL.md file, reference material, scripts, assets, configuration, hooks and memory.
Anthropic says its internal Skills fell into nine broad categories: library and API reference, product verification, data fetching and analysis, business-process automation, code scaffolding and templates, code quality and review, CI/CD and deployment, runbooks, and infrastructure operations.
The company’s strongest claim is about verification Skills. According to Anthropic’s own measurement, Skills that check whether an agent’s work is correct had the largest impact on output quality. The company also says strong Skills often begin as a short instruction plus one hard-won caveat, then grow as teams add edge cases and reusable tools.
A Skill is a folder, not a prompt
Anthropic published what it learned running hundreds of Skills across its own engineering org. Read as a business memo, the point is bigger than a coding trick: this is how ad-hoc prompting becomes durable institutional capability — the SOPs your agents actually follow, versioned and shared.
“A Skill is just a clever markdown prompt you save in a file.”
A folder the agent can discover, read & run — instructions, scripts, references, templates, config & on-demand hooks.
The knowledge of how your organization actually operates can be captured, versioned, shared & executed — and the thing capturing it is a humble folder with a script and a gotchas list inside. For the builder, that’s context engineering with real tools attached. For whoever owns the budget, it’s the difference between AI that starts from zero every morning and an asset that compounds. Caveats: best practices are still evolving, checked-in Skills cost context, and curation beats accumulation. Start with one Skill, one gotcha, and the category that catches your mistakes.
Why Skills Matter for Teams
The development matters because it reframes agent setup as institutional knowledge management, not prompt tinkering. If a Skill contains the way a team reviews code, verifies product behavior or handles releases, that knowledge can be shared, versioned and reused instead of retyped into each session.
For engineering leaders, the claim points to a practical question: whether AI agent quality improves more from buying stronger models or from giving agents better local operating knowledge. Anthropic’s account suggests that reusable procedures, scripts and checks can help reduce inconsistent output across teams.
The business case remains partly interpretive. Anthropic reports internal experience and measured gains for verification Skills, but it has not provided enough public detail in the source material to independently compare those gains across companies, codebases or agent setups.

From Scripting To Systems: A Practical Guide to Using AI Workflows That Save Time, Reduce Errors, and Make You the Go-To Tech Expert
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
From Prompting to Folders
The central correction in Anthropic’s write-up is definitional: a Skill is not just markdown. It is a folder that can hold the lightweight instruction the model sees first, then deeper references or scripts that are loaded only when needed.
That design supports progressive disclosure: the agent does not need every detail at once, but can reach for more specific material when the task requires it. In Anthropic’s framing, the folder itself becomes the knowledge base.
The July 1 commentary from Thorsten Meyer AI casts the finding as a business memo: ad-hoc prompting can become a durable asset when companies package repeated work into shared operating procedures that agents can follow.
“A Skill is a folder, not a prompt.”
— Thorsten Meyer AI

Mastering Codex for Parallel AI Agents: Run multiple AI agents at once and verify their work — a non-engineer's guide to supervising Codex (Codex Mastery Series Book 2)
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Limits of the Evidence
Several details remain unclear from the available source material. Anthropic does not provide enough public information here to show how each Skill category was measured, what baseline was used, or how results differed across teams and projects.
It is also unclear how much of Anthropic’s experience transfers to smaller organizations, non-engineering teams or companies without mature internal documentation. The source material says best practices are still evolving, and that checked-in Skills can carry context costs if teams accumulate them without curation.

Designing Instruction with Generative AI: 24/7 Support for Optimizing Teaching and Learning
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Teams Will Test Skill Libraries
The next step for companies using coding agents is likely to be smaller pilots: building one Skill around a repeatable workflow, especially a verification or review task, then measuring whether it improves output consistency.
Anthropic’s own guidance, as reflected in the source material, points toward curated libraries rather than large collections. Teams will need to decide which procedures deserve scripts, templates and hooks, and which are better left as ordinary documentation.

50 AI Workflows for Engineers: From Debugging to System Design, Code Review & Engineering Automation
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Key Questions
What did Anthropic publish?
Anthropic published a Claude blog post by Thariq Shihipar describing lessons from using hundreds of Claude Code Skills inside its engineering organization.
What is a Claude Code Skill?
A Skill is described as a folder that can include instructions, references, scripts, templates, configuration, hooks and memory that an agent can use for a task.
Which Skills had the biggest reported effect?
According to Anthropic’s own measurement cited in the source material, product verification Skills had the largest impact on output quality.
Why is this relevant beyond developers?
The approach suggests that companies can turn repeated work patterns into versioned operating knowledge for AI agents, making agent behavior more consistent across teams.
What is still unknown?
The available source material does not fully show measurement methods, external benchmarks or how well the approach works outside Anthropic’s own engineering environment.
Source: Thorsten Meyer AI