Claude Code anti-patterns are the rollout failure modes that show up after the procurement story is over and the behavioural story begins. Eight of them recur across the engagements we have audited this year — visible in .claude/ directories, in commit history, in the gap between licence telemetry and actual engineering throughput. None of them are bugs in the tool. All of them are failures of curation, governance, or measurement on the team side, and all of them stall rollouts for a quarter or two before anyone names them.
This is the contrarian framing because most rollout retros land on the wrong axis. The team blames the model, the season, the skeptics, the budget. The artefacts in the repo tell a different story: a CLAUDE.md the model has long since stopped reading in full, fifty half-finished skills no one remembers, a permissions allowlist that has been growing for six months without anyone reviewing it, hook configurations that fire on every event and train the team to ignore notifications. Tools do not stall rollouts. The patterns around the tools do.
The structure below is consistent across the eight: a one-line diagnostic signal you can read off the repo in under a minute, the failure shape we see most often, and the corrective pattern we have shipped on at least three engagements. The aim is to give an engineering manager a checklist sharp enough to catch the stall in the first month, not the fourth.
- 01Adoption is behavioural, not procurement.Seats provisioned in an afternoon say nothing about how engineers work the next day. The signals that matter are skill invocations, subagent calls, and the shape of CLAUDE.md — not the licence dashboard.
- 02A skill library needs curation, not collection.Fifty skills nobody remembers is worse than five everyone uses. The corrective pattern is a hard library cap (about twenty) and a quarterly retirement review that deletes anything below an invocation threshold.
- 03CLAUDE.md must stay lean.Past roughly two hundred lines the model starts skimming. Bloated memory correlates strongly with shallow engagement. Prune to the lines the model needs every turn and link out to the rest.
- 04Permissions need quarterly review.Allow-lists grow as the tool surface grows. Without a quarterly review they drift into over-permission inside two quarters, and the remediation after an incident is always larger than the review would have been.
- 05Productivity signals must be engagement-weighted.Counting licence-active users tells you nothing the procurement system did not already tell you. The metrics that compound are weighted by depth — subagent invocations, skill invocations, hook coverage, memory hygiene.
01 — Install ≠ AdoptAdoption is behavioural, not procurement.
The first anti-pattern is the one that contains all the others: mistaking procurement for adoption. A purchasing team can provision seats inside a week. A licensing dashboard can confirm those seats are active. None of that describes whether the engineering team has changed how it works. The most common failure mode in 2026 rollouts is a leadership team running on licence telemetry while the underlying engineering practice has not budged.
The diagnostic signal is straightforward. Open the team's shared repo. Look for .claude/skills/. Look for .claude/agents/. Look for a hook block in settings.json. If all three are empty or absent and the dashboard says ninety-five percent licence engagement, you are looking at a tool that is being used as autocomplete and being reported as transformation. The gap between those two postures is the gap that stalls rollouts.
The corrective pattern is measurement on the behaviour side. We cover the full framework in the 50-point adoption scorecard, but the minimum-viable version is three numbers tracked weekly: subagent invocations per active user, skill invocations per active user, and CLAUDE.md line count trended over time. Those three signals separate the engaged users from the adopted-but-idle ones inside a fortnight.
02 — Skill SprawlFifty skills no one remembers.
The second anti-pattern is the failure mode of the team that took adoption seriously and then over-corrected. A motivated tech lead reads the docs, ships skills enthusiastically, the library grows. Six months in there are forty or fifty entries in .claude/skills/, of which the team actively uses six. The remainder are noise — drafts, experiments, half-finished, near-duplicates of each other, written by engineers who have since rotated off the team.
The diagnostic signal is the ratio between library size and weekly invocations. A healthy library is roughly one skill per two-to-three active users with weekly invocation counts that cluster around the top five entries. A sprawled library has fifty entries and the top five still account for ninety percent of the invocations — the rest sit at zero. The long tail is not dormant capacity; it is overhead.
Library size grows linearly
no retirement processSkills are added each sprint by the engineer who needed them that sprint. Nothing is ever removed. The library accumulates draft skills, near-duplicates, and skills tied to workflows that have themselves been retired.
diagnostic signalDiscovery collapses
no one remembers what existsPast about twenty entries, engineers stop browsing the library and re-prompt by hand instead. The skill that would have saved them ten minutes goes uninvoked because they could not find it amongst the noise.
behavioural impactHard cap + quarterly retirement
≤20 entries, prune on invocation countSet an explicit library cap (about twenty for most teams) and a quarterly review that retires anything below an invocation threshold. Curation is the work; collection is the trap.
corrective patternThe skill library is a curated catalogue, not an inbox. Every entry has to earn its slot — same discipline as a design system, same discipline as a public API surface. Teams that apply that discipline keep their libraries small, sharp, and heavily used. Teams that treat it as an inbox watch invocation counts collapse past the twenty-entry mark.
03 — CLAUDE.md BloatInstructions Claude skips.
The third anti-pattern is the memory-file equivalent of skill sprawl. CLAUDE.md grows by accretion. Every incident, every convention, every team norm gets added, never removed. By the time the file is two thousand lines long the model is no longer reading it in full — it is sampling, paraphrasing, and on a bad day quietly ignoring sections that no longer match the rest of the repo. The team thinks the memory is comprehensive. The model thinks the memory is noise.
The diagnostic signal is line count plus drift. Two hundred lines is the sweet spot for most repos — the model will read that on every turn without compromise. Five hundred is workable with careful structure and explicit anchors. Past roughly a thousand, behaviour becomes unpredictable. The second half of the signal is drift: when was the file last edited, and does its content still match what the repo actually does. A six-month-old CLAUDE.md telling the model about deprecated patterns is worse than no memory at all.
Model starts skimming
The file is large enough that the model is sampling rather than reading. Effective signal density collapses. Engineers think the memory is comprehensive; the model treats it as noise.
common in mature reposModel reads in full
Tight, well-anchored memory the model can hold in context every turn. Most teams need an aggressive prune to get here — content gets moved into linked sub-docs loaded on relevance.
house-style targetStale memory is poison
When CLAUDE.md is older than the patterns it describes, the model is being instructed to do work the way the team no longer does it. Quarterly memory-edit cadence is the minimum hygiene.
audit findingThe corrective pattern has two halves. The first is a prune — move the three sections most likely to be skimmed into linked sub-docs, replace the body with a one-line summary and the link, and re-audit a fortnight later. The second is a cadence — quarterly review, dated entries, a single owner who is accountable for the file's health. Without the cadence the prune is undone within two sprints.
One nuance from the engagements: the orchestrator-style CLAUDE.md (a master file that links out to many relevance-loaded sub-files) consistently outperforms the monolithic version, even at identical total line counts. The model handles small linked context far better than large blocks of inline context, and the linked structure is far easier to keep accurate across a quarter.
"A bloated CLAUDE.md is a memory file that has stopped being read. The model is the only one who knows."— Field note · Q2 2026 audit cohort
04 — Permission DriftAllow-lists that grow without review.
Permission drift is the security-side equivalent of skill sprawl. The allow-list in settings.json grows every time an engineer hits a permission prompt during a flow they did not want to interrupt. Two quarters later the list is twice as long, half of it covers tools that nobody on the team actively uses any more, and the gap between intended permissions and granted permissions has widened into something nobody could write down in one go.
The diagnostic signal is age plus turnover. How old are the entries on the list. How many of them were added by engineers who have since left the team. How many of them cover tool surfaces the team has since moved off. A healthy allow-list looks recent and intentional; a drifted one looks like a geological record.
Catch-all allow
A small number of broad allow entries that cover everything anyone might need. Easy to maintain. Useless as a security control. The most common pattern in audited rollouts — and the one that gets discovered after an incident.
audit riskSprawled, never reviewed
Hundreds of narrow allow entries accumulated over multiple quarters with no retirement process. Most cover surfaces the team no longer uses. Nobody can describe the current effective permissions in one sentence.
drift in progressCurated + quarterly review
A narrow allow-list scoped to current tool surface, owned by a named engineer, reviewed quarterly with explicit retirement of stale entries. Boring to maintain. The only pattern that actually controls drift.
house standardThe remediation is calendared review. Once a quarter, an owner walks the allow-list, removes anything that has not been invoked in the previous quarter, and challenges anything that covers a tool surface the team has retired. The work is small. The cost of not doing it accumulates until the quarter the wrong tool fires under the wrong allow-list and the incident review takes a week.
05 — Hook SpamEvery event triggers something.
Hooks are the workflow-integration surface — the place where Claude Code becomes a participant in the team's shared workflow rather than a private assistant. They are also the surface most prone to over-eager configuration. A team that grasps the value of hooks tends to wire them to every lifecycle event inside a fortnight, which sounds like enthusiasm and ends up being noise.
The diagnostic signal is alert behaviour. If hook-driven notifications outnumber the team's real-time work events by more than a small multiple, engineers will start filtering them. Once filtering starts the hook is dead — even the important events get ignored alongside the routine ones. The corrective pattern is curation, same as the skill library, applied to lifecycle events.
Stop + SubagentStop only
Two hooks wired against meaningful end-of-work events. Notifications fire when work completes — auditable, scannable, attended to. Engineers learn to treat the channel as signal. The pattern that survives a quarter.
house standardEvery PreToolUse + Notification
Hooks wired against every lifecycle event the API exposes. Every bash command, every file edit, every tool invocation produces a notification. Engineers mute the channel within a sprint. The hook is dead — and worse, so is the next hook anyone tries to add.
anti-patternLogging hooks, no notifications
PreToolUse and PostToolUse wired against an append-only audit log rather than a chat channel. No engineer notification. Investigable after the fact. Pairs well with the curated Stop hook above — gives you both the human-attention surface and the machine-readable record.
advancedCurated routing by severity
All lifecycle events captured, but only Stop / SubagentStop / explicit-tagged severity events route to engineer channels. The rest go to logs. Hardest to set up correctly; longest-lived in production.
engagement+Hooks are a curation problem dressed as an enthusiasm opportunity. Two well-chosen lifecycle events beat ten indiscriminate ones every time. Teams that take this seriously also build a habit of pruning old hooks alongside new ones — the same quarterly cadence that prunes skills and CLAUDE.md entries.
06 — Three MoreShared-skill orphans, subagent-governance gaps, agent-blame postmortems.
Three further anti-patterns recur often enough to flag, even though they tend to show up later in the rollout — after the base anti-patterns above have been recognised and partially addressed. Each is a governance failure rather than a curation failure, and the corrective patterns shift accordingly.
Shared-skill orphans
no owner, no reviewSkills shared into the repo by engineers who have since rotated off the team. No owner is named, no review cadence is set, and the skill drifts as the underlying workflow evolves. By the time anyone notices it is broken, nobody remembers what it was for. The remedy is an explicit owner field on every committed skill, enforced at PR review.
governance gapSubagent-governance gap
.claude/agents/ unreviewedSubagents are committed to the repo with no review process — anyone can add one, nobody reviews scope, and the directory accumulates agents whose tool permissions or model defaults nobody can defend on a Tuesday morning. Treat subagents as production artefacts. PR review, named owner, quarterly review against current tool surface.
governance gapAgent-blame postmortems
the model did itPostmortems that conclude with the agent as the root cause. The model hallucinated. The agent took a destructive action. The skill misfired. These conclusions feel satisfying and teach nothing — the corrective action is to mature the human-side controls (review, scope, permissions, hooks) the agent operates inside. Blame the system, not the participant.
cultural failureAll three of these surface later because they require the team to have shipped enough artefacts that governance becomes necessary. A team with zero subagents cannot have a subagent-governance gap; a team with no shared skills cannot have skill orphans. The trade-off is real — the cost of engagement is that you now have artefacts to govern. The payoff is that the governance work is small and the engagement gains are large.
For the operational underpinning of the subagent point, our guide on building a Claude skill from scratch covers the SKILL.md format and the scoping discipline that separates a skill that earns its slot from one that becomes an orphan inside a quarter.
07 — Vanity MetricsThe productivity-signal vanity-metric trap.
The final anti-pattern is the measurement-side equivalent of install-not-adopt. A team that has corrected the first seven anti-patterns can still stall on this one — measuring the wrong numbers, reporting the wrong success story to leadership, and quietly losing budget two quarters later when the reported metrics turn out not to predict any real engineering outcome.
The diagnostic signal is the gap between reported productivity and felt productivity. If the dashboard claims a meaningful uplift and the engineering managers cannot point to a workflow that has shifted, the dashboard is measuring vanity. The engagement-weighted alternative is harder to compute but is the one that actually compounds.
Vanity vs engagement-weighted productivity signals
Source: Digital Applied client audit cohort, Q2 2026The pattern from the audit cohort is consistent. Teams that track the top two signals (licence active, sessions per week) report headline numbers that look healthy and predict almost nothing. Teams that track the bottom four (skill and subagent invocation rates, memory hygiene, hook coverage) get smaller numbers, harder to win an executive-review meeting with, but those numbers actually move when the team improves and stay still when the team plateaus. That is the property a productivity metric needs to earn its place on a dashboard.
The remediation is unglamorous: build the engagement-weighted panel, report it weekly to the same audience that sees the vanity panel, watch which numbers people argue about over a quarter. The arguing is the signal. Numbers no one argues about are numbers no one believes in.
Adoption depth is the only metric that compounds — and the only one most teams don't measure.
The eight anti-patterns rhyme. Install-not-adopt is the procurement-side version of the vanity-metric trap. Skill sprawl, hook spam, permission drift, and CLAUDE.md bloat are all curation failures dressed as enthusiasm. Shared-skill orphans and the subagent-governance gap are the same governance failure applied to two different surfaces. Agent-blame postmortems are the cultural tax the team pays for not having unlearned the single-root-cause instinct yet.
Across all eight, the corrective pattern is the same shape: curation over collection, governance over enthusiasm, engagement-weighted signals over surface counts. None of the corrective patterns are expensive. All of them are boring. The cost of skipping them is invisible until the quarter the rollout has stalled and the licence dashboard still says everything is fine.
The next-quarter move for any engineering manager reading this is mundane. Walk the repo. Count the skills, count the hook entries, count the lines in CLAUDE.md, look at the allow-list, ask who owns the subagents, and check whether the productivity panel measures anything you would argue about. The anti-patterns are easy to find once you go looking. The failure is not knowing they were the thing to look for.