A Claude Code team-adoption audit is the difference between knowing your engineers have the tool installed and knowing what they actually do with it. Engineering leaders can pull a licence-usage report from Anthropic in a single click; what that report cannot tell you is whether the team has skills in .claude/skills, whether CLAUDE.md files are tight enough to be read in full on every turn, whether anyone has wired Stop hooks, and whether the security policy is keeping up with the shape of the tool today versus six months ago.

The 50-point scorecard below is the audit framework we use on client engagements. Five axes — configuration, hooks, skills, memory, security — at ten points each. Every point is checked against an artefact in the repo or in the user's local Claude Code state, not against survey self-reports. The output is a stage on a four-step maturity model and a 60-day uplift roadmap with the highest-ROI gaps prioritised.

This guide walks through what the audit looks for on each axis, how the maturity model maps an aggregate score to a development stage, and a worked example from a 30-engineer shop we audited in Q1 2026. Everything below is the operational method, not the sales pitch — the section at the end on engagement work is the place for the latter.

Key takeaways

01
Engagement is the trailing indicator that matters.Daily-active subagent usage and skill-invocation counts predict productivity gains; licence counts only predict spend. Audit the inputs that produce engagement, not the seats.
02
Skills are the highest-ROI investment.One well-scoped skill saved per developer per week compounds into hours back inside a quarter. Most audited teams have zero shared skills — that is the lowest-hanging point.
03
CLAUDE.md hygiene predicts adoption depth.Bloated memory files correlate with surface-level use because the model skims rather than reads. Tight, well-pruned memory correlates with the teams that have wired hooks and skills too.
04
Security policy lags adoption by quarters.Teams adopt new tool surfaces (skills, MCP servers, hooks) faster than policy gets written. The audit catches the drift before an incident does.
05
Quarterly re-audit cadence works.Monthly is too noisy — the artefacts only shift on the quarter scale. Annual misses the policy-drift window entirely. Quarterly is the rhythm that has held across the client base.

01 — Adoption vs EngagementInstalled is not adopted.

The first failure mode in any AI-coding-tool rollout is treating licence telemetry as the success metric. Anthropic's dashboard can confirm a seat is provisioned, a session opened, a token spent. None of those signals describe whether the engineer is using Claude Code as a thin chat interface or as a delegated agentic workflow — and the gap between those two postures is roughly an order of magnitude in productivity terms.

We separate the two with a deliberate vocabulary. Adoption is the floor: the tool is installed, the user has logged in, a session has been started this week. Engagement is the ceiling: subagents are being invoked, skills are being run, hooks are firing, CLAUDE.md is referenced on every turn, memory entries are accumulating with durable value. Adoption is binary and easy. Engagement is a spectrum and is what the audit measures.

Three signals separate engaged users from adopted-but-idle ones. They are also the signals managers most often miss because they sit outside the vendor dashboard:

Signal 1

Subagents

Daily-active subagent count

Engaged teams invoke 2-5 subagents per active session. Adopted-but-idle teams invoke zero. Subagent count is the single strongest predictor of week-over-week productivity gains in the audited cohort.

.claude/agents/ · /agents

Signal 2

Skills

Skill invocations per week

Skills are user-defined slash commands. Teams that have written and use even three or four skills are visibly faster on routine tasks (commits, reviews, scaffolding). Teams with zero skills are still hand-prompting every interaction.

.claude/skills/

Signal 3

Hooks

Stop / SubagentStop hooks wired

Hooks fire on lifecycle events — Stop, SubagentStop, PreToolUse. They are how teams integrate Claude Code into their existing workflow (notifications, audit logs, automated reviews). Zero hooks usually means zero workflow integration.

settings.json hooks

Licence telemetry sees none of the three. A user who opens a session daily and copy-pastes generated code into their editor scores identically to a user running parallel review subagents with hook-driven CI integration — yet they are doing fundamentally different work. The audit's purpose is to surface that gap, attribute it to the right axis, and convert it into a roadmap rather than a number.

For a deeper look at how subagents change the workflow specifically, our walkthrough on building a Claude Code custom subagent is the operational layer underneath this axis — the audit checks for the presence and quality of those agents, that piece teaches how to build them.

Why this distinction matters

A team can be 100% adopted and 10% engaged. We have seen exactly that pattern on three separate client audits in 2026 — full licence coverage, zero shared skills, zero hooks, CLAUDE.md files left at the auto-generated default. The remediation for that profile is not buying more seats.

02 — Five AxesConfig, hooks, skills, memory, security.

The scorecard is organised around five axes, ten points each, for a fifty-point total. We picked these axes by working backwards from the engagement signals above — every axis maps to a class of artefact that engaged teams produce and idle teams do not. They are listed in the order we score them, because the earlier axes tend to be prerequisites for the later ones (skills are hard to land without good configuration; security is hard to audit without skills being present at all).

Axis 01

Config

settings.json + permissions

settings.json hygiene, model defaults, permissions allowlist, MCP server registrations, environment variables, default tool gating. The base layer — every later axis assumes it.

10 points

Axis 02

Hooks

Lifecycle integration

Stop / SubagentStop / PreToolUse / Notification hooks. Hooks turn Claude Code from a chat tool into a workflow component. Zero hooks is the single most common gap.

10 points

Axis 03

Skills

Slash-command library

.claude/skills/ directory. Number of skills, scope quality, documentation, sharing across the team. The leverage axis — the one with the biggest productivity multiplier.

10 points

Axis 04

Memory

CLAUDE.md hygiene

Length, structure, signal density, cross-references, freshness. Bloated CLAUDE.md correlates with surface-level engagement; tight memory correlates with deep agentic use.

10 points

Axis 05

Security

Policy + productivity signals

Permissions discipline, secret handling, MCP-server review, audit logging, plus the productivity signals (engagement metrics, time-to-value, subagent uptake). The guardrails axis.

10 points

Two axes — configuration and hooks — form the base layer. They make the tool behave consistently across the team and create the hooks (literally) that later workflow integration depends on. Two more — skills and memory — form the leverage layer. They are what make Claude Code feel like a senior teammate rather than an autocomplete. The fifth — security — is the cross-cutting concern: it has to keep up with everything else because every new surface area is a new policy gap.

The next three sections cover each layer in turn — what we score, what good looks like, what failure looks like, and what the common remediation pattern is for the gaps we find most often.

03 — Config + HooksTwenty points on the base layer.

Configuration is the easiest axis to score because the artefacts are deterministic — either ~/.claude/settings.json has an entry or it does not. We look for ten things, weighted equally. A well-configured engineer scores eight or nine; an unconfigured one scores zero to two.

Config 01

Project + user settings

.claude/settings.json + ~/.claude/settings.json

Separate files for repo-specific and user-global settings. Project settings committed to git so the team shares a baseline. User settings personal. Many teams have neither — every engineer runs out-of-box defaults.

2 points · structural

Config 02

Permissions allowlist

settings.json permissions

Explicit allow / deny patterns for Bash, file paths, network. Teams without an allowlist either over-prompt for permission (friction) or run unrestricted (risk). Two points for explicit lists; one for partial.

2 points · safety

Config 03

MCP server registry

.mcp.json or settings.json mcp

Shared MCP servers committed to the repo. Common picks: GitHub, Linear, Slack, project-specific data. Audit checks both presence and freshness — stale registrations are a red flag.

2 points · integration

Config 04

Model + tool defaults

model, autoCompactEnabled, etc.

Sensible defaults set at user or project scope. Sonnet for routine work, Opus for hard reasoning, Haiku for fast classification. Teams without defaults pay the configuration tax on every session.

2 points · ergonomics

Hooks 01-03

Stop + SubagentStop hooks

settings.json hooks

Lifecycle integration. Notifications, audit logging, automated summaries. Six points across the lifecycle events. Teams with zero hooks score zero across this block — the most common pattern in the audited cohort.

6 points · workflow

Hooks 04

PreToolUse / PostToolUse

fine-grained workflow gates

Hooks firing on specific tool invocations. Used for things like vetting bash commands before execution, logging file edits, or running pre-commit-style checks. Advanced — usually only seen on the most engaged teams.

4 points · advanced

The most common base-layer gap is the no hooks profile: solid settings, sensible model defaults, sometimes even a tidy MCP registry — but no lifecycle hooks at all. That team is using Claude Code as a smart text editor rather than as an agentic component of their workflow. The remediation is small and high-ROI: ship a Stop hook that posts to Slack or appends to an audit log, watch engagement rise inside a week as the team starts thinking of Claude Code as a participant in the shared workflow rather than a private assistant.

Configuration gaps, by contrast, are usually a one-afternoon fix — generate a shared .claude/settings.json template, commit it, document the user-scope additions individuals are expected to make, and move on.

"Show me your hooks and I will tell you whether Claude Code is integrated into your workflow or sitting beside it."— Field note · Q1 2026 audit cohort

04 — Skills + MemoryTwenty points on the leverage layer.

If the base layer makes Claude Code behave consistently, the leverage layer is what turns it into a multiplier. Skills are the productised leverage — user-defined slash commands that wrap a repeated workflow into one keystroke. Memory is the sustained leverage — the file the model reads on every turn, which means tight memory pays dividends on every interaction.

Skills score ten points across five sub-criteria, two points each. Memory scores another ten with the same structure. The audit is opinionated about what good looks like on both — opinionated enough that scoring is reproducible across auditors, not interpretive.

Skills 01

Library size + quality

Number of skills shipped in .claude/skills/. Six to ten well-scoped skills is the sweet spot — covers the routine tasks (commits, reviews, scaffolding, doc updates) without overlap. Two points: library size; quality scored on the next four criteria.

2 pts

Skills 02-05

Scope + docs + sharing + freshness

Each skill scoped to one job (not five). Each documented inside the skill file. Shared via repo so the whole team benefits. Updated as the underlying workflow evolves. Eight points total — most audited teams score zero here because the library is empty.

8 pts

Memory 01

CLAUDE.md size + structure

Lean is better. Most production CLAUDE.md files we recommend land at 200-400 lines. Bigger is a code smell — either pruning is overdue or content belongs in skills, docs, or per-subagent prompts instead. Structured with clear section headers the model can use as anchors.

4 pts

Memory 02

Signal density + cross-refs

Every section earns its line count. Cross-references point to authoritative artefacts (sub-docs, skills, agents) rather than restating them inline. Auto-generated memory files score badly here — they pad every section with boilerplate the model has to skim past.

4 pts

Memory 03

Freshness + drift

Memory updated within the last quarter. Stale CLAUDE.md tells the model things that are no longer true, which is worse than no memory at all. Two points: dated entries, recent edits, no obvious drift between memory and current repo state.

2 pts

The leverage layer is where the audit usually finds the biggest point gap in absolute terms. A team scoring 14 out of 20 on the base layer is doing fine — there is room to improve but the tool is broadly functional. A team scoring 2 out of 20 on the leverage layer (the median for first-time audits in 2026) is leaving most of the value on the table. The remediation is straightforward — start with one skill per engineer per month, prune CLAUDE.md to 300 lines, re-measure the quarter after.

For the technique of building a single high-quality skill from scratch, our walkthrough on building a Claude skill from scratch covers the SKILL.md format, trigger phrasing, and the scoping discipline that separates a skill that gets invoked from one that sits idle in the directory.

The CLAUDE.md hygiene pattern

The single fastest improvement on the leverage axis is a CLAUDE.md prune. Pick the three sections the model is most likely to skim; move them into linked sub-docs loaded on relevance rather than every turn. Replace the body of those sections with a one-line summary plus the link. Re-audit in a fortnight — engagement on the affected axes typically rises measurably without any other change.

05 — Security + SignalsTen points on the guardrails and the productivity signals.

The fifth axis is the smallest in nominal point count but the most cross-cutting. Six points cover security policy — the disciplines that have to keep up with every other axis as it grows. Four cover the productivity signals — the metrics that tell you whether the other forty points are paying off.

Security 01

Permissions discipline

deny patterns audited

Permissions allowlist reviewed against current tool surface. New tools (Skills, MCP servers, hooks) audited as they are added. Forgotten allowlists drift into over-permission inside a quarter.

2 points

Security 02

Secret + PII handling

no secrets in CLAUDE.md / skills / hooks

Memory files, skill files, and hook scripts checked for accidentally committed secrets or PII. Cross-reference with git history. The most common finding: api keys pasted into a skill for convenience.

2 points

Security 03

MCP server review

registered MCPs vetted

Every MCP server registered in the project verified — source, permissions, last-update date. Stale or unknown servers are a supply-chain risk. Two points for an active review process; zero for set-and-forget.

2 points

Signals 01-04

Engagement metrics + cadence

weekly subagent + skill counts

Daily-active subagent count, weekly skill invocations, time-to-value on new hires, audit cadence in place. Four points for the panel; without metrics the previous 46 points are unmeasured.

4 points

Security drift is the audit's most-cited finding in retros. Teams adopt new tool surfaces — a new MCP server, a new hook, a new skill that runs bash — faster than the security policy catches up. The drift is rarely catastrophic but accumulates: by the time it shows up in a real incident the remediation is much bigger than the quarterly review would have been.

The productivity signals close the loop. Without them the other forty-six points are advisory rather than measurable. With them — even simple weekly subagent counts, skill-invocation totals, time-to-first-meaningful-output on new hires — the engagement curve is visible and the audit's next-quarter roadmap is grounded in evidence rather than assertion.

06 — MaturityFour-stage maturity model — Ad-Hoc → Optimised.

The aggregate score is plotted against a four-stage maturity model. The stages are deliberately broad — twelve-point buckets rather than fine-grained tiers — because the goal of the model is to set a next-quarter direction, not to fine-tune the current quarter. Most teams should expect to move one stage per quarter if the roadmap is taken seriously.

0-12

Stage 1 · Ad-Hoc

Licences in place, individual usage, no shared artefacts. CLAUDE.md is auto-generated. Zero skills, zero hooks, zero shared subagents. Adoption proven, engagement absent. The starting position for most first-audit clients.

Roadmap: ship the base layer

13-25

Stage 2 · Configured

Project + user settings.json in place, permissions allowlist, MCP registry. Maybe one or two hooks. Still no skills, CLAUDE.md still under-pruned. The team is taking the tool seriously but has not yet invested in leverage.

Roadmap: ship the leverage layer

26-38

Stage 3 · Leveraged

Skills library with six to ten entries, CLAUDE.md pruned and structured, hooks firing on lifecycle events, subagents in regular use. Security policy needs the most work. The most common audit-outcome stage.

Roadmap: ship the guardrails

39-50

Stage 4 · Optimised

All five axes scoring strongly. Productivity signals tracked weekly. Quarterly re-audit cadence in place. The team is shipping new skills as workflows emerge and pruning CLAUDE.md as roles evolve. Rare — under 10% of audited teams reach this stage in 2026.

Roadmap: maintain + optimise

One nuance worth stating. The stages are not strictly sequential — it is possible to land at Configured with a partial leverage layer or to skip from Ad-Hoc straight to Leveraged on the strength of a single motivated tech lead shipping skills before settings.json is even committed. The stage label is a summary, not a constraint. The roadmap a given team gets out of the audit is the concrete remediation list — the stage just sets the expectation for cadence and next-quarter ambition.

07 — Worked ExampleA 30-engineer shop, audited.

One representative engagement, anonymised. A 30-engineer product team, Next.js + Python stack, Claude Code rolled out seven months before the audit. Self-reported adoption was strong — every engineer had used Claude Code in the last fortnight, the eng-leadership team had a positive narrative internally. The audit told a more textured story.

Adoption scorecard · 30-engineer shop · pre-remediation

Anonymised client engagement, Q1 2026

Axis 01 · ConfigurationProject settings.json present · partial allowlist · stale MCP registry

6 / 10

Axis 02 · HooksOne Stop hook (Slack notification) · nothing else wired

2 / 10

Axis 03 · SkillsTwo skills shared in repo · neither documented · low invocation count

3 / 10

Axis 04 · MemoryCLAUDE.md at 1,100 lines · auto-generated padding · last edit five months ago

2 / 10

Axis 05 · Security + signalsPermissions reviewed · no MCP review process · no productivity panel

4 / 10

TotalStage 2 · Configured · 17 / 50

17 / 50

The headline finding: the team was Configured on paper but barely past Ad-Hoc in practice. Strong licence-usage telemetry, a passable base layer, almost no leverage layer, and zero infrastructure for measuring whether things were improving. The single highest-ROI remediation was the leverage axis — at 5 out of 20, the gap was the biggest absolute deficit on the scorecard.

The 60-day roadmap that came out of the audit had three workstreams. One: prune CLAUDE.md from 1,100 lines to roughly 350, split the rest into linked sub-docs, set up a quarterly edit cadence. Two: ship six initial skills — commit, code-review, scaffold-new-route, write-tests, update-changelog, spawn-subagent-team. Three: stand up a productivity panel — weekly subagent count, weekly skill invocations, time-to-first -meaningful-output for new hires. The security and hooks gaps were scheduled for the following quarter.

Re-audit at day 90 produced a score of 34 out of 50 — Stage 3 Leveraged, up from Stage 2 Configured, with the leverage axis now the strongest rather than the weakest. The productivity panel showed weekly subagent invocations roughly tripling and skill invocations climbing from a near-zero baseline. The absolute numbers are less interesting than the shape — the team moved from spending time prompting Claude Code to spending time orchestrating it.

If you want the same calibration applied to your team, our AI transformation engagements ship Claude Code adoption audits with a 60-day uplift roadmap scoped to your stack, your team size, and the gaps the scorecard identifies on the first pass.

Conclusion

Adoption depth is the difference between Claude Code as a tool and Claude Code as a platform.

The fifty-point scorecard is deliberately mundane on each individual point — a settings.json check, a CLAUDE.md line count, a skill directory inventory. What it produces in aggregate is the one thing licence telemetry cannot: a credible picture of how the tool is actually being used, which axis is the biggest drag on engagement, and where the highest-ROI remediation lives.

The next-quarter milestones we expect to see across the client base are visible in the scorecard distribution. Most teams move from Configured to Leveraged inside one quarter once the skills library and the memory prune ship — those two interventions together are the most consistent high-ROI step. The harder transition is from Leveraged to Optimised, where the gating factor is usually security policy and signal infrastructure rather than tooling — both of which require a quarterly cadence rather than a one-off project.

The signal underneath all of this: adoption depth is a property of the artefacts a team produces, not of the seats it buys. Audit the artefacts, build the leverage layer, ship the guardrails, and the productivity gains follow. Skip the audit and the seats sit idle while the dashboard says everything is fine.

Claude Code Team Adoption Audit: 50-Point Scorecard

01 — Adoption vs EngagementInstalled is not adopted.

Daily-active subagent count

Skill invocations per week

Stop / SubagentStop hooks wired

02 — Five AxesConfig, hooks, skills, memory, security.

settings.json + permissions

Lifecycle integration

Slash-command library

CLAUDE.md hygiene

Policy + productivity signals

03 — Config + HooksTwenty points on the base layer.

Project + user settings

Permissions allowlist

MCP server registry

Model + tool defaults

Stop + SubagentStop hooks

PreToolUse / PostToolUse

04 — Skills + MemoryTwenty points on the leverage layer.

Library size + quality

Scope + docs + sharing + freshness

CLAUDE.md size + structure

Signal density + cross-refs

Freshness + drift

05 — Security + SignalsTen points on the guardrails and the productivity signals.

Permissions discipline

Secret + PII handling

MCP server review

Engagement metrics + cadence

06 — MaturityFour-stage maturity model — Ad-Hoc → Optimised.

Stage 1 · Ad-Hoc

Stage 2 · Configured

Stage 3 · Leveraged

Stage 4 · Optimised

07 — Worked ExampleA 30-engineer shop, audited.

Adoption scorecard · 30-engineer shop · pre-remediation

Adoption depth is the difference between Claude Code as a tool and Claude Code as a platform.

Most teams treat AI coding as an install, not a system.

Adoption audit engagements

The questions engineering managers ask before auditing a rollout.

Continue exploring agentic adoption.

Claude Code Team Rollout: A 30/60/90-Day Plan 2026

Vibe-Coding Policy Audit: 50-Point Team Scorecard 2026

Case Study: Claude Code Adoption at a 30-Dev Shop 2026