Skip to content

Claude Code Skill — Authoring Standard (CVNTrade)

Story : CVN-N001… → CVN-N014-EC-S00 (wp#216, GH #1038) — foundation for the skills Epic CVN-N014-EC. Validated against : the official Claude Code Skills docs at https://code.claude.com/docs/en/skills.md, checked 2026-05-22, re-confirmed 2026-05-23 UTC (Claude Code v2.1.147)name optional (basename fallback), 1,536-char description+when_to_use cap, ~500-line body guideline, naming-style unspecified by docs, minimal bundled-file security guidance, no multi-model-testing guidance. Re-validate (and bump this line) when the Skills format evolves — do not author from memory.

This is the canonical standard every SKILL in this repo MUST follow. SCRIPT-type deliverables (S01/S02/S07) follow the normal scripts/ + PR conventions instead — see §1.


1. SCRIPT vs SKILL — pick the right tool first

Choose When Lives in Tested how
SCRIPT Deterministic, parameterised, no judgment, also useful to the operator directly scripts/*.py (+ Make target) unit tests + standard PR/CI
SKILL Process + agent judgment the model executes repeatedly .claude/skills/<name>/SKILL.md dogfood on a real action before close
Hook Automated on a tool/lifecycle event, no user input .claude/settings.json
Subagent Delegated task needing isolated context + specialist tools .claude/agents/<name>/agent.md
Built-in slash-command Fixed, non-prompt behaviour (/help, /run) n/a

If it's a SCRIPT, stop here and follow the repo's normal code conventions. The rest of this doc is the SKILL standard.


2. Location & naming

  • Project skill (codebase-specific): .claude/skills/<name>/SKILL.md — the default for this repo.
  • Personal skill (cross-project): ~/.claude/skills/<name>/SKILL.md.
  • One skill = one directory with a required SKILL.md entrypoint. Override priority: enterprise > personal > project > plugin.
  • name (directory basename, or the name: field if set): the docs constrain only lowercase letters / numbers / hyphens, ≤ 64 chars — they do not prescribe a style (verified 2026-05-23). Repo convention: use domain-action and make it discoverable (pr-review, loki-query — not helper).
  • Creating a brand-new .claude/skills/ dir that didn't exist at session start requires a restart for the watcher to see it; edits to existing skills are picked up live.

3. SKILL.md frontmatter (current fields)

description is the only field you must fill. All others optional. name is optional in Claude Code — it defaults to the directory basename (verified 2026-05-23; note this differs from the broader Agent-Skills/API format, which lists name as required, so don't "fix" it). Hard constraints below are verbatim from the docs.

Field Notes / constraint
name Optional — ≤ 64 chars, [a-z0-9-]. Defaults to dir basename.
description What it does + when to use it; key use case FIRST. Drives auto-invocation. Combined with when_to_use, hard cap 1,536 chars.
when_to_use Extra trigger phrases / synonyms (counts toward the 1,536 cap).
argument-hint Autocomplete hint, e.g. "[wp-number] [to-status]".
arguments Named positional args → $name substitution (e.g. arguments: [wp, status]).
disable-model-invocation trueuser-only (use for side-effect workflows: deploy, commit, merge). Description NOT loaded into context.
user-invocable falseClaude-only (background knowledge). Description IS loaded.
allowed-tools GRANTS tools without prompting — see the warning below. e.g. Read Grep Bash(git add *).
model / effort Per-skill override (inherit to keep session); effort: low\|medium\|high\|xhigh\|max.
context: fork + agent Run the skill in an isolated subagent (Explore/Plan/general-purpose/custom).
paths Glob(s) — auto-invoke only when working files match, e.g. "src/**,dags/**".
shell bash (default) or powershell.
hooks Skill-scoped lifecycle hooks.

⚠️ allowed-tools GRANTS, it does NOT restrict. Listing a tool pre-approves it (no prompt) while the skill is active — it never limits what the skill can call. To restrict, use deny rules + workspace trust (.claude/settings.json); a deny rule always wins. Apply least privilege: grant only the exact patterns needed (Bash(git add *), not Bash). This is a classic source of false confidence.

Who sees the description? (easy to invert) - model-invocable (default) → description loaded → drives auto-invocation. - disable-model-invocation: true (user-only) → description NOT in context → it only feeds the / menu/autocomplete, not auto-trigger. - user-invocable: false (Claude-only) → description IS loaded (background knowledge). Don't combine paths (path-based auto-invoke) with disable-model-invocation: true (user-only) — they contradict.

Subagent mechanisms relate: the Subagent row in §1 (.claude/agents/<name>) is a standalone delegate; context: fork + agent here runs this skill inside such a subagent. Same machinery, two entry points.

Substitutions usable in the body: $ARGUMENTS, $ARGUMENTS[N]/$N, $name, ${CLAUDE_SESSION_ID}, ${CLAUDE_EFFORT}, ${CLAUDE_SKILL_DIR} (use for portable script paths: !`python ${CLAUDE_SKILL_DIR}/scripts/x.py`). Live shell injection: !`git diff HEAD`.

4. Content rules (the part that actually matters)

  1. description is for discovery. Specific + trigger phrases ("Run a CodeRabbit review round to completion. Use when a PR has a CodeRabbit review to address."), not "CR utility". Key use case first.
  2. Keep SKILL.md < ~500 lines. Invoked content stays in context all session — every line is recurring token cost. State what to do, not why/how (same conciseness test as CLAUDE.md).
  3. Progressive disclosure: move bulk reference/examples/scripts to supporting files in the skill dir and link them ([reference.md](reference.md)); Claude loads them on demand.
  4. Inject live data, don't hard-code it (!`gh pr view $1 --json …`). This couples the skill to external commands (gh, git, kubectl) at invocation — write the skill body to degrade gracefully if the injected command fails (empty/error output), not to assume success.
  5. Scope side effects: disable-model-invocation: true for deploy/merge/commit-type skills so only the operator triggers them.
  6. Bake in the gotcha. Every CVNTrade skill MUST encode the specific failure it prevents (the reason it exists) — e.g. loki-query → the OTel-drop stdout-stream fallback; cr-round → CI-exact isort --profile=black.

5. Cross-cutting norms (acceptance for every skill story)

  • Dogfood before close: the skill must be run on a real action and shown to work before its Story moves to Tested/Closed.
  • Multi-model dogfood (repo norm — NOT in the Claude Code docs as of 2026-05-23; we adopt it because skills carry a model/effort field): if a skill targets more than one model/effort, the dogfood MUST cover the weakest one (behaviour that works under Opus/high may need more explicit instructions under a smaller model/lower effort).
  • Bundled-file security (repo policy — the docs say only "review project skills before trusting, since a skill can grant itself broad tool access"; for a trading repo we go further): any bundled scripts//assets MUST be audited before merge, carry no hard-coded secrets (creds via .env / K8s secrets only), and the skill applies least-privilege allowed-tools.
  • Usage note in documentation/ (or the skill's own reference.md) and a one-line pointer where teammates will find it.
  • Provenance: skills referencing Claude Code features must cite the docs version they were validated against (mirror the header of this file).
  • No formal version: field exists yet — record version/author/status in a top comment if needed.

5b. Anti-patterns (don't merge a skill that does these)

  • Catch-all skills (helper, utils, tool-1) — one skill = one focused job.
  • Vague description that won't auto-invoke ("Git stuff") — fails the discoverability test.
  • Critical logic left to the prompt when it should be a deterministic SCRIPT (re-read §1).
  • Hard-coded secrets or broad allowed-tools: Bash "to be safe".
  • Duplicating an existing script/skill instead of calling it.
  • A critical policy enforced only by skill text (e.g. "never auto-merge") with no external gate (see §7).

5c. Lifecycle & deprecation

  • Re-validation: this standard pins a Claude Code docs version (header). Re-validate skills + this standard when Claude Code minor-versions bump or a referenced field changes; owner = the Epic owner (CVN-N014-EC). The provenance line in each skill makes staleness visible.
  • Deprecation: a superseded/obsolete skill is removed (or marked status: deprecated in its top comment + delisted from the skills index), not left to rot — a stale skill that silently misfires is worse than none.

6. Template

Copy documentation/templates/TEMPLATE_skill.md into .claude/skills/<name>/SKILL.md and fill it in.

7. Governance & execution-risk controls (committee d4717e99, PASSED / EXECUTION_RISK)

The plan_review of this standard PASSED (strong consensus) but flagged execution risks for an AI-agent-operated workflow. These are MANDATORY for the Epic:

  • Review gate (committee 51a5f799): side-effect skills (anything that commits/pushes/merges/deploys, transitions OpenProject, or posts externally) require a committee pr_review before merge; read-only / passive skills are exempt. The S00 standard itself was committee-reviewed; conformant read-only skills inherit that approval, but anything taking an action does not.
  • Critical policies are enforced EXTERNALLY, never by skill text alone. A skill instruction ("never auto-merge") is a soft guardrail, not enforcement. Hard rules (no direct push to main, no auto-merge, no prod config in git) MUST be backed by GitHub branch protection + CI gates; side-effect skills (pr-merge, op-story-transition, story-close, any deploy) MUST additionally set disable-model-invocation: true (user-only) and explicitly prompt for operator approval before the side effect.
  • Tests bar (refines §1/§5). SCRIPT deliverables MUST ship unit tests for their logic. SKILL deliverables that wrap a script test the script; the dogfood run is the skill-level acceptance test, not a substitute for unit tests of any logic they call.
  • Versioning + registry. Each skill carries a top-of-file <!-- version / last-validated / owner --> comment (no formal field exists yet). Maintain a skills index (one line per skill: name, type, gotcha, owner, last-validated) so the set is discoverable, auditable, and rollback-traceable.
  • Auditability + runtime observability. Side-effect skills/scripts MUST emit structured logs of what they did (action, args, target, outcome) per ADR-30/31/32, so an agent-run action is reconstructable. Full telemetry contract in §8.
  • Error handling = ADR-25. Skills/scripts MUST define their failure modes and never silently fall back — surface the failure (and, for diagnostics, the no-crash + no-silent-fail rule, memory feedback_no_python_crash_visible).
  • Secrets. Skills/scripts access credentials only via the established path (.env locally, Kubernetes secrets in-cluster) — never inline.
  • Drift management. This standard pins a Claude Code docs version (header). Re-validate on a defined cadence / when Claude Code minor-versions bump; owner = the Epic owner. A CI reminder when the pinned version ages is a follow-up.
  • Routed to the side-effect stories (not this standard): kill-switch + staged/canary rollout for high-blast-radius skills (pr-merge, deploy) — captured as acceptance notes on S08 and any future deploy skill.

8. Observability & performance metrics (Loki → Grafana)

Skill/script runs are agent-initiated actions — they MUST be observable on the same platform path as everything else (ADR-26 Grafana single entry point · ADR-30/31/32 structured logs · ADR-33 closed event catalogue). Don't invent a parallel telemetry channel.

Two runtimes, two sinks: - In-pod skills/scripts (run inside K8s / the platform) emit via the project's log_event → OTel → Loki, exactly like the rest of the platform. - Locally-run skills (on the dev box, e.g. cr-round, pr-merge) have no automatic OTel→Loki path. They MUST use a small bundled emitter (canonical reference: .claude/skills/cr-round/scripts/emit_skill_event.py) that ALWAYS appends a JSONL record to a local audit log (~/.claude/skill-audit/…, override CVN_SKILL_AUDIT_DIR) and best-effort pushes to Loki when CVN_LOKI_PUSH_URL is set — so local runs still surface in Grafana. The emitter is no-crash (a dead push URL / unwritable dir never breaks the skill) and redacts secret-bearing field names.

What to emit — every SKILL/SCRIPT that takes an action emits a start + terminal event (via log_event in-pod, or the bundled emitter locally), structured event=key=value:

Event When Fields (golden)
skill_invoked at entry skill=<name> type=skill\|script args=<sanitised> session=${CLAUDE_SESSION_ID} invoked_by=model\|user
skill_completed at terminal (always) skill=<name> outcome=success\|error\|noop severity=info\|warn\|error duration_s=<float> + skill-specific counts (e.g. pr=1234 findings=5 wp=216)
skill_operator_approval side-effect skills, before the effect skill=<name> action=<merge\|transition\|deploy> target=<…> approved_by=operator outcome=approved\|aborted — the audit trail for the never-auto-merge / external-enforcement rule (§7).

Rules: - Performance metrics = at minimum duration_s + outcome on skill_completed; add the counts that make a run interpretable (rows, PRs, findings, rounds). - severity follows the no-crash / no-silent-fail discipline (memory feedback_no_python_crash_visible): a handled failure is severity=error logged, never a raised crash and never a silent pass. - Sanitise args — never log secrets/tokens (§7 secrets). - Closed catalogue (ADR-33): the skill_* event names above are the canonical set; add new ones to the catalogue, don't free-form.

Retrieval (know the gotcha): query via the loki-query skill/script (S02). The log_event/OTel path drops events (Failed to export … otel-collector), so for run outcomes ALSO line-filter the Airflow/stdout stream {namespace="cvntrade"} |~ "event=skill_completed" — the OTel/event-label stream alone is incomplete (this is the lesson that recovered the S23 group verdict). See memory reference_loki_query.

Surfacing: side-effect + critical skills get a small Grafana view (invocation count, success/error rate, p50/p95 duration_s) so anomalies (a skill suddenly erroring, or running 10× longer) are visible. Dashboard wiring + a CI-generated skills index are Epic follow-ups (committee 55fafa7a), not blockers for an individual skill.


Source of truth: the official Claude Code Skills documentation. This standard is a distillation for CVNTrade conventions; when the two disagree, the official docs win — re-validate and update the header date.