CVN-N012-EA-S02 — Service catalog module on console-next (plan dossier)¶
Story : CVN-N012-EA-S02 (wp#96) — Service catalog module on console-next
Issue : (to file pre-merge — proposed slug feat: service catalog module on console-next)
Branch : feat/cvn-n012-ea-s02-catalog-module
Author : Dominique (operator) + Claude
Date : 2026-04-30
Status : Draft → committee plan_review pending → impl on PASS
Estimated effort : 5-7 days
1. Mission¶
Build the first IDP module on console-next : a /catalog route + /catalog/[name] detail page backed by catalog-info.yaml files in git. The module gives operators a single browsable inventory of CVN's ~17 services with owner, lifecycle, links, and dependencies — replacing the current tribal-knowledge-plus-scattered-Helm-values state.
This is the first concrete deliverable on the IDP umbrella (Epic CVN-N012-EA, reframed 2026-04-30 to IDP modules on console-next post the Backstage-vs-console-next decision in S01). Catalog before Grafana embed (S03) before RBAC closure (S04).
2. Hypothesis¶
If we ship a YAML-driven catalog whose descriptor format mirrors Backstage's catalog-info.yaml, we get :
- A machine-readable service inventory the team can grep / diff / PR-review.
- A portability invariant (per ADR-78 I2 from the IDP choice dossier §9) — the YAML files round-trip through @backstage/catalog-model so the choice of console-next over Backstage stays reversible at the data layer. If 2 years from now we ever need to migrate to Backstage, the catalog content lifts as-is.
- A reusable pattern for the next IDP modules (Grafana embed S03 will reuse the YAML-registry-loaded-at-build-time approach).
The catalog is read-only from console-next — edits go through git PRs (which is itself the audit trail for ADR-78 I7). No SaaS dependency (ADR-78 I5).
3. Scope¶
In scope¶
-
Schema :
catalog-info.yamlstructure compatible with Backstage Component schema —apiVersion: backstage.io/v1alpha1,kind: Component,metadata.name,metadata.description,metadata.tags,metadata.links[],spec.type(service/library/website),spec.owner,spec.lifecycle(production/experimental/deprecated),spec.system,spec.dependsOn[]. -
Storage layer : YAML files under
documentation/catalog/<service-name>.yaml. One file per service. PR-reviewed like any other docs change. -
Parser (
console-next/lib/catalog/) : TypeScript module that reads YAML files at build time (Next.js convention — server-onlyfsaccess during build). Exports typedComponent[]andgetComponent(name). -
Portability gate (ADR-78 I2) : CI step that runs each YAML file through
@backstage/catalog-modelvalidator. Fail the build on schema drift. Concrete : aconsole-next/scripts/validate-catalog.tsscript invoked byconsole-next/package.json::test:catalog-portability, wired into.github/workflows/console-next-ci.yml. -
Routes :
/catalog— list view : grid of<ServiceCard>components, filter byowner+lifecycle+type, search by name, link to detail.-
/catalog/[name]— detail view : full metadata, links, dependencies (rendered as a list of links to other catalog entries — the dependency-graph SVG is out of scope, deferred to a follow-up). -
Components (per ADR-66) :
<ServiceCard>(presentational, incomponents/catalog/) : 5 Storybook states required (default, with-tags, deprecated, missing-owner, mobile).<CatalogFilters>(client component,'use client'for URL-state sync). v1 uses native<select>+<input>styled with Tailwind ; migration to shadcnSelect+Inputprimitives deferred to when a second consumer materialises (per dossier §3 trade-off).-
<DependencyList>(presentational) : 1 Storybook state. -
Anchor entries : 7
catalog-info.yamlfiles committed for the most-used services — Airflow, MLflow, Grafana, console-next itself, PostgreSQL, Redis, S3. The original target was 5 anchors ; bumped to 7 during CR pass 1 to close the dependency graph (AirflowdependsOnRedis + MLflowdependsOnS3 ; both deps now resolve to a real/catalog/<name>page). -
Runbook :
documentation/runbooks/catalog-add-service.md— step-by-step "how to add a service" (the YAML pattern + the PR review gate). Linked from OPERATIONS.md. -
Tests : vitest unit tests for the parser (valid YAML, malformed YAML, missing required fields, unknown fields tolerated, schema validator integration). Component tests for
<ServiceCard>rendering states. -
mkdocs SSoT (ADR-77) : new nav entry under
documentation/runbooks/for the catalog runbook ; link fromdocumentation/missions/index.md(catalog as part of IDP mission).
Out of scope¶
- Multi-user RBAC — operator-only for now ; deferred to S04 (CVN-N012-EA-S04, RBAC closure gate).
- Live service health — catalog is metadata only ; probes / uptime tracking is a separate concern (the existing Grafana stack handles it).
- Editing UI — read-only ; edits via git PRs (audit-trailed by design).
- Dependency graph SVG — list-of-links suffices for v1 ; graph deferred until > 30 services warrant it (currently ~17).
- TechDocs integration — ADR-77 makes mkdocs the SSoT ; the catalog links to mkdocs runbooks instead.
- API endpoints — pure build-time YAML, no runtime API. If we ever need dynamic catalog updates (e.g., from Helm charts), that's a separate Story.
- Audit log table — not in this Story (the audit invariant ADR-78 I7 is satisfied by git history for read-only catalog ; live audit table lands in S04 RBAC).
Explicitly withdrawn vs the original wp#96 framing¶
Original S03 listed "~17 CVN services + ownership groups". I'm relaxing the gate to 5 anchor services because : - The remaining 12 can be added incrementally via PR after merge (the runbook makes this trivial). - Forcing all 17 now turns the Story into a research sprint (cataloguing every microservice's owner / lifecycle / dependency graph) which is uncertain effort and orthogonal to the infrastructure of the catalog module. - Operator can add the remaining services as ops work, not framework work.
4. Implementation plan¶
Phase 1 — schema + parser + tests (1.5 days)¶
- Add
yaml@^2.8.3toconsole-next/package.json(already in monorepo pnpm-lock as transitive ; add as direct dep). - Add
@backstage/catalog-modelas a dev dep (only used by the portability validator script ; not shipped in the Next.js bundle). console-next/lib/catalog/schema.ts— TypeScript types matching the BackstageComponentschema (subset we use). Single source of truth for the type system.console-next/lib/catalog/parser.ts—loadCatalog(): Component[]readsdocumentation/catalog/*.yamlat build time usingfs.readdir+yaml.parse. Validates against the local TS schema (zod-style or hand-rolled — TBD per committee). Emits a typed list.console-next/lib/catalog/parser.test.ts— vitest unit tests : valid file → parsed ; malformed → throws with line number ; missing required field → throws ; extra field → tolerated and logged.
Phase 2 — anchor data + portability validator (1 day)¶
documentation/catalog/airflow.yaml— first anchor entry. Ownercvntrade-ops, lifecycleproduction, links to grafana / loki / kubernetes namespace, dependsOn[postgresql, redis].documentation/catalog/{mlflow,grafana,console-next,postgresql}.yaml— 4 more anchors. Each one PR-able as a separate diff but bundled here for the Story.console-next/scripts/validate-catalog.ts— Node script, invoked viapnpm test:catalog-portability. Imports@backstage/catalog-model, iteratesdocumentation/catalog/*.yaml, validates each. Exits non-zero on any drift.- CI wiring : add
test:catalog-portabilitystep to.github/workflows/console-next-ci.ymlbetweentypecheckandtest.
Phase 3 — routes + components (2 days)¶
console-next/components/catalog/ServiceCard.tsx— presentational, props :{ component: Component }. Renders shadcn Card with name / description / owner badge / lifecycle badge / tags. 5 states for Storybook : default, with-tags, deprecated (lifecycle visual treatment), missing-owner (graceful fallback), mobile.console-next/components/catalog/CatalogFilters.tsx—'use client'. URL-state-driven (search params for owner / lifecycle / type / query). shadcn Select + Input. 1 Storybook state.console-next/components/catalog/DependencyList.tsx— presentational. Renders dependsOn[] as a list of internal<Link>to/catalog/[name]. 1 Storybook state.console-next/app/catalog/page.tsx— server component, callsloadCatalog(), renders grid of<ServiceCard>filtered via<CatalogFilters>(client component reading search params).console-next/app/catalog/[name]/page.tsx— server component, callsgetComponent(params.name), renders detail. 404 if missing (Next.jsnotFound()).- Stories :
console-next/stories/ServiceCard.stories.tsx,CatalogFilters.stories.tsx,DependencyList.stories.tsx. axe-core a11y green required.
Phase 4 — runbook + nav + OPERATIONS update (0.5 day)¶
documentation/runbooks/catalog-add-service.md— runbook with copy-paste YAML template + PR checklist.mkdocs.ymlnav — registerrunbooks/catalog-add-service.md.documentation/OPERATIONS.md— new sub-section under §16 with link to catalog UI + runbook.documentation/missions/index.md— new IDP mission entry pointing to catalog as the first delivery.
Phase 5 — committee pr_review + merge (1 day)¶
- PR description references
Story: CVN-N012-EA-S02, links the plan dossier, links the run. - CodeRabbit pass(es). Wait full CR cycle per
feedback_cr_rounds_before_merge.md. - Expert Committee
pr_review(mandatory per ADR-68 for substantial frontend changes touching the IDP umbrella). - Squash merge ; OP wp#96 flipped Closed with merge SHA + acceptance criteria checklist.
5. Files to create / modify¶
Created¶
documentation/catalog/airflow.yaml
documentation/catalog/mlflow.yaml
documentation/catalog/grafana.yaml
documentation/catalog/console-next.yaml
documentation/catalog/postgresql.yaml
documentation/catalog/redis.yaml
documentation/catalog/s3.yaml
documentation/runbooks/catalog-add-service.md
documentation/reviews/2026-04-30-cvn-n012-ea-s02-catalog-module-plan.md (this file)
console-next/lib/catalog/schema.ts
console-next/lib/catalog/parser.ts
console-next/lib/catalog/parser.test.ts
console-next/scripts/validate-catalog.ts
console-next/components/catalog/ServiceCard.tsx
console-next/components/catalog/CatalogFilters.tsx
console-next/components/catalog/DependencyList.tsx
console-next/stories/ServiceCard.stories.tsx
console-next/stories/CatalogFilters.stories.tsx
console-next/stories/DependencyList.stories.tsx
console-next/app/catalog/page.tsx
console-next/app/catalog/[name]/page.tsx
console-next/app/catalog/loading.tsx
Modified¶
console-next/package.json # +yaml dep, +@backstage/catalog-model dev dep, +test:catalog-portability script
.github/workflows/console-next-ci.yml # +catalog-portability step
documentation/OPERATIONS.md # +catalog access section
documentation/missions/index.md # +IDP mission entry
mkdocs.yml # +runbook nav entry
6. Test plan¶
Unit tests¶
console-next/lib/catalog/parser.test.ts:- parses valid YAML → typed
Component - throws on malformed YAML with line number context
- throws on missing required field (
metadata.name) - tolerates extra fields with
console.warn -
integration :
loadCatalog()reads the 5 anchor files end-to-end -
console-next/components/catalog/ServiceCard.test.tsx: - renders name / description / owner
- renders deprecated badge when
spec.lifecycle === 'deprecated' - graceful fallback when
spec.ownermissing
Integration tests¶
console-next/scripts/validate-catalog.tsruns against the 5 anchor files in CI and passes (validates against@backstage/catalog-model).
Storybook + a11y¶
- 5
ServiceCardstates + 1CatalogFiltersstate + 1DependencyListstate — all axe-core color-contrast green per ADR-66.
Smoke (manual)¶
pnpm dev; navigate/catalog→ see 5 services ; filter byowner=cvntrade-ops→ see Airflow + Grafana ; click Airflow → detail page renders ; clickdependsOnlink → navigates to PostgreSQL detail.
7. Risks & mitigations¶
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
@backstage/catalog-model is heavy / pulls Backstage's whole runtime |
medium | medium | dev dep only, run only in CI script ; if heavyweight even there, replace with a hand-rolled JSON Schema validator generated from Backstage's published schema |
| YAML schema drift between our subset and Backstage's evolving format | low | low | the portability test catches drift ; if Backstage breaks the schema in a future major, we pin the version of @backstage/catalog-model and revisit |
| 5 anchor entries is too few — review feedback says "not enough to validate the pattern" | medium | low | the runbook makes adding more trivial ; add via follow-up PRs post-merge |
| Build-time YAML loading breaks Next.js dev hot-reload | low | medium | Next.js 14 supports fs reads in server components ; verify in Phase 1 with the parser test ; if dev-mode is broken, fall back to a useState reload-on-change pattern in dev only |
| dependency-graph-as-list looks ugly for services with many deps | low | low | accepted as v1 ; SVG graph deferred ; max ~5 deps per service today |
| Operator regrets the read-only-via-git constraint | low | medium | the constraint is the audit trail (ADR-78 I7) ; revisiting requires ADR + new Story, friction by design |
8. ADR & invariant compliance¶
| ADR / Invariant | Compliance |
|---|---|
| ADR-66 — UI Stack | Storybook stories with required states ; axe-core a11y green ; DTCG tokens (no inline CSS) ; shadcn primitives used ; CVA variants only |
| ADR-77 — MkDocs SSoT | runbook in documentation/runbooks/ registered in mkdocs nav ; strict-mode build green ; no doc duplication |
| ADR-78 (stub) I1 — Next.js routes inside console-next | /catalog is a route in console-next/app/, not a standalone service |
ADR-78 I2 — catalog-info.yaml portability |
CI portability test using @backstage/catalog-model ; round-trip validation mandatory |
| ADR-78 I3 — TechDocs honored by mkdocs | catalog links point to mkdocs runbooks at docs.cvntrade.eu, not a separate TechDocs site |
| ADR-78 I4 — every IDP module respects ADR-66 | yes, see above |
| ADR-78 I5 — no SaaS dependencies | YAML in git, no external service |
| ADR-78 I6 — IDP kill-switch | catalog is read-only ; trivially killed by removing the route file or scaling console-next to 0 (operator-controlled) ; documented in runbook |
| ADR-78 I7 — Immutable audit trail | git history is the audit trail for catalog edits (read-only catalog ; no runtime mutations) |
| ADR-78 I8 — Blast radius | catalog module imports nothing from app/config/* (the existing Console module's surface) ; isolated by file structure |
9. Out-of-band considerations¶
- ADR-78 itself — the dossier §13 acceptance checklist marks ADR-78 as a follow-up PR (separate from S01 dossier merge). This Story relies on the stub form of ADR-78 ; the formal ADR-78 lands as part of S04 (RBAC closure gate) per the reframed Epic plan.
@backstage/catalog-modellicense — Apache-2.0, compatible with our stack. No lock-in : we pin a version, and if Backstage ever changes direction, we own a YAML format spec that any tool can read.- Schema versioning — we adopt
apiVersion: backstage.io/v1alpha1(Backstage's current). When it stabilizes tov1, we add a migration script in a follow-up Story.
10. Acceptance gate (mirror of OP wp#96 §Acceptance)¶
-
console-next/lib/catalog/exists with typed parser + vitest tests green -
/catalogroute renders the 7 anchor services with filter (owner / lifecycle / type) + search -
/catalog/[name]detail page renders full metadata + dependency list -
documentation/catalog/has 7 anchorcatalog-info.yamlfiles (airflow, mlflow, grafana, console-next, postgresql, redis, s3 — closes the dependency graph) - CI job
test:catalog-portabilityfails on schema drift (verified by intentionally breaking a YAML in a draft commit, observing failure, fixing) -
<ServiceCard>Storybook : 5 required states + axe-core green + DTCG tokens (no inline CSS) -
<CatalogFilters>+<DependencyList>stories green - mkdocs runbook
documentation/runbooks/catalog-add-service.md(build strict green) - OPERATIONS.md updated with catalog access link
-
documentation/missions/index.mdentry for the IDP mission with catalog link - Plan dossier (this file) committee
plan_reviewPASSED - PR review : CodeRabbit full cycle + Expert Committee
pr_reviewPASSED - On merge : OP wp#96 status flipped Closed with merge SHA + acceptance summary
11. Open questions for committee plan_review¶
-
Validator choice :
@backstage/catalog-modeldirect dev dep vs. a hand-rolled JSON Schema validator generated from Backstage's published schema. Trade-off : ecosystem alignment vs. dep-tree weight. Operator preference ? -
Schema strictness : strict (reject unknown fields) vs. lenient (
console.warnon unknown fields). I picked lenient because Backstage extensions (Spotify-style annotations) are common and we shouldn't break catalogs that adopt them. Committee endorse ? -
Anchor count : 5 services for v1 vs. all 17 ? I argued 5 in §3 ; want explicit committee endorsement of the relaxation.
-
Client/server split :
/catalogpage is server component ;<CatalogFilters>is'use client'reading search params. Alternative : everything client-side with Next 14's'use client'boundary at the page level. Trade-off : SSR-rendered SEO + first-paint vs. simpler mental model. I picked the split. Endorse ? -
Dependency rendering : list-of-links for v1, SVG graph deferred. If a reviewer feels strongly that "no graph = no value" speak now ; otherwise defer.
-
Owner field : free-form string vs. enum constrained to known OIDC groups (
cvntrade-ops,cvntrade-ml, etc.). Free-form is simpler ; enum catches typos at build time but adds maintenance. I picked free-form for v1 ; committee preference ? -
mkdocs nav placement : new top-level Catalog section vs. tucked under
Runbooks? I picked under Runbooks (it's a single "how to add a service" page) ; alternative is a new top-level entry once we have multiple catalog-related docs.
13. Committee plan_review triage (session 4bdcdd34, 2026-04-30)¶
Verdict : PASSED / EXECUTION_RISK — strong consensus across 5 experts (architect 8.5, ops 8.0, ml-engineer 8.0, data-scientist 7.5, crypto-trader 8.5 — avg 8.1/10). 0 blockers. 11 recommendations.
Reason cited : "The plan is architecturally sound, well-scoped, and compliant with ADRs, but carries execution risks related to data quality, operator adoption, and the lack of explicit operational success metrics for the catalog's mission."
Open questions resolution :
- Q1 (validator choice → @backstage/catalog-model dev-dep) : endorsed unanimously ; pin version, consider fallback for schema version mismatches.
- Q2 (lenient schema strictness, console.warn on unknown) : endorsed unanimously.
- Q3 (5 anchor services) : endorsed unanimously.
- Q4 (client/server split) : endorsed unanimously.
- Q5 (dep rendering as list, SVG deferred) : endorsed unanimously.
- Q6 (owner field free-form vs enum) : dissent — 2 experts endorse free-form, 3 want validation. Resolved by combining : free-form in v1 + CI lint warning against a curated list of known OIDC groups (cvntrade-ops, cvntrade-ml, cvntrade-viewer, cvntrade-architect). Hard enum deferred to S04 RBAC closure where the OIDC group catalog is finalized.
- Q7 (mkdocs nav under Runbooks) : endorsed unanimously.
13.1 Recommendations integrated pre-impl (locked into the plan)¶
| Reco | Source | Integration |
|---|---|---|
| #1 — Define operational success metrics (KPIs) | expert-ops + expert-data-scientist | New §13.2 below ; KPIs measured manually post-merge for first 90 days, automated dashboard deferred to a follow-up Story |
| #3 — Owner field CI lint against curated list | expert-ops + expert-ml-engineer + expert-data-scientist | console-next/scripts/validate-catalog.ts already exists per §4 phase 2 ; add owner-allowlist warning : if spec.owner ∉ {curated list} → emit warning (not error). Curated list lives in console-next/lib/catalog/owners.ts |
| #8 — Failure isolation for YAML parser | expert-architect | loadCatalog() wraps each file parse in a try/catch ; malformed file → log + skip + report in CI summary (not crash the whole build). Documented in runbook §"What happens if my YAML breaks" |
| #10 — Rollback playbook in OPERATIONS.md | expert-ops | Per ADR-68 substantial-FE-change requirement. Adds 1-paragraph rollback section to documentation/runbooks/catalog-add-service.md AND a 3-line entry in OPERATIONS.md §16 (catalog access section) |
| #11 — CI check for YAML diff alerts | expert-ops | Already covered by the portability test in §4 phase 2 ; strengthen : when CI detects modifications to documentation/catalog/*.yaml, post a short summary comment on the PR (filename + owner change + lifecycle change) for review attention. Reinforces ADR-78 I7 audit trail |
13.2 KPIs (mission success criteria — measured post-merge)¶
The catalog module replaces "tribal knowledge + scattered Helm values" with "single browsable inventory". Falsifiability test for that mission :
| KPI | Target (90 days post-merge) | Measure |
|---|---|---|
| Catalog completeness | ≥ 12 services in documentation/catalog/ (currently committing 7 anchors ; expected +5 via ops PRs) |
find documentation/catalog -maxdepth 1 \( -name '*.yaml' -o -name '*.yml' \) \| wc -l |
| Catalog freshness | 0 files with last commit > 90 days ago (matches FRESHNESS_THRESHOLD_DAYS in scripts/validate-catalog.ts) |
per file git log -1 --format=%ct -- documentation/catalog/<file> (covers both .yaml and .yml) |
| Operator adoption | ≥ 5 distinct PR authors touched a catalog file | git log --pretty=%ae -- 'documentation/catalog/*.yaml' 'documentation/catalog/*.yml' \| sort -u \| wc -l |
| Mission validation | post-90-day operator survey : "Did you use /catalog at least once this week ?" — yes from ≥ 3 of the team |
manual check, doc'd in OPERATIONS §16 |
| Negative falsification | if all 4 KPIs miss → catalog mission failed → revisit (either re-launch comms, simplify entry barrier, or sunset the module and write a post-mortem) | quarterly review |
These are not Story acceptance gates (the Story closes on the §10 acceptance list). They are mission gates evaluated 90 days post-merge to validate the IDP umbrella's first delivery actually works.
13.3 Recommendations applied at impl time (in code)¶
| Reco | When |
|---|---|
| #2 — Data freshness CI check | Phase 2 (validate-catalog.ts) — adds a stale-warning step (warning only, not failure) |
| #5 — Runtime observability | Phase 3 — basic Next.js route timing logs (no full Prometheus integration ; that's S04) |
#9 — Monitor @backstage/catalog-model dep size |
Phase 2 — CI step measures pnpm why @backstage/catalog-model \| wc -l baseline, alerts if > 2× growth |
13.4 Recommendations deferred (out of scope for S02)¶
- Reco #4 (Adoption strategy beyond runbook) — proactive comms / training is operator-led work, not impl. Tracked as a checkbox in the wp#96 closure comment.
- Reco #6 (Proactive Backstage schema monitoring) — process not code. Documented in the runbook §"Pin & monitor" with a quarterly check. Real automation deferred to S04.
- Reco #7 (Custom schema extension ADR) — when CVN actually needs an extension (not now). Filed as a follow-up note, no Story until the need surfaces.
13.5 Falsifiability gap closed¶
Pre-committee, the dossier had no explicit "how do we know the catalog mission succeeded" beyond the technical acceptance gates. §13.2 closes that gap with 4 measurable KPIs + a negative-falsification clause.
14. Linked context¶
- IDP choice dossier —
2026-04-29-idp-choice-plan.md§3.1 + §8 + §9 (especially I2 portability) - Need CVN-N012 — wp#75 (deferral comment 203 explains §7 consolidation deferral)
- Epic CVN-N012-EA — wp#77 (reframed 2026-04-30 to IDP modules on console-next)
- Story CVN-N012-EA-S02 — wp#96
- ADR-66 — UI Stack invariants
- ADR-77 — MkDocs SSoT
- ADR-78 (stub) — IDP framework choice + invariants I1-I8 (formal
documentation/adr/0078-...mdlands as a follow-up PR perCVN-N012-EA-S04) - Existing console-next CI — .github/workflows/console-next-ci.yml