← Diligence index  ·  View raw .md

Title: Validation Report — Pipeline Version 0.1.0-substrate Version: 0.1.0 Status: Customer-facing — present to the lab's QMS reviewer ahead of LOI signing Owner: Quality Lead (Provider) Last Reviewed: 2026-05-07 Next Review: on next pipeline-version bump (per sops/CHANGE_CONTROL.md) Pipeline Version Validated: 0.1.0-substrate


Validation Report — Pipeline Version 0.1.0-substrate

This is the formal validation dossier the Lab's QMS reviewer signs against to authorize the free-pilot LOI (per ../customer/LOI_ONEPAGER.md).

It is the instantiated, version-locked counterpart to the internal REPORT_TEMPLATE.md. Every quality number it cites is mirrored verbatim from the canonical ../QUALITY_METRICS.md (per the citing rule in §0 of that document). Any drift between this report and the canonical source is a bug — flag and re-issue.


§0 — Intended-use boundary (read first)

The platform delivers secondary analysis only: FASTQ → BAM, VCF, gVCF, QC report, signed manifest, audit-log entries.

The Lab is the CLIA/CAP lab of record and performs all clinical interpretation, variant classification, report sign-out, and patient communication. The Provider does not diagnose, does not classify pathogenicity, does not sign out reports.

Full boundary statement: ../intended-use/INTENDED_USE.md.

Every quality number below is conditional on the platform staying inside this boundary.


§1 — Pipeline version under validation (lock manifest)

Field Value
Pipeline version 0.1.0-substrate
Pipeline source git SHA e4e97da (substrate baseline)
Parabricks image nvcr.io/nvidia/clara/clara-parabricks:4.7.0-1
Parabricks image digest {{TBD — NGC-authenticated pull required to pin; see PIPELINE_LOCK.md §2 runbook}}
DeepVariant model bundled with Parabricks 4.7.0-1 (pbrun deepvariant_germline)
HaplotypeCaller status EXCLUDED per ADR-0005 (Outcome 4b) — see HAPLOTYPECALLER_BENCHMARK_FIX.md
Reference FASTA GRCh38_no_alt_analysis_set
Reference FASTA SHA-256 9cce8b92...8702b7 (full digest in PIPELINE_LOCK §4)
Truth set v4.2.1 SHA-256 adb4d4a5...e81175c
Truth set v5.0q SHA-256 c7f9d7a4...f9c50dc8
Exclusion BED SHA-256 (uncompressed; post-MHC-lift per ADR-0006) 7dc4d16b1d0eb1d171713bc272c9a3f3b881dddb1f305faba02dac25a3932c1c
Exclusion BED file investigations/v5_0q_excluded_regions.bed.gz (gzipped, 30 MB)
Stratifications bundle SHA-256 c5a1eceac54aac2c438af21825223d2a71e64b3db6b1c9e923849babb38063d8

The full lock manifest including parameter values, container digests, and reference indexes lives at ../technical/PIPELINE_LOCK.md. Any field not pinned in that document is invalid for clinical pilot use.


§2 — Headline metrics (HG002 30× WGS)

These numbers are mirrored verbatim from ../QUALITY_METRICS.md. The canonical source updates first; this report follows under change control.

2.1 Against GIAB v4.2.1 truth (full benchmark BED)

Metric Observed Acceptance criterion Verdict
Aggregate F1 0.9954 ≥ 0.99 (Phase 1 pilot positioning) ✅ PASS
Total false negatives 30,084 (no acceptance threshold; informational)

2.2 Against GIAB v5.0q truth (raw — no exclusion)

These are the raw v5.0q numbers, retained for transparency. They are NOT clinical-quality claims on their own — see §2.3.

Metric Observed Note
SNP F1 0.9906 informational; cite ONLY paired with §2.3
Indel F1 0.9408 informational; cite ONLY paired with §2.3
Total false negatives 121,994 81.2% are in v5.0q-only truth-content territory v4.2.1 never asserted

2.3 Against GIAB v5.0q truth (in-scope complement of exclusion BED)

This is the headline clinical-quality posture. The exclusion BED is empirically derived from the per-stratum decomposition (alldifficultregions ∪ chrX/Y non-PAR/XTR/ampliconic; PAR remains in scope) and captures 97.7% of v5.0q false-negatives in regions where the caller architecture has known limits.

Metric Observed Acceptance criterion Verdict
In-scope SNP F1 (post-MHC-lift) 0.9993 ≥ 0.995 ✅ PASS
In-scope Indel F1 (post-MHC-lift) 0.9959 ≥ 0.99 ✅ PASS
Exclusion BED FN capture 119,184 of 121,994 (97.7%) ≥ 95% ✅ PASS
In-scope quality vs v4.2.1 baseline exceeds (0.9993 SNP / 0.9959 Indel vs 0.9954 aggregate; arithmetic estimate per ADR-0006; hap.py confirmation pending) ≥ baseline ✅ PASS

Per-stratum decomposition driving the exclusion BED design is documented at ../investigations/V5_0Q_GAP_ANALYSIS.md v0.3.0+ §5.10.

2.4 Headline acceptance — overall verdict

PASS with the §0 intended-use boundary in force.


§3 — Lab-side reproducibility (the Lab can run this independently)

The Lab confirms the Provider's claims with three offline commands. None require GPU compute, network access, or credentials.

3.1 Verify the example signed manifest

# After `pip install -e .` in the repo, or after extracting customer-bundle.tar.gz
genomics-verify \
  keys/sample-manifest.json \
  keys/sample-signature.json \
  --public-key keys/genomics-public.pem.example

Expected output (verbatim, exit code 0):

OK — signature valid for this manifest.
  algorithm:        ed25519
  public key id:    c45fed5f205aea057efa7314515ec3688109aa4f072aa71bd4a7fd4c48db102d
  signed at:        2026-05-07T12:00:00+00:00
  manifest sha256:  5c15b3d8007f27591de57411393b92d25a3cb2dfa6da63d79e24a887bd9550fd
  job_id: demo-job-0001
  sample_id: HG002-DEMO
  pipeline_version: 1.0.0
  outputs: 2 file(s)

What this proves: the JCS canonicalization, Ed25519 detached signature scheme, and the publicly-published verification key all work end-to-end on the Lab's hardware before any sample is shipped.

3.2 Confirm the example public-key fingerprint

sha256sum keys/genomics-public.pem.example

Expected output: c45fed5f205aea057efa7314515ec3688109aa4f072aa71bd4a7fd4c48db102d keys/genomics-public.pem.example

What this proves: the PEM file in the bundle hashes to the fingerprint documented in ../security/SIGNING_KEY_PUBLISHING.md §3.1. The Lab's pinned trust anchor is valid.

Production-pilot keys. This is the demo-key fingerprint. The production pilot key, when KmsEd25519Signer goes live, gets a new fingerprint pinned in the Lab's executed pilot agreement (Appendix A) and surfaced here in §3.2 of the next-version validation report.

3.3 Confirm the customer-bundle SHA-256

sha256sum customer-bundle.tar.gz

Expected: the value the Provider quoted in the discovery email that delivered this bundle. CI rebuilds the bundle twice on every push and rejects non-deterministic output (per ../../../.github/workflows/clinical-readiness-ci.yml "Customer-bundle determinism check").

What this proves: what the Lab reviewed is bit-identical to what the Provider built and audited internally; nothing was modified in transit.


§4 — Substrate hardening verified

The compute substrate has six hard CI gates; each is enforced on every push to the Provider's repository (workflow runs are public- auditable upon request).

Gate What it enforces Status
lint (ruff check + format-check) code style + dead-code elimination ✅ green
typecheck (mypy --strict) full static type coverage ✅ green
test (pytest --cov-fail-under=87) unit + integration coverage floor ✅ green @ 88%
security (bandit -ll) static security analysis ✅ green
openapi-check API contract drift detector ✅ green
demo-verify end-to-end CLI smoke test ✅ green
customer-bundle-determinism bundle reproducibility ✅ green

300 tests pass at 88% coverage on the validated commit.

Substrate decisions are documented as ADRs at ../decisions/ (5 ADRs covering frozen-dataclass pattern, JCS+detached signatures, hash-chained audit log, file-system audit-log, and HaplotypeCaller excision).


§5 — What's in scope vs out of scope (Phase 1)

In scope (validated):

Out of scope for Phase 1 (no quality claim made):

Hard rules (from ../intended-use/QUALITY_CLAIMS.md):


§6 — Open items (do not gate Phase 1 pilot)

Item Status Owner Resolution path
Parabricks image digest pin TBD Provider Eng NGC API key + docker buildx imagetools inspect; runbook in PIPELINE_LOCK.md §2
Brev BAA execution Pending Provider Compliance HARD GATE for any PHI run that uses Brev burst compute. GB10 / on-prem nodes are unaffected and remain in scope.
KmsEd25519Signer (Cloud KMS) Skeleton; not used for Phase 1 demo signatures Provider Eng ~0.5 day; lands when the first customer requires HSM-rooted signatures
40× / 50× v5.0q HG002 cells Pending GPU compute Provider Quality Non-gating; expected to halve in-scope residual at 50×

Phase 1 entry criteria are met. None of the open items above gate the LOI signing or the first-paid-pilot kickoff.


§7 — Sign-off

By signing below, the Lab's QMS reviewer attests they have:

By signing below, the Provider's Quality Lead attests:

Provider — Quality Lead Lab — QMS Reviewer
_______ _______
{{COMPANY_QUALITY_LEAD_NAME}}, Quality Lead {{LAB_QMS_REVIEWER_NAME}}, {{LAB_QMS_REVIEWER_TITLE}}
Date: ____ Date: ____

Comment block — deviations from §3 expected outputs (if any):

(populate at sign-off; "none" if no deviations)

§8 — References