Title: Quality Metrics — Canonical Source Version: 0.1.0-draft Status: Draft Owner: Quality Lead Last Reviewed: 2026-05-06 Next Review: 2026-08-06
This document is the single source of truth for the platform's
in-scope quality numbers. Any other dossier doc (PHASE_1_READINESS.md,
customer/PILOT_AGREEMENT.md, customer/DISCLAIMERS.md, validation
reports) that cites quality numbers must either inline these values
verbatim or reference this file by name. When the numbers tighten or
shift, the canonical update happens here first; downstream docs are
synchronized as a single change-controlled act per
sops/CHANGE_CONTROL.md.
Citing rule. No quality number from this doc may be cited in a customer-facing context without the accompanying intended-use disclaimer from
intended-use/INTENDED_USE.md§1 and the reportable-range exclusion from §1.3. The numbers are conditional on the platform staying inside its locked intended use.
These are the substrate-baseline numbers for 0.1.0-substrate per
technical/PIPELINE_LOCK.md §1.
| Metric | Value | Source artifact |
|---|---|---|
| Aggregate F1 | 0.9954 | data/hg002_30x/output/benchmark_deepvariant_v4_2_1/summary.txt |
| Missed truth variants (FN total) | 30,084 | benchmark_deepvariant_v4_2_1/fn.vcf.gz |
| SNP F1 | (split not pinned at 30x; aggregate dominates) | RTG vcfeval snp_roc.tsv.gz |
| Indel F1 | (split not pinned at 30x; aggregate dominates) | RTG vcfeval non_snp_roc.tsv.gz |
Suitable for: Phase 1 pilot positioning ("credible for pilot work").
NOT suitable for: clinical-quality claims that exceed industry
standards (per intended-use/QUALITY_CLAIMS.md F-009).
These are the raw v5.0q numbers — they look bad because v5.0q is an assembly-based truth set that asserts truth in regions where the caller architecture has known limits. Do not cite these values without the §1.3 in-scope numbers in the same sentence.
| Metric | Value | Source artifact |
|---|---|---|
| SNP F1 | 0.9906 | benchmark_deepvariant_v5_0q/snp_roc.tsv.gz |
| Indel F1 | 0.9408 | benchmark_deepvariant_v5_0q/non_snp_roc.tsv.gz |
| Missed truth variants (FN total) | 121,994 | benchmark_deepvariant_v5_0q/fn.vcf.gz |
This is the headline clinical-quality posture. The exclusion BED is
empirically constructed to capture the v5.0q-specific truth content
that the caller architecture cannot meet (alldifficultregions
minus MHC ∪ chrX/Y non-PAR/XTR/ampliconic; PAR remains in scope;
MHC was lifted to in-scope per ADR-0006 on 2026-05-11). See
investigations/V5_0Q_GAP_ANALYSIS.md v0.3.0+ §5.10 for the full
per-stratum decomposition and decisions/0006-mhc-exclusion-lift.md
for the MHC-lift rationale.
| Metric | Value | Source artifact |
|---|---|---|
| In-scope SNP F1 | 0.9993 (arithmetic est.; hap.py confirmation pending) | per-stratum decomposition + ADR-0006 |
| In-scope Indel F1 | 0.9959 (arithmetic est.; hap.py confirmation pending) | per-stratum decomposition + ADR-0006 |
| Exclusion BED capture | 118,748 of 121,994 FNs (97.3 %) | investigations/v5_0q_excluded_regions.bed |
| Exclusion BED region count | 4,571,604 merged intervals | same file |
| Exclusion BED coverage | 747,356,696 bp | same file |
| In-scope quality vs v4.2.1 aggregate | exceeds (0.9993 SNP and 0.9959 Indel vs 0.9954 aggregate) | comparison |
| In-scope range now includes MHC (HLA region) | yes — SNP F1 0.9897 / Indel F1 0.9498 in-stratum | V5_0Q_GAP_ANALYSIS.md §5.10; ADR-0006 |
| Rank | Stratum | Total FN | Share | SNP F1 | Indel F1 | v5.0q-only share |
|---|---|---|---|---|---|---|
| 1 | notinrefseq_cds | 121,385 | 99.5 % | 0.9896 | 0.9447 | 81.2 % |
| 2 | HG002_v4.2.1_complexandSVs_alldifficultregions | 120,562 | 98.8 % | 0.9646 | 0.9323 | 81.2 % |
| 3 | alldifficultregions | 118,859 | 97.4 % | 0.9521 | 0.9308 | 81.2 % |
| 4 | AllAutosomes | 115,893 | 95.0 % | 0.9899 | 0.9458 | 80.2 % |
| 5 | notinsegdups | 93,652 | 76.8 % | 0.9930 | 0.9475 | 82.5 % |
alldifficultregions is the dominant stratum and the one driving the
exclusion BED design.
The numbers above are reproducible. To verify them, recompute against the pinned artifacts:
| Artifact | SHA-256 |
|---|---|
| HG002 v4.2.1 truth VCF | adb4d4a5...e81175c (see technical/PIPELINE_LOCK.md §5.1) |
| HG002 v5.0q truth VCF | c7f9d7a4...f9c50dc8 (PIPELINE_LOCK.md §5.1.1) |
| Reference FASTA | 9cce8b92...8702b7 (PIPELINE_LOCK.md §4) |
| Per-stratum decomposition TSV | 2badc993243a8807abbe005c5b7800cbe26adacd5bfbfc24353a2c9a95f2383a |
| Exclusion BED (uncompressed; per ADR-0006 post-MHC-lift) | 7dc4d16b1d0eb1d171713bc272c9a3f3b881dddb1f305faba02dac25a3932c1c |
| Exclusion BED (uncompressed; pre-MHC-lift; historical, ADR-0004) | 3c079df0d7a2e40876c7e18a87e8a9d541ae63f18a026b76812df715523ae795 |
| GIAB v3.6 stratifications bundle | c5a1eceac54aac2c438af21825223d2a71e64b3db6b1c9e923849babb38063d8 |
Full SHA-256 manifest pins live in technical/PIPELINE_LOCK.md §4
(reference) and §5 (truth sets, exclusion).
The numbers in §1 update on any of:
sops/CHANGE_CONTROL.md. Material
changes per PIPELINE_LOCK.md §6 trigger revalidation; new numbers
land here as part of the revalidation report.PIPELINE_LOCK.md §5; gap
analysis re-runs in V5_0Q_GAP_ANALYSIS.md; new numbers here.validation/PROTOCOL_GIAB.md §6.2). Coverage
slope (H5) is non-gating but expected to tighten the in-scope
residual ~0.06 % (1 − 0.9994 = 0.0006 → expected to halve at 50x).When any of these triggers fire, the change-control sequence is:
QUALITY_METRICS.md) with new values, bump
front-matter version (e.g. 0.1.0-draft → 0.2.0-draft), and update
Last Reviewed.PHASE_1_READINESS.md §2 / §4customer/PILOT_AGREEMENT.md (success-criteria block)customer/DISCLAIMERS.md (quality-claim posture block)validation/VALIDATION_REPORT_*.md instancescustomer/RELEASE_NOTES_TEMPLATE.md if
the change is material.Always paired with the intended-use disclaimer (INTENDED_USE.md §1)
and the reportable-range exclusion (INTENDED_USE.md §1.3 + the
v5_0q_excluded_regions.bed reference):
7dc4d16b...3932c1c (post-MHC-lift; ADR-0006)."What MUST NOT be cited (per intended-use/QUALITY_CLAIMS.md F-009 and
related Forbidden rows):
PIPELINE_LOCK.md §10b).INTENDED_USE.md §1).These would tighten or extend the headline once they land but do NOT gate Phase 1:
validation/PROTOCOL_REPEATABILITY.md (3
byte-identical replicate runs already exist as initial evidence; a
formal repeatability run with provenance manifests + hash
verification is pending).validation/PROTOCOL_REPRODUCIBILITY.md (pending Brev compute).| Date | Change | Authority |
|---|---|---|
| 2026-05-06 | Initial canonical metrics doc populated from V5_0Q_GAP_ANALYSIS.md v0.3.0+ §5.10 per-stratum decomposition and benchmark_deepvariant_v4_2_1/summary.txt. |
Quality Lead |