8. Scaling study

The scaling study is the manuscript’s main empirical result. Same architecture, same hyperparameters, same evaluation — only the training-set size varies.

Tiers

Experiment

Tier

n samples

Manifest

(Phase 1 baseline)

545

545

(legacy)

exp 32

5K

5,000

assets/kpsc_expansion_subset_5k.tsv

exp 33

10K

10,000

assets/kpsc_expansion_subset_10k.tsv

exp 34

20K

20,000

assets/kpsc_expansion_subset_20k.tsv

exp 36

40K

40,000

assets/kpsc_expansion_subset_40k.tsv

Each manifest is a strict superset of the previous one — see 4. Subset selection.

Headline results

Note

Results table populates as experiments complete. Currently scheduled: exp 32–34 will retrain at MQ=20 once the 20K download finishes (~3 days from 2026-05-17). Exp 36 follows after.

Chromosomal blaSHV (extra-copy)

Tier

MCC

FNR

PPV

call_rate

n_eval

545 (Phase 1)

5K

10K

20K

40K

Plasmid genes (presence)

For each tier we report per-gene MCC across {blaKPC, blaCTX-M, blaNDM, blaOXA-48, qnrB1, aac6-Ib-cr}.

Tier

blaKPC

blaCTX-M

blaNDM

blaOXA-48

qnrB1

aac6-Ib-cr

5K

10K

20K

40K

Reading the curve

We expect:

  • Monotonic improvement in MCC for under-represented genes (rare STs, rare plasmid carriages) as the training set grows.

  • Plateau at some n* between 10K and 40K — that’s the per-gene saturation point.

  • For genes already at MCC ≈ 0.9+ at 545 samples (e.g. blaKPC in the Phase 1 cohort), expect little to no headroom — diminishing returns from scaling. Scaling matters most for the long tail.

Early observation (MQ=40 baseline, since superseded)

The initial exp 32 / 33 runs were done at MQ=40 — too strict for the multi-mapping plasmid reads in the extended reference (see 9. Methods (parameter choices)). With those broken-plasmid results we already saw a striking jump in chromosomal blaSHV call_rate:

Tier

blaSHV call_rate (MQ=40 baseline)

5K

0.44

10K

0.91

The signal is there — the rerun at MQ=20 should produce comparable chromosomal numbers plus working plasmid detection.

Reproducibility note

Every experiment’s evaluation.txt is committed to data/results/{exp}/. The commit hash that produced each result is recorded in the experiment log. See 10. Reproducibility for the exact recipe to regenerate.