For proteomics labs & CROs — 5 to 50 people

Mass-spec proteomics on your AWS, in hours not weeks.

MAPPs Complete chain (Sarek → DeepSomatic → epitopeprediction → Casanovo) verified end-to-end. AlphaDIA cohorts at $0.05 per sample, 13 minutes wall-clock. Glyco-MAPPs at $0.73 per cohort. Same open-source tools your team already cites. Your data never leaves your VPC.

Sarek + epitopeprediction Casanovo de novo Sage + MetaMorpheus AlphaDIA cohort MHCflurry / NetMHCpan Glyco (pGlyco3 / MSFragger-Glyco)

$4K paid pilot, scoped to one workflow · $750/mo BYOC after · First reference customer free

Benchmarks

Full-pipeline cost on real proteomics workloads.

Spot pricing, checkpoint-restart on, us-east-1. Reproducibility bundle on every run.

Workload
Cost
Reference
MAPPs Complete neoantigen discovery
Sarek → DeepSomatic → epitopeprediction → Casanovo · 4-step chain
~41 min test fixtures, per-job stamped
PRIDE PXD019643 HLA-eluted · Casanovo 3.9× Comet, 24.9% predicted HLA binders. Full-patient benchmark in progress.
AlphaDIA cohort, parallel samples
timsTOF Ultra 2 · HeLa 200ng · Mann lab library
~$0.05 per sample, 13 min wall-clock
8,591 proteins, 105,414 precursors · 50-sample cohort runs in parallel, not serial
Glyco-MAPPs Tn-glycopeptide discovery
pGlyco3 / MSFragger-Glyco → Boltz-2 fold · 3 replicates
~$0.73 per cohort
PXD050580 COSMC-KO MDA-MB-231 · 12 high-confidence Tn-glycopeptide PSMs, 46 min total
TCR-pMHC ternary complex co-folding
5-chain structure prediction · Boltz-2 · A10G
~$0.13 per complex
Reference DockQ 0.59, post-cutoff DockQ 0.62 against published crystals

Read the MAPPs Complete benchmark in full →

Case studies

Reproducible benchmarks. Fork the repo, run it on your data.

All benchmarks below are on public datasets (PRIDE, GIAB) so any reviewer can verify the numbers. Customer-anonymized case studies follow our first reference deployments.

mapps-complete

Neoantigen discovery, end to end

WES + tumor-normal variant calling + epitope prediction + de novo MS/MS. Picks up the HLA-presented peptides database search misses.

tumor + normal FASTQ + raw MS
ranked neoantigens + reproducibility bundle

Read →

alphadia-cohort

DIA at 50–5,000 sample scale

AlphaDIA + MS2Rescore parallelized across samples. 50-sample cohorts in 13 minutes instead of 10+ hours on a workstation.

raw .d / .raw files + spectral library
protein quant matrix + QC report

Read →

glyco-mapps

Tumor glycopeptide discovery (ADC leads)

Raw → oxonium QC → N/O-glycan search → de novo validation → structure prediction. Surfaces druggable Tn-glycopeptide ADC targets.

raw MS + glycan database
glycopeptide PSMs + folded structures

Read →

Proteomics toolchain

Verified containers for the mass-spec stack.

Pinned versions, GPU-tuned where applicable, Slurm-ready. Add a tool we don't list and we'll containerize it in a week.

Why BYOC for proteomics

Patient samples, IP-sensitive cohorts, and HIPAA-grade data — on your AWS, not ours.

01 / Data residency

Raw MS, patient FASTQ, and identified peptides never leave your VPC.

Cross-account IAM role, ~10 minutes to set up. Your S3, your KMS, your audit trail. No upload to a vendor cloud.

02 / No compute markup

The AWS bill is yours.

Uses your Reserved Instances, Savings Plans, and credits. We don't sit in the spend path. Flat $750/mo for the control plane, after the pilot.

03 / Per-job cost stamp

Per-sample / per-patient cost lands in CloudWatch.

CROs invoice by patient. We stamp the per-job AWS spend onto every Slurm job so your finance team can map compute to client invoices.

Who this is for

A real fit, honestly.

Fit

You probably want to talk to us if …

· 5–50 person proteomics CRO or biotech
· Running MAPPs / DIA / glyco / TCR pipelines on a workstation or homebrew Slurm
· One bioinformatician maintaining the pipeline alongside their day job
· Open-source-first software stack (Casanovo, Sage, AlphaDIA, MetaMorpheus, etc.)
· Patient samples or IP-sensitive cohorts that shouldn't go to a vendor cloud

Not fit

Probably not for you if …

· Your pipeline is fully Spectronaut/Proteome Discoverer and the licenses are happy
· You already have an in-house platform / DevOps team running Slurm
· You need SOC 2 or HIPAA BAA in month one
· You're a single-person lab running one workload — buy a workstation instead
· You're a major pharma — you have an infra team

$ sbatch mapps-complete --patient PATIENT_ID # Your AWS, your VPC. Per-job cost stamped automatically.

Scope a proteomics pilot on your data.

30 minutes to scope the workflow. 4 weeks of paid pilot ($4K, scoped to one workflow) — or free if you'll be our first reference customer in proteomics. After that, $750/mo BYOC.