Spot HPC for Genomics: How to Save 60-90% on Bioinformatics Compute
AWS Spot instances are perfect for bioinformatics workloads — embarrassingly parallel, fault-tolerant, and bursty. Here's how Clusterra makes Spot reliable enough for production genomics pipelines.
If you're running bioinformatics pipelines on AWS, you're probably paying too much for compute. The most common setups — AWS Batch with On-Demand instances, or Seqera Platform with managed compute — charge 2–7x what the underlying hardware costs. The reason is simple: Spot instances offer 60–90% savings, but most platforms either don't use them reliably or charge a hefty markup on top.
Clusterra passes AWS Spot prices directly to you with zero compute markup — just a 10% orchestration fee on the on-demand list price. For a typical genomics workload, this means paying $0.077/hour for a 4 vCPU / 16 GB node instead of $0.192 on-demand.
Here's why Spot works so well for genomics and how Clusterra makes it production-ready.
Why Bioinformatics Is a Perfect Spot Workload
Spot instances are cheap because AWS can reclaim them with 2 minutes notice. This makes them risky for long-running, stateful workloads. But bioinformatics pipelines have three properties that make them ideal:
1. Embarrassingly Parallel
Most genomics workflows break down into hundreds or thousands of independent tasks. A 30x WGS pipeline with BWA + GATK HaplotypeCaller might spawn 500+ individual Slurm jobs across alignment, sorting, marking duplicates, and variant calling. Each task runs on its own node. If one node is interrupted, only that task needs to retry — not the entire pipeline.
2. Naturally Checkpointed
Nextflow and most bioinformatics tools write intermediate outputs to shared storage (S3 or EFS). When a task is interrupted and requeued, it can resume from the last written checkpoint rather than starting from scratch. A BWA alignment that was 80% complete doesn't lose all progress — the staged output is preserved.
3. Bursty and Idle-Heavy
Genomics teams don't run pipelines 24/7. Typical patterns: batch 50 samples on Monday, wait for results Tuesday, analyze Wednesday–Friday. With On-Demand or reserved instances, you pay for idle capacity. With Spot + scale-to-zero, you pay nothing when no pipelines are running.
The Spot Problem: Why Most Teams Don't Use It
Despite the savings, most genomics teams avoid Spot for production pipelines. The reasons:
1. Silent fallback to On-Demand. AWS Batch's default behavior when Spot capacity is unavailable is to silently fall back to On-Demand instances. Teams discover the 3x cost surprise weeks later in their AWS bill.
2. No automatic requeue. When a Spot instance is interrupted, Batch marks the task as failed. Nextflow sees a failure and retries, but without Spot-awareness — the retry might land on On-Demand again.
3. Cold start penalty. Scaling from zero to 20 nodes on Spot can take 5–10 minutes with Batch, adding latency to pipeline starts.
4. Instance diversity. Spot availability varies by instance type. Using only m5.xlarge means competing with everyone else for that type. Diversifying across m5, m6i, m6a, m7i, and Graviton families dramatically improves availability — but configuring this in Batch is complex.
How Clusterra Solves Spot for Genomics
Clusterra uses Karpenter (the Kubernetes node provisioner) and Slurm's native requeue to make Spot reliable:
Transparent Instance Selection
Karpenter automatically selects from dozens of instance types and availability zones based on current Spot pricing and availability. You specify the resource requirements (4 vCPU, 16 GB) and Karpenter finds the cheapest available option. No manual instance type configuration.
Automatic Interruption Handling
When AWS reclaims a Spot instance: 1. Karpenter detects the 2-minute interruption notice 2. A replacement node is provisioned immediately (usually from a different instance family) 3. Slurm requeues the interrupted task automatically 4. Nextflow resumes from the last checkpoint
No silent On-Demand fallback. No manual intervention. The pipeline continues with minimal delay.
Scale-to-Zero
When the Slurm job queue is empty, Karpenter terminates all worker nodes. You pay $0 for idle capacity. When a new pipeline is submitted, nodes provision in ~60 seconds.
No Compute Markup
Clusterra charges 10% of the on-demand list price as an orchestration fee. The Spot discount is passed through entirely.
For a 4 vCPU / 16 GB node: - On-demand list: $0.192/hr - Spot price (typical, ap-south-1): ~$0.058/hr - Clusterra fee: $0.019/hr (10% of list) - You pay: ~$0.077/hr total - Savings vs on-demand: ~60%
Real-World Pipeline Costs
Here's what common bioinformatics pipelines cost on Clusterra with Spot:
| Pipeline | Description | Clusterra (Spot) | On-Demand | Seqera Compute |
|---|---|---|---|---|
| 30x WGS | BWA + GATK HaplotypeCaller | $3.70 – $5.10 | $12 – $18 | $15 – $25 |
| RNA-seq | STAR + featureCounts (50M reads) | $0.60 – $1.00 | $2 – $4 | $3 – $6 |
| Exome | BWA + GATK | $1.00 – $1.60 | $3 – $5 | $4 – $8 |
These are full pipeline costs, not per-step — including alignment, variant calling, annotation, and QC.
When Spot Doesn't Work
Spot isn't right for every workload:
- Long-running single-node jobs (>24 hours): Interruption probability increases with time. For 72-hour molecular dynamics simulations, consider On-Demand or Reserved.
- Latency-critical tasks where a 60-second cold start is unacceptable: rare in bioinformatics.
- Compliance requirements mandating specific instance types: some clinical workflows require dedicated tenancy.
For the vast majority of genomics pipelines — alignment, variant calling, RNA-seq, ChIP-seq, metagenomics — Spot is the right choice.
Getting Started
- Sign up at clusterra.cloud — no credit card required
- Submit a test pipeline:
nextflow run nf-core/fetchngs -profile slurm --input ids.csv - Check the cost: See per-task and total pipeline cost in the Clusterra console
Your first pipeline runs in under 5 minutes. Spot handling, scale-to-zero, and cost tracking are all built in.
Built by the former Product Manager for AWS Batch and AWS Parallel Computing Service.
Running bioinformatics on AWS? We'd love to hear what you're paying — drop us a line at hello@clusterra.cloud.