Running nf-core Pipelines on Spot: How Clusterra Keeps Compute Costs Low
A transparent look at what it costs to run common nf-core pipelines on managed Slurm with AWS Spot — and how that compares to other options in the ecosystem.
For bioinformatics teams running Nextflow, the ecosystem offers several ways to execute pipelines: AWS Batch, Seqera Platform, AWS HealthOmics, or Slurm-based schedulers. Each has a different cost model, and the differences add up quickly — especially for teams running hundreds of samples per month.
Clusterra's approach is straightforward: managed Slurm with transparent AWS Spot pricing. No compute markup. A 10% orchestration fee on the on-demand list price, and 100% of the Spot discount passed through. This post breaks down what that actually costs for common nf-core pipelines.
Clusterra's Pricing Model
| Detail | |
|---|---|
| Compute | AWS Spot price (pass-through, no markup) |
| Orchestration fee | 10% of on-demand list price |
| Effective rate (8 vCPU, 32 GB) | ~$0.153/hr |
| Spot support | Karpenter + Slurm requeue (automatic interruption handling) |
| Scale to zero | Automatic — $0 when no pipelines running |
| Platform fee | None (included in orchestration fee) |
For context, the same 8 vCPU / 32 GB node costs $0.384/hr on-demand, ~$0.115/hr on Spot alone, and $2.40/hr on Seqera Platform's compute pricing ($0.10/vCPU + $0.025/GB).
What nf-core Pipelines Actually Cost on Clusterra
We calculated costs for three common pipelines in ap-south-1 (Mumbai), using typical Spot availability.
1. nf-core/sarek — 30x Whole Genome Sequencing
The workhorse of clinical and research genomics. BWA-MEM2 alignment + GATK HaplotypeCaller on a single 30x WGS sample (~100 GB FASTQ).
| Step | Resources | Cost on Clusterra |
|---|---|---|
| Alignment (BWA-MEM2) | 8 vCPU × 32 GB × 2.5 hr | $0.83 |
| Sort + MarkDuplicates | 4 vCPU × 16 GB × 1 hr | $0.08 |
| HaplotypeCaller (scattered, 24 intervals) | 4 vCPU × 8 GB × 0.5 hr × 24 | $0.92 |
| GenotypeGVCFs + VQSR | 4 vCPU × 16 GB × 0.5 hr | $0.04 |
| VEP Annotation | 2 vCPU × 8 GB × 0.3 hr | $0.02 |
| Total per sample | ~$1.89 |
At 100 samples/month: ~$189 total compute.
For comparison, the same pipeline on Seqera Platform compute pricing would cost ~$13.90/sample ($1,390 for 100 samples), and on HealthOmics Ready2Run ~$10–$15/sample.
2. nf-core/rnaseq — RNA-seq Analysis
STAR alignment + featureCounts + DESeq2-ready counts. Single sample, 50M paired-end reads.
| Step | Resources | Cost on Clusterra |
|---|---|---|
| STAR alignment | 8 vCPU × 32 GB × 0.4 hr | $0.13 |
| featureCounts | 2 vCPU × 4 GB × 0.1 hr | $0.01 |
| FastQC + MultiQC | 2 vCPU × 4 GB × 0.1 hr | $0.01 |
| Total per sample | ~$0.15 |
A 200-sample RNA-seq cohort study: ~$30 total compute.
3. nf-core/fetchngs + nf-core/taxprofiler — Metagenomics
Download SRA data + taxonomic profiling with Kraken2/Bracken. 20 samples.
| Step | Resources | Cost on Clusterra |
|---|---|---|
| fetchngs (download) | Network-bound, minimal compute | ~$0.10 |
| Kraken2 classification | 16 vCPU × 64 GB × 0.3 hr × 20 | $1.98 |
| Bracken + reporting | 2 vCPU × 4 GB × 0.1 hr × 20 | $0.04 |
| Total (20 samples) | ~$2.12 |
How Spot Makes This Possible
The low costs above come from two things working together:
1. Spot Pricing (60–90% off On-Demand)
Bioinformatics workloads are ideal for Spot: embarrassingly parallel, naturally checkpointed (intermediate outputs saved to S3/EFS), and bursty. Each Nextflow task runs independently — if one node is interrupted, only that task retries.
2. Reliable Interruption Handling
The usual problem with Spot is unreliable recovery. Clusterra solves this with Karpenter (automatic node provisioning across instance families and AZs) and Slurm's native requeue. When a Spot instance is reclaimed:
- Karpenter provisions a replacement from a different instance family
- Slurm requeues the interrupted task
- Nextflow resumes from the last checkpoint
No silent fallback to On-Demand. No manual intervention. Spot utilization stays above 95%.
Beyond Nextflow: Why Managed Slurm
If your team only runs Nextflow, any compute backend works. But many computational biology teams also need:
- Molecular dynamics (GROMACS, AMBER) — multi-node MPI, native Slurm support
- Structure prediction (AlphaFold, ESMFold) — GPU scheduling via Slurm GRES
- Custom batch scripts — non-containerized workloads that don't fit Nextflow
On Clusterra, all of these run on the same cluster. One set of credentials, one cost dashboard, one place to monitor jobs. No need for separate Batch, ParallelCluster, or SageMaker setups.
What Other Platforms Do Well
Every platform has strengths worth acknowledging:
Seqera Platform offers Data Studios (interactive Jupyter/RStudio), a polished GUI pipeline launcher, and deep nf-core marketplace integration. For teams that need interactive analysis environments alongside pipeline execution, Seqera's UX is ahead.
AWS HealthOmics provides Ready2Run pipelines, zero infrastructure management, and built-in compliance certifications (HIPAA, SOC2). For regulated teams running standard genomics workflows, the convenience is real.
AWS Batch is free to use (you only pay for EC2) and deeply integrated with AWS services. For teams with infrastructure expertise who want full control, it's a solid foundation.
Clusterra's niche is the team that prioritizes low compute costs, Spot reliability, mixed workloads, and Slurm familiarity — without needing to manage the infrastructure themselves.
Who This Is For
| Your Situation | Best Fit |
|---|---|
| Need polished GUI + Data Studios | Seqera Platform |
| Need compliance certifications today | AWS HealthOmics |
| Want full control, have infra expertise | AWS Batch (DIY) |
| Want low cost + Spot + mixed workloads | Clusterra |
| Budget-conscious, know Slurm | Clusterra |
Try It
- Sign up at clusterra.cloud — no credit card required
- Run a test pipeline:
nextflow run nf-core/rnaseq -profile slurm --input samplesheet.csv - Check the cost: Per-task and total pipeline cost visible in the console
Your first pipeline runs in under 5 minutes.
Built by the former Product Manager for AWS Batch and AWS Parallel Computing Service.
Want to compare costs on your specific pipelines? Reach out at hello@clusterra.cloud — we're happy to help.