Fine-tune protein models on one GPU, 1% of parameters, full accuracy

NVIDIA ships ready-to-run LoRA recipes for billion-scale biology models

NVIDIA published BioNeMo Recipes, step-by-step templates for fine-tuning large pretrained biology foundation models using Low-Rank Adaptation (LoRA). The recipes support ESM2, a 3-billion-parameter protein language model, and Evo2, a 1-billion-parameter DNA model. Both run on a single RTX 6000 Blackwell Workstation Edition GPU.

LoRA works by freezing the pretrained backbone and training only small low-rank adapter matrices in parallel. For ESM2-3B protein secondary structure prediction (PSSP), the team fine-tuned using LoRA adapters on query/key/value projections, achieving 84.80% Q3 accuracy and 74.30% Q8 accuracy (per company testing). These numbers match or exceed published baselines: Porter 6 reported 84.56% Q3 and 74.18% Q8; NetSurfP-3.0 hit 82.92% Q3 and 71.84% Q8.

For Evo2-1B on splice-site classification, LoRA plus a lightweight classification head reached 96.6% test accuracy using only 1.42% of model parameters trainable (15.99 million parameters). A head-only baseline with no adapters scored 52.3% accuracy, demonstrating LoRA's contribution. The recipes use PyTorch, Hugging Face, and Megatron-Bridge patterns familiar to biology teams.

Training speed benefited from sequence packing, which eliminates padding overhead. THD (packed) format achieved 5.5x throughput over BSHD (padded) format by concatenating nonpadding tokens and tracking boundaries with metadata. A full ESM2-3B LoRA workflow completed in under one hour on a single workstation GPU.

Parameter-efficient adaptation removes the compute barrier for biology teams

Full fine-tuning billion-parameter models requires storing and updating all model parameters and optimizer state, becoming impractical quickly. LoRA solves this by keeping the backbone frozen and training only adapter matrices, cutting memory footprint and parameter count to roughly 1% of the original model.

The recipes come with working code, not API abstractions. Teams can customize hyperparameters (rank, target modules, dropout) for their datasets and architectures. Both token-classification (PSSP) and sequence-classification (splice sites) examples are included, covering the two common task shapes in computational biology.

The benchmarks are honest: results are compared against published baselines on standard datasets (Porter 6 for PSSP, Nucleotide Transformer for splice sites), not against vendor-only internal tests. LoRA does not outperform full fine-tuning; it matches it while using far less compute and storage.

Test LoRA on your own biology downstream task

If you currently skip fine-tuning foundation models because full parameter updates are too expensive, LoRA reduces that friction. Clone the BioNeMo Recipes repository, prepare your own dataset in the same format as the Porter 6 or Nucleotide Transformer splits, and run the ESM2 or Evo2 recipes to measure accuracy on your task.

LoRA hyperparameters (rank, target modules) will vary by dataset size and task complexity. The recipes include validation curves and top-five-checkpoint selection logic, standard practices for avoiding overfitting on small biology datasets.

Fine-tune protein models on one GPU, 1% of parameters, full accuracy

Our Take

Why it matters

Do this week

NVIDIA ships ready-to-run LoRA recipes for billion-scale biology models

Parameter-efficient adaptation removes the compute barrier for biology teams

Test LoRA on your own biology downstream task

Related stories

Your compliance API isn't ready for AI agents yet

Regulators now demand proof controls work, not just docs

Banks can't wait for AI rules. Regulators just told you why.