Our Take
Solid engineering with rigorous safety gates, but success depends on whether clinicians trust locally-run AI over established decision trees.
Why it matters
Hospital IT teams can finally deploy clinical AI without cloud dependencies, addressing the primary barrier to oncology AI adoption in privacy-sensitive environments.
Do this week
Hospital IT teams: evaluate AMD MI300X procurement costs against cloud API expenses before Q3 budget cycles so you can present on-premises AI options.
Researchers built a privacy-preserving oncology AI system
OncoAgent routes clinical queries through an additive complexity scorer to either a 9B parameter model for simple cases or a 27B model for complex presentations. The system runs entirely on AMD Instinct MI300X hardware with 192GB HBM3, eliminating cloud API dependencies that prevent hospital deployment due to patient data sovereignty requirements.
The complexity router assigns weighted scores based on cancer type (rare cancers +0.40 points), staging (Stage IV +0.25), mutation count (multiple mutations +0.30), and prior treatments (+0.10). Cases scoring above 0.5 route to the deeper reasoning model. A Stage IV pancreatic carcinoma case with KRAS and BRCA2 mutations scored 0.80, correctly routing to Tier 2 (per the research paper).
Training used QLoRA fine-tuning on 266,854 oncological cases from PMC patient reports, Asclepius medical QA data, and synthetic cases generated by Qwen 3.6-27B. The AMD MI300X hardware completed full-dataset fine-tuning in approximately 50 minutes, delivering 56× throughput acceleration over API-based generation (company-reported).
The retrieval pipeline grounds responses in 77 physician-grade NCCN and ESMO guidelines. Document relevance grading achieved 100% success rate with mean RAG confidence scores of 2.3+ after switching from Qwen 3.5 to Qwen 2.5 Instruct for the grading component (per the technical preprint).
On-premises deployment removes the primary adoption barrier
Most clinical AI systems fail hospital adoption because they require sending patient data to cloud APIs, violating HIPAA compliance policies and institutional data governance frameworks. OncoAgent's full on-premises deployment addresses this directly while maintaining clinical safety through deterministic validation layers.
The three-layer safety cascade runs formatting checks, rule-based scans for prohibited patterns, and LLM entailment verification before any output reaches clinicians. The system enforces mandatory human-in-the-loop interrupts for complex cases and low-confidence outputs, with fallback nodes returning clinical refusals rather than hallucinated recommendations.
Per-patient memory isolation using unique thread IDs prevents cross-contamination between clinical sessions while enabling multi-turn consultations within individual cases.
Evaluate hardware costs against API expenses
Hospital IT teams should calculate AMD MI300X procurement and operational costs against projected cloud API volumes. The 192GB HBM3 specification supports both model tiers simultaneously, but requires substantial upfront capital investment versus pay-per-query cloud alternatives.
Clinical teams should audit existing oncology decision support workflows to identify integration points where dual-tier routing could reduce cognitive load without disrupting established protocols. The complexity scoring system may require local calibration based on institutional case mix and specialist availability.
Privacy officers should review the Zero-PHI policy implementation and on-premises deployment architecture against current data governance requirements before pilot deployment approval.