Back to news
Use CaseJune 25, 2026· 3 min read

Rare disease diagnoses jump 5% when AI reruns old genetic tests monthly

Talos, an open-source system from Microsoft Research, recovered 241 new diagnoses by automatically reanalyzing stored genomic data as scientific knowledge evolves. Deploy it yourself—here is why monthly cycles beat one-time testing.

Our Take

Talos solves a real bottleneck (manual review time) by being deliberately conservative, not permissive, returning 1.3 candidates per patient instead of ranked lists—a design choice that makes continuous reanalysis actually sustainable.

Why it matters

Over half of rare disease patients remain undiagnosed after their first genetic test, but re-examining stored data as gene-disease knowledge accumulates can yield answers years later. Automation has been proposed for years; Talos demonstrates it works at scale (4,735-patient cohort, 5.1% additional yield) and at speed (32 days from new knowledge to diagnosis on average).

Do this week

Diagnostic labs: audit which patient cohorts you've tested but never reanalyzed, then pilot monthly Talos runs on the largest undiagnosed group so you can establish baseline yield and staffing costs before full deployment.

Talos delivered 241 new rare disease diagnoses by automating genomic reanalysis

Talos is an open-source tool that reruns genomic analysis on stored patient data whenever new gene-disease relationships or variant classifications become public. Developed through collaboration between the Centre for Population Genomics, Australian Genomics, the Broad Institute, and Microsoft, it was benchmarked against two independent cohorts totaling 1,089 patients who had already undergone manual expert review. On the Australian Acute Care Genomics cohort, Talos recovered 90% of in-scope diagnoses while returning a median of just 1.3 candidate variants per family for expert review. On the U.S. Rare Genomes Project cohort, it recovered 87% of diagnoses at the same operating point, showing consistency across different patient populations.

In a prospective deployment across 4,735 undiagnosed patients from Australian Genomics research studies and a single diagnostic laboratory, Talos produced 241 new diagnoses (5.1% additional yield). Every variant flagged was later confirmed as pathogenic or likely pathogenic by accredited labs. The breakdown of where those diagnoses came from reflects why reanalysis matters: 32% emerged from new gene-disease relationships discovered since original testing, 22% from variant reclassifications, and 45% from improved filtering and analysis including copy number variants and structural variants not examined in the initial test.

To test whether reanalysis could operate continuously rather than as a one-off event, the team ran 29 monthly cycles. Later cycles required minimal human effort: analysts reviewed an average of one new variant per 200 cases per month. The speed of diagnosis improved dramatically. An average of 32 days passed between new evidence appearing in public databases and a patient receiving a diagnosis. The fastest case turned around in a single day.

Human review time is the bottleneck, not algorithmic recall

Genomic testing has left more than half of rare disease patients without answers despite sequencing. The reason is not technical but institutional: our understanding of the genome improves constantly (hundreds of new gene-disease associations and thousands of new variant classifications are reported yearly), but existing genomes are rarely revisited. A meta-analysis of nearly 9,500 undiagnosed patients found that reanalysis lifted diagnostic yield by about 10% over two years, yet reanalysis today depends on motivated clinicians and scarce laboratory staff with inconsistent reimbursement. Most stored genomes never get a second look.

Talos breaks this pattern by optimizing for specificity, not sensitivity. It deliberately returns a short, high-confidence list rather than a ranked list of candidates. When benchmarked head-to-head against Exomiser, a widely used prioritization tool, Talos and Exomiser performed similarly when every variant was reviewed. But when review was limited to realistic budgets (top five or top one ranked variants), Talos significantly outperformed Exomiser (p = 0.017 for top five, p < 0.0001 for top one). The two tools surfaced different variants, suggesting they are complementary.

The economics are also viable. Annotating 1,000 genomes costs roughly $11, and a monthly reanalysis pass runs for a few cents per cohort. This cost structure makes continuous reanalysis feasible for health systems, not just academic research projects.

Deploy on monthly cycles, starting with your largest undiagnosed cohort

Talos is open source and straightforward to deploy in cloud environments like Azure. The system draws on two continuously updated public resources: PanelApp Australia for gene-disease relationships and ClinVar for variant-level pathogenicity. It filters and prioritizes variants using family structure (mode of inheritance, de novo status) and phenotype when available.

Begin with a cohort of singleton or trio patients with neurodevelopmental, cardiac, renal, or neurological indications, as these showed consistent 5–6% additional yield across the deployment. Run Talos once to establish baseline, then move to monthly cycles. Because later runs return only newly actionable evidence, the workload per cycle is minimal (one variant per 200 cases in the deployment trial). Expect 92% of new diagnoses on the first pass, but the iterative design proves its value by ensuring no family waits years for an answer that becomes possible weeks after a scientific discovery is published.

#Healthcare AI#Open Source#Research
Share:
Keep reading

Related stories