Penn AI Framework Identifies GPNMB as Multi-Cancer CAR T Target

Penn team uses LLM-guided screening to surface GPNMB

Researchers at the Perelman School of Medicine and Abramson Cancer Center, working with teams at Mount Sinai and RWTH Aachen, developed a human-in-the-loop AI framework to nominate antigens for CAR T cell therapy. The system integrates large language models with single-cell RNA sequencing data from skin cancer and healthy tissue to generate candidate target lists that expert scientists then evaluate experimentally.

The team filtered more than 10,000 potential antigens using CAR T design criteria (tumor composition, tissue specificity, clinical feasibility). Multiple LLMs ran 1,000 independent nomination simulations to reduce noise and mitigate hallucinations. The consensus list yielded GPNMB (Glycoprotein non-metastatic melanoma protein B) as the top candidate. The researchers engineered a GPNMB-directed CAR T cell and tested it in preclinical models.

In mouse studies, the CAR T cells eliminated tumors in melanoma, monoblastic leukemia, and colorectal adenocarcinoma. The findings suggest GPNMB is expressed across multiple tumor types. Penn's team has published the full framework in the methods section of their Cell paper to enable adoption by other groups and plans to extend the approach to additional cancer types.

Solid tumors need safer targets than blood cancers

CAR T therapies have delivered durable remissions in hematologic malignancies but have struggled to expand into solid tumors. The bottleneck is target selection: identifying antigens that are present on tumor cells but absent or low on healthy tissue requires screening thousands of candidates. This is slow, manual work that has limited clinical progress.

The Penn framework addresses this by automating the initial triage. LLMs excel at scanning and ranking large datasets; human experts then evaluate the top candidates experimentally. Lead author Daniel Baker described the problem: "Discovering a good CAR target is like trying to find a needle in a haystack, except the haystack keeps growing as more sequencing data becomes available." A reproducible, modular workflow that accelerates this triage could unblock new CAR T programs and shorten the path to clinical validation.

Framework is open; start with your tumor type

The Penn team has published the complete framework in Cell, designed to be modular and disease-agnostic. CAR T program leads should download the methods, request or reproduce the code, and test the pipeline on their target tumor type before launching traditional target discovery efforts. This is particularly valuable for solid tumor programs where the haystack is largest and expert bottleneck most acute.

The framework depends on publicly available single-cell RNA-seq datasets, so adoption will be fastest for cancers with robust reference cohorts (skin, lung, colorectal, ovarian). Programs working on rarer tumors may need to generate or integrate additional datasets first. The Penn team plans to apply the approach to additional cancer types themselves, so the field will accumulate both worked examples and refined protocol.

Penn AI Framework Identifies GPNMB as Multi-Cancer CAR T Target

Our Take

Why it matters

Do this week

Penn team uses LLM-guided screening to surface GPNMB

Solid tumors need safer targets than blood cancers

Framework is open; start with your tumor type

Related stories

Seal failures cause batch recalls—here's what machinery standards prevent

Generic sildenafil costs £2.50 per tablet vs £9.50 for Viagra

GemPharmatech builds mouse models to cut neurology drug failures