Meta's Privacy Classifier Keeps LLMs Out of Production Decisions

Meta Built a Two-Lane Classification System

Meta published a case study on privacy-aware infrastructure (PAI) that details how it classifies data assets—tables, columns, nested fields, ML features, embeddings, log keys—before privacy controls can enforce retention, access, purpose, or sharing policies.

The system works in two parallel paths. The deterministic path handles routine cases: it routes approximately 85% of classification requests through versioned, auditable rules that execute in single-digit milliseconds (roughly 40ms including context assembly). The remaining 15% of novel or ambiguous assets route to an LLM fallback, which runs on a separate budget and returns results in seconds. A nightly offline loop samples served decisions, compares them against human-reviewed ground truth, and feeds validated patterns back into the deterministic ruleset as new versioned rules. No decision is final until humans sign off on rule promotions.

The system separates context from prompting. Before the LLM sees an asset, the system builds an "evidence brief" from multiple sources: source-code resolution, ownership metadata, semantic annotations, data lineage, ML heuristics, and code search results. The evidence brief pre-ranks signals by reliability, highlights supporting and contradicting signals separately, and masks circular fields (like pre-existing labels) to prevent the model from grading its own homework. The model reasons over the curated evidence, not raw context dumps.

Each classifier owns one scoped question (e.g., "Is this user data or operational data?" or "Is this eligible for AI training?") and returns a structured contract: a category from a domain-specific taxonomy, a confidence score, a decision trace showing which evidence influenced the result, the rule that matched (if deterministic), and version information for context, rules, and prompt.

Enforcement Demands Reproducibility, Not Just Accuracy

Asset classification sits at the foundation of privacy control stacks. Every downstream capability—discover, enforce, demonstrate compliance—depends on understanding what the data actually is. A false classification ripples through the entire stack.

The tension is acute in AI-native systems. Multimodal inputs, fast iteration cycles, derived features, embeddings, and evolving policy interpretations create constant schema drift and novelty. Manual review cannot keep pace with volume and speed. Yet privacy enforcement cannot depend on a black box, because regulators, auditors, and courts will ask: Why did you protect this data this way? What evidence did you use? Can you replay that decision?

Meta's pattern decouples two competing demands. LLMs handle ambiguity, cold start, and novel patterns during learning and discovery. Deterministic, versioned rules handle production enforcement—low-latency, replayable, explainable. The LLM's surface area shrinks over time as stable patterns crystallize into rules. In the common case, logic, not learning, makes the call.

Focus on Context Before Prompting

The case study identifies four recurring failure modes in asset classification: noisy and weak signals (a field called "age" could be a user attribute or a cache TTL), distributed context (code, lineage, ownership, docs, usage patterns live in different systems), evolving requirements (product teams move faster than policy reviews), and error propagation (false positives and false negatives both hurt downstream enforcement).

Meta's takeaway is direct: most classification failures are not prompt failures. They are context failures. Hours of prompt optimization produced marginal gains when the classifier was reasoning over raw, unstructured fields. Structuring the evidence brief—assembling relevant signals, suppressing circular references, weighting reliability—produced much larger accuracy improvements.

The implication for teams building privacy, compliance, or data governance systems is clear. Before optimizing how you ask an LLM to classify, invest in what you feed it. Build lineage. Resolve code references. Surface ownership and annotations. Mask labels that would let the model cheat. Let the evidence do the work. Then use the LLM to reason, not to encode the final enforcement rule. Move validated patterns into deterministic, auditable logic as soon as they stabilize.

Meta's Privacy Classifier Keeps LLMs Out of Production Decisions

Our Take

Why it matters

Do this week

Meta Built a Two-Lane Classification System

Enforcement Demands Reproducibility, Not Just Accuracy

Focus on Context Before Prompting

Related stories

Seal failures cause batch recalls—here's what machinery standards prevent

Generic sildenafil costs £2.50 per tablet vs £9.50 for Viagra

GemPharmatech builds mouse models to cut neurology drug failures