Law Firms Now Train Custom AI Models on Their Own Workflows

Harvey Confirms Open Source Model Training with Law Firms

Harvey CEO Winston Weinberg told Artificial Lawyer the company is running proof-of-concept studies with law firms to fine-tune open source large language models on firm workflows and client-specific practices. The goal is not just to encode how work happens internally, but to encode the entire experience from law firm through to the client, so that automation can be tailored to recurring needs.

This mirrors moves elsewhere in legal tech. Kirkland & Ellis, after announcing a $500 million AI investment with Palantir, has begun hiring AI infrastructure experts with GPU cluster experience, suggesting plans for in-house open source training. Thomson Reuters has also been training open source LLMs on its legal research corpus to augment its commercial AI tools.

Harvey co-founder Gabe Pereyra outlined the ambition on X: a legal foundation model series designed to serve frontier-quality intelligence at lower cost with stronger security, and to let law firms "own their own intelligence." The models will target complex client matters spanning months and involving dozens of associates, orchestrating legal tech tools, sub-agents, and escalations to frontier models or humans. Harvey has open-sourced benchmarks representing associate and in-house lawyer work and reports "promising results" when post-training open source models on legal tasks.

The Secret Sauce Bet Is Weaker Than It Sounds

The marketing hook—that law firms can "bottle" their proprietary methods and lock in competitive advantage—rests on a shaky premise. Most legal work product has no durable moat. A contract drafted by a Manhattan elite firm may look different from a High Street rival's version, but the differences are legible to competitors. Documents circulate. Methods leak. Tax structures that are genuinely unique to one partner at one firm are the exception, not the rule.

What firms may legitimately own is narrower: the relationship itself. How one firm's lawyers interact with a specific client, the client's playbooks, the orchestration of past work—those may differ from what another firm would do. But even that is limited by human factors, not secret methodology. As one expert put it to Artificial Lawyer, the difference between how Ford and Tesla build cars is real but not planetary. Law firms operate in an even smaller margin.

The real driver of this shift is not exceptionalism but pragmatism. Data stays on-premises or in controlled environments. Firms believe they can extract better performance from fine-tuned models on their own patterns than from generic frontiers alone. And agentic workflows—systems that combine reference data, client playbooks, process orchestration, and a specialized LLM—do create tighter automation for recurring work. That's defensible. The mythology of irreplicable expertise is not.

What Firms Should Do Now

Stop assuming your workflows are unique unless you can defend that claim to a peer at a competitor. Audit your recurring client work for three things: (1) data sensitivity that justifies on-premises training, (2) process complexity that generic models handle poorly, and (3) client relationships where the experience difference (not the output) is defensible. Only invest custom model training in those buckets. For the rest, use frontier models with better prompting and retrieval. The cost and operational burden of maintaining custom fine-tuned models is real, and it is not worth incurring to defend a claim you cannot actually prove.

Law Firms Now Train Custom AI Models on Their Own Workflows

Our Take

Why it matters

Do this week

Harvey Confirms Open Source Model Training with Law Firms

The Secret Sauce Bet Is Weaker Than It Sounds

What Firms Should Do Now

Related stories

Half of firms talk change, 17% ask employees how it lands

72% use AI but only 43% of staff trust their judgment. Here's why.

Commercial health plans brace for 9% cost surge in 2027