Mozilla finds 271 Firefox bugs with AI, reports no false positives

Mozilla found 271 Firefox vulnerabilities in two months

Mozilla engineers used Anthropic's Mythos AI model to identify 271 security vulnerabilities in Firefox source code over two months, with what they describe as "almost no false positives" (per Mozilla's engineering blog). The key difference from previous AI vulnerability detection attempts was Mozilla's custom "harness" that wrapped around Mythos to guide it through specific tasks.

Mozilla Distinguished Engineer Brian Grinstead described the harness as "the code that drives the LLM in order to accomplish a goal." The harness provided Mythos with the same tools Mozilla's human developers use, including file read/write access, test case evaluation capabilities, and the special Firefox build used for testing. This represented significant engineering investment to customize the system for Mozilla's specific codebase, tooling, and development processes.

Previous attempts at AI-assisted vulnerability detection had produced what Mozilla engineers called "unwanted slop" - plausible-sounding bug reports that proved to be largely hallucinated when human developers investigated further.

False positives have killed AI security tools

Security teams have repeatedly encountered AI vulnerability detection tools that generate impressive-looking reports at scale, only to discover that human verification reveals most findings to be fabricated. This wastes developer time and erodes confidence in AI-assisted security tooling.

Mozilla's approach suggests that the breakthrough comes not from the AI model alone, but from substantial engineering work to create project-specific infrastructure around the model. The harness approach provides a template for organizations that want operationally useful AI vulnerability detection rather than proof-of-concept demos.

The timing matters because Mozilla's CTO recently claimed that AI-assisted vulnerability detection means "zero-days are numbered." The detailed engineering disclosure provides evidence for what would otherwise sound like typical AI hype.

Harness engineering is the real work

Organizations considering AI vulnerability detection need to budget for significant custom engineering work, not just API calls to AI models. Mozilla's success required building infrastructure that integrates the AI model with existing development tools, testing pipelines, and codebase-specific knowledge.

Security teams should audit their current AI vulnerability tools for false positive rates before expanding usage. The "almost no false positives" claim from Mozilla suggests a sharp distinction from earlier generations of AI security tooling that required extensive human cleanup.

The harness approach also indicates that successful AI security tools will be highly customized to specific codebases and development workflows, rather than generic solutions that work across all projects.

Mozilla finds 271 Firefox bugs with AI, reports no false positives

Our Take

Why it matters

Do this week

Mozilla found 271 Firefox vulnerabilities in two months

False positives have killed AI security tools

Harness engineering is the real work

Related stories

Gresham and FundGuard merge data platforms for asset managers

ANNA Money adds 3.66% savings account for UK small businesses

Payward buys Reap for $600M to merge stablecoin cards with B2B rails