Back to news
AnalysisJune 11, 2026· 3 min read

Soccer coaches now use AI to spot why kicking the ball away wins games

Jesse Davis's Sports Analytics Lab at KU Leuven has spent over a decade building machine-learning models that reveal hidden tactical patterns pro clubs are adopting. Here's what the data actually shows.

Our Take

Davis's lab has moved soccer analytics past individual-player metrics into tactical discovery that clubs actively deploy, but the sport still lags rivals in data collection standardization.

Why it matters

Soccer has resisted statistical analysis longer than baseball or basketball because goals are rare and actions are fluid. Davis's work proves the sport's complexity is tractable with the right ML framework, and European clubs are already building workflows around his open-source tools.

Do this week

Analytics teams: audit your event-data annotation pipeline this week; if it takes more than six hours per game, Davis's standardization work on transformer-based tagging could cut your manual overhead by half.

A decade of research surfaces tactical patterns that shaped professional soccer

Jesse Davis, a computer science professor at KU Leuven in Belgium, heads the Sports Analytics Lab, which has spent more than a decade applying machine learning to soccer. The lab's work has shifted how European clubs evaluate rosters, assess strategy efficiency, and scout recruits.

One concrete example: Davis's team analyzed 1.4 million passes and 60,000 throw-ins from the 2022 World Cup using tree ensemble models. They found that kicking the ball out of bounds on the opponent's side of the pitch (when in the middle third) puts a team within 10 actions of scoring. That matters in a sport with 1,500+ actions per match and very few goals. The tactic has begun appearing in top European leagues.

The lab also published research showing that long shots merit higher frequency than traditional soccer doctrine suggests. In one 2021 analysis, researchers modeled English Premier League team behavior using Markov decision processes and concluded Chelsea could gain 1.6 more goals per season by shooting from distance 20% more often (presented at MIT Sloan Sports Analytics Conference).

Davis's approach differs from internal analytics teams now common across pro clubs. He publishes most research freely and maintains open-source tools: VAEP (which assesses the effect of all on-ball actions), xG models (expected goals), and packages that synchronize event data with player tracking data. Those tools log thousands of downloads monthly and are used in daily workflows across industry.

Jan Van Haaren, an engineering student Davis hired in 2010, is now director of football intelligence at Club Brugge KV and helps translate the lab's research into measurable outputs for his club's tactical philosophy.

Soccer's data problem is structural, not computational

Baseball and basketball lend themselves to statistical isolation: a jump shot or at-bat is a discrete unit with clear outcome attribution. Soccer is not. Most actions do not lead to a goal or even a shot. The sport's fluidity and speed made it seem like a poor candidate for moneyball-style analysis until Davis and Van Haaren proved otherwise.

The real bottleneck is not model design but data collection. Every team employs people to watch video and manually annotate tactics using software. This process takes up to six hours per match and is, in Davis's words, "a complete nightmare as a data analyst to work with."

Davis is now collaborating across institutions to standardize annotation. The group is experimenting with transformers (the neural network architecture behind ChatGPT) to train models on a few human-tagged examples of a tactic (a three-on-two breakaway, for instance) so the model can auto-tag subsequent instances. This work remains hard but has shown progress.

Meanwhile, Davis's open-source outputs have already made analytics workflows materially easier for clubs that lack the resources to build custom infrastructure.

Build or audit your data pipeline now

If your analytics team relies on manual event annotation for video, the six-hour-per-match baseline is not normal. Test whether transformer-based tagging (the direction Davis's lab is moving) can reduce that burden. If you lack internal tooling, Davis's open-source VAEP and xG models are proven production-ready; thousands of industry users are already running them daily. Sync your event data with tracking data so you can assess individual player role fulfillment—that is how clubs like Club Brugge evaluate roster strength and scout talent. Do not wait for vendor solutions to standardize your format; the standardization effort is academic and open.

#Research#Open Source#Enterprise AI
Share:
Keep reading

Related stories