Soccer Analysis Moves Toward Smarter Scouting And More Accessible Data

Association football turned up unfashionably late to the sports statistics party.

Numbers becoming a more prominent part of the coverage of the game was a gradual process, until they were eventually propelled into the public eye by being the undercurrent of fantasy football and betting.

Around the same time, Opta data was made available on a number of online platforms for free, leading to the creation of more stats-based output across various mediums by amateur bloggers and analysts, many of whom went on to turn a hobby into a career.

Once the predictions made by the data and the people analysing it began to play out and pay out, the popularity surged to the point where soccer clubs were, and are, hiring people who may be alien to the game but are experts in analysing the data it produces.

Soccer stats went from the simplicity of goals and assists, to expected goals and algorithms which process data to produce useful and often extremely valuable pieces of information.

Amateur, and indeed professional scouts can scour the various public data offerings to compile their own information on soccer players and teams.

But it’s one thing having the data, it’s another thing entirely for someone to work out how to use it to their or their employer’s advantage: how to apply raw numbers to soccer players, their characteristics, and their characters.

Thinking that the numbers can replace traditional scouting methods would be as big a mistake as not using the numbers at all.

“Traditional scouting methods are still absolutely critical,” says Dan Altman, founder of North Yard Analytics and its associated platform

“When I was advising clubs I privately commissioned dossiers on players’ personalities as well as providing my advanced analytics. The dossiers always offered valuable input, particularly for players who were unknown to the coaches’ and scouts’ networks.

“To this day, I find that various forms of traditional scouting can often answer the questions provoked by the data—for example, why a player’s performance has dipped or improved from season to season.

“But sometimes coming to a consensus on a player’s narrative is more difficult, and that’s when a signing is likely to carry more risk.”

Given the amount of data which is now available publicly, good football analysis can be done by anyone who possesses gigabytes of patience, and an eye for where to find the relevant numbers among the myriad of websites and apps displaying data.

They all display it in different ways, and navigating the definitions can be as much of a task as making sense of the numbers themselves, but with a combination of statistics and tactics, and the presentation of both in an understandable form, conclusions can begin to be reached.

The style and format of this presentation is often the key to understanding the data. 

“It’s always a challenge,” adds Altman when asked about such visualization.

“You can be great at mathematics and statistics, with a phenomenal understanding of the game—but if you can’t make your metrics engaging and beautiful, you won’t get anywhere.

“Fortunately, as more media outlets become interested in football analytics, they’re bringing their visual skills to bear, and the people producing new metrics are placing more importance on presentation as well.”

To the soccer-watching public, a lot of this side of the game can come across as esoteric and it can sometimes feel as this type of information is being made inaccessible on purpose.

At the same time, it’s easy for media outlets with a bigger reach to throw out what they believe are more basic and easy to understand numbers, ticking a box labelled ‘include some stats’, only for those numbers to be so out of context that they are themselves useless and impossible to understand.

Statistics like how many shots a goalkeeper has saved or how many passes a holding midfielder completed in their last game don’t reveal much on their own, but other numbers, the type which paint a picture of a player’s place in their team and in the wider soccer world, are more valuable and easier to understand.

As Altman alludes to, finding a balance is key and it is a balance he is striving to reach through Smarterscout.

The platform is free to access with paid options available for professionals, and subscription plans for less than a cost of a Netflix subscription in between for the keen amateur.

“We offer 45 leagues to the public and have another dozen or so leagues for private clients, but we use the same metrics across all our leagues, and they’re all explained in our site’s FAQ,” adds Altman.

“So in theory people could try to reconstruct our metrics, yet they’ll probably find it more convenient just to use our site. I think we’re the first site to offer explanations for such a broad slate of metrics, and that’s intentional.

“I hope knowing what goes into our metrics will engender trust and encourage uptake, even among people without technical backgrounds. I want Smarterscout to be the opposite of a black box.”

The only worthwhile conclusions on soccer players are reached when data is applied to a wider context, and behind the numbers Smarterscout’s models and algorithms attempt to do this.

In a sport such as soccer which has so many variables, it has taken some time for the numbers a player produces to be applied properly, but with models such as expected goals (xG)—which assigns a likelihood of a shot resulting in a goal based on numerous factors including its position on the pitch, the player and the situation—more useful judgements can be made.

From xG comes expected assists (xA), and other offshoots using the same or similar data such as xGBuildup and xGChain which attempt to measure the contribution of all players in a team to a goal, even if they aren’t the ones scoring or assisting.

They can also be applied to goalkeepers—a notoriously difficult position to judge— to begin try to gauge how good they are at saving shots other goalkeepers might not, especially when post-shot xG is used.

Like all statistics, they are fairly useless on their own but xG and its cousins, plus the other models created based on other data, are now becoming more accessible, but it has taken some time to get past the over-complication and exclusivity stage which often led to an immediate rejection of such concepts.

“It’s a two-way street,” Altman adds. “You don’t have to engage in a lot of technical explanation if people already know what your metrics mean.

“The simplest measure of expected goals—what we call shot creation xG at Smarterscout—has basically gotten over that hurdle.

“Other metrics still require a lot of explanation, which is another important skill for their producers to have.

“And yes, especially for people with technical backgrounds, there can be a temptation to show off how mathematically elegant your formulas are, but all of that will land with a dull thud among most people who actually work in the sport.”

The level of the league, the level of team-mates, the style of play of the team and their opposition all need to be taken into account.

A combination of data, traditional scouting, and tactical analysis is the way forward, especially when combined with effective visualization of all three. One without the other two can lack something, but thanks to the work of sites like Smarterscout, they are all there for everyone to use. Just remember to actually watch the football, too.

Source: Forbes/ James Nalton


All comments.