Taking Expected goals (xG) to the next level
I’ve been hearing a lot about xG. Can you please tell me again what is it exactly?
Expected goals, or xG, is a predictive football analytics metric that reflects the probability (on a scale of 0 to 1) that a shot will result in a goal. The model is powered by historic shot-based data and typically takes into account variables such as the distance and angle to the goal, shot type (header/feet), the assist type, opponents beaten on the way, the possession nature (a rebound? An after dribble shot?), and others. For clarity’s sake - If at the moment of the shot, a player is one-on-one with the goalkeeper, and he is shooting from well inside the 18-yard box from a good angle - the model, based on the context, may indicate that the striker has a high likelihood of scoring - let’s say a 70% chance (meaning the shot should have been converted 70% percent of the time, judging by historic data). As a result, the xG of the shot would have a value of 0.7. The common method of calculating a team’s xG in a match compared to its opponent is by combining each player’s individual xG for that match. This is an easy way for fans, analysts, and coaches to evaluate which side dominated by seeing which had the higher expected goals total.
xG is a better indicator than shot-count, but it doesn’t come without its limitations. And one of them is pretty substantial
xG earned a large share of its glory owing to the fact that it’s a better method of evaluating who was the better team in a match than just calculating which team took more shots. As explained, this is because xG takes into account the likelihood of the shot being converted based on the circumstances it was taken. But what if teams generate lots of shots, but the majority of them are in fact difficult opportunities? Does that mean they generated good quality chances? Probably not. This is why simply viewing the number of shots to evaluate how many goals they should have scored can be very misleading.
Real Madrid took 11 shots compared to Atalanta’s 13 during their Champions League clash. However, their xG was significantly higher (3.0 vs 1.3) indicating the shots they took had a far greater chance of being converted)
In addition, it wouldn’t be smart to use xG on a single-match basis. Not only can the game state affect the match’s final xG (teams that lead may decide to sit back and conserve energy, potentially sacrificing xG) - football is also a very low-scoring sport with a lot of variance and luck, and certain match events may dramatically affect xG. For instance, teams may create lots of opportunities to score by taking up dangerous positions on the pitch, but if a shot is not taken, nothing is counted towards xG as it is a shot-based model.
xG covers only 1.5% of match events, what about the other 98.5%? Meet “Dangerousity”- the next evolution of performance analysis
And here lies the biggest weakness of xG. Yes, it can serve as a good indicator of a team’s performance (especially over many matches), however, with it being a shot-based model and shots only accounting for 1.5% of all events in a football match, the other 98.5% of the game is unaccounted for. This problem can be addressed partially by using additional metrics in conjunction with xG, but what if we don’t want to use a blend of numerous stats and still get an understanding of how dominant and dangerous a team was?
This challenge - learning how to evaluate how specific actions change the probability of scoring a goal regardless of if a shot was taken at the end - is one of the issues that professionals in the field of football analytics have been trying to tackle in the last few years. This article by John Muller on FiveThirtyEight reviews some of the more popular possession-based models that are used for this purpose. Allow us to add one more member to the list.
Dangerousity is the name of a deep-learning-based metric that might supply a whole new way of evaluating an attacking performance in football. It does so by aiming towards the same purpose that xG does: Defining the probability of a goal being scored. However, while xG focuses on shots taken, dangerousity uses a groundbreaking formula that determines a quantitative representation of that probability for every point in time at which a player is in possession of the ball. The origins of this formula are actually quite fascinating.
Sounds interesting. Tell me more about Dangerousity and how it works
It all started with a study published in December 2016, with this title: “Real-Time Quantification of Dangerousity in Football Using Spatiotemporal Tracking Data”. Researchers from the University of Munich were responsible for the study that openly questioned the real significance of traditional indicators for performance evaluation in football, and the heavy reliance analysts have on them (Among these indicators were the shots on goal stat).
At the core of the study was a belief that situations in which there’s a true danger of a goal being scored (or their prevention) should have a central place in analyzing the tactical success or performance in football, regardless if they ended with a shot or not. By calculating different parameters, all referring to a player’s chance of scoring (depending on his and the ball’s position), and the probability of the defending team to deny the attacking action and recover the ball, the study created the basis for a creation of a compatible football analytics metric.
As of today, the only provider that made use of the study and has a dangerousity metric of its own is Track160. The company’s fully AI-automated football analytics solution uses the most advanced tracking system there is - Optical tracking. This technology supplies its users with accurate positional data. Alongside one more of Coach160’s unique capabilities - fully AI-automated event tagging and analysis - it creates the perfect basis for applying the dangerousity principles.
How does it work? Using deep learning algorithms that rely on positional parameters (such as zone, control, density, dominance), the system quantifies every moment in the match and rates it with a dangerousity score from 0 to 1 (e.g. 0.9 is super dangerous, 0.3 much less, etc.). Every user can then filter out desired moments and the events they contain by creating their own threshold.
But that’s not all. Coach160 also monitors the maximum dangerousity value of a team every 5 seconds, and adds up the total score to provide its users with a calculated “Dominance score” - a real gift to every coach or analyst that wants to get a broader and more insightful look into how dominant his team was, and not rely only on a model that calculates only 1.5% percent of the events in the match (=shots).
Below you can see a video of five different attacking chances with relatively high dangerousity rates of 0.65-0.80 set by Coach160. All of them would have gotten a “0” xG value as no shot was taken
The world of football analytics is an ever-changing one. xG was born out of an understanding that ball possession percentage is an inefficient indicator when it comes to evaluating how dangerous a football team is in a match. Out of the same necessity, and with the understanding that even xG is not a perfect metric, new possession-based models were created. Alongside them, with the help of advanced AI and deep learning algorithms, backed by academic studies, Track160's dangerousity came to life. Hopefully, it will give football coaches, analysts, and other relevant stakeholders access to a new era of tactical insights and informed decisions.