Now that some of the advanced data set has been released by Manchester City's performance analysis department it's a good time to start delving in to the data to see what kind of analysis can be done. Although the advanced data set is only for one game-- Bolton vs. Manchester City from last season-- there is still A LOT of data to look at.
The advanced data contains (x,y) location information of every statistic that is kept. This is valuable information, as it obviously tells exactly where each event happened in the game. I was interested in how this information can be used, specifically to look at momentum and passing trends.
Some work has already been done in the soccer analytics community on trying to quantify and analyze momentum. The Analyse Football looked at momentum shifts from this same game, although in a different way. The Soccer by the Numbers blog looks at momentum in football in a much more general way.
By looking at the exact position of passes by each team during the game, one can tell in what area of the field the game is being played at a specific point. My aim was to break this down. I looked at the y location of each pass during the game. In the data set, the passes are given an (x,y) location tag, where the y value is the location of the pass end line to end line and 0 < y < 100. In other words, a y value of 100 would be a pass on the other team's end line, and a y value of 50 would be a pass on the half line. The x location is also given, but for the purpose of this analysis I ignored it.
Just graphing the y location of every pass is too noisy to break down and understand. Instead, I took the simple moving average of the y location of the previous 25 passes of both teams. This gives a clearer picture of where the passes are being made throughout the game, and is taken as a proxy for momentum.
Below is the graph plotting the simple moving average of the pass location throughout the game.
The straight black line down the middle is a value of 0. When the simple moving average (the jagged black line) intersects the straight black line this indicates that the y location of the previous 25 passes of both teams is averaged at the mid line. The blue and red bars are the actual y location of every pass for reference. The red bars are the passes in the Bolton offensive half and the blue bars are the passes in the City offensive half. I also indicated goals by each team with the blue and red dots on the simple moving average line. As you can see, the game ended 3-2 in favor of Manchester City.
What does this graph tell us about the game? It seems that City dominated momentum for the most part of the game. The first 3 goals of the game were scored when there was relatively no momentum either way, though. After Bolton's first goal (which made the game 2-1 in favor of City), City gained momentum as the ball was in Bolton's defensive half consistently. Possibly as a result of this momentum, City was able to score a 3rd goal, making the score 3-1. After evening out momentum and another dominance by City, Bolton was able to gain back momentum, as most passes were in City's defensive half. Again, possibly as a result of this momentum, Bolton was able to get a 2nd goal to make the game 3-2. Finally, Bolton ended the game with most of the momentum, but City was able to hold off the attack and finish the game 3-2.
The location data of passes can give us a good proxy for momentum. Additionally, this momentum does seem to matter in terms of scoring goals. While the first 3 goals of the game were scored with seemingly no momentum for either team, the final 2 goals were scored with each team dominating momentum for their respective goals.
When the full advanced data set is available it will be interesting to do this analysis for more games. Obviously this is an imperfect measure of momentum, however I think it does its job as a good indicator of momentum. I'd be interested to hear suggestions and comments on the methodology.