The major point of this blog is to test commonly held notions in football for their validity. After watching the US women lose to Japan yesterday, I started to think about shots on goal. I don't have the exact numbers, but I'm pretty sure the US crushed Japan in the shots on goal category. This made me think, do shots on goal matter? Most people would quickly say yes. It would make sense that more shots on goal mean more chances to score and thus more goals. The only problem is that some things in football just don't make sense. I wanted to see if shots on goals equate to success in two categories: 1.) Do more shots on goal mean more success for a team as a whole? 2.) Do more shots on goal mean more goals for a specific player? To test these questions I used data from the MLS website. As an aside, mls.com has extensive statistics for every season in a bunch of categories. Great to see. Anyways, the data is from the 2010 MLS season.
First question: Do more shots on goal mean more success for a team as a whole?
If this was true, we would expect points to increase as shots on goal increase on a team level. In other words, teams that have more shots on goal would be more successful. The graph below tells us a different story.
The graph shows there is no real relationship between shots on goal and points. Most teams cluster around just under 140 shots on goal on the season. The line of best fit shows a positive relationship, but this relationship is not strong at all. The correlation of the graph is r=.1311. As a reminder, the correlation of a graph tells us how strong the linear relationship is between two variables. The correlation coefficient (the value of r) gives a numerical value of the strength of the relationship. A value of 0 means there is no linear relationship at all, and a value of 1 means there is a perfect positive linear relationship. In this case, the value is .1311, telling us there is a very weak linear relationship.
Second question: Do more shots on goal mean more goals for a specific player?
Similar story for this question: is there a linear increase in the amount of goals as the amount of shots on goal increases? The graph below gives us the answer.
This graph shows a stronger relationship compared with the graph above. However, the relationship is still not very strong. The value of r in this case is .4722, indicating that the relationship is stronger than the graph above. However, a correlation under .5 is generally considered to be a weak relationship. This means for individual players, shots on goals are not a very good indicator of goals.
Here's my best explanation for why shots on goal are not a very indicative statistic: Not all shots on goal are the same. There are 40 yard weak rollers that the goalie easily saves, and there are 5 yard shots that the keeper barely gets a hand on. There are weak attempts by a center back getting forward and there are breakaways by forwards. In the shots on goal statistic, in both cases the shots on goal are counted as equivalent. Obviously this makes no sense. A statistic that would be better indicative of goals scored for both questions I looked at above would be shots on goal inside the box. Shots on goal inside the box would get rid of the shots on goal that have no chance of going in. Not all shots inside the box are the same, so we have somewhat of the same problem as shots on goal. However, I assume there would be a much stronger correlation between shots on goal inside the 18 and points, and shots on goals inside the 18 and goals by an individual player. Unfortunately, I don't have the data to back up this claim (working on it). If/when I do get the data from shots inside the box I'll post the graph and the correlation between shots on goal in the box and goals.
Even without the data, the point I'm making is still clear: shots on goal do not equate to more success from a team perspective and do not correlate with goals for individual players very strongly like most people assume they do. There are better statistics than shots on goal. This means statements like "New England had 5 more shots on goal than New York, they dominated the game" and "Donovan had 4 shots on goal in the game, he was due for a goal" are not neccesarily valid. What if New England had a bunch of shots on goal from outside the 18 that never had a chance of going in? And what if Donovan's shots on goal all were weak rollers? Shots on goal are often misleading.