Yesterday I looked at referee bias in this past season for the EPL. It turned out that while referees favored the home team overall in parts of the game like fouls, yellow cards, and red cards, it is more likely due to the advantage the home team has in a game. One statistic I did not look at though, is the amount of extra time given.
Extra time has nothing to do the relative abilities or score of the game like many other parts of soccer do. In theory, it should be an objective amount not dependent on if the home or away team is leading in the game. You see in almost every game though, the home crowd jeering for the ref to end the game if their team is ahead, or cheering even louder for their team to come back if they are trailing. Based on this, referee bias would be present if home teams that are leading have shorter games compared with away teams that are leading. The obvious logic being that the referee gives in to the home team's fans and adjusts his extra time given unconsciously.
To do this I looked at the length of the game for home teams that won the game versus length of the game for away teams leading. If there is indeed a referee bias then we should see that the length of games is shorter for home leading teams versus away teams.
Below are histograms (graphs showing the frequency of each dependent variable value) of the length of the game for the two categories above.
We can see the graphs are very similar, except for the tail on the right end of the away win time. This is in accordance with our hypothesis that away teams that are leading face more extra time. It seems refs gave trailing home teams more than 10 minutes of stoppage time more than they gave trailing away teams more than 10 minutes.
Like the previous post looking at referee bias, I did statistical analysis to see if the difference was actually statistically significant (in other words, the difference was not due to randomness). The mean length of game for leading home teams was 96.36 minutes, while the mean length of games for leading away teams was 96.56.
Using the data, I ran a two sample t-test. Basically what a
t-test does is takes in to account the number of observations, mean, and standard deviation (measure of spread) and tests to see if they are equal. In the end, the test gives a p-value between 0 and 1. A p-value basically answers the question, if the two means were actually the same (time given for leading teams were the same for home and away), what is the probability that we there would be a difference in the means that we actually saw. In this case, a probability of 0 suggest that the means are different, and one of 1 suggests they are the same. Generally, a p-value of .05 or lower is statistically significant, meaning we can rightfully say the means are not the same.
After doing the test, the p-value I got was 0.2013. While this suggests that referees are giving more time to trailing home teams, it is not at a statistically significant level. In other words,
we cannot conclude that referees give more extra time to trailing home teams compared with trailing away teams.
It may seem like there is a bias evident based on the means, but it is not at a statistically significant level.
All in all, referees are doing a good job in terms of not favoring home teams over away teams. Next time someone complains that the ref is favoring the home team, you can just tell them to look at the data.