Contact Us

Use the form on the right to contact us.

You can edit the text in this area, and change where the contact form on the right submits to, by entering edit mode using the modes on the bottom right. 


123 Street Avenue, City Town, 99999

(123) 555-6789


You can set your address, phone number, email and site description in the settings tab.
Link to read me page with more information.


Filtering by Tag: regression

Power Laws and Goal Scoring

Ford Bohrmann

Is there a normal number of goals scored in a season for a striker? To answer this, one may be tempted to just take the mean of the goals scored of every player in a season. If we do this for last season, the mean is 1.83. Of course, this is misleading. There isn't really such thing as a "normal" number of goals scored in a season. The reason for this is that goals scored does not have a standard distribution, the bell curve we are used to. For example, if you looked at the distribution of heights in a population, you would see a nice bell curve. Most people are right around the average height, and as you go towards the extremes either way (really short or really tall) you find fewer and fewer people. Therefore, the mean of heights in the population is instructive because it gives us the "normal" or "typical" height. The problem is, goals scored in a season does not follow a standard distribution. Instead, most players score no goals at all. The next most common number of goals scored last season? Just one goal, of course. This distribution continues, and it follows a power law distribution.
Read More

Why We Shouldn't Put Much Value in Assists

Ford Bohrmann

Last week I wrote a post on why shots on goal are a misleading statistic. In keeping with the analysis of the problems with some commonly kept statistics in football, I decided to look at assists. 

If you think about it, assists are highly misleading. Simply playing with good players boosts your assist total. Similar to shots on goal, not all assists are the same. There are the assists where a player makes a short pass in the midfield that leads to a teammate dribbling through all the opposing defenders and finishing, and the assists where a player makes a beautiful cross where their teammate simply has to tap the ball in the open net. These obviously shouldn't be counted as the same value to the team, yet they are. Hell, I could probably record an assist eventually in the EPL if I played for one of the top teams (OK, maybe an exaggeration but you get the point.)

First, let's look at the assists data for all the teams in the EPL league. As the graph below shows, as the point value of a team increases (basically, the better the team is) the assist total also generally increases. This is no surprise. We would expect better teams to score more goals and thus have more assist totals.

Basically what this means is that the assist statistic should favor players on better teams. Players on better teams play with better teammates and should therefore have more opportunities for assists. Below is a screenshot from the EPL website of the players with the top 20 assist totals.

9 players from top 5 clubs are in the top 20 for assist totals. No players from bottom 3 clubs are in the top 20, with the exception of Blackpool's Charlie Adam who was just signed by Liverpool. It's easy to see assists totals are higher for players on better clubs.

A better statistic that is not influenced by the quality of your teammates are chances created. A chance created is defined as a pass that leads to a shot. These are obviously not as dependent on your teammates and give a more fair and true assessment of how much of a playmaker that player is for their team. 

The next time a club is looking to sign a player based solely on their assists totals, they should take a more in depth look. Assists can tell an inaccurate, or at the least biased, story.

Win Probability Graphs and Regressions

Ford Bohrmann

Earlier in this blog I wrote a post on Win Probability in every possible game situation. I posted the excel files but they aren't as informative as a graph. I made up graphs for home and away and +2, +1, 0, -1, and -2 goal differentials for every minute. I didn't make up graphs for GD's bigger than that because there is basically no point. The fact that a team has a .999% win probability when they are up 4-0 isn't that exciting.

Each graph has the line of best fit and a scatter plot of the data. The equations for those lines are also on the graph along with the r^2 value for correlation. The graphs are below to look at. Some interesting things I noticed:

-Most graphs show a very strong relationship between minute and win probability. The only ones that don't really are when teams are away and are tied, when teams are home and up by 2, and when teams are away and down by 2. Not really sure why these three stick out.

-Some of the graphs have linear relationships, while others are quadratic. Again, not really sure why this is. Why is the win probability when you are at home and tied follow a quadratic curve while the win probability of a team at home and down by 1 is linear? Maybe people have ideas as to why this happens.

-For some of the scenarios (the +2 and -2 GD's for home and away) I didn't start the graph at minute 1 because the data points were a little all over the place. This happens because there are so few data points so the win probabilities are screwed. Example: There aren't many times when a team has a 2-0 lead in the 5th minute.

-I added the graphs of all the goal differentials together for comparison, one for home and one for away. They're interesting to look at.

-Finally, because of this we now have some basic equations to model a team's chance of winning a game. Feel free to use them and check them out.