Blog Index
The journal that this archive was targeting has been deleted. Please update your configuration.
Navigation
« Dealing with the MCFC Analytics Advanced Data Release | Main | Visualizing Twitter Data »
Thursday
Aug302012

A Simplified Football Prediction Model

I recently wrote a blog post for the Betting Expert site about a simple model I created attempting to predict the outcome of football matches using only very simple statistics.

You can read the full blog post here.

I wanted to point out on here something interesting that I found while working on the model; betting odds do a relatively poor job of predicting football match outcomes. In other words, the percentage likelihood of a win, draw and loss for the home team implied from the odds set by bookmakers is surprisingly inaccurate.

My hypothesis for why this happens is that football is very unbalanced, especially in the EPL. It is very hard to predict when an upset is going to happen, mostly because these upsets are (seemingly) random.

Using just 4 factors in my model, including the home team's goal differential for the season up to that game, the away team's goal differential for the season up to that game, the home team's point total from the previous season, and the away team's point total from the previous season, I could create a model that was as accurate as the bookmakers.

The question that remains is how much more accurate can the model become with the introduction of new variables? Beyond that, what variables should be used?

I am not sure I know the answers to those questions, but I am going to keep playing around with the data.

References (19)

References allow you to track sources for this article, as well as articles that were written in response to this article.
  • Response
    ...to provide you with live pro football, college football, pro basketball, college basketball, and baseball odds for picks against the spread.
  • Response
    Response: bonus bookmaker
  • Response
    Response: buy sizegenetics
    SoccerStatistically - Blog - A Simplified Football Prediction Model
  • Response
    Very good Site, Carry on the great work. Thanks for your time!
  • Response
    SoccerStatistically - Blog - A Simplified Football Prediction Model
  • Response
    The begone club's ambition differential for the temper up to that game, the house club's beak aggregate from the previous winter, also the distant club's dot demolish from the previous ripen, I could generate a design that was as precise as the bookmakers.
  • Response
  • Response
    Response: fifa live score
  • Response
    Response: baby shower games
    SoccerStatistically - Blog - A Simplified Football Prediction Model
  • Response
    SoccerStatistically - Blog - A Simplified Football Prediction Model
  • Response
    sammual
  • Response
    Response: D D Photographics
    SoccerStatistically - Blog - A Simplified Football Prediction Model
  • Response
    Response: D D Photographics
    SoccerStatistically - Blog - A Simplified Football Prediction Model
  • Response
    I like these all services because of the services are providing more knowledge to every student. So the students are haven't facing problems at education. Thanks a lot to providing these services.
  • Response
    Response: UK Models Review
    SoccerStatistically - Blog - A Simplified Football Prediction Model
  • Response
    Response: UK Models Review
    SoccerStatistically - Blog - A Simplified Football Prediction Model
  • Response
    Response: Anthony Alles
    SoccerStatistically - Blog - A Simplified Football Prediction Model
  • Response
  • Response

Reader Comments (6)

Could you possibly re-run the scoring of the betting odds? In my opinion, it would be preferable to score a match prediction with one point when the game outcome matches the betting site's highest-likelihood-outcome, and zero points if it doesn't.

August 30, 2012 | Unregistered CommenterMichael M.

Respectively, could you apply this scoring to all models involved? Could be interesting to see in what way this makes a difference.

August 30, 2012 | Unregistered CommenterMichael M.

I am currently running data analysis (through regressions) on what it takes to win in different leagues. While our project is not yet finished and we have not run a final model yet, one thing we have noticed is that different factors affect winning in different leagues (we are running our regression to find out what factors control total points throughout a season). For example, for the EPL the amount of years a manager has been with a club is statistically significant, however manager experience is insignificant within La Liga and the MLS. So what I would advise would be to possibly work a different model for each league if you're going to add a high number of variables as there does seem to be some difference.

November 12, 2012 | Unregistered CommenterLance

I found http://goalograph.com/. See there a lot of stats and predictions. I really enjoy it.

July 14, 2013 | Unregistered CommenterGeorge

First, I recommend that you use football-data.co.uk as a data source rather than RSSSF because the data is in a much better format - CSV and Excel.
With regards to variables to apply, I think season to date should be complemented with form data - last 6 home/away matches for example.

April 16, 2014 | Unregistered CommenterNick

First, I recommend that you use football-data.co.uk as a data source rather than RSSSF because the data is in a much better format - CSV and Excel.
With regards to variables to apply, I think season to date should be complemented with form data - last 6 home/away matches for example.

April 16, 2014 | Unregistered CommenterNick

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>