Blog Index
The journal that this archive was targeting has been deleted. Please update your configuration.

World Cup Performance by Continent (Lots of graphs)

Much has been made of the inter-continental games so far this World Cup, especially considering the presence of 3 of the 4 CONCACAF countries making it past the group stages, including the US getting out of the group of death and Costa Rica going much farther than anyone predicted.

To see how various (FIFA defined) continents have done compared to past World Cup results, I used past World Cup data collected from I looked at the past World Cup results (here is an example from the United States’ page These results include all World Cup and World Cup qualifying games, which is what I limited my analysis to. World Cup qualifying games are a little different than World Cup games, but considering these are almost always between countries that are in the same continent, I think its OK because I drop intra-continent games anyways. What defines a continent is pretty hazy, so I just stuck with FIFA’s definitions. This means that Australia is actually a part of Asia, and some other anomalies. This division of the world is the best way to stay consistent, though. The continents I ended up using were Africa, Asia, CONCACAF, Europe, Oceania and South America.

If you want to look at the code I wrote to do the analysis (the data scraping, the actual analysis, and the visualization) head over to here 

There’s nothing too crazy going on in the analysis, just a lot of graphs to look at.

Click to read more ...


Underdogs and Inefficiencies

Odds makers tend to do a fairly good job in sports-- While they may not be perfect, it tends to be tough to find any consistent exploitable inefficiencies. In other words, it is rare that the odds of "Liverpool winning at home", or some other event like that, are consistently over or underestimated. You may think that the odds in an individual game may be incorrect, but in the long run inefficiencies like that rarely persist. Why? Because bookies would lose money on them. If they realize they are starting to lose money, the odds are going to be adjusted to better reflect the probability of each result occuring.

While I am not really interested in betting on soccer myself, odds do provide an interesting estimate of the probability of an outcome occuring. For example, take Arsenal's home game against Chelsea this past year. Bet365 put the odds of an Arsenal victory at 2.38. These decimal odds imply that they expect the probability of an Arsenal victory to be about 42%. Taking in to account that the odds makers usually lower the payouts so that they make money, the adjusted probability of an Arsenal victory is just over 41.1%.

This is all pretty standard stuff. The odds for relatively evenly matched games like the one above are probably pretty accurate, or at least more accurate than your average person. But what about significant underdogs? What about City against Cardiff? These are a little more difficult to assess. It's clear that Cardiff is an underdog in this game, but how much of an underdog? And do odds makers do a good job of assigning implied probabilities to these lopsided games?

Click to read more ...


Sloan Sports Analytics Conference Overview: The State of Analytics in Soccer

The Sloan Sports Analytics Conference was this past weekend. I attended the 2012 conference and was looking forward to seeing how much the soccer analytics community had progressed. Unfortunately, the soccer panel was very similar to the one two years ago. While I'm not quite as pessimistic as Howard Hamilton, I understand where his viewpoint is coming from. I think the reason for this lack of progress in the soccer analytics community is threefold:

Click to read more ...


Outcome Probability Calculator (Updated)

I've updated the site's Outcome Probability Calculator. I updated and added more games of data, changed the methodology somewhat, and created a new online app. The first iteration of the app was featured on the Wall Street Journal's website in a blog post called Arsenal Beats Reading and Math

Click to read more ...


The Numbers Game: My Thoughts

I just finished The Numbers Game: Why Everything You Know About Soccer is Wrong, and really enjoyed it. I've been lucky enough to meet Chris at the MIT Sports Analytics Conference, and have also met a number of the other people featured in the book. I even played pickup soccer last summer in New York City with Ramzi Ben Said, the Cornell undergrad tasked with collecting some of the data for the book. All in all, the names that come up are very similar to the names on my Twitter timeline. If you're reading this blog and have read the book, you probably recognize a lot of the names also.
Here are some of my thoughts:

Click to read more ...