<?xml version="1.0" encoding="UTF-8"?>
<!--Generated by Squarespace V5 Site Server v5.13.156 (http://www.squarespace.com) on Sun, 19 May 2013 00:07:58 GMT--><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0"><channel><title>Blog</title><link>http://www.soccerstatistically.com/blog/</link><description></description><lastBuildDate>Sun, 20 Jan 2013 18:44:45 +0000</lastBuildDate><copyright></copyright><language>en-US</language><generator>Squarespace V5 Site Server v5.13.156 (http://www.squarespace.com)</generator><item><title>Power Laws and Goal Scoring</title><dc:creator>Ford Bohrmann</dc:creator><pubDate>Sun, 20 Jan 2013 18:35:41 +0000</pubDate><link>http://www.soccerstatistically.com/blog/2013/1/20/power-laws-and-goal-scoring.html</link><guid isPermaLink="false">1199755:14017085:32600678</guid><description><![CDATA[Is there a normal number of goals scored in a season for a striker? To answer this, one may be tempted to just take the mean of the goals scored of every player in a season. If we do this for last season, the mean is 1.83. Of course, this is misleading. There isn't really such thing as a "normal" number of goals scored in a season.

The reason for this is that goals scored does not have a standard distribution, the bell curve we are used to. For example, if you looked at the distribution of heights in a population, you would see a nice bell curve. Most people are right around the average height, and as you go towards the extremes either way (really short or really tall) you find fewer and fewer people. Therefore, the mean of heights in the population is instructive because it gives us the "normal" or "typical" height.

The problem is, goals scored in a season does not follow a standard distribution. Instead, most players score no goals at all. The next most common number of goals scored last season? Just one goal, of course. This distribution continues, and it follows a power law distribution.]]></description><wfw:commentRss>http://www.soccerstatistically.com/blog/rss-comments-entry-32600678.xml</wfw:commentRss></item><item><title>Momentum in Bolton vs. Manchester City</title><dc:creator>Ford Bohrmann</dc:creator><pubDate>Sun, 21 Oct 2012 19:39:58 +0000</pubDate><link>http://www.soccerstatistically.com/blog/2012/10/21/momentum-in-bolton-vs-manchester-city.html</link><guid isPermaLink="false">1199755:14017085:29972390</guid><description><![CDATA[<p class="p1">Now that some of the advanced data set has been released by Manchester City's performance analysis department it's a good time to start delving in to the data to see what kind of analysis can be done. Although the advanced data set is only for one game-- Bolton vs. Manchester City from last season-- there is still A LOT of data to look at.</p>
<p class="p2">The advanced data contains (x,y) location information of every statistic that is kept. This is valuable information, as it obviously tells exactly where each event happened in the game. I was interested in how this information can be used, specifically to look at momentum and passing trends.</p>
<p class="p2"><em>Previous Work</em></p>
<p class="p1">Some work has already been done in the soccer analytics community on trying to quantify and analyze momentum. The Analyse Football looked at <a href="http://analysefootball.com/2012/09/17/visualizing-momentum-shifts-in-bolton-vs-man-city-mcfcanalytics/">momentum shifts from this same game</a>, although in a different way. The Soccer by the Numbers blog looks at momentum in football in a <a href="http://www.soccerbythenumbers.com/2011/07/big-mo-little-mo-or-no-mo-evidence-of.html">much more general way</a>.</p>]]></description><wfw:commentRss>http://www.soccerstatistically.com/blog/rss-comments-entry-29972390.xml</wfw:commentRss></item><item><title>Dealing with the MCFC Analytics Advanced Data Release</title><dc:creator>Ford Bohrmann</dc:creator><pubDate>Mon, 17 Sep 2012 17:07:56 +0000</pubDate><link>http://www.soccerstatistically.com/blog/2012/9/17/dealing-with-the-mcfc-analytics-advanced-data-release.html</link><guid isPermaLink="false">1199755:14017085:29027814</guid><description><![CDATA[<p>I wanted to point out an <a href="http://profpeppersassistant.blogspot.com/2012/09/r-code-for-managing-f24-dataset.html">excellent blog post</a> from the blog Professor Pepper's Assistant.</p>
<p>If you're an R user and are having trouble dealing with the Advanced MCFC Analytics XML data file, the link above provides the code to pull the data in to a data frame in R. After this it is easy to perform whatever analysis you want on it.</p>
<p>I'll admit the code above is beyond my limited R skill level, but I know that it works. I'm excited to start doing some analysis, although the advanced data set is only for one game from last season at this point.</p>]]></description><wfw:commentRss>http://www.soccerstatistically.com/blog/rss-comments-entry-29027814.xml</wfw:commentRss></item><item><title>A Simplified Football Prediction Model</title><dc:creator>Ford Bohrmann</dc:creator><pubDate>Thu, 30 Aug 2012 15:04:19 +0000</pubDate><link>http://www.soccerstatistically.com/blog/2012/8/30/a-simplified-football-prediction-model.html</link><guid isPermaLink="false">1199755:14017085:26385250</guid><description><![CDATA[<p>I recently wrote a blog post for the <a href="http://www.bettingexpert.com/">Betting Expert site</a>&nbsp;about a simple model I created attempting to predict the outcome of football matches using only very simple statistics.</p>
<p>You can read the full blog post <a href="http://www.bettingexpert.com/blog/how-to-build-a-football-game-prediction-model">here</a>.</p>
<p>I wanted to point out on here something interesting that I found while working on the model; betting odds do a relatively poor job of predicting football match outcomes. In other words, the percentage likelihood of a win, draw and loss for the home team implied from the odds set by bookmakers is surprisingly inaccurate.</p>
<p>My hypothesis for why this happens is that football is very unbalanced, especially in the EPL. It is very hard to predict when an upset is going to happen, mostly because these upsets are (seemingly) random.</p>
<p>Using just 4 factors in my model, including the home team's goal differential for the season up to that game, the away team's goal differential for the season up to that game, the home team's point total from the previous season, and the away team's point total from the previous season, I could create a model that was as accurate as the bookmakers.</p>
<p>The question that remains is how much more accurate can the model become with the introduction of new variables? Beyond that, what variables should be used?</p>
<p>I am not sure I know the answers to those questions, but I am going to keep playing around with the data.</p>]]></description><wfw:commentRss>http://www.soccerstatistically.com/blog/rss-comments-entry-26385250.xml</wfw:commentRss></item><item><title>Visualizing Twitter Data</title><dc:creator>Ford Bohrmann</dc:creator><pubDate>Thu, 19 Jul 2012 16:37:11 +0000</pubDate><link>http://www.soccerstatistically.com/blog/2012/7/19/visualizing-twitter-data.html</link><guid isPermaLink="false">1199755:14017085:19339289</guid><description><![CDATA[<p>Inspired from <a href="http://www.r-bloggers.com/plotting-the-frequency-of-twitter-hashtag-usage-over-time-with-r-and-ggplot2/?utm_source=feedburner&amp;utm_medium=email&amp;utm_campaign=Feed%3A+RBloggers+%28R+bloggers%29">this post</a>&nbsp;on plotting the frequency of Twitter hashtags over time, I was interested in trying to apply this to soccer some way. While not the most technical analysis, I thought it would be interesting to use this tool to analyze transfer rumors.<span class="full-image-float-right ssNonEditable"><span><img src="http://www.soccerstatistically.com/storage/twitter-soccer-bird.jpg?__SQUARESPACE_CACHEVERSION=1342717116180" alt="" /></span></span></p>
<p>To summarize the process quickly, there is a package in <a href="http://www.r-project.org/">R (open source statistical software)</a>&nbsp;called <a href="http://cran.r-project.org/web/packages/twitteR/">TwitteR</a>&nbsp;which allows you to pull Twitter data. It's actually a fairly easy process, especially if you follow the tutorial in the link at the beginning of this post.</p>
<p>As most Twitter users know there is a seemingly unlimited number of transfer rumors circulating Twitter. These range from being fairly plausible to pretty ridiculous ("Ronaldo to the Philadelphia Union???). &nbsp;As a Manchester City supporter, I was curious at looking at a few popular transfer rumors related to City.</p>
<p><strong>Robin van Persie to Manchester City:</strong></p>
<p>Yes, this is definitely a rumor, and yes, it is probably not going to happen. But I was still curious. Below is a plot of the frequency of the number of tweets that include "Robin van Persie" and "Manchester City". Of course, this is an imperfect method, but it still gives us an idea of what is going on in the Twitter transfer rumor world.</p>
<p>To explain, the graph below measures the number of tweets described above at a 2 hour interval for the past week. This means the height of every line gives us the number of tweets referencing RVP and City in that 2 hour interval.<span class="full-image-inline ssNonEditable"><span><img style="width: 650px;" src="http://www.soccerstatistically.com/storage/rvp.png?__SQUARESPACE_CACHEVERSION=1342716595073" alt="" /></span></span></p>
<p>&nbsp;</p>
<p><strong>Carlos Tevez to AC Milan:</strong></p>
<p>After Tevez's past season with the club, there are obviously transfer rumors concerning Tevez all over the place. Because of this, it was hard not to want to look at the data on Tevez. I picked AC Milan because it seemed like the club he had the highest likelihood of going to. Like above, I searched for tweets that included "Carlos Tevez" and "AC Milan". The frequency of these tweets, in 2 hour intervals, is plotted below.</p>
<p><span class="full-image-block ssNonEditable"><span><img style="width: 650px;" src="http://www.soccerstatistically.com/storage/tevez.png?__SQUARESPACE_CACHEVERSION=1342716770579" alt="" /></span></span>You can try to analyze these graphs to find some meaning, but they are more just a fun exercise than anything else. The TwitteR package lets you do other cool things, like plot the frequency of Twitter mentions for a user. I did this for another site I write for, <a href="http://www.eplindex.com/">EPL Index</a>. They tend to get a lot more mentions than @SoccerStatistic does, so I thought it would be more interesting to plot the frequency of @EPLIndex mentions. Again, the intervals are every 2 hours.</p>
<p><span class="full-image-block ssNonEditable"><span><img style="width: 650px;" src="http://www.soccerstatistically.com/storage/eplindex.png?__SQUARESPACE_CACHEVERSION=1342716952186" alt="" /></span></span></p>
<p>Like I said before, this analysis is not very insightful or ground-breaking, but still pretty cool nonetheless. The possibilities for future analysis like this are almost endless, so if people have good ideas of Twitter data to visualize, I'd love to hear them.</p>]]></description><wfw:commentRss>http://www.soccerstatistically.com/blog/rss-comments-entry-19339289.xml</wfw:commentRss></item><item><title>Possession Analysis: A Closer Look</title><dc:creator>Ford Bohrmann</dc:creator><pubDate>Wed, 02 May 2012 15:09:39 +0000</pubDate><link>http://www.soccerstatistically.com/blog/2012/5/2/possession-analysis-a-closer-look.html</link><guid isPermaLink="false">1199755:14017085:16094530</guid><description><![CDATA[There is no shortage of analysis done recently on the fact that possession statistics tend to be misleading. A while ago, I <a href="http://www.soccerstatistically.com/blog/2011/7/27/does-more-possessionmore-wins-in-the-mls.html">looked at</a> how teams with higher rates of possession in the MLS do <em>not</em> tend to win more games. Similarly, the Climbing the Ladder blog on the MLS website recently did analysis and found <a href="http://www.mlssoccer.com/news/article/2011/08/17/climbing-ladder-truth-behind-possession-game">very similar results</a>. Devin Pleuler (<a href="http://twitter.com/devinpleuler">@devinpleuler</a>) has done <a href="http://www.mlssoccer.com/news/article/2012/04/03/central-winger-why-possession-stats-are-misleading">even more analysis</a> on why possession stats are misleading for his Central Winger blog on the MLS website. On his personal blog, Devin has also looked at possession efficiency and <a href="http://www.centralwinger.com/unexpected-findings-of-possession-efficiency/">how it relates to winning</a>. Even more, the 11tegen11 blog (<a href="http://twitter.com/11tegen11">@11tegen11</a>) has written about some <a href="http://11tegen11.wordpress.com/2011/08/01/possession-analysis-in-football/">interesting points</a> on how to better analyze possession. I'm sure there are even more that I have forgotten to list here, but you get the point.]]></description><wfw:commentRss>http://www.soccerstatistically.com/blog/rss-comments-entry-16094530.xml</wfw:commentRss></item><item><title>Strength and Imbalance- A Comparison of European Leagues</title><dc:creator>Ford Bohrmann</dc:creator><pubDate>Fri, 16 Mar 2012 17:10:01 +0000</pubDate><link>http://www.soccerstatistically.com/blog/2012/3/16/strength-and-imbalance-a-comparison-of-european-leagues.html</link><guid isPermaLink="false">1199755:14017085:15462682</guid><description><![CDATA[<p class="p1">How can we effectively compare the strength of different European Leagues? Which country has a stronger top flight, England or Spain? Which country has a more balanced top flight, Italy or Germany? How does the imbalance and strength of the EPL change across the different divisions? These questions are not easily answered, and do not even necessarily have definitive answers. With the help of data from Euro Club Index and Infostrada Live (powered by HyperCube) we can begin to make some analysis of Europe's top leagues.</p>
<p class="p1">The idea for this post originally came from another blog post written by Chris Anderson (@soccerquant), the writer of the Soccer By the Numbers blog. In this post, Chris compares both the strength and imbalance of 6 of the top European leagues. You can read the post <a href="http://www.soccerbythenumbers.com/2011/06/high-quality-and-low-imbalance-which.html">here</a>. My idea was to expand upon this analysis using the extensive and accurate Euro Club Index data, while also looking at more European leagues. This analysis looks at the top leagues of 10 different European countries. The analysis will be split in to two posts. The first looks at only the top division of 10 different countries. The second, which will be posted later, will compare strength and imbalance within each country's league structure.</p>]]></description><wfw:commentRss>http://www.soccerstatistically.com/blog/rss-comments-entry-15462682.xml</wfw:commentRss></item><item><title>EPL Table Visualization: A Different Perspective</title><dc:creator>Ford Bohrmann</dc:creator><pubDate>Thu, 26 Jan 2012 15:11:24 +0000</pubDate><link>http://www.soccerstatistically.com/blog/2012/1/26/epl-table-visualization-a-different-perspective.html</link><guid isPermaLink="false">1199755:14017085:14740931</guid><description><![CDATA[<p class="p1">After the positive comments and interest in the <a href="http://www.soccerstatistically.com/blog/2012/1/13/scoreline-visualization.html">scoreline visualization chart</a>&nbsp;I posted last week, I decided it would be interesting to do another type of data visualization. Processing, the software I've been using for these visualizations, lets you do some cool stuff with making the visualization interactive. This week, I decided to make a more complete and informative visualization of the English Premier League table.&nbsp;</p>
<p class="p1">I tried to make it as stand-alone as possible. In other words, I wanted people to understand it just by looking at it without other information. One point: its interactive in that you can scroll your mouse over a club's circle and it will give you information on them. If you are interested in more analysis and how I created it, read below.</p>]]></description><wfw:commentRss>http://www.soccerstatistically.com/blog/rss-comments-entry-14740931.xml</wfw:commentRss></item><item><title>Scoreline Visualization</title><dc:creator>Ford Bohrmann</dc:creator><pubDate>Fri, 13 Jan 2012 15:53:24 +0000</pubDate><link>http://www.soccerstatistically.com/blog/2012/1/13/scoreline-visualization.html</link><guid isPermaLink="false">1199755:14017085:14565876</guid><description><![CDATA[<div id="_mcePaste">The idea for a scoreline visualization originally came from Devin Pleuler (@devinpleuler on Twitter). He had the idea to create a graph that represents how soccer scorelines tend to progress, representing both how often scorelines end a certain way, and how often games flow through a certain scoreline.</div>
<div id="_mcePaste"></div>
<div id="_mcePaste">Using data from 1000 EPL games from the <a href="http://www.rsssf.com/">RSSSF</a>, I've created this chart using <a href="http://processing.org/">Processing</a>, which you can find below.</div>]]></description><wfw:commentRss>http://www.soccerstatistically.com/blog/rss-comments-entry-14565876.xml</wfw:commentRss></item><item><title>New Site!</title><dc:creator>Ford Bohrmann</dc:creator><pubDate>Thu, 12 Jan 2012 20:26:56 +0000</pubDate><link>http://www.soccerstatistically.com/blog/2012/1/12/new-site.html</link><guid isPermaLink="false">1199755:14017085:14555163</guid><description><![CDATA[<p>I've redesigned the Soccer Statistically site! Instead of using Blogger in the domain name, the site is now www.soccerstatistically.com, which is nice. I've also redesigned the entire website with a new banner design. Here are some of features on the site:</p>
<p>Blog: The blog is exactly the same, and is the also home page of the site. Nothing new here.</p>
<p>Statistical Applets: There is now a menu option called Statistical Applets. Under this are two options, Expected Points Added and Outcome Probability Calculator. The first is a table of the EPL leaders in Expected Points Added, a metric I created a while ago that takes in to account the true value of each goal when ranking goal scorers. For more information, you can read <a href="http://www.soccerstatistically.com/blog/2011/6/23/wpa-and-agw-van-persie-is-overrated.html">here</a>. The Outcome Probability Calculator lets you enter information about a team in a game, and then gives you the probability of each type of outcome. For example, you could enter the 34th minute, at home, leading by 1, and see the probability of the team winning, drawing, and losing.</p>
<p>About Me: Just an about me page, with a contact us form link.</p>
<p>&nbsp;</p>
<p>If you have any comments or suggestions on the design for the new site, I'd love to hear him. I'm working on adding some more statistical applets to that section for the future, which I'm excited about. Hope you like it!</p>]]></description><wfw:commentRss>http://www.soccerstatistically.com/blog/rss-comments-entry-14555163.xml</wfw:commentRss></item></channel></rss>