Yesterday we spoke about the basic approach that we’re going to take in our definition of player valuations. Today we’re going to start implementing it.

I thought a lot about the ideal approach, but ultimately I settled on a soccer version of With Or Without You (WOWY), a method that in various forms has been utilized in several settings, but the implementation I’m going to adopt is going to be closely related to the one first used by Tom Tango in The Hardball Times Annual 2008 (the explanation for what he did starts at page 147, but it’s strictly on baseball). Tango believed that the best way to measure a player’s impact would be to see how his teammates and opponents would do while the specific player was on the field and when he was off, a simple yet powerful concept. On a career level, it works wonderfully. On a seasonal level, well, it’s debatable due to the small sample size. Then again in baseball things take a long time to stabilize, while in soccer an argument (well, more like a hypothesis really) could be made that the sample doesn’t need to be as big to be reliable just by being generally more predictable. I’m going to proceed, but this disclaimer had to be stated.

So, why would WOWY work well (alliteration!)? Intuitively we know that a lot of things contribute to winning a soccer game. Having more possession, gaining more free kicks up front, gaining more corner kicks, winning tackles, shooting more, etc, these are all good things. How good and how effective, that is still up for debate. But ultimately those are all means to score more goals. That is the only scope of the game: score more than you allow. Players can impact a game in several different ways. A star player might require an extra man to cover him and that’s going to create more chances for his teammates; that way he would be valuable, but it wouldn’t be too easy to see in the individual stats. It would show up in the team stats however, as his teams would undoubtedly end up doing better with him on the field than with him off. Every time there is a move on the transfer market, fans think of how many points in the standings the team is going to gain or lose. Better players will or should lead to more points, simply enough. On a career level, the bigger contributions would shine through, while on a seasonal level there may be a little extra noise, but ultimately a well-built system should see top players on top.

Let’s start working through our system. Thanks to MCFC Analytics, I am in possession of all data regarding individual players in the 2011/12 EPL. I have combined that data to recreate the correct results for every game, and the final standings. Then, for every player, I have analyzed how his team did while he was on the field. What does this mean? Let’s use Abou Diaby to illustrate it (especially convenient, because he’s the first in the spreadsheet, in alphabetical order):

– He played 91 minutes in the season, 46 at home, 45 away, over the course of 4 games.

– In all 4 appearances, he came on as a substitute, and in one of those occasions he was later substituted himself.

More specifically, he played in:

– Stoke City-Arsenal 1-1. He came on in the 73rd minute, with the score already 1-1 and finished the game. In his 17 minutes there was no score.

– Liverpool-Arsenal 1-2. He came on in the 53rd minute and came off in the 81st. There was no score in his 28 minutes.

– Arsenal-Fulham 1-1. He came on in the 69th minute. Arsenal scored a goal while he was on the pitch, and allowed none (which means they were losing and managed to tie the game).

– Arsenal-Chelsea 0-0. He came on in the 65th minute and obviously there was no score in his 25 minutes.

Over the course of his 91 minutes, his team scored 1 goal and allowed none. He was one of 11 players on the pitch, so he gets 1/11th of the credit for this performance. In other words, he gets a “share” of the performance, which also gets attributed to everybody else who played, for the time they played and depending on how their teams performed in those games. I know this is going to sound bad to some people: how can I give the same share to every player? Some players have clearly more responsibilities than others in scoring or allowing certain goals. True, however a player’s presence might impact the performance of others in subtle ways which aren’t easy to see. In the long term, if a team plays better with a specific player on the field, evidently his contributions are felt despite everything else. Even if the player is just lucky to be on the pitch at the right time (eg: Diaby on the pitch when Arteta decides to invent a fantastic goal all by himself) a few times, that’s not going to last and his real values are going to shine through in the long run. Always beware of small sample size.

Back to Diaby: obviously in three of the four games, he gets 0.00 shares for his team scoring or allowing goals, so let’s focus on the Fulham one. He didn’t play the whole game, but “only” 21 minutes, so he should get 1/11 of what happened in that time. Let’s see how the entire Arsenal team for that match stacks up, in alphabetical order:

– Abou Diaby: 21 minutes played, partial score 1-0.

– Andre Santos: 90 mins, 1-1.

– Arshavin: 76 mins, 0-1.

– Arteta: 90 mins, 1-1.

– Chamakh: 14 mins, 1-0.

– Djourou: 90 mins, 1-1.

– Gervinho: 22 mins, 1-0.

– Mertesacker: 69 mins, 0-1.

– Ramsey: 68 mins, 0-1.

– Song: 90 mins, 1-1.

– Szczesny: 90 mins, 1-1.

– Vermaelen: 90 mins, 1-1.

– Walcott: 90 mins, 1-1.

– van Persie: 90 mins, 1-1.

Well, that’s the Opta-defined alphabetical order, so Van Persie is on the bottom because the has a small “v”.

The guys who played the entire game get the share for 1 goal scored and 1 goal allowed. Mertesacker, Ramsey and Arshavin get the share for allowing 1 goal and scoring none. Gervinho, Diaby and Chamakh get the share for scoring 1 goal and allowing none.

I have calculated this for every game of the season, while breaking down the home/away difference. Diaby contributed to 0.09 home goals scored (that 1/11 of the only goal Arsenal scored while he was playing) and 0.00 home goals allowed, away goals scored and away goals allowed. I then pro-rated this over 90 minutes, so Diaby contributed to 0.18 home goals scored per 90 minutes and 0.00 of the other 3 categories. Naturally Diaby didn’t play much. For the sake of illustration, these are the values for Thomas Vermaelen:

– 0.18 Home Goals Scored/90

– 0.15 Away Goals Scored/90

– 0.09 Home Goals Allowed/90

– 0.07 Away Goals Allowed/90

Again, these are shares, so this means that Arsenal scored 2 goals every 90 minutes at home while Vermaelen played (0.18*11) and allowed 1 on average (0.09*11). Please note that here I’m only going to 2 significant figures, while the spreadsheet carries them on, so rounding errors correct themselves. Vermaelen played 1306 minutes at home and 1172 away. Arsenal scored 29 goals with him on the pitch at home, so 29/1306/11*90 comes out to 0.182679 (6 significant figures, yay!) which is the pro-rated figure mentioned above. By the way, Arsenal allowed 15 at home with him, while they scored 22 and allowed 10 away with Vermaelen playing, and that’s how I came up with those numbers for him.

All this is nice and well, but we haven’t really done anything other than set up a starting point: we haven’t adjusted for how strong the opponents were and how good a player’s teammates were. The player we are evaluating might have played against stronger opponents than average, while resting against the weaker ones. This would bring down his stats and make him appear worse than he really was. We can and must correct for that and for much, much more. But we’ll do that next time. In the meantime I’ll let you digest what we’ve done so far.