Tagged: Correlations
Busting the Myth of Moneyball in Soccer Statistics?
Over the past month or so Tim, @7amkickoff, and I have been having some great discussions about soccer, statistics, and the ways or means in how to use statistics to better communicate what may be happening on the pitch outside of what may normally be seen by supporters.
I’m not sure we’ve cracked the nut completely but these discussions have spurred me to come up with some other ways to show the strengths and weaknesses of statistics in soccer and what key indicators may better tell the story of a team exclusive of Goals Scored or Goals Against.
My article today is an attempt to do that.
In setting the stage, I feel it is worthy to reinforce that the pioneering of soccer statistics is not just about one or two people; I’m aware of many folks trying to help others better understand the nuance of soccer in a variety of different ways.
But with all that hard work, by people across the pond, and now here, recently in the US, I think some of the well-intended efforts have strayed off the mark.
Why? As much as it pains me to say this I blame Moneyball relative to baseball statistical thinking and trying to apply the event statistical thinking of baseball to the concepts of statistical measurement in soccer.
Soccer is not a game played in series (like baseball) it’s a fluid game played with continuous, sometimes random decision making, all with the intent to possess the ball, retain and move the ball, penetrate, create, take shots, put them on target and score goals.
And at any time, be it a Coaching decision, Referee decision, Assistant Referee decision, or a split second decision, by any player, either with or without the ball, can influence the outcome of a game.
Therefore, statistics, single statistics, simply miss the mark on translating the nuance of soccer to the general supporter, and as such, are – on the surface – flawed if used (alone) to evaluate the market value of a player.
To put this into perspective, ignoring Coaching or Referee decisions, here’s a rundown on the Correlations (r’s) of the three best Attacking r’s for each team in the English Premier League.
Caveat: The statistics are either measured by volume (quantity) or by percentage of accuracy (quality) to Points Earned in the League Table over the span of 21 games, one game at a time; these are not Aggregate r’s.
Said another way, this is NOT a measurement relative to winning or losing… it’s a measurement relative to winning, drawing, or losing (points earned).
- Chelsea: Goals Scored (.46) Shots on Goal per Shots Taken (.30) Shots on Goal (.18)
- Burnley: Shots on Goal per Shots Taken (.46) Goals Scored (.44) Goals Scored per Shots on Goal (.40)
- Man City: Opponent Total Passes (.59) Goals Scored (.58) Opponent Total Passes Completed (.54)
- Newcastle: Goals Scored per Shots on Goal (.60) Goals Scored (.52) Passes Completed Final 1/3 per Passes Completed Entire Pitch (.49)
- Southampton: Goals Scored (.58) Goals Scored per Shots on Goal (.58) Shots Taken per Passes Completed Final 1/3 (.46)
- Liverpool: Goals Scored (.66) Shots Taken per Passes Completed Final 1/3 (.52) Opponent Possession Percentage (.43)
- Crystal Palace: Goal Scored (.67) Shots Taken per Passes Completed Final 1/3 (.62) Shots Taken (.53)
- Arsenal: Goals Scored (.60) Goals Scored per Shots on Goal (.41) Shots Taken per Passes Completed Final Third (.20)
- Spurs: Goals Scored (.67) Goals Scored per Shots on Goal (.65) Shots on Goal (.46)
- West Ham: Goals Scored (.76) Shots on Goal (.44) Goals Scored per Shots on Goal (.43)
- Sunderland: Goals Scored (.61) Goals Scored per Shots on Goal (.40) Passes Completed Final 1/3 per Passes Completed Entire Pitch (.38)
- West Brom: Shots on Goal per Shots Taken (.45) Passes Completed Final 1/3 per Passes Completed Entire Pitch (.45) Goals Scored (.44)
- Aston Villa: Goals Scored (.76) Goals Scored per Shots on Goal (.46) Shots on Goal per Shots Taken (.41)
- Stoke City: Goals Scored per Shots on Goal (.81) Goals Scored (.68) Opponent Possession Percentage (.50)
- Hull City: Goals Scored (.63) Goals Scored per Shots on Goal (.59) Shots on Goal (.36)
- QPR: Goals Scored (.68) Passes Completed Final 1/3 per Passes Completed Entire Pitch (.56) Shots on Goal (.44)
- Everton: Shots Taken per Passes Completed Final Third (.71) Goals Scored (.60) Goals Scored per Shots on Goal (.34)
- Leicester City: Goals Scored per Shots on Goal (.74) Goals Scored (.53) Opponent Possession Percentage (.31)
- Swansea: Shots on Goal per Shots Taken (.52) Goals Scored (.48) Shots on Goal (.40)
- Man United: Goals Scored per Shots on Goal (.69) Goals Scored (.62) Shots on Goal per Shots Taken (.28)
What’s that mean?
For the most part what this means is that no two teams show the same consistency of pattern in what single (game to game) quantity or quality indicators best represent team performance in Attacking.
Therefore – the individual player statistics behind these values have a different meaning (amount of influence) in whether a team wins, draws, or loses.
In addition, while Goals Scored (in bold) appears as a relevant indicator it is not the most relevant indicator for every team. Reinforcing that teams, in attacking, behave differently with respect to earning points in the League Table.
Of additional note is that the r for eight of those teams is less than (.60) and only two teams show an r greater than (.70).
Finally, the single indicators (either by volume or by ratio) that fit into the top three, exclusive of Goals Scored, are:
- Goals Scored per Shots on Goal (thirteen times)
- Shots on Goal per Shots Taken (six times)
- Shots on Goal (six times)
- Shots Taken per Passes Completed Final 1/3 (five times)
- Passes Completed Final 1/3 per Passes Completed Entire Pitch (four times)
- Opponent Passing Percentage (three times)
- Opponent Total Passes (once)
- Opponent Total Passes Completed (once)
- Shots Taken (once)
What’s intriguing is that three Defending Indicators appear; Opponent Passing Percentage, Opponent Total Passes and Opponent Total Passes Completed.
With all those variety of different attacking r values, it’s pretty clear it simply isn’t all about scoring goals (getting a man on base and moving them forward)… therefore the market value used to assess that players value should be questioned if it doesn’t consider outside factors that influence output…
In other words, it’s about a variety of different ways and means to do well – even (in a small way) about not possessing the ball so even passing accuracy is influenced – somewhat – but a head coaching tactical decision.
But wait, there’s more:
All those indicators above show the top three r’s for a team when attacking.
There’s a whole side of the game that is missed with those – and that’s defending.
So here’s the top three, best negative (inverse) r’s compared to Points Earned in the League Table, for each team in the English Premier League:
- Chelsea: Opponent Goals Scored (-.51) Opponent Shots on Goal (-.43) Opponent % of Success Passes Final 1/3 (-.42)
- Burnley: Opponent Goals Scored (-.59) Total Passes Completed (-.55) Total Passes (-.54)
- Man City: Opponent Goals Scored per Shots on Goal (-.67) Opponent Goals Scored (-.53) Opponent Passes Completed Final 1/3 per Passes Completed Entire Pitch (-.43)
- Newcastle: Opponent Goals Scored (-.57) Passing Accuracy (-.44) Total Passes (-.43)
- Southampton: Opponent Goals Scored (-.72) Opponent Goals Scored per Shots on Goal (-.63) Opponent Shots on Goal per Shots Taken (-.35)
- Liverpool: Opponent Goals Scored per Shots on Goal (-.67) Opponent Goals Scored (-.60) Passing Accuracy (-.46)
- Crystal Palace: Opponent Goal Scored (-.42) Opponent Shots on Goal (-.37) Opponent Shots on Goal per Shots Taken (-.41)
- Arsenal: Opponent Goals Scored (-.86) Opponent Goals Scored per Shots on Goal (-.64) Opponent Shots Taken (-.47)
- Spurs: Opponent Goals Scored (-.52) Opponent Shots on Goal (-.43) Opponent Shots on Goal per Shots Taken (-.42)
- West Ham: Opponent Goals Scored (-.66) Opponent Goals Scored per Shots on Goal (-.50) Opponent Shots on Goal (-.48)
- Sunderland: Opponent Shots Taken (-.50) Total Passes Completed (-.40) Total Passes (-.39)
- West Brom: Opponent Goals Scored (-.80) Opponent Goals Scored per Shots on Goal (-.64) Opponent Shots on Goal (-.57)
- Aston Villa: Opponent Goals Scored (-.60) Opponent Goals Scored per Shots on Goal (-.55) Passing Accuracy (-.37)
- Stoke City: Opponent Goals Scored per Shots on Goal (-.70) Total Passes (-.60) Total Passes Completed (-.60)
- Hull City: Opponent Goals Scored per Shots on Goal (-.60) Opponent Goals Scored (-.57) Opponent Total Passes (-.40)
- QPR: Opponent Goals Scored (-.55) Opponent Goals Scored per Shots on Goal (-.42) Opponent Shots on Goal (-.35)
- Everton: Opponent Goals Scored (-.57) Passes Completed Final 1/3 per Passes Completed Entire Pitch (-.56) Opponent Goals Scored per Shots on Goal (-.52)
- Leicester City: Opponent Shots on Goal per Shots Taken (-.54) Opponent Goals Scored (-.47) Opponent Goals Scored per Shots on Goal (-.42)
- Swansea: Opponent Goals Scored (-.72) Opponent Goals Scored per Shots on Goal (-.66) Opponent Shots on Goal (-.59)
- Man United: Opponent Goals Scored per Shots on Goal (-.57) Opponent Goals Scored (-.47) Passes Completed Final 1/3 per Passes Completed Entire Pitch (-.36)
What’s that mean?
Again, for the most part, no two teams show the same consistency of pattern in what single (game to game) quantity or quality indicators best represent team performance in Defending.
Therefore – the individual player statistics behind these values have a different meaning (amount of influence) in whether a team wins, draws, or loses.
In addition, while Opponent Goals Scored (in bold) appears as a relevant indicator it is not the most relevant indicator for every team. Reinforcing that teams, in defending, behave differently with respect to earning points in the League Table.
Of additional note is that the r for eleven of those teams is less than (-.60) and only four teams show an r2 greater than -.70.
Also, Opponent Goals Scored does not appear in the top three single defending indicators for two teams, Stoke City and Sunderland.
Finally, the single indicators (either by volume or by ratio) that fit into the top three, exclusive of Opponent Goals Scored, are:
- Opponent Goals Scored per Shots on Goal (fourteen times)
- Opponent Shots on Goal (eight times)
- Opponent Shots on Goal per Shots Taken (four times)
- Total Passes (four times)
- Total Passes Completed (three times)
- Passing Accuracy (three times)
- Passes Completed Final 1/3 per Passes Completed Entire Pitch (twice)
- Opponent Percentage of Successful Passes Final 1/3 (once)
- Opponent Shots Taken (once)
- Opponent Total Passes (once)
- Opponent Passes Completed Final Third per Passes Completed Entire Pitch (once)
A few thoughts here to go with some of these indicators:
Most recognize that a negative r means there is an inverse relationship – in other words you get more with less or you get less with more.
What is intriguing is that Attacking Total Passes appears four times while Attacking Total Passes Completed and Attacking Passing Accuracy appear three times.
Meaning, as those teams have less overall Passes Attempted, Passes Completed or lower Accuracy they are more likely to earn points. Imagine that sort of logic applying to baseball – where a team who, sometimes, puts less men on base is more likely to win!
Finally, with the variety of defending r values this also seems pretty clear that earning points is not just about putting a man on base and moving them forward, and in some cases it may even be about not possessing the ball!?!
In Closing:
Single statistics have value – but they should be offered up, in context, with relation to other things that occur in the game of soccer.
Not enough writers do that – they simply offer up individual statistics as if they are the panacea of greatness… the more they do this the more ingrained most soccer supporters become in individual statistics that over-value a player.
And the more the media does it the more likely the supporters will become disenchanted with front office decisions that don’t make sense based upon those high-visibility individual statistics…
I’m not a Moneyball guy for soccer – never have been – and to me that line of thinking is flawed (as it applies to individual statistics in baseball).
What’s that mean??? (Editorial)
After a great question offered up in the comments section I think I should clarify what I mean by that with respect to soccer.
When I read Moneyball I was more focused on the individual statistics part of the game that were used to generate market value than the ‘economic state’ of buying and selling players that might lead to more wins…
That being said, I am not saying that you can’t measure the value of a player in soccer – it can be done but it needs to be done after considering teammates, opposing players, and at least the Head Coach of the team the player plays for.
Modern day soccer statistics, for the most part, don’t measure the appropriate level of influence teammates, opposing players, and Head Coaching tactics – as such when I say I’m not a Moneyball guy when it comes to soccer it really means I don’t buy all that crap about tackles, clearances, goals scored, etc…
I value players relative to team outputs and I strongly feel and think the more media and supporters who understand this about soccer the less frustration they will in blaming or praising one individual player over another player.
I hope that makes sense???
Anyhow, an example if you will…
A player with many tackles or clearances is simply a player with many tackles or clearances – it doesn’t mean they are better or worse than another player with fewer tackles or fewer clearances.
And… actually, I could make a reasonable argument that a player with many tackles or clearances is actually a worse player… why?
For one reason – if an opposing head coach knows that a player on the other side is weak – what do you think that head coach will want his players to do?
Drive or pass the ball towards the weaker player – as such – that increase volume of tackles or clearances will naturally increase that weaker players defending statistics simply because of increased volume!!!!
Bottom line here is that individual tackles or clearances can be over-valued or under-valued – as such – as an individual statistic it’s relevance to a player being better or worse than another player is flawed…
However viewed…
I would offer more individual statistics need to be created for players that better reflect how those statistics relate to points earned.
It’s that type of reporting and analyses that should help others better understand the nuance of soccer and that it isn’t just all about scoring goals.
Best, Chris
COPYRIGHT, All Rights Reserved. PWP – Trademark
You can follow me on twitter @chrisgluckpwp