Expected Wins V3 MLS, EPL, Bundesliga, LaLiga, WC 2014

If you’ve read these two previous articles, Expected Wins and Expected Wins 2, you know I look at how teams perform, on average, (win, lose, or draw) with respect to my primary data collection points for Possession with Purpose.

What will be added, in Version 3 (V3), will be a compare and contrast between all the leagues I evaluate in my Family of Indices.

Results of looking at the diagrams and reading through my observations should help clarify analyses like (ABAC, ABCB) doesn’t really have relevance to teams that win, lose or draw – at least not this year.  (Note – two links – two different sites published roughly the same analysis)…

Don’t get me wrong – I’m not taking a personal dig at the grueling work associated with the analyses.

It has great value, but more from a tactical viewpoint in how passing is executed, not from a (bell curve) – volume/success of passing rate – relative to possession and penetration into the Final Third, that helps a team create and generate shots taken leading to goals scored; or… when flipped, leading to goals not scored.

And as pointed out by a (shomas) on the article, that surfaced on MIT, if anything, it adds predictability to what a team will do – and the more predictable a team, the more likely the opponent can defend against them better… 

For me – I would have thought the GREATER the variation in that cycle(ABAC, etc…) the better… others may view that differently?

In addition, I think there could be more value, to the information, if it was segregated by league – more later on that…

To begin – here’s a reminder of what Expected Wins looked like in Major League Soccer after 92 games (184 events): 

MLS AFTER 184 EVENTS
MLS AFTER 184 EVENTS

The term ‘event’ is used, as opposed to game, to clarify that each team’s attacking data is include in this analyses – and that the greater the volume of data points the stronger the overall statistical analyses is; i.e. sampling 15 data-stream points is not the same as sampling 1000 data-stream points.

Biggest takeaway here is the strength of correlation these seven data points have to each other (i.e. their representation – in my opinion – of the primary bell curve of activities that occur in a game of soccer)…

In every case, in every diagram that follows, all the Exponential trends exceed .947; and in every case the relationship for the winning teams is higher than the relationship for losing teams… speaking to consistency of purpose and lower variation in my view.

In general terms. this is my statistical way of showing that a goal scored is tantamount to a 5th or 6th standard deviation to the right from the normal bell cuver of activities that occur in a game of soccer.

Said another way – I don’t evaluate the tail – when measuring the dog’s motion – I evaluate the dog; recognizing that the tail will follow, to some degree, what the motion of the dog will be…  and… that even if the motion of the dog is somewhat different, the tail will normally behave in the same way.

Therefore, it’s not the tail that should be analyzed – it’s the dog… others may view that differently.

Here’s the same diagram for the MLS after 366 events:

MLS AFTER 366 EVENTS

Oh… the green shaded areas are meant to show those data points that are higher for those particular categories; in other words the Volume of Shots Taken for winning teams (after 366 events) was higher than that of losing teams – but the volume of passes completed in the Final Third was higher for losing teams than winning teams…  more on that later.

Here’s the diagram after 544 events in MLS:

MLS AFTER 544 EVENTS

Note the shift – only the volume of Final Third Passes Attempted is now higher for losing teams – all other data categories see the winning teams with greater volume.

For me, what this reinforces is the issue of time and space as well as patience – three statistics never measured in soccer (publicly at least)…  again, reinforcing, for me, that shot location only has value relative to the time, space, and patience of the team in creating that time and space for that shot.

Statistically speaking, what that means, to me, is that Expected Goals; a very popular (and worthy) statistical calculation, needs to be refined if it’s to have greater value as a predictive tool/model…  I’d be interested to hear / read the views of those who work Expected Goals efforts…

Now here’s the European Leagues I’ve added to my PWP Family of Indices analyses; first up the English Premier League:

EPL AFTER 100 EVENTS

Note that the pattern, here, after 100 events, resembles the same pattern for MLS after 544 events… worthy.

Moving on to the Bundesliga:

BUNESLIGA AFTER 72 EVENTS

A pattern similar to MLS after 366 events; will this pattern morph into something different as the league continues?  Possibly – the MLS pattern has changed so perhaps this one will too?

Now for La Liga:

LALIGA AFTER 78 EVENTS

A completely new pattern has taken shape – here “volume” speaks volumes! 

Is this unique?  Nope…  It also happens to be the same pattern as the World Cup 2014 pattern – below:

WORLD CUP AFTER 128 EVENTS

Will that pattern show itself in the UEFA Champions League?  I don’t know but we’ll find out…

So what’s it all mean? The “so-what”?

Before attempting to answer that, here’s two different diagrams plotting these data points for winners and losers  (in reverse order) for the leagues I evaluate:

LOSERS EXPECTED LOSES

WINNERS EXPECTED WINS

Now the grist:

The red shaded areas are where the losing teams’ average exceeds the winning teams’ average in the volume of those activites – the green shaded areas are highlighted for effect.  Green shaded areas for the volume of Shots on Goal and Goals Scored indicate that those numbers are virutally the same, for winning teams, in all the activities measured…

Now, back to the so-what and what’s all mean?

For me this reinforces that the “pattern” of passing (ABAC, ABCB, etc…) that gets you into the Final Third has no relevance to the volume of Goals Scored.

And, it also reinforces that different motions of the ‘dog’ will generate the same tail wagging outputs – therefore it’s the analysis of the dogs activities that drive greater opportunities for improvement.

The averages for winners in the activities measured all behave somewhat differently – granted some patterns might be the same but the volumes are different.

And when volumes change, the game changes, and when the game changes, the strategic or tactical steps taken will change – but… the overall target should still remain the same (on average) – put at least 5-6 shots on goal and you ‘should’ score at least two goals… getting to that point remains the hard part!

Bottom line here: 

These leagues are different leagues – and the performances, of the teams, in those leagues are different when it comes to winning.

Therefore, I’d offer that comparing a striker’s ability to score in one league is completely different than an expectation an organization might have in how that striker may score in another league.

Said another way – a striker who scores 20 goals in the Bundesliga, a league that shows winning teams play to a more counter-attacking style, might not perform as well in a league like the EPL; which looks to offer that winning teams play a more possession-based style.

Perhaps??? another good example… a striker playing for a team that counter-attacks, is more likely to have greater time and space to score a goal, than playing in a possession-based team where time and space become a premium because the opponents play far tighter within their own 18 yard box.

But, as mentioned before – since no-one statistically measures (publicly) the amount of time and space associated with passing, and shot taking, we can’t peel that onion back further.  I have suggested two new statistics that may help ‘intuit’ time and space – that article is “New Statistics? Open Shots and Open Passes”: here.

In Closing:

For the future…  I’m interested in seeing how these analyses play out when separating out teams who show patterns of counter-attacking, and perhaps direct play, over teams that show patterns of possession-based football.

In addition, I’m also keen to see how these take shape when reversing the filter and organizing this data based upon whether or not a team is defending deeper, or more shallow.

The filter there will come from looking at the opponent averages for passing inside and outside the Final Third…

It seems reasonable to me (others may view this differently?) that the if a team lacks goal scoring they need to find the right midfielders and fullbacks that are good enough to create the additional time and space the strikers need in order to score more goals.

And that doesn’t even begin to address the issues in defending – which statistics continue to prove year in and year out as being more critical to winning than attacking.

Given all this information, I may have missed something – I’m always looking for questions/clarifications so please poke and prod the diagrams and analyses and comment as time permits.

Best, Chris

COPYRIGHT, All Rights Reserved.  PWP – Trademark

You can follow me on twitter @chrisgluckpwp

Advertisements

9 thoughts on “Expected Wins V3 MLS, EPL, Bundesliga, LaLiga, WC 2014

  1. Christopher,

    I’ve been reading your Expected Wins analysis and I have a suggestion:

    Should you be drawing comparisons between attempted passes, completed passes, final third passes, final third completed passes, shots and shots on target for each league?

    By definition, the number of each of those must be lower than the category before, so the relationship between each would remain relatively stable and predictable, wouldn’t it?

    I would expect all of those statistics to move towards the norm over time.

    However, this kind of data might have a real-world application in predicting an individual player’s likelihood to succeed in a given league: a player used to playing in a competition with an above average final-third passing success rate, may seem even better in a league where that success rate is below average (and where I would predict that rate is about to increase).

    I would appreciate your thoughts.

    Like

    1. Ian, Thanks for adding your thoughts and questions. A short few comments/questions that perhaps will generate more thoughts than you expect 🙂 I will do my best to respond…

      In considering the need to compare leagues in these categories, against each other. For me, the answer is yes – especially when you consider the real world application on the likelihood of one player succeeding or failing in another league.

      In addition, I think this also points out that comparing individual players to each other, from different leagues, is an apples to oranges comparison because the leagues clearly don’t behave in the same manner – i.e. nor do the coaches.

      In terms of the statistics moving towards the norm over time – agreed – but I don’t see that as a bad thing – if you don’t collect statistics like these you don’t know what the norm is… for each team – and they ARE different for each team…

      The other compelling piece here is that the data is stable and the stats will be lower as the progression occurs (but to what degress lower?). Reinforcing for me that individual defensive statistics really have no meaning – for example clearances, tackles, interceptions, etc… while of benefit for a player… the more reliable ‘norm’ is the pattern of percentages in ‘unsuccessful passes’ simply because when adding up those individual defensive statistics it never-ever equals the volume of unsuccessful passes…

      I’m not sure if that makes sense or not – sometimes it’s hard to explain – the best article of late speaking to that statistical issue is the one I wrote on Hurried Passes – followed by “sometimes what doesn’t happen…” plus the latest one on New Statistics (Open Shots and Open Passes)…

      I did see the article by Ted Knutson about baseball stats and soccer. I non-violently disagree with his view – you don’t need detailed individual statistics about the game to draw conclusions because the game is fluid – more amoeba like – where multiple individual actions of players in groups of three or four drive some defensive scenarios – this isn’t the ABAC stuff that was published on the MIT site – this is the controlled pack movement of players. Baseball is a game played in ‘series’ (electronically speaking – so step by step statistics are great) – football is a game played in parallel (electronically) where multiple actions by multiple players affect one event – because of that you NEED multiple statistics of multiple events to analyze team results… granted there are individuals like Messi, and others, who imapct the game – but PSG just beat Barcelona 3-2 – they did that as a team!

      In adding some additional thoughts on my approach in PWP – without identifying individual players this analysis will, eventually, tell you most everything you need to know about how one team behaves versus another – and as a coach – that is far more valuable to me then knowing some one-off defensive statistics about tackles or interceptions…

      I also think my approach has real-world application with respect to scouting what players you need to fit into a system you run – sadly I don’t have millions of dollars and access to PROZONE data to prove that point. perhaps I can land a wealthy sponsor or get hired by one of those rich team owners in Europe to prove the point…

      Anyhow, but I do know, with what I’ve seen in my two years of research that I could collect this data for (lower division teams or academies) across europe, interpret in my own way, and come up with great individual talent to fit needs of top division clubs… all at a fraction of cost that is now devoted to scouting.

      In regards to your final thoughts on the above average final third passing success rate – i think you need to ask this question before going any further – why is that league having an above average final third passing success rate – is it because there are many teams who play counter-attacking, direct attacking, possession-based, a high defending line or a lower defending line?

      In a league where many teams play a lower defending line I would offer that an individual players’ pass success rate might be overrated compared to playing against teams that play a higher defending line in the final third… higher defending lines usually generate unsuccessful passes more often – whereas a lower defending line usually generates fewer unsuccessful passes.

      I’m not sure that scratches your itch Ian – but am always willing to chat – simply love the game and as someone who enjoys chess it’s this approach in analyzing a team that has me far more intrigued about soccer than expected goals… I’d offer there is no golden formula – hence my throw the net wide and catch what you catch and then make sense of what it tells you not what you want to see given some data set…

      Like

Comments are closed.

Create a free website or blog at WordPress.com.

Up ↑

%d bloggers like this: