Yes, yet another model
looking at the quality of chances and finishing of football teams.
This is something that I had
hoped to have finished before last season ended, unfortunately life got in the way
and it got delayed. Since then there have been a number of very interesting
analyses done, including those by @colinttrainor (like
this) and @11tegen11 (like
this), which have continued the good work done by @footballfactman (like
this), where they have put in a significant amount of work to look at where
shots are taken from and what the conversion rates are for shots from those
areas. Hopefully I can hang on to their coattails.
Personally, I am far too
lazy to collect all that data, so I have let the experts (Opta) decide upon
chance quality for me, and I hope to make the model as simple as possible. In
my blog so far I have looked at how Liverpool
and Tottenham have performed in terms of finishing and creativity, and to add
some context, I compared them to the league and Top 4 average. Whilst compiling
the numbers, I noticed that the League averages were quite consistent year on
year over each of the past 3 seasons, and realised that I could create a
theoretical average team that I could use as a benchmark to compare the
performance of all the Premier League teams.
I am sure I am not telling
anyone anything new when I say that the amount of goals a team scores is
essentially dependent on 3 things, the amount of shots they take, the quality
of chances they create, and the quality of their finishing, and it’s against
these metrics that I will be comparing teams against.
Rate of Attack
This is very simply the
amount of shots a team takes, and can be measured on a per game (SpG) or per
season (SpS) basis. Yes, I know that not all attacks end with a shot and I am
basically just using total shots, but I wanted the model to have a 'racey' acronym, so Rate of Attack it is.
On average, each team takes
about 14.5 SpG, or about 550 SpS. Between
9-10% of all shots end up with a goal, and this has been found to be consistent
season upon season and across different leagues. For those that don’t know,
this is called the Reep Ratio, after an amateur statistician named Charles Reep, who looked at
various stats, including the conversion rate of shots, in the 1950s.
Creative Efficiency (%CCC)
This is a measure of the
creativity of the team and quality of chances they have, and this is where I am
relying on Opta to decide upon what is a good chance, as I am using their Clear
Cut Chance (CCC) for this. A CCC is one of Opta’s few subjective statistics,
and whilst a full description is not given, a brief description is given by
Opta in their Event Definitions under Big Chance (here)
“A situation where a player
should reasonably be expected to score usually in a one-on-one scenario or from
very close range.”
Creative Efficiency (%CCC)
is measured as a proportion of Clear Cut Chances to Total Shots.
A team with a high %CCC
will, over time, create chances that are easier to score from than the average
team. Whilst CCCs make up only about 13% of all shots in the Premier League, they are vitally
important, as for each of the last 3 seasons, around 52% of all goals have been
scored from a CCC. It should be noted that CCCs include penalties, and whilst I
did consider removing them from the analysis as they have their own average
conversion rate, I decided to include them for a few reasons, there will be some
open play CCCs that will be easier to score from than a penalty, I also think
that teams that attack more or are more creative will tend to get more
penalties, at least over the long term, and that should be included in their Creative Efficiency, and finally
because I want to keep the model simple and with as few adjustments as
possible.
Obviously when you multiply
a team’s Rate of Attack by their %CCC, you will get the number of shots which
are CCCs. The remaining shots will be what I will call, as I can’t think of a
more appropriate term, the Non-CCCs. The two types of shot have their own
average conversion rate, and the model analyses the quality of finishing of
both types of chance by comparing the goal expectancy (number of chances
multiplied by the average conversion rate) to actual goals scored for each type
of chance.
CCC Conversion
To give an indication of the
average difficulty of a CCC compared to the average shot, it is on average about 4x
easier to score a CCC as they have an average conversion rate of just under
38%. It should be remembered though that there is a large range in the
probability of a CCC being scored, Sam Green of Opta has said (here) he considers the
base probability to start at about 20% and it of course goes up to 100%.
Non-CCC Conversion
The average conversion rate
of Non-CCCs is slightly above 5%. The reason why I won’t classify them along
the lines of a ‘difficult’ chance is that with the goal expectancy range for
individual shots being between 0% and 20%, anything with an expectancy above
10% will still be easier than average.
The Numbers
Here are the hard numbers I
have collected for the past 3 seasons.
And these are the benchmark
ratios/rates that I have either mentioned or will be using for the theoretical
average team.
So, how did each team perform
last season? In terms of number of shots, Liverpool
lead the way by far with 740 shots over the season, 59 more than Tottenham
took, the next best team, and not far off double the amount of shots that Stoke had.
It may not come as much of a surprise to see that Manchester United had the best %CCC, with 21% of the efforts being from a CCC, compared to 18.3% for 2nd placed
In terms of shot conversion,
the team with the best conversion rate for CCCs was, yes you’ve guessed it, the
team who scored the most goals, Manchester United with 44.1% of them scored.
The team with the worst conversion of CCCs was, yes you’ve guessed it, the team
who scored the least…oh, it was actually Manchester City ,
with only 28.9%, I didn’t guess it either. So, City had 3 more CCCs, but scored
17 fewer CCCs, a significant amount.
The team with the best
conversion rate of Non-CCCs was Chelsea
at 7.4% leading to goals, and this time we do find the expected QPR at the
bottom of the pile with only 3.1%.
I’ll admit that the table
above is a little hard to read though, we’ve got different units and magnitudes
of measurement and its hard to see how well each team is doing overall, so lets
add some context and measure each teams performance as the percentage change
from our benchmark team.
Now things become a bit
clearer. We can see that despite only taking 3% more shots more than the
benchmark, Manchester United’s %CCC was a whopping 62% higher than average,
which goes some way to explaining why their total shot conversion was so much
stronger than everyone else at 14.2%. However they also significantly
outperformed both conversion rate metrics, meaning they scored almost 13 goals
more than expected if they had average finishing. If they had scored at average
rates, their total conversion rate would still have been the highest in the
league though at a touch under 12%.
Only 2 teams managed to beat the benchmark for all 4 metrics, Man Utd and Arsenal. Of the other top teams, Chelsea and Tottenham had a relatively poor %CCC, Man City were poor at converting their CCCs, Liverpool were poor at converting their Non-CCCs, and Everton were poor at converting both types of chances.
At the other end of the
table, only 2 teams performed worse on all 4 metrics compared to the benchmark
as well, unsurprisingly QPR, with the other team being Newcastle. Reading were very good at
finishing their chances, its just that they struggled to create any.
So what does this all look
like when we convert these metrics to expected goals and how did the teams
compare? There were 3 big outperformers, Chelsea (15.7 goals above expected), Man
Utd (+12.6), and Arsenal (+10.4) whilst there was 2 big underperformers in QPR
(-12.3) and Everton (-10.1). For those
of you who are into your ‘proper’ statistics, I’ve calculated the Mean Absolute
Percentage Error for the model over the last 3 seasons as 10% and the Root Mean
Squared Error as 7 goals. Its been a loooong time since I studied statistical
methods, so I may have used the wrong error measurements, but I think that
shows that the model isn’t too bad.
I’ll finish with how my
model differs from those I’ve mentioned which look at shot location. I’ll start
with the weaknesses. The first is that my model is far less granular as I have
lumped the 87% of all shots that are Non-CCCs with the same goal expectancy,
which means that the type of analysis that I can do with my model probably
can’t go quite as deep as the others. Due to the creative efficiency element, I
think the model is only applicable to teams and won’t be able to do player
analysis. There is an element of trust in Opta that they are consistent when
collecting the CCC data as it is subjective, particularly as we do not know
their precise definition, although having read this (here),
I think its fair to assume they are consistent. And because we do not know
exactly how Opta define CCC, I think it will be very difficult to see how or if
the metrics change depending on the Game state as the info of when a CCC occurred is
not available. Whilst on average there are just under 4 CCCs per game, so it
might be possible by watching the highlights or reading the match reports to
figure it out for most games, in some cases however, as shown by @analysesport (here),
it would be very difficult. Another issue is the relative lack of CCC data, it is not freely available (you need to pay for a subscription at www.eplindex.com for the data), it only goes back 3 years, and as far as I am aware, there is no CCC data publicly available for leagues other than the Premier League
The positives are that it is
very easy to collect and analyse the data, you only need the number of games
played by a team, their total shots and the number of CCCs they’ve had to be able to estimate the number of goals they should have scored. One of the
issues with simply using shot location, as discussed by @mixedknuts (here),
is that it does not take into account the positioning of the defenders. For
instance a player may take a shot in the central area of the box but have 4 defenders and the keeper between him and the goal, so the probability of a goal
would be low, equally a player may break an offside trap and have the ball
outside the area but be 1 on 1 with the keeper, so the likelihood of scoring
would be quite high. This model at least separates out those chances where the
defenders are not making a significant difference to the difficulty of a goal
being scored, and whilst these only make up 13% of the chances, they do make up
52% of the goals.
Hopefully, if I get enough
time, I’ll look at how repeatable these metrics are and if they could be of use
for predicting matches and also look at how the teams performed on these
metrics from a defensive point of view.
You can follow me on twitter at @The_Woolster
Data taken from www.eplindex.com
You can follow me on twitter at @The_Woolster
Data taken from www.eplindex.com
No comments:
Post a Comment