It has been a
while since I have written anything. This is partly due to life just getting
that little bit busier, both at work and at home, but also because, after one
too many spillages, they keyboard on my laptop stopped working. This is a piece
that I actually had 95% completed prior to the keyboard giving up the ghost on
me, so I didn't quite get it done, and in this business your data becomes out
of date very quickly! This was meant to be posted soon after I presented at the
OptaPro Forum last February, and is a follow up to the last piece I wrote that you can read here.
Recently I
posted the slides from my presentation at the Opta Forum, and in this I will look
at on one of the metrics that I introduced in the presentation, the Big Chance
Ratio (“BCR”). If you don’t know already, Big Chances are one of Opta's few
subjective stats, described as “A situation where a player should
reasonably be expected to score usually in a one-on-one scenario or from very
close range.” Big Chances have only been measured by Opta for 4 full seasons
now, and this gives us 80 observations to check the relationship with points,
but due to this season not yet being completed and teams getting relegated, we
only have 51 observations to test the year-on-year relationship. I wrote in
more detail here about how many Big Chances a team gets on average over the
season and the rate at which they are converted at here, and there has been little
change, with the average team over the past 4 seasons taking 535 shots, with
about 13%, or an average of 68 of which are Big Chances.
As I did in my
last post, I am going to be using the Total Shot Ratio (“TSR”), which measures
the proportion of shots that a team takes compared to its opposition as the
baseline to compare the different metrics. Although we have many more
observations for TSR, I will use the same period to compare the differences
between the metrics so that I am comparing like for like.
Below are the
two charts showing graphically TSR’s relationship to points and TSR
year-on-year, the R2 for each are 0.65 and 0.70 respectively, and
it’s against these that I’ll be comparing the new metrics.
Moving on to
the BCR, the graph below is the one I used in the presentation to show the
relationship between the BCR and points, although with last season’s data also
included. I don’t think there is anything ground breaking in looking at the BCR,
as it is essentially using the same method used for TSR, but applying it to Big
Chances. I have seen the BCR used by others to compare
teams, although as far as I am aware, no one has written about it before to
show just how meaningful it can be.
The R2
was 0.75 over the 4 full seasons that Big Chances have been recorded by Opta. Just
like TSR, the average team has a BCR of 0.5, but you can see from the graph
that the range in BCRs is larger, from the 0.3 achieved by Reading two seasons
ago up to the 0.77 for Manchester
City also from two
seasons ago.
So, why do Big
Chances, with an average of only 68 per team each season have such a strong
relationship with points won? Well, it is partly be a case of correlation
rather causation. Teams that are winning tend to be more conservative and sit
back, restricting the opposition to more difficult shots, whilst also being
able to hit teams on the break to create better chances or be more patient and
wait for easier opportunities to come along. As I showed in my presentation,
this comes through when looking BCRs by game state, teams that are winning by 1
goal on average have a BCR of 0.53, and this increases with each goal the lead
increases by, and needless to say that teams that are in the lead more tend to
win more points. However, as shown by Mark Taylor (here), the ability to create
better chances can also be an important factor in who wins the game, even if
the expected goals for each team in a game equal the same.
But what about
repeatability, is there ‘skill’ in a team’s ability to be able to both create
and restrict Big Chances? Well, the graph below shows there is a positive
relationship between the BCR in one year and the following year, but this is
not a strong as for TSR, with the R2 for BCR at 0.60 as opposed to
0.70 for TSR. However, as one of the strengths of TSR is that there are a large
number of shots, we should remember that by looking at Big Chances only, we
have significantly reduced the number of observations, and when you consider
this, the repeatability is actually quite high.
There are still
a lot of shots left over however, so how much information is there in shots
which aren’t Big Chances? I'll refer to these as Normal Chances, and to give
a bit of extra detail, whilst Big Chances are converted at a rate of about 38%,
Normal Chances are converted at a rate of slightly over 5%. Well, from the
graphs below, we can see that the relationship between the Normal Chance Ratio ("NCR") and points is not as strong as
for BCR, whilst the year on year correlation is slightly higher, with an R2
of 0.55 and 0.63 respectively. As you would expect with Normal Chances making
up 87% of all shots, the range is similar to what we see for TSR, going from
0.36 up to the 0.67. The average NCR, as with both TSR and BCR, is also 0.5.
As about 50% of
goals are scored from Big Chances, with of course the other 50% from Normal
Chances, I thought it would make sense to see what happens if we add the team’s
BCRs and NCRs together. As they both have an average of 0.5 across all
teams, the combined metric will have an average of 1 so it will also be nice and
easy to tell which teams are above or below average. Of course this metric
needs a confusing name and acronym, and as it is two ratios based on the quality of
chances added together, I’ve called it Chance Quality Ratios Added Together
(“CQR+”).
We can see from
the graph that the relationship between CQR+ and points is very strong, and has
an R2 of 0.78. The reason for the strong improvement over TSR I
think can be explained by thinking of TSR as a weighted average of the BCR and
NCR, and by separating them out and adding them back together, we have given
each an equal weighting, which is in line with their average contributions to
goals. As Normal Chances make up the vast majority of a team’s shots, then
their TSR and NCR will always be relatively close. If a team is more efficient
at creating and restricting Big Chances than they are at shots in general, then
their BCR will be higher than their TSR, whilst their NCR will lower, however
the change in BCR will be larger in absolute terms, and the higher conversion
rate associated with Big Chances should in general translate into more points
won.
How about the
repeatability? The addition of BCR and NCR together also has a positive effect on
repeatability, and the R2 0.75 is actually higher than for TSR which
is 0.70 over the same period.
To summarise
the differences between the four metrics that I have covered, the table below
shows the R2 for each one. As we know, TSR is a good predictor of points and is repeatable, BCR has a stronger relationship to points than TSR, but is not as stable year-on-year, but by adding the team's BCR and NCR together we have a metric with both a higher explanatory power and a greater predictive power.
I thought it
would be interesting to check how teams are performing by these metrics this
season so far. The table below shows each teams TSR, NCR, BCR and CQR+, with
the ranking in the league for each metric, and the table has been sorted by
CQR+.
If we look at
TSR compared to league position we can see that, as we might expect, it is
performing relatively well, with the majority of TSR rankings within 3 places
of the league position. We can also see how the NCR is never more than 2
decimal points different from the TSR, although even these small changes do
shuffle the rankings a little. Its when we start to look at the BCRs that we
start to see the real differences. Chelsea
have the highest BCR at 0.71, on the back of being the meanest defence at
conceding Big Chances and creating the 2nd most (behind Arsenal),
which compares to their TSR of 0.61. Moving in
the other direction we have Liverpool , who
have been good at dominating the shot count in their matches and have the 4th
highest TSR of 0.59, however they are not as efficient when it comes to Big
Chances and have a BCR of 0.51, ranking them 8th.
In terms of the
CQR+ metric, and on the back of their high BCRs, Chelsea
and Arsenal are the leaders of the pack. Man City
have been the most dominate team in terms of TSR this season, but like
Liverpool they are not efficient when it comes to Big Chances so rank 3rd
by CQR+. Then come Southampton, Liverpool, Manchester United and a bit of a gap to Tottenham, meaning that the top 7 by CQR+ make up the top 7 teams in the league. Down at the other end, 5 of the bottom 6 teams in the league make up the 5 worst CQR+ teams, so it does seem to be working on first sight, and overall, all but 3 of
the teams’ CQR+ rank are within 3 places of where they are in the league.
That does mean there are 3 outliers however. The first two are West Ham and Swansea, who are
outperforming their CQR+, where they rank 14th and 15th
respectively. For both teams, particularly West Ham, it may well be the case
that their numbers are being affected by game states. West Ham have so far
spent the 5th highest amount of time winning in the league this
season which is likely to be having some downward pressure on their shot
numbers, and whilst Swansea haven’t spent as much time winning, they did start
the season very strongly. Going in the other direction, by far the biggest
underperformance compared to their CQR+ are QPR, currently sitting outside of the relegation positions
by goal difference, yet ranked 12th in TSR, and 10th in
CQR+ thanks to having the 7th best BCR in the league. Have they been
a little unfortunate, or is it the consequence of playing in very open matches
when you are not actually very good? I’m afraid I haven’t seen enough of them
to know.
I’m not too sure how this metric stacks up against some of the others out there,
particularly the expected goals model, although I did use it to enter a prediction in Simon Gleave's Premier Leagueprediction analysis though, which is kindly updated by James Grayson through out the season to show how everyone is doing, and where it is performing
relatively well (by points at least, although not by position).
However, compared to some of the models out there, it has the benefit of being very simple to calculate. All you need is the total shots taken and faced by each team, and the total number of Big Chances they have taken and faced. Unfortunately Big Chance isn’t as readily available as most other stats, you can get it from FantasyFootballScout where you have to pay a subscription, and I have also recently been directed to the AllThingsFPL website which also has them, although having said that the two sites do show slight differences in the Big Chances for each team here and there. I do know that Big Chances do get reviewed in the week following the game which may account for the differences, but I don’t know which site is the most up to date (I have been using the FantasyFootballScout numbers).
However, compared to some of the models out there, it has the benefit of being very simple to calculate. All you need is the total shots taken and faced by each team, and the total number of Big Chances they have taken and faced. Unfortunately Big Chance isn’t as readily available as most other stats, you can get it from FantasyFootballScout where you have to pay a subscription, and I have also recently been directed to the AllThingsFPL website which also has them, although having said that the two sites do show slight differences in the Big Chances for each team here and there. I do know that Big Chances do get reviewed in the week following the game which may account for the differences, but I don’t know which site is the most up to date (I have been using the FantasyFootballScout numbers).
I hope that
that I have shown here, as well as in my previous work just how useful the Big
Chance stat can be and how we can use it to make some simple metrics. Due to
not having a precise definition of what a Big Chance actually is, the lack of
detailed information on all the Big Chances, as well as its subjective nature,
some in the ‘fanalyst’ community have their doubts about Big Chances. Whilst I
agree that there may be cases where a shot is recorded as a Big Chance when
possibly it should not, that is always going to be the case when subjectivity
is added, however I believe that these will be in the minority, and as we see
consistency each season with the number of Big Chances and their conversion
rates, they are not having a big impact. I think
that we should embrace subjective stats as they can add more context to our
analysis, and the benefits can outweigh the concerns.