UPDATE (2019/10/30): Since Jérôme Garcès has been appointed to the final, this post needed a little updating. I have refreshed the data on the interactive report, so you can see the effect that Saturday’s win had on the various statistics and figures in this blog post. Additionally, the expected win % for the Boks is calculated for Saturday…
ORIGINAL (2019/10/24): I stumbled across this Reddit post yesterday, discussing the effect that Jérôme Garcès has on Springbok Rugby games. The results are a little scary, and to try and make myself feel a bit better about the possibility of South Africa winning on Saturday and making the Rugby World Cup Final, I thought I’d try dig a little deeper to see if I could find some mitigating factors and a ray of light.
My initial reaction to the analysis done in the post was that too little gravitas was placed on whether the Springboks were expected to win a specific game or not. So I decided to build a model which created an expected win % for each game, and apply that to the games for which Jérôme Garcès was the referee. I thought this was an appropriate course of action because the Springboks would be far more likely to win a game against Namibia at home, than away against New Zealand. So the distribution of games in which Garcès was referee had the potential to mitigate against the pure win %.
TL; DR – The Springboks have a far lower expected win % under Garcès than in the entire data set. However, the Springboks still have a significantly worse actual win % than the expected, even when the opposition and venue is taken into account.
Just by looking at the games which he refereed, I could see that expecting the win % to be equivalent to a random selection of games would be unrealistic.
The question became, how to determine an expected win %.
I initially thought of scraping some betting odds off the internet for all Springbok games, but I found it difficult to locate historical odds. So I had to create my own win expectation metric.
NB: There is no scientific or statistical methodology behind this, just my trying to have a bit of fun playing with the data, and seeing if it made me feel better.
I decided to take 4 sets of historical results, and create a win % on each of those sets of 5 matches.
Firstly, I just looked at the 5 most recent results for the Springboks. Obviously, a team in form is more likely to win than a team devoid of confidence.
I then looked at the 5 most recent results against the opposition. This gives us a more realistic view of the relative strengths of the two teams, although it can be misleading due to the frequency with which certain games take place relative to others (5 games against New Zealand could be just over 12 months of rugby, whereas there are only 2 games against Fiji in my entire dataset – 145 matches, since 2010).
I then added a metric which looked at the 5 most recent results at an equivalent location to the game (Home / Away / Neutral Ground) as the win % in markedly different depending on the location of the game.
Finally, I combined the second and third metrics to get an even more specific metric, the last 5 games against the opposition at an equivalent venue.
I then got to the following numbers
The only thing left to do at this point is to weight each of those 5 metrics to calculate an expected win % for each and every match.
I weighted the 4 metrics as follows.
|Wins % Last 5||1|
|Wins % Last 5 (Home / Away / Neutral)||4|
|Wins % Last 5 vs Opposition||5|
|Win % Last 5 vs Opposition (Home / Away / Neutral)||3|
The weighting doesn’t have to add to a specific number, it just assigns relative weights to the categories. If I’d rated the categories 2/8/10/6 the results would have been identical.
So after all that setup work, where did we end up?
Initially, my expected win % metric wasn’t even vaguely accurate. The win % seemed to be depressed due to the limited number of certain matches, and the effect that has on the weighting. To compensate for this, I added in a couple of modifiers, in particular, if the Springboks had a 100% record against an opposition over the last 5 matches, regardless of any other metrics, I rated the Expected Win % at 95%.
After this, I got a fairly close overall model, with a predicted Springbok win % of 57.5% and an actual win % of 60%.
I then looked at only the games that Garcès refereed, to see if the expected win % on those games was considerably lower than the average.
Over 14 games, the expected win % under Garcès is a little shy of 48%, so it does appear that the Springboks having a worse record under his whistle is to be expected. However, the actual win % of under 29% is still WAY below the expected outcome.
So does this exercise give me any more confidence heading into Sunday? Well, yes, and no. It’s clear from the data that the Springboks should do better than they do in games where Garcès is at the whistle. However, it’s also apparent from looking at the results, that there have been a number of games that have been extremely close defeats (6 defeats by 4 or fewer points). So I’m going to put on my half-full glasses and say that he definitely doesn’t blow us out of games entirely, and it’s probably about time we were on the right side of a close result under his officiating.
If you’d like to play with the data a little bit, you can do so on this interactive report.