The Draw Specialists
The Draw Specialists
In my previous article, I highlighted a scenario where data driven football predictions went as well as they possibly could. Unfortunately in this article, I will have to highlight just how data driven betting can go incredibly wrong.
Regular readers will be aware of the fact that I’ve been testing my predictive model ‘PremBot’ throughout this English Premier League season and I would have to say with somewhat mid-table fortunes.
The previous attempt saw ‘PremBot’ return just three out of ten correct picks, although there was a caveat that a lot of relative shocks took place almost three weeks ago and if it the form guide came good, it would have been a more than respectable six out of ten.
Undettered by such a meek showing, it was time to try my luck again yesterday
(Saturday) and that meant another eight games were thrown into the predictive ring, although with the predictive outcomes I was definitely on a hiding to nothing.
LOOK AT THIS MUCK
What is the first thing that stands out to you? Is it the amount of draws that the model came back with? Yes it was mine and I thought to myself “this is not going to be a good Saturday” while low and behold it was certainly was not.
The games that were picked as outright winners came back as mainly draws, while the draws came back as wins. The only way it could have been any worse was if there were no correct picks at all, so small mercies I guess.
But it does lend itself to a rather pertinent question, why on earth did ‘PremBot’ throw up so many draws and to try and answer that, let’s add a bit more context to the predictive model and highlight what it works off of.
THE BACKGROUND CHECKS
Fundamentally this is a model that works off comparing a combination of home form vs away form and also the difference in league positions. If Liverpool have perfect form they are given a ranking 1, while if a team lost their last six in a row, they would be given a ranking of 729.
Why 729? Because that is the number of different six match result combinations that can take place and therefore they are then ranked from best to worst, with an example of Saturday’s data looking like this.
As mentioned if we take the second line, Liverpool were the perfect form team with six straight wins and therefore they were given 1, while Southampton were not all that bad from home and were given a ranking of 53.
LIVERPOOL DRAWING, SURELY NOT
=IF(AND(C3-S3<=5,AE3>51),V3,IF(AND(S3-C3<=-5,AE3<0),Y3,”Draw”))
What does the above mean, this is thre fundamental principle of how the model works and will now break it down.
It is saying if the difference between Liverpool’s league placing to Southampton’s is less than five (in a pure mathematical function) and the form index comparison is more than 51, then it is a Liverpool win.
If the difference between Southampton and Liverpool’s league placing is less than five (in a pure mathematical function) and the form index comparison is a negative number, then the Saints would win) if it’s neither of these combinations it is a Draw.
Now usually, the 51 number is a lot higher and a lot more emphasis is placed on a home win. However with Southampton having such good form away from home, I had to tweak the margin, so as to still show a Liverpool win.
But apart from that, it seemed as if the permutations involved created a sense of deadlock and this means a couple of things off the top of my head.
1. The model is too rigid to be league wide (one size does not fit all)
2. The league position factor was too wide (5 places too much?)
3. They were games that tight to call anyway (clutching at straws)
For example the ‘PremBot’ model looks at Chelsea’s poor away form and then nudges it to a Leicester win. However it doesn’t really take into account it is a clash between a team in third and a team in fourth.
Therefore, there may need to be some additional rules for the rest of the season and especially as teams have certain objectives that may have need to be reached, those being league win, Europe and relegation.
Don’t forget this same logic has returned 7 out of 10 before, so it is not a complete bust but it is certainly not working at its optimum at the moment. If anything far from it, so the question is how do we get out of this rut?
If there is any positive to shine from this, is that the disaster weeks at least highlight what needs to be done and this certainly has. Therefore, I will now be able to pick this apart and add some more layers of rules into the model, so it can be even more flexible.
In doing so, it should hopefully give more some success in this project of data driven football predictions. Therefore, it is back to the drawing board and I will see what I have up my sleeve over this split gameweek of Premier League action.
Happy punting and thanks for reading. Dan
If this has grabbed your interest and you would like to discuss/feedback then please feel free to drop me a message at dan@realfootballman.com. While I am always looking for new football/data projects to work on and if you feel that my skills would be of use, I can be contacted at the same address.
Follow me on Facebook at Dan The Stat Man
You can also check out my Premier League Podcast on Soundcloud