Back To Belarus
Back To Belarus
In my previous article,I reviewed how my first raft of data driven football predictions fared in Belarus and although data driven excellence is yet to be achieved, it has set the ground work for the rest of the season.
With not a lot of data to work with, the first week scope was somewhat limited and although trying to find anything more than surface details is proving somewhat tough, I can now roll out not one but two predictive models for this particular league.
After reading some forums, I decided that the Form Index of six matches might be too long of a basis in which to judge upcoming matches and although I’m not going to ditch that, I’m going to try something parallel along side it.
Whereas six match previous form has been the bedrock of my predictions, I’m going to trim the fat and work off of a three match previous instead and in doing this, I’ll try and see if real recent form is even more of an indication.
In theory, that should stand to reason because six matches is a long time in any league and perhaps earlier results are dictating too much in terms of the choices that are then being made on the betting slips,
Therefore with the ground set for this week, lets first take a look at the head to head infograph for Week 7 of the season
As always, the Orange block is the best form during this period (although as of right now, it will just mimic the league table) The blue block is the worst form and the Green form represents the higher points tally out of the two respective teams.
Now I was going to do a smaller version of this for just three matches, but I have a feeling it will be a little bit inconclusive and to be honest, you can just use the above image for the same purposes (and do some basic sums in your head)
This may give you an initial insight into who might come out on top this weekend and using that form, I’ve managed to offer up my first full set of data driven football predictions and it is a set that looks like this:
THREE WAS THE MAGIC NUMBER
Going back to last week, I set the wheels in motion by doing a blind pick for an 8-fold come the end of the sixth week of the season, I managed to get three correct. While although this was rather paltry, these were not data driven football predictions.
Therefore, this will be the real acid test and we will use last week’s three correct predictions as the benchmark for this week, in order to see whether data can be a better driver of success than random choice.
For those of you who don’t know about the form index, it’s a guide that takes all the combinations of playing six consecutive matches and then lists all those 729 combinations in ‘best to worst’ – with best obviously being six straight wins and the worst being six straight losses.
However, there is some element of subjective criteria and that is how you rank the other 727 permutations in between. In this instance, I have worked on the basis that winning your previous game is obviously stronger than drawing or losing it and therefore these are ranked higher.
You could base it on the total points scored as well, but there’s always the risk of early matches in the run having too much weight, so I’ve avoided that for now. Although at the same time we could run another sub model, with it being ranked on points.
THE TESTING GROUND
Ultimately the Belarus Premier League is going to be the most fertile testing ground for any ideas, because we are letting data and theory guide us, rather than pre-conceived notions of which teams are perceived to be better.
With that in mind, here is the predictions for Week 7, if I only worked on a form index of last three matches. Of which there would be far fewer pernutations in terms of previous results – just 27 to be exact.
While if you look closely there is some variation from one model to the next and that is exactly what we want, because it gives us a perfect testing platform – something that could not be achieved with the same outcomes.
Therefore, I will test both of these models “Belarus 3 and Belarus 6” against last weeks random efforts and see how we get on, while eventually we will fine tune these two and peservere with the better performing effort for the rest of the season.
In addition to that, there are lots of other things we can consider, such as adding weight to the rank of home wins and teams at homes and test how much of an advantage that would actually give them and also look into the percentage of teams that have won at home this season.
All lots of things we can roll out and to be honest, there is plenty of time in which to do so – therefore I’m going to back to the ‘office’ and crack on with some more ideas. I’ll be back early next week, to see how I fared.
If this has grabbed your interest and you would like to discuss/feedback then please feel free to drop me a message at firstname.lastname@example.org. While I am always looking for new football/data projects to work on and if you feel that my skills would be of use, I can be contacted at the same address.
Follow me on Facebook at Dan The Stat Man
You can also check out my Premier League Podcast on Soundcloud