January 27, 2013

Using Risk-Equalising Probabilities for the Margin Predictors

January 27, 2013/ Tony Corke

With the exception of Combo_NN_2, all of the Margin Predictors rely on an algorithm that takes Bookmaker Implicit Probabilities as an input in some form:

Bookie_3 and Bookie_9 use Bookmaker Implicit Probabilities directly
ProPred_3 and ProPred_7 use the outputs of the ProPred algorithm, which uses a log transform of Bookmaker Implicit Probabilities as one input
WinPred_3 and WinPred_7 use the outputs of the WinPred algorithm, which also uses a log transform of Bookmaker Implicit Probabilities as one input
H2H_U3, H2H_U10, H2H_A3 and H2H_A7 use the outputs of the Head-to-Head algorithm, which uses Bookmaker Implicit Probabilities as one input
Combo_7 uses Bookmaker Implicit Probabilities directly as well as via its use of the outputs of the Head-to-Head Algorithm
Combo_NN_2 uses Bookmaker Implicit Probabilities directly as well as via its use of the outputs of the ProPred, WinPred and H2H algorithms

For this short blog I've switched, in all of the underlying algorithms, the Implicit Probabilities calculated using the Risk-Equalising Approach as replacements for those calculated using the Overround-Equalising Approach and then compared the resulting MAPEs for seasons 2007 to 2012 for all the Margin Predictors.

Overall, all Margin Predictors except Bookie_3 benefit from the switch, however modestly. Bookie_9, which now will serve as a co-predictor in the MAFL Margin Fund, benefits most, knocking over one quarter of a point per game off its MAPE.

The uniformity of these improvements is made slightly more remarkable by the realisation that the Margin Predictors, built using Eureqa, were optimised for the probability outputs of the underlying algorithms when those algorithms were using Overround-Equalising Implicit Probabilities. So, for example, the equation for Bookie_9, which is:

Predicted Home Team Margin = 2.2205129 + 17.729506 * ln(Home Team Bookmaker Probability/(1-Home Team Bookmaker Probability)) + 2*Home Team Bookmaker Probability

was created by Eureqa to minimise the historical MAPE of this equation when the Home Team Bookmaker Probabilities being used were those calculated assuming Overround-Equalisation. The 0.26 points per game reduction in the MAPE is being achieved without re-optimising this equation but, instead, simply by replacing the Home Team Probabilities with those calculated using a Risk-Equalising Approach.

Bookie_3 is the one Margin Predictor that responds poorly to the switch of probabilities without an accompanying re-optimisation in Eureqa. When I performed such a re-optimisation, Eureqa came up with this remarkably simple equation:

Predicted Home Team Margin = 21 * ln(Home Team Bookmaker Probability/(1-Home Team Bookmaker Probability))

This predictor has an MAPE of 29.22 points per game, which is extraordinarily low for such an easy-to-use predictor.

CONCLUSION

Virtually every algorithm used in MAFL has now been shown to benefit, however slightly, from using Implicit Probabilities calculated using the Risk-Equalising instead of the Overround-Equalising Approach. Naturallly, this makes me wonder if there's an even better way ...

Maybe next year I'll look for it.

January 18, 2013

Bookmaker Implicit Probabilities: Empirical Value of the Risk-Equalising Approach

January 18, 2013/ Tony Corke

A few blogs back I developed the idea that bookmakers might embed overround in each team's price not equally but instead such that the resulting head-to-head market prices provide insurance for a fixed (in percentage point terms) calibration error of equivalent size for both teams. Since then I've made only passing comment about the empirical superiority of this approach (which I've called the Risk-Equalising Approach) relative to the previous approach (which I've called the Overround-Equalising Approach).

December 28, 2012

Does an Extra Day's Rest Matter in the Home and Away Season?

December 28, 2012/ Tony Corke

Whenever the draw for a new season is revealed there's much discussion about the teams that face one another only once, about which teams need to travel interstate more than others, and about which teams are asked to play successive games with fewer days rest. There is in the discussion an implicit assumption that more days rest is better than fewer days rest but, to my knowledge, this is never supported by empirical analysis. It is, like much of the discussion about football, considered axiomatic. In this blog we'll assess how reasonable that assumption is.

September 20, 2012

Does An Extra Day's Rest Matter in the Finals?

September 20, 2012/ Tony Corke

This week Collingwood faces Sydney having played its Semi-Final only 6 day previously while Adelaide take on Hawthorn a more luxurious 8 days after their Semi-Final encounter. The gap for Sydney has been 13 days while that for the Hawks has been 15 days. In this blog we'll assess what, if any, effect these differential gaps between games for competing finalists might have on game outcome.

April 21, 2012

Predicting the Final SuperMargin Bucket In-Running

April 21, 2012/ Tony Corke

On Friday night, while watching the progress of the Saints v Freo game knowing that Investors has a SuperMargin wager on the Saints to win by 20-29, I was wondering how to react to the changes in the scoreline as the game progressed. Should I want the Saints to lead early? By a little? By a lot? By about 5 points at Quarter Time and 10 points at Half Time?

April 12, 2012

The Increased Importance of Predicting Away Team Scores

April 12, 2012/ Tony Corke

In an earlier blog we found that the score of the Home team carried more information about the final game margin than did the score of the Away team. One way of interpreting this fact is that, given the choice between improving your prediction of the Home team score or your prediction of the Away team score, you should opt for the former if your goal is to predict the final game margin. While that's true, it turns out that it's less true now than it once was.

April 08, 2012

Finding Non-Linear Relationships Between AFL Variables : Alternative Measures to MIC

April 08, 2012/ Tony Corke

The Maximal Information Coefficient (MIC) that we explored in the previous blog is not the only non-linear measure of the pairwise relationship between any two continuous variables.

April 02, 2012

Finding Non-Linear Relationships Between AFL Variables : The MINER Package

April 02, 2012/ Tony Corke

It's easy enough to determine whether or not one continuous variable has a linear relationship with another, and how strong that relationship is, by calculating the Pearson product-moment correlation coefficient for the two variables. A value near +1 for this coefficient indicates a strong, positive linear relationship between the variables in question, so that high values of one tend to coincide with high values of the other, and vice versa for low values; a value near -1 indicates a strong, negative linear relationship; and a value of 0 indicates a lack of any linear relationship at all. But what if we want to assess more generally if there's a relationship between two variables, linear or otherwise, and we don't know the exact form that this relationship takes? That's the purpose for which the Maximal Information Coefficient (MIC) was created, and recently made available in an R package called MINER.

March 25, 2012

Predicting the Final Margin In-Running (and Does Momentum Exist)?

March 25, 2012/ Tony Corke

Just a short post tonight while we wait for the serious footy to begin. For this blog I've again called upon the services of Formulize, this time to find for me equations that predict the final victory margin for the Home team (which might be negative or zero) purely as a function of the scores at the various quarter breaks.

February 16, 2012

Specialist Margin Prediction: Epsilon Insensitive Loss Functions

February 16, 2012/ Tony Corke

In the last blog we looked at Margin Prediction using what I called "bathtub" loss functions. For the current blog I've extended the range of loss functions to include what are called epsilon-insensitive loss functions, which are similar to the "bathtub" loss functions except that they don't treat absolute errors of size greater than M points equally.

February 09, 2012

Specialist Margin Prediction: "Bathtub" Loss Functions

February 09, 2012/ Tony Corke

We know that we can build quite simple, non-linear models to predict the margin of AFL games that will, on average, be within about 30 points of the actual result. So, if you found a bet type for which general margin prediction accuracy was important - where every point of error contributed to your less - then this would be your model. This year we'll be moving into margin betting though, where the goal is to predict within X points of the actual result and being in error by X+1 points is no different from being wrong by X+100 points. In that environment, our all-purpose model might not be the right choice. In this blog I'll be describing a process for creating margin predicting models that specialise in predicting within X points of the final outcome.

September 15, 2011

Explaining More of the Variability in the Victory Margin of Finals

September 15, 2011/ Tony Corke

This morning while out walking I got to wondering about two of the results from the latest post on the Wagers & Tips blog. First that teams from higher on the ladder have won 20 of the 22 Semi Finals between 2000 and 2010, and second that the TAB bookmaker has installed the winning team as favourite in only 64% of these contests. Putting those two facts together it's apparent that, in Semi Finals at least, the bookmaker's often favoured the team that finished lower on the ladder, and these teams have rarely won.

August 18, 2011

The 2011 Performance of the MARS, Colley and Massey Ratings Systems

August 18, 2011/ Tony Corke

I was curious - and that's rarely a portent of lazy evenings - as to which of the three ratings systems we've been tracking since Round 14 is best, so I set about finding out.

August 03, 2011

Predicting the Home Team's Final Margin: A Competition Amongst Predictive Algorithms

August 03, 2011/ Tony Corke

With fewer than half-a-dozen home-and-away rounds to be played, it's time I was posting to the Simulations blog, but this year I wanted to see if I could find a better algorithm than OLS for predicting the margins of victory for each of the remaining games.

July 28, 2011

Projecting the Favourite's Final Margin

July 28, 2011/ Tony Corke

In a couple of earlier blogs I created binary logit models to predict the probability that the favourite would win given a specified lead at a quarter break and the bookmaker's assessed pre-game probability for the favourite. These models allow you to determine what a fair in-running price would be for the favourite. You might instead want to know what the favourite's projected victory margin is given the same input data, so in this blog I'll be providing some simple linear regressions that provide this information.

March 15, 2011

A Friendly Wager on the Margin

March 15, 2011/ Tony Corke

You're watching the footy with a mate who leans over and says he reckons the Cats will win by 15 points. How much leeway should you give him to make it a fair even money bet? Surprisingly - to me anyway - the answer is 24 points either way. So, if the Cats were to record any result between a loss by 9 points and a win by 39 points you should pay out.

March 08, 2011

Introducing MAFL's First Neural Network

March 08, 2011/ Tony Corke

I've been leery of neural networks for some time because of their perhaps undeserved reputation for overfitting data and because of the practical difficulties that have existed in using them for prediction. Phil Brierly's Tiberius software includes an implementation of neural networks that has, at least for now, converted me. As a consequence, I'm adding one final margin predictor to the mix for 2011.

March 05, 2011

Margin Prediction for 2011

March 05, 2011/ Tony Corke

We've fresh tipsters for 2011, fresh Funds for 2011, so now we need fresh margin predictors for 2011. This year, all of the margin predictors are based on models that produce probability forecasts, which includes the algorithms powering ProPred, WinPred and the Head-to-Head Fund and the "model" that is the TAB Sportsbet bookmaker. The process for creating the margin predictors was to let Eureqa loose on the historical data for seasons 2007 to 2010 to produce equations that fitted previous home team margins of victory as a function of these models' probabilities.

October 01, 2010

Grand Final Margins Through History and a Last Look at the 2010 Home-and-Away Season

October 01, 2010/ Tony Corke

A couple of final charts before GF 2.0.

The first chart looks at the history of Grand Finals, again. Each point in the chart reflects four things about the Grand Final to which it pertains ...

September 27, 2010

Drawing On Hindsight

September 27, 2010/ Tony Corke

When sports journos wait until after a contest has been decided before declaring a group of winning punters to be "savvy", I find it hard not to be at least a little cynical about the aptness of the label.

So when, on Sunday, I read in the online version of the SMH that a posse of said savvy punters had foxed the bookies and cleaned up on the draw, collectively winning as I recall about $1m at prices ranging from $34 to $51, I did wonder how many column-inches would have been devoted to those same punters had the margin been anything different when the final siren sounded on Saturday. I'm fairly certain it would have been the number that has '1' as its next-door, up the road neighbour on Integer Street.

Statistical Analyses