September 24, 2013

The Relative Importance of Class and Form in AFL

September 24, 2013/ Tony Corke

Today's blog is motivated by a number of things, the first of which is alluded to in the title: the quantitative exploration of the contributions that teams' underlying class or skill plays in their success in a given game relative to their more recent, more ephemeral form. Is, for example, a top-rated team that's been a little out of form recently more or less likely to beat a less-credentialled team that's been in exceptional form?

August 10, 2013

Game Margins and the Generalised Tukey Lambda Distribution

August 10, 2013/ Tony Corke

The Normal Distribution often turns up, like the Spanish Inquisition, in places where you've no a priori reason to expect it. For example, I've shown before that bookmaker handicap-adjusted margins appear to be distributed Normally.

August 06, 2013

Measuring Momentum in Game Margins

August 06, 2013/ Tony Corke

The topic of momentum is one I've explored before: in terms of game-to-game results for different teams, from the perspective of quarter-to-quarter outcomes, and even by examining scoring shot to scoring shot progressions within a game.

July 24, 2013

The Predictability of Game Margins

July 24, 2013/ Tony Corke

In a recent blog post I described how the results of games in 2013 have been more predictable than game results from previous seasons in the sense that the final victory margins have been, on average, closer to what you'd have expected them to be based on a reasonably constructed predictive model. In short, teams have this year won by margins closer to what an informed observer, like a Bookmaker, would have expected.

July 16, 2013

The Predictability of 2013

July 16, 2013/ Tony Corke

Friend of MAFL, Michael, e-mailed me earlier to ask about my claim that 2013 was on track to be the most predictable MAFL season ever, pointing out, quite correctly, that bookmaker favourites have been winning at about the same rate - perhaps even at a slightly higher rate - as they had been at the same time last year.

June 29, 2013

Game Statistics and Game Outcomes

June 29, 2013/ Tony Corke

My first Matter of Stats blog looked at how game statistics, averaged across an entire season for each team, are predictive of key season outcomes like ladder position, competition points and MARS Ratings. This post summarises similar analyses, but here performed on a per-game basis

June 29, 2013

Simulating SuperMargin Wagering

June 29, 2013/ Tony Corke

Season 2013 has been a good one, so far, for SuperMargin wagering, which led me to ponder why that might be the case. More generally, I wondered if we could define the characteristics of a season and of the predictive algorithm that we're using for selecting wagers, which are most propitious for this form of wagering.

May 28, 2013

Really Simple Proves Remarkably Effective

May 28, 2013/ Tony Corke

The Really Simple Margin Predictors (RSMPs), which were purpose-built for season 2013, have shown themselves to be particularly accurate at forecasting game margins. So much so, in fact, that they're currently atop the MAFL Leaderboard, ahead of the more directly Bookmaker-derived Predictors like Bookie_3 that have excelled in previous years.

May 11, 2013

The Value of MARS Ratings Points Across Time Revisited

May 11, 2013/ Tony Corke

When last I considered the issue of the value of MARS Ratings across time I assessed their value in terms of a team's victory probability. Perhaps a more intuitive approach would be to, instead, value them in terms of a team's victory margin.

February 24, 2013

Are the Victory Margins for Some Games Harder to Predict than for Others?

February 24, 2013/ Tony Corke

It's unarguable that the winner of some games will be harder to predict than the winner of others. When genuine equal-favourites meet, for example, you've only a 50:50 chance of picking the winner, but you can give yourself a 90% chances of being right when a team with a 90% probability of victory meets a team with only a 10% chance. The nearer to equal-favouritism the two teams are, the more difficult the winner is to predict, and the further away we are from this situation the easier the game is to predict.

February 14, 2013

The Probability of a Draw

February 14, 2013/ Tony Corke

Lately it seems I've been specialising in blogs on topics that I've covered before, and tonight's blog is no exception. It's on estimating the probability of a draw.

February 02, 2013

Building Simple Margin Predictors

February 02, 2013/ Tony Corke

Having a new - and, it seems, generally superior - way to calculate Bookmaker Implicit Probabilities is like having a new toy to play with. Most recently I've been using it to create a family of simple Margin Predictors, each optimised in a different way.

January 27, 2013

Using Risk-Equalising Probabilities for the Margin Predictors

January 27, 2013/ Tony Corke

With the exception of Combo_NN_2, all of the Margin Predictors rely on an algorithm that takes Bookmaker Implicit Probabilities as an input in some form:

Bookie_3 and Bookie_9 use Bookmaker Implicit Probabilities directly
ProPred_3 and ProPred_7 use the outputs of the ProPred algorithm, which uses a log transform of Bookmaker Implicit Probabilities as one input
WinPred_3 and WinPred_7 use the outputs of the WinPred algorithm, which also uses a log transform of Bookmaker Implicit Probabilities as one input
H2H_U3, H2H_U10, H2H_A3 and H2H_A7 use the outputs of the Head-to-Head algorithm, which uses Bookmaker Implicit Probabilities as one input
Combo_7 uses Bookmaker Implicit Probabilities directly as well as via its use of the outputs of the Head-to-Head Algorithm
Combo_NN_2 uses Bookmaker Implicit Probabilities directly as well as via its use of the outputs of the ProPred, WinPred and H2H algorithms

For this short blog I've switched, in all of the underlying algorithms, the Implicit Probabilities calculated using the Risk-Equalising Approach as replacements for those calculated using the Overround-Equalising Approach and then compared the resulting MAPEs for seasons 2007 to 2012 for all the Margin Predictors.

Overall, all Margin Predictors except Bookie_3 benefit from the switch, however modestly. Bookie_9, which now will serve as a co-predictor in the MAFL Margin Fund, benefits most, knocking over one quarter of a point per game off its MAPE.

The uniformity of these improvements is made slightly more remarkable by the realisation that the Margin Predictors, built using Eureqa, were optimised for the probability outputs of the underlying algorithms when those algorithms were using Overround-Equalising Implicit Probabilities. So, for example, the equation for Bookie_9, which is:

Predicted Home Team Margin = 2.2205129 + 17.729506 * ln(Home Team Bookmaker Probability/(1-Home Team Bookmaker Probability)) + 2*Home Team Bookmaker Probability

was created by Eureqa to minimise the historical MAPE of this equation when the Home Team Bookmaker Probabilities being used were those calculated assuming Overround-Equalisation. The 0.26 points per game reduction in the MAPE is being achieved without re-optimising this equation but, instead, simply by replacing the Home Team Probabilities with those calculated using a Risk-Equalising Approach.

Bookie_3 is the one Margin Predictor that responds poorly to the switch of probabilities without an accompanying re-optimisation in Eureqa. When I performed such a re-optimisation, Eureqa came up with this remarkably simple equation:

Predicted Home Team Margin = 21 * ln(Home Team Bookmaker Probability/(1-Home Team Bookmaker Probability))

This predictor has an MAPE of 29.22 points per game, which is extraordinarily low for such an easy-to-use predictor.

CONCLUSION

Virtually every algorithm used in MAFL has now been shown to benefit, however slightly, from using Implicit Probabilities calculated using the Risk-Equalising instead of the Overround-Equalising Approach. Naturallly, this makes me wonder if there's an even better way ...

Maybe next year I'll look for it.

December 28, 2012

Does an Extra Day's Rest Matter in the Home and Away Season?

December 28, 2012/ Tony Corke

Whenever the draw for a new season is revealed there's much discussion about the teams that face one another only once, about which teams need to travel interstate more than others, and about which teams are asked to play successive games with fewer days rest. There is in the discussion an implicit assumption that more days rest is better than fewer days rest but, to my knowledge, this is never supported by empirical analysis. It is, like much of the discussion about football, considered axiomatic. In this blog we'll assess how reasonable that assumption is.

September 20, 2012

Does An Extra Day's Rest Matter in the Finals?

September 20, 2012/ Tony Corke

This week Collingwood faces Sydney having played its Semi-Final only 6 day previously while Adelaide take on Hawthorn a more luxurious 8 days after their Semi-Final encounter. The gap for Sydney has been 13 days while that for the Hawks has been 15 days. In this blog we'll assess what, if any, effect these differential gaps between games for competing finalists might have on game outcome.

July 01, 2012

Team Scores - Statistical Distribution and Dependence

July 01, 2012/ Tony Corke

In the most recent post on the Simulations blog I assumed that Home Team and Away Team scores were independently and Normally distributed (about their conditional means). I'll investigate both these assumptions in this blog.

June 09, 2012

Estimating Fair Head-to-Head Prices : Part II

June 09, 2012/ Tony Corke

In the previous blog on this topic I described a way to estimate the vig embedded in the head-to-head prices of both teams

April 12, 2012

The Increased Importance of Predicting Away Team Scores

April 12, 2012/ Tony Corke

In an earlier blog we found that the score of the Home team carried more information about the final game margin than did the score of the Away team. One way of interpreting this fact is that, given the choice between improving your prediction of the Home team score or your prediction of the Away team score, you should opt for the former if your goal is to predict the final game margin. While that's true, it turns out that it's less true now than it once was.

April 08, 2012

Finding Non-Linear Relationships Between AFL Variables : Alternative Measures to MIC

April 08, 2012/ Tony Corke

The Maximal Information Coefficient (MIC) that we explored in the previous blog is not the only non-linear measure of the pairwise relationship between any two continuous variables.

April 02, 2012

Finding Non-Linear Relationships Between AFL Variables : The MINER Package

April 02, 2012/ Tony Corke

It's easy enough to determine whether or not one continuous variable has a linear relationship with another, and how strong that relationship is, by calculating the Pearson product-moment correlation coefficient for the two variables. A value near +1 for this coefficient indicates a strong, positive linear relationship between the variables in question, so that high values of one tend to coincide with high values of the other, and vice versa for low values; a value near -1 indicates a strong, negative linear relationship; and a value of 0 indicates a lack of any linear relationship at all. But what if we want to assess more generally if there's a relationship between two variables, linear or otherwise, and we don't know the exact form that this relationship takes? That's the purpose for which the Maximal Information Coefficient (MIC) was created, and recently made available in an R package called MINER.

Statistical Analyses