Attaching Probabilities to Game Margins: An Application of Quantile Regression

I first heard about quantile regression, I think, over a decade ago and, for whatever reason, could never quite understand it nor fathom a useful application for it here.

Recently, while building a suite of models to predict game margins for a presentation I'm giving to the Sydney Users of R Forum (SURF) later this month, the leading model amongst about 70 that I tried turned out to be a quantile regression neural network. That result was enough to encourage me to take another look at quantile regression and to see if there might be a MatterOfStats-related use for the technique.

THE TECHNIQUE 

For me, the simplest way to think about quantile regression is as a generalisation of Ordinary Least Squares (OLS). Where OLS seeks to model the conditional mean of the target variable, conditioned on whatever regressors you've chosen, quantile regression seeks to model the pth percentile of the target variable - that is, the value of the target variable that you can expect to witness less than p percent of the time - also conditioned on the selected regressors. For a more complete introduction to the technique, this paper by Koenker and Hallock is a good place to start.

What's powerful about this approach is that it provides you with more than just a point prediction of the expected target value given the regressors, but instead with estimates of the entire distribution of the target variable.

METHODOLOGY AND PERFORMANCE

To create my quantile regression I've used the quantreg package of R and applied it to the data for all games from 2006 to 2013. As regressors I've used the home team victory probability derived by applying the Overround-Equalising approach to the TAB Bookmaker's head-to-head prices, the two teams' pre-game MARS Ratings, the Interstate Status of the clash, and the two teams' recent form as measured by the change in their MARS Ratings across their two most-recent games in the current season.

(If you're new to MatterOfStats, you can read about the Overround-Equalising approach to creating probability estimates on this blog, about MARS Ratings, which are my attempt at a team rating system, on this blog, and about the definition of Interstate Status here. On-site searches will also turn up a large number of references to and uses of these variables.)

As a first step, I split the 1,536 available games into a training and holdout set, 50% in each, fitted the quantile regression to the training set (for all quantile values from 0 to 1 by 1% increments) and then used the model created in this process to estimate quantiles for the holdout games. From the fitted values for the training set and the predictions for the holdout set it's possible to calculate a variety of performance metrics, which I've summarised in the table at right.

The top section measures the calibration of the model outputs. It takes, for each game, the final margin and determines how likely (in 10% increments) that margin was according to the model. So, for example, if the final result in a game was a home team win by 52 points and this was rated as having a 27% probability by the model (determined on the basis of the percentile associated with a game margin nearest the actual margin), that result would appear in the "20-30% likely to happen" bucket.

Ideally, if the model were well-calibrated, about 10% of results would finish in each bucket so that, for example, outcomes assessed as having an 80-90% probability of occurrence would have transpired about 80-90% of the time. Clearly, the quantile regression is very well-calibrated in relation to the training data, since the proportion of games in each bucket is very close to 10%.

The model's calibration on the holdout sample is, as you'd expect, slightly poorer, most notably for the first and last deciles. The "Less than 10%" bucket has about 25% fewer results than would be ideal and the "90% or more bucket" has about 25% more. There are also some offsetting excesses and deficiencies in the two buckets in the 40-60% range.

The lower section of the table provides other performance data for the model on the training and holdout data based on converting its output into home team victory and line market success probabilities.

Home team victory probabilities were estimated by determining the percentile with an associated game margin nearest to 0 and then using 1 minus this percentile as the estimated home team probability. Armed with these probabilities, Brier and Log Probability Scores were calculated. Somewhat unusually, both probability scores are actually better in the holdout sample than in the training sample - and acceptably low (Brier) and high (Log Probability) in both.

In a similar fashion, the home team's line market success probabilities were determined on the basis of the nearest percentile for which the associated margin exceeded the TAB Bookmaker's handicap for that game. One minus this percentile is the probability that the home team prevails in the line market. If this value is greater than 50% we categorise the game as a predicted home team line market victory; if less than 50% we categorise it as an away team line market victory. The Line Market performance using this methodology, for which the expected result from random guessing would be 50% accuracy, is well above chance on the training set and on the holdout sample.

EXAMPLE OUTPUTS

When fitting a quantile regression, you choose the quantiles for which you want the regression to be estimated, and I chose to estimate mine at 2.5% increments from 0% to 100%. The table below shows the results of that estimation, fitted to all games from 2006 to 2013, for 9 of the 41 quantiles.

The coefficients shown here allow you to estimate the xth percentile for the target variable given the regressor values for a particular game. Here, for example (on the right of the table below) is what you get if you apply these results to the 2013 Grand Final, the regressors values for which appear on the left.

So, for example, these results suggest that there was only a 30% probability that the final margin from the Hawks' viewpoint might have been a loss by about 1 point or more. This is in interesting contrast to the TAB Bookmaker's assessment that the Dockers' victory probability was as high as 40%.

Estimating the model at all 41 quantiles and charting the result yields the cumulative distribution function (CDF) shown at left for the margin for this game.

One of the especially interesting features of this CDF is its asymmetric character around its median. For example, the difference between the 50th and 60th percentiles is 11.4 points, but between the 40th and 50th percentiles is only 7.0 points. This asymmetry is incompatible with a view that the final game margin, conditioned on the regressors, is distributed as a Normal random variable. I'll be looking particularly closely at the TAB Bookmaker's margin markets this year to observe whether or not his market prices imply a similar asymmetry.

It's easy to imagine using a CDF like this one created for a specific game to inform a variety of traditional and exotic assessments about the likely game outcome - for example, the probability that the home team should win by between 10 and 19 points, by 25 points or more, or by 17 to 26 points.

COMPLETE MODEL

To give you a little more idea of the quantile regression model fitted to the entirety of seasons 2006 to 2013, here firstly in a chart of the regression coefficients across all fitted quantiles. The quantiles run from left to right in each chart and the vertical height reflects the coefficient value for a given quantile.

With the exceptions of the extreme quantiles around 0 and 100, most coefficients tend to be constrained to a reasonably narrow range. Interestingly, the impact of the Bookmaker's implicit home team probability assessment diminishes as we consider larger quantiles while the impact of the home team's recent form increases. The implication of this is that the likelihood of a home team significantly exceeding expectations is less about the extent of its pre-game relative favouritism and more about its recent form. In short, home teams that have been playing well recently are more likely to do much better than expected.

Lastly, here are the cumulative distribution functions plotted for five different scenarios, with the home team of varying levels of inferiority or superiority relative to the away team, with each game assumed to be an Interstate Clash from the home team's viewpoint, and with both teams having had stable MARS Ratings over the two most-recent games.

For these lines the legend describes the assumed parameter values, ordered as follows: Home team implicit victory probability, Home team MARS Rating, Away team MARS Rating, Interstate Status, Home team Recent Form. Away team Recent Form.

It's interesting to note that the general shape of these lines is the same for all scenarios; they're just shifted to the left or the right depending on the relative abilities of the two teams.

CONCLUSION

From a practical viewpoint, the quantile regression approach looks promising as a means of estimating the likelihood of different margin-related game outcomes such as the probability that the home team will win by between 30 and 39 points. More generally, I think it might also shed light on the broader distributional properties of game margins.

I plan to explore these and other topics using the quantile regression technique further in future posts.