Matter of Stats

View Original

Estimating Team-and-Venue Specific Home Ground Advantage Using the VSRS

In the Very Simple Rating System as I've described it so far, a single parameter, HGA, is used to adjust the expected game margin to account for the well-documented advantages of playing at home. We found that, depending on the timeframe we consider and the performance metric that we chose to optimise, the estimated size of this advantage varied generally in the 6 to 8-point range.

That value for HGA is an average though, across all teams and all venues. Treating HGA as if it were the same for all teams and all venues is, regular AFL-followers will attest, a vast oversimplification. The Cats playing at Kardinia Park are a far more daunting prospect than the Dogs playing at Docklands.

It'd be nice then to account for the variability of HGA across venues and across teams - best of all, across both considered together. Earlier blogs have attempted this to some extent. For example, in two recent blogs we've looked at:

In both of these blogs, HGA (or excess scoring) estimates took into account the quality, form and Venue Experience of the opposition faced. As such, those estimates isolated the unique contribution of the venue itself. But, they were all relative and not absolute estimates, which makes their interpretation less intuitive than is ideal - we can say, for example, that the Pies are a Y-point better team when playing at the MCG than at Docklands, but we can't say how many extra points they score at either venue adjusting solely for the quality of opposition. In short, we can't say what the HGA is for any given combination of team and venue.

Using the VSRS we can, instead, derive absolute HGA estimates for every team and home ground venue combination, while also adjusting for the quality of the opposition faced.

THE METHODOLOGY

For this blog I'm going to focus on the period 1999 to 2013, setting every team's initial Rating to 1,000 at the start of the 1999 season (or, for GWS and Gold Coast, at the time of their first game). I'm also going to adopt the convention of assuming that the higher-rated team is the home team in any Final. In all other games I'll use the AFL designation of home team status.

Adapting the VSRS to estimate team-specific HGAs is very simple. The equation for calculating the expected game margin, you'll recall, is:

Expected Game Margin = Home Team Rating - Away Team Rating + HGA

All we need do is replace the single HGA parameter with a team-and-venue specific HGA (of which there'll be as many as there are combinations of team and home venue that we want to recognise). Then, when we're calculating the expected game margin for the home team we subtract the team-and-venue specific HGA rather than the all-team, all-venue HGA.

If we consider only those venues for each team at which 10 or more games have been played as home fixtures, grouping home games for that team played at any other venue under an "Other Venues" category, we wind up adding about 60 new parameters to the VSRS. Optimal values for these parameters can be estimated in the same way as was employed in the previous blog: choosing a performance metric and then using Excel Solver and a lot of hand-optimisation to find the "best" values of all the tuning parameters considered simultaneously.

(As a side note, the process of determining the optimal parameters as recorded in this blog has taken a couple of weeks ... and there is no guarantee at all that they're globally optimal.)

THE RESULTS

The table at right records the estimated HGA for all team and venue combinations appearing at least 25 times during the 1999 to 2013 period. These HGA estimates were determined by optimising the VSRS using the Mean Absolute Error metric and have been sorted from highest to lowest.

So, games involving the Cats playing at home at Docklands (which will include any Finals they've played there as the higher-rated team) have seen the largest discrepancies between the pre-game differences in the ratings of the competing teams and the post-game differences in the scores. On average, the Cats have enjoyed about a 4-goal HGA at Docklands, which we can think of as the margin of victory we'd expect them to record at that venue against a team of equal rating at the time.

Five of the next six highest estimated HGAs are for interstate teams playing at home: the Eagles and Freo at Subiaco, the Crows at Football Park, the Lions at the Gabba, and the Suns at Gold Coast Stadium. Each of these teams enjoys about a 2 to 3-goal HGA at these venues.

In the middle of that pile are the Saints who, at Docklands, have played with about a 17-point HGA.

As we've noted in previous blogs on this same topic, Victorian teams with shared home grounds have all but taken the A out of HGA. Melbourne has fared best, enjoying about a 12-point HGA at the MCG, while Carlton has managed about an 8-point advantage at this same venue though playing there as the home team only about four times per season on average across the period.

Essendon has enjoyed 5-point HGAs at the MCG and at Docklands, while Richmond has barely eked out a 2-point HGA at Docklands and a 1-point HGA at the MCG. The Roos have fared even worse, with HGAs of only about 1 point at the MCG - where they've played only a couple of times per season as the home team anyway - and at Docklands.

At least those HGAs are positive. Hawthorn and Collingwood are the poster children of misnamed HGA. They both have negative estimated HGAs at all Victorian home grounds (the MCG in the Hawks' case, and the MCG and Docklands in the Pies'). Arguably, this is partly attributable to the effects of playing Finals at these grounds, on which occasions the concept of "home" might be more notional, though this would surely only diminish the size of the HGA, not completely extinguish (and even reverse) it. Maybe it's simply the case that the Hawks and the Pies are better teams on the road than they are at their communal home grounds.

OPTIMISED PARAMETERS

The table that follows provides the full set of optimised parameters for three metrics: Root Mean Squared Error, Mean Absolute Error and the Brier Score (for details on these metrics and their use in the VSRS see this previous blog). I excluded the Mean Absolute Percentage Error because, in the context of recent history where average scoring per game has been relatively constant, optimising Mean Absolute Error is roughly equivalent to optimising Mean Absolute Percentage Error.

Generally, the estimates of HGA for all combination of team and venue are similar regardless of the metric we choose to optimise, especially for the most common combinations. 

Nonetheless it's interesting to find once again that the answer to a very straightforward question (viz what is the size of the HGA for a particular team and venue) can differ, quite legitimately, depending on the mathematical context in which you're planning to use that answer. The numbers that I'm showing here are also different from those provided in the blogs linked to earlier, which were based on slightly different assumptions and useful in a slightly different context. If you want to exist peacefully as a statistical modeller you need to be precise about the question you're asking and the intended use for the answer - and you need to be comfortable about ambiguity and uncertainty.

Of the three sets of results tabulated here, those for optimising the Brier Score tend to be the most extreme. This version of the VSRS, you'll recall, is used for estimating the victory probability of the home team rather than its victory margin and is, in that sense, qualitatively different from the VSRS' created by optimising Mean Absolute Error or Root Mean Square Error.

One other interesting feature of the Brier Score-optimised VSRS is its rating-stickiness (ie smaller value of k) relative to the MAE- and RMSE-optimised VSRSs. Because it adopts a smaller value of k it adjusts the ratings of teams by a smaller proportion of the difference between the actual and expected game margin, punishing unexpectedly large negative differences less, but rewarding unexpectedly large positive differences less also.

The Brier Score-optimised VSRS also, however, cares less about a team's rating at the end of the previous season when assessing its chances in a new season. It carries forward only about 40% of a team's end-of-season rating into the next whereas the MAE- and RMSE-optimised VSRSs carry forward 50 to 55%.

In summary, the MAE- and RMSE-optimised VSRSs remember more about previous season results at the start of a new season (higher Carryover) but discount this knowledge more rapidly (higher k) as results in the current season become known. 

VSRS PERFORMANCE

When we apply these three VSRSs to the actual results for the 1999 to 2013 period we obtain the results shown in the following table.

Recall that, for all metrics, smaller is better. In the table I've colour-coded the results so that, for any given metric, particularly good seasons are coloured green and particularly bad ones red. Season 2013 was, therefore, clearly a good one for all three Systems in an historical context, which confirms what I'd reported in an earlier blog about the relative predictability of results in 2013. 

In fact, so good is that MAE of 26.87 for 2013 that it would have been enough to place this VSRS third amongst the MAFL Margin Predictors at the end of the season. Obviously, the MAE-optimised VSRS has the clear advantage of having been optimised, in part, on the actual results from 2013, but that benefit should be contexted by recognising that this VSRS has also been optimised - with equal weighting - on games from over a decade ago. 

The overall performances of all three VSRSs are very creditable, beating the empirically-determined hurdles of 37 points per game for RMSE, 30 points per game for MAE and 0.200 for Average Brier Score. A naive tipster who predicted a draw in every game across the period 1999 to 2013 would have recorded an RMSE of 44.8 points per game, an MAE of 35.5 points per game, and an Average Brier Score of 0.248.