An Improved VFL/AFL Team Rating System: MoSSBODS 2.0

November 3, 2015 Tony Corke

Earlier this year in this blog, I introduced the MoSSBODS Team Rating System, an ELO-style system that provides separate estimates of each team's offensive and defensive abilities, as well as a combined estimate formed from their sum. That blog post describes the main motivations for a MoSSBODS-like approach, which I'll not repeat here.

What I will do here though is reiterate the key components of the System and explain how they've been altered in MoSSBODS 2.0. One important difference is that MoSSBODS 2.0 can be applied to any period of VFL/AFL history, which facilitates - to the extent that it ever really makes sense to do so - comparisons of teams across eras.

Version 2.0 of MoSSBODS almost certainly won't be the final version, but I think it's at a point now where it deserves to enter the canon and where it's worth exploring what it suggests.

(The name MoSSBODS, by the way, stands for Matter of Stats' Scoring Shot Based Offence-Defence System. Quite apparently, I still need a Marketing arm ...)

MOSSBODS Mechanics

In MoSSBODS 2.0, as in the original MoSSBODS, each team has two Ratings defined as follows:

Defensive Rating, which measures how many Scoring Shots the rated team would be expected to concede to an average team at a neutral venue relative to the all-team Scoring Shot average determined for that season.

In the original MoSSBODS I used 25.3 Scoring Shots per game per team, which was the average for all games from the period 2005 to 2014. That average can't sensibly be applied to the entire span of VFL/AFL history because the per-season average has shown so much variability (see chart at right). Instead, for a given season I use the average number of Scoring Shots per game per team from the previous season.

So, under MoSSBODS 2.0, a team Rated +3 for example would be expected to concede to an average team at a neutral venue, 3 Scoring Shots fewer than the average number conceded in all games from the preceding season. In 2015, the average was 23.6 Scoring Shots per team per game, so if that team were playing in 2016 it would be expected to concede only 20.6 Scoring Shots.

Offensive Rating, which measures how many Scoring Shots the rated team would be expected to score against an average team at a neutral venue relative to the all-team Scoring Shot average determined for that season in the same way as described above.

Under MoSSBODS 2.0 then, a team Rated -2 would be expected to score, against an average team at a neutral venue, 2 Scoring Shots fewer than the average number conceded in all games from the preceding season.

Note that, as for the original MoSSBODS, higher Offensive and Defensive Ratings are always better than lower Ratings, and 1 Rating Point is equivalent to 1 Scoring Shot.

Defensive and Offensive Ratings are updated after each game using the following equations:

New Defensive Rating = Old Defensive Rating + k x (Actual Defensive Performance – Expected Defensive Performance)
Actual Defensive Performance = All-Team Scoring Shot Average – Adjusted Opponent’s Scoring Shots
Expected Defensive Performance = Own Defensive Rating – Opponent’s Offensive Rating + Venue Adjustment / 2
Adjusted Opponent’s Scoring Shots = min(Scoring Shot Cap, Actual Opponent’s Scoring Shots)
New Offensive Rating = Old Offensive Rating + k x (Actual Offensive Performance – Expected Offensive Performance)
Actual Offensive Performance = Adjusted Own Scoring Shots - All-Team Scoring Shot Average
Expected Offensive Performance = Own Offensive Rating – Opponent’s Defensive Rating + Venue Adjustment / 2
Adjusted Own Scoring Shots = min(Scoring Shot Cap, Actual Own Scoring Shots)
Venue Adjustment = Nett Venue Performance + Nett Travel Penalty

Parameter values aside, the only difference in MoSSBODS 2.0 is in that final equation, which I'll return to in a moment.

But, before that, a couple of other things to note:

Teams are given initial Offensive and Defensive Ratings of 0 prior to their first game.
Empirical "optimisation" suggests that no Cap is required on teams’ actual Scoring Shot data. Consequently, in (4) and (8) above, no Cap is imposed.
Teams carry 70% of their Rating from the end of one season to the start of their next season so, for example, a team finishing a season with a +3 Defensive Rating will start their next season with a +2.1 Defensive Rating.
Sydney, in its first year, is considered to be a continuation of South Melbourne, the Western Bulldogs a continuation of Footscray, and the Brisbane Lions a continuation of Fitzroy (the Brisbane Bears therefore being assumed to disappear for Ratings purposes at the end of 1996). The Kangaroos and North Melbourne are also treated as the same team regardless of the name used in any particular season.

In equation (2), the All-Team Scoring Shot Average parameter has been set on the basis described above (ie as the average for the preceding season). In the other equations, acceptable parameter values have been determined using the entire history of VFL/AFL football, broadly seeking to minimise the Mean Absolute Error (MAE) in predicted game margins relative to actual game margins. This is how, for example, the 70% Carryover proportion and the decision to apply no Cap, as described above, was determined.

MAE minimisation was pursued under some informal constraints:

Optimise to the nearest round number. So, prefer a Carryover of 70% to, say, 68.2%
Prefer better fits to later eras (say the last 25 years) than better fits to earlier eras, provided this can be achieved without damaging the all-time fit "too much"
Prefer solutions with fewer parameters and fewer differently-valued parameters for a similar parameter type. So, for example, as we'll see in a moment, some of the k's have the same value.

One particularly important parameter (or set of parameters, as it turns out) in the equations is the k used in equations (1) and (5). It determines the extent to which unexpectedly good or bad performances alter Ratings, with larger values of k leading to larger Rating changes for a given performance.

MoSSBODS 2.0, as the original MoSSBODS, varies k according to the round in which the game is played, as follows:

k = 0.15 for Rounds in approximately the first one-third of the Home and Away Season
k = 0.10 for the remainder of rounds in the Home and Away Season, excepting the last 3 rounds
k = 0.075 for the last 3 rounds of the Home and Away season
k = 0.10 for Finals

A larger k in early rounds helps Ratings more rapidly adjust to team abilities in a new season, while the smaller k for the final few rounds of the Home and Away season prevents these games from having an unduly large affect on Ratings when their outcomes might not reflect the true underlying abilities of teams already assured of a Finals berth, or already thinking about the following season (or, dare I say it, in more recent seasons, teams "optimising" their draft picks).

Note that, originally, I had planned to use 5 k's as in MoSSBODS rather than 4, but ended up setting the second and third k's from the Home and Away season equal to the same value.

VENUE ADJUSTMENT

Venue Performance

Probably the biggest difference in MoSSBODS 2.0 is its treatment of Game Venue. For a while now I've grappled with how best to handle home team status, especially in games where the designated "home" team seems odd - such as the time when Geelong played a "home" game against the Brisbane Lions at the Gabba, and when Richmond played a "home" game against the Gold Coast at Gold Coast Stadium.

The problem mostly arises because of the tradition of treating Home Ground Advantage as a binary thing, enjoyed by - or, more correctly, assessed for - only one of the two teams in the contest. MoSSBODS 2.0 ignores this tradition and instead bases its assessment of how beneficial or otherwise a particular venue is to a given team's chances by looking at its historical performance at that venue, regardless of whether it was playing as the designated "home" or "away" team. This treatment does, I know, mean that I'm ignoring the fact that the members of the AFL-designated "home" team receive preferential ticketing treatment and so might attend in larger numbers than the members of the designated "away" team. Implicitly, I'm assuming that the effect of this is small enough not to matter a lot.

Specifically, MoSSBODS proceeds as follows:

Until a team has played some minimal number of games at a venue, set its Venue Performance value to zero
Once a team has played the minimal number of games at a venue, calculate for those games the average difference between its Scoring Shot performance (Own Scoring Shots less Opponent Scoring Shots) and that predicted based on its own and its opponents' Ratings at the time (ie Own Expected Offensive Performance + Own Expected Defensive Performance - Opponent Expected Offensive Performance - Opponent Expected Defensive Performance). Set the Venue Performance equal to this average.
As the team plays subsequent games at that same venue update its Venue Performance using an exponential smoothing method, that is:
New Venue Performance = Alpha x [Own Expected Offensive Performance + Own Expected Defensive Performance - Opponent Expected Offensive Performance - Opponent Expected Defensive Performance] + (1-Alpha) x Old Venue Performance.

Small values of Alpha prevent a team's Venue Performance figure from reacting too much to the outcome of a a single game, while larger values have the opposite effect. It turns out that dampening the effects of recent games quite a bit is a good strategy - in fact, a value of 1% works well.

We also need to set a threshold for the minimum number of games required before we establish an initial Venue Performance value. MoSSBODS 2.0 uses 30 games for this purpose. All teams have played fewer than 30 games at most venues and so still have Venue Performance values of zero for these grounds.

One venue though where almost all teams have played at least 30 games is the MCG, so the chart of Venue Performance against time for this ground provides a useful illustration of how Venue Performance figures change over time.

For each team you can see that Venue Performance is zero for the first 30 games before moving (leaping in some cases) to an assessment based on the relative performances in those first 30 games and then smoothly tracking up and down on the basis of subsequent performances. At the end of the 2015 season, most teams have MCG Venue Performance values in the -0.5 to +0.5 range (only Gold Coast and GWS have yet to play 30 games at the MCG and so still have their initial zero value).

The only teams with MCG Venue Performance Values outside that range are:

Adelaide +1.4
Geelong +0.7
Port Adelaide -0.6
Western Bulldogs -0.8
St Kilda -1.2
Fremantle -2.1

(Note that these figures exclude the Travel Penalty, which I'll describe in a moment and which ultimately reduces the advantage that Adelaide enjoys, and exacerbates the detriment that Port Adelaide and Fremantle suffer, by 3 Scoring Shots except were they to be playing at the MCG in a game against another interstate team - say in a Final.)

For any game, we can determine a Nett Venue Performance for either of the teams as the difference between its own Venue Performance value for that Venue and the Venue Performance value of its opponent for that same Venue.

Travel Penalty

Almost every analysis of game margins that I've ever performed has identified another aspect of a game's venue that has been predictive of that game's outcome: whether or not the teams have travelled significantly different distances to play at the venue.

MoSSBODS 2.0 also incorporates this dimension and imposes a flat 3 Scoring Shot penalty on any team that is deemed to have travelled out of its "region". MoSSBODS 2.0 allocates venues to one of ten regions:

Sydney
Canberra
Greater Melbourne (which includes Geelong)
Regional NSW and Victoria (required for some of the games from the early parts of football history played in Albury, Euroa and Yallourn)
Tasmania
Brisbane and the Gold Coast
Adelaide
West Australia
Northern Territory
New Zealand

All teams are then mapped to one of these regions (though, in the case of Sydney and the Brisbane Lions this mapping changes with the switch from South Melbourne and Fitzroy, respectively). The Travel Penalty is imposed whenever a team plays a game at a venue from outside its own region.

The Nett Travel Penalty for either of team is calculated as the difference between its own Travel Penalty and that of its opponent. Note that this nett figure can be zero because neither team is required to travel out-of-region, or because both are (eg for a game played in Tasmania).

I was originally sceptical that a flat Travel Penalty was a reasonable way to handle this dimension, but an analysis of the MAE by Venue for the period 1982 to 2015, comparing Victorian and non-Victorian venues where at least 30 games had been played in that time, suggests that, as a first-order approximation at least, a flat 3 Scoring Shot penalty performs adequately. See the chart at right for the evidence.

Still, if there ever is a MoSSBODS 2.1, I expect that this parameter will be one that is subject to review.

Returning now to the originally stated equations, the total Venue Adjustment for a game in equation (9) is simply the sum of the Nett Venue Performance figures and the Nett Travel Penalty figures as just defined.

Combining the Venue Performance and Travel Penalty values for relevant combinations of current teams and Venues, and expressing the results relative to a reference team, since it's relative rather than absolute values that ultimately affect team Ratings, we come up with the following table, which we can think off as a quasi-HGA table where the reference team is considered to be the "home" team.

So, for example, if Adelaide faced the Lions at the Gabba, Brisbane would enjoy a nett Venue Adjustment of +3.1 Scoring Shots. The same figure would apply to any other team except the Gold Coast, no team having played the minimum 30 games at the Gabba to have yet had a Venue Performance figure assessed, and all but the Gold Coast being levied with the 3 Scoring Shot Travel Penalty.

Since the reference team has been chosen as the most-likely genuine home team for those venues where this is possible, most of the values shown in the table are negative, reflecting a nett positive quasi-HGA as we'd expect..

There are though a few interesting exceptions, most notably Carlton and Collingwood at Kardinia Park where they enjoy a small advantage over the Cats.

Fremantle's significantly negative value for the MCG is also interesting, its 5.8 Scoring Shot penalty relative to the reference team Cats comprising a negative 2.8 Scoring Shot Venue Performance figure and a negative 3 Scoring Shot Travel Penalty.

PERFORMANCE METRICS

The fundamental unit of MoSSBODS 2.0 is the Scoring Shot, so it seems logical to firstly assess the System's performance by calculating its Mean Absolute Error (MAE) by season (ie the average difference between its predicted game margin in Scoring Shots and the actual game margin in Scoring Shots).

What we find is that MoSSBODS 2.0 consistently produces season MAEs in the 6 to 8 Scoring Shot range, with the smallest MAEs in the earlier parts of the competition's history, and the largest MAEs in the mid to late 1980s. The largest single season MAE was 8.78 Scoring Shots per game in 1987, and the smallest MAE 5.15 Scoring Shots per game in 1913.

The trend in Scoring Shot MAEs since the late 1980s is encouragingly downwards. This timing is just after the competition's expansion outside of Victoria commencing in 1982 with the Swans coming to Sydney, and continuing in 1987 with the addition of West Coast and the Brisbane Bears. It was with the introduction of these teams that the Travel Penalty first became relevant, so the downward trend in MAEs since around that time provides further evidence for the efficacy of the flat 3 Scoring Shot Travel Penalty.

A more familiar metric than Scoring Shot MAE is Margin MAE, but to use this we need to convert Scoring Shots to Points. In the original MoSSBODS we simply assumed a 53% Conversion Rate for this purpose, which meant than 1 Scoring Shot was equivalent to 3.65 points. It's not appropriate to use this same estimated Conversion Rate for the entirety of VFL/AFL history, however, because, like Scoring Shots per team per game, actual Conversion Rates have varied quite significantly across time.

In the earliest years, Conversion Rates of around 40% were typical, whereas in more recent seasons Rates of around 50 to 55% have been more typical. To cater for this temporal variability, we use an approach similar to that we used for estimating the expected Scoring Shots per team per game in a season, converting Scoring Shots to Points using the average Conversion Rate for all games in the immediately preceding season.

That means that MoSSBODS 2.0 Ratings in 2016 will be converted to Points at a rate of 1 Rating Point (which is 1 Scoring Shot) equals 3.7 Points. This is very similar to the figure used in 2015 (3.6 Points) and in other recent seasons, this the case because average Conversion Rates have been in the 52.5% to 54% range for all of the last 10 seasons.

Adopting this methodology for converting Scoring Shots into Points allows us to calculate points-based MAEs for every season, the results of which are charted at left. (BTW for the 1897 season I simply assumed 13 Scoring Shots per team per game for Ratings purposes and a 39% Conversion Rate).

Not surprisingly, the general shape of this chart matches that of the chart for Scoring Shot based MAEs. The correlation between the two time series is +0.84. The lowest MAE comes in 1922 (17.5 points per game), and the highest in 1986 (35.4 points per game).

Another, cruder measure of performance is model accuracy - whether the model does or does not select the winning team - which we can calculate for MoSSBODS 2.0 by determining whether the team it selected to win either:

Registered more Scoring Shots, or
Scored more Points

The chart at right shows MoSSBODS' accuracy in every season, treating draws as half wins. The model's accuracy is slightly better when assessed using the final scores (all-time average 69%) than when assessed using Scoring Shots (67%). Score-based accuracy exceeds Scoring-Shot based accuracy in about 70% of seasons.

Looking solely at the 10 most recent seasons, Score-based model accuracy comes in at 69% and Scoring-Shot based accuracy at 68%.

Across time, Score-based accuracy has been more variable than Scoring-Shot based accuracy, the standard deviation of Score-based accuracy levels about 5.5% points and of Scoring-Shot based accuracy about 5.2% points.

TEAM RATING HISTORY

I'll finish today's blog by charting all teams' MoSSBODS 2.0 Ratings for two different time spans. Note that, as now defined, MoSSBODS 2.0 Defensive and Offensive Ratings, combined, always sum to zero for the teams currently active.

The first chart tracks every team's Offensive and Defensive Ratings across the entire expanse of VFL/AFL history, the jagged lines tracking the game-by-game Ratings and the smoothed lines loess fits to the underlying Ratings (with a span of 0.5 and degree of 2 for those of a technical bent).

MoSSBODS 2.0, like the original MoSSBODS, weights equally a team's Offensive and Defensive Ratings, so an Overall Rating would simply sum the red and blue lines in the chart.

One macro-feature that can be gleaned from this chart is the overall Defensive/Offensive balance of teams across expanses of time. The Lions, for example, have been generally stronger in Offence than in Defence for their entire history, barring only the last few years. Hawthorn, throughout their history, have tended to display long periods of superior Offence followed by long periods of superior Defence. The current team is in a period of superior Offence.

The chart also depicts the Swans' Defensive dominance since about the turn of the century, the consistency of the Pies' abilities across the entirety of football history, and the lock-step nature of the rise and fall of the Saints' Offensive and Defensive abilities - until the most-recent decade.

Finally, let's drill down on the period from 2005 to the present.

Some of the interesting features of this chart are that, overall:

Adelaide has been about or above-average since about the start of 2012
The Brisbane Lions have been below-average since about the start of 2010
Geelong has been above-average for virtually the entire period from 2005 until early in 2015
The Hawks have been, overall, an above-average team since about the middle of 2010
The Kangaroos and Fremantle have been about-average for the entire period
Sydney has been above-average for almost the entirety of the period barring a longish stretch in 2009 and a shorter one in early 2014.

What's not readily discernible from the MoSSBODS 2.0 data presented in this way is the fact that West Coast are assessed by MoSSBODS 2.0 (and, indeed, by the original MoSSBODS) as having finished the 2015 season Rated higher overall than Hawthorn. The gap between West Coast and Hawthorn is much smaller under MoSSBODS 2.0 than under the original MoSSBODS, however. As well, at their respective 2015 MoSSBODS 2.0 peak Ratings, the Hawks were almost a full Scoring Shot better team than the Eagles.

Even more controversially perhaps, MoSSBODS 2.0 has Richmond as the third-highest Rated team, solely on the basis of its assessment of their Defensive abilities. The original MoSSBODS had the Tigers ranked only 8th. It's important to remember though that these are all point-in-time estimates, and to acknowledge that a better measure of a team's true ability might best be determined by reviewing Ratings across some span of time.

Overall though, MoSSBODS 2.0 and MoSSBOD are in broad agreement at the season anyway, the correlations between their various Ratings and Rankings all above +0.96.

There's obviously a lot more we can do with MoSSBODS 2.0, for example looking at:

the highest- and lowest-Rated teams of all time (and how this assessment compares to a similar one we did using MARS Ratings a few years ago)
the relative importance of Offence versus Defence across time, especially in terms of Grand Final success and appearances
the biggest upsets of all time (and how this compares to another analysis we did for Finals using MARS, also a few years ago)

As well, if there are any analyses you'd be keen to see run, or if there are any comments about MoSSBODS 2.0 that you'd like to pass on, please leave a comment on this blog or e-mail me on the address shown in the navigation bar.