Bringing the MoS Twins into the 2020s

February 11, 2020 Tony Corke

The MoS twins - MoSSBODS and MoSHBODS - have both performed well these past few seasons, but there are some aspects of them that I’ve wanted to improve for a while, and I’ve decided that 2020 is the time to do that.

Rather than simply link you to a number of historical posts and describe only the new elements here, I thought it would be more useful to recap the philosophies and fundamentals of MoSSBODS and MoSHBODS here, and weave discussion of the changes through the exposition.

THE MOSSBODS ALGORITHM

MoSSBODS’ uses teams’ Scoring Shots and is an Elo-style team rating system designed to produce an offensive and a defensive rating for each team, where a rating of zero means that a team is expected to generate (if we’re considering offensive rating) or concede (if we’re considering defensive rating) an “average” number of Scoring Shots when facing a team rated 0 on both offence and defence at a neutral venue.

In other words, a team rated 0 on offence and on defence is an average team when playing on a neutral venue in the context of the current season and the mix of abilities of the teams.

Underpinning MoSSBODS is a set of equations that are used to update teams’ offensive and defensive ratings based on the most recent results

New Defensive Rating = Old Defensive Rating + k x (Actual Defensive Performance – Expected Defensive Performance)
Actual Defensive Performance = Expected Scoring Shots for Average Team – Adjusted Opponent’s Scoring Shots
Expected Defensive Performance = Own Defensive Rating – Opponent’s Offensive Rating + Venue Adjustment / 2
Adjusted Opponent’s Scoring Shots = min(Scoring Shot Cap, Actual Opponent’s Scoring Shots)
New Offensive Rating = Old Offensive Rating + k x (Actual Offensive Performance – Expected Offensive Performance)
Actual Offensive Performance = Adjusted Own Scoring Shots - Expected Scoring Shots for Average Team
Expected Offensive Performance = Own Offensive Rating – Opponent’s Defensive Rating + Venue Adjustment / 2
Adjusted Own Scoring Shots = min(Scoring Shot Cap, Actual Own Scoring Shots)
Venue Adjustment = Nett VPV Difference after adjustments

There are a number of parameters in those equations, the optimal values for which I've estimated differently this year. Specifically, I’ve chosen parameter values that minimise the Mean Absolute Error (MAE) in the predicted game margins for 50% of randomly selected games from the 1990 to 2019 period, where games from 1990 to 2009 were only half as likely to be chosen in the random sample as games from 2010 to 2014, and only one-quarter as likely to be chosen as games from 2015 to 2019. This approach is designed to favour predictive ability in recent seasons, but ensure that a number of games are held out from each season for performance assessment purposes, if required.

OPTIMISED MoSSBODS PARAMETERS

All teams start in Round 1 of season 1897 with Offensive and Defensive Ratings of 0. Teams that started in later seasons start with offensive and defensive ratings of 0 for their first game.
The value of k in (1) and (5) above varies according to the round in which the game is played, as follows:
- k = 0.121 for Rounds 1 to 6 (in 2020 - in general, this first split is meant to take in about 30% of the home and away rounds, the second and third splits another 30% each, and the fourth split the remaining 10%)
- k = 0.090 for Rounds 7 to 13
- k = 0.096 for Rounds 14 to 20
- k = 0.073 for Rounds 21 to the end of the home and away season
- k = 0.079 for Finals
Having relatively larger k values in early rounds allows the ratings to adjust more rapidly to the true abilities of teams in the new season.
It’s also interesting to note that k is smallest for the last few rounds of the home and away season, which hints at the fact that team’s real abilities might not as often be on show in these rounds. Make of that what you will.
For MoSS2020 we include a new parameter, called Days to Average, which determines the window across which a number of key values are calculated. One of those is the value for Expected Scoring Shots, which is used in (2) and (6) above and is calculated using the actual average Scoring Shots per team per game across the previous 6 years. The previous MoSSBODS - let’s call it MoSSBODS 2019 - used the average only from the previous season, and made no updates during the course of the season. It’s hoped that this new approach will respond, somewhat, to any significant short-term changes in Scoring Shot production, without being overly sensitive to them.
Venue Performance Values (VPVs) are calculated for each team pre-game based on their performance relative to expectations (ignoring venue effects) at that same venue across the previous 6 years, but only once a minimum of 11 games have been played at that venue by that team in that window. The average under- or over-performance at the venue is regularised (ie dragged closer to zero) by the use of a multiplier with a value in the (0,1) range called Mean Regularisation. The optimal value has been calculated to be 0.33. In other words, if a team’s average final margin at the venue and ignoring venue effects has been 9 Scoring Shots above expectation, its VPV for that venue will be +3 Scoring Shots.
If a team has not played 11 or more games at a venue in the 6-year window, then, if it is playing in its home state, it will have an assumed VPV of 0, but, if it is playing out of state, it will have an assumed VPV of -3.5 Scoring Shots.
These VPVs are subject to a couple of adjustments.
- If a team is playing a home and away game as the designated home team, it gets a Home Status Bonus of 0.75 Scoring Shots
- During Finals, VPVs are multiplied by a Final VPV Fraction, the optimal value of which turns out to be 1.5 (which implies that venue effects are exacerbated in Finals)

Optimisation suggests that no Cap is required on teams’ actual Scoring Shot data. Consequently, in (4) and (8) above, no Cap is imposed.
Teams carry 66% of their Rating from the end of one season to the start of their next season. So, for example, a team finishing a season with a +3 Defensive Rating will start their next season with a +2 Defensive Rating.
Sydney, in its first year, is considered to be a continuation of South Melbourne, the Western Bulldogs a continuation of Footscray, and the Brisbane Lions, Fitzroy and the Brisbane Bears are treated as three separate teams (this is also different from MoSSBODS’ previous approach, which was to treat Brisbane Lions as a continuation of Fitzroy). The Kangaroos and North Melbourne are also treated as the same team regardless of the name used in any particular season.
The sum of all active teams’ offensive and defensive ratings during the course of a season will be zero. Where a team drops out of the competition, temporarily or permanently, a fixed adjustment is made to all of the offensive and defensive ratings of the remaining teams at the start of the season to ensure that the sum again becomes zero.
For those teams that missed entire seasons - for example, Geelong in 1916, 1942 and 1943 - they re-enter the competition with the same ratings as they had when they exited (adjusted for the season-to-season carryover).

EXPECTED TEAM SCORES AND BIAS CORRECTION

Once we have the pre-game offensive and defensive ratings for both teams, and their VPVs for the venue, we can calculate the expected number of scoring shots generated and conceded by a team by calculating:

Expected Scoring Shots Generated = Expected Scoring Shots for Average Team + Own Offensive Rating - Opponent Defensive Rating + (Own VPV - Opponent VPV)/2

Expected Scoring Shots Conceded = Expected Scoring Shots for Average Team + Opponent Offensive Rating - Own Defensive Rating + (Opponent VPV - Own VPV)/2

(note that we split the VPV difference 50:50 across offence and defence)

These Scoring Shot calculations are converted to Scores by multiplying them by the average Score per Scoring Shot across the past 6 years.

The new MoSSBODS makes one final adjustment in coming up with estimated team scores, and that is to make a bias adjustment. For this purpose, an average bias over the past six years is calculated (ie an average of actual less expected scores), separately for all designated home teams and for all designated away teams, and including only home and away games.

That average, all-team bias is then added to the expected scores generated earlier if the game is a home and away contest. No adjustment is made if it is a Final.

So, in summary,

Adjusted Expected Score = (Expected Scoring Shots x 6-year Average All-Team Score per Scoring Shot) + 6-year Average All-Team Bias

The average adjustment to expected scores across the period from 1990 to 2019 is about +0.08 points per game (ppg), with Home teams, on average, getting about an 0.06 ppg boost, and Away teams about an 0.11 ppg boost. The net effect on predicted game margins is therefore only about +0.05 ppg.

The expected total score for a game is now just the sum of the Adjusted Expected Scores for the two teams.

MAJOR MoSSBODS DIFFERENCES

The major differences in philosophy between MoSS2020 and MoSSBODS 2019 are:

The use of a six year window to calculate Expected Scoring Shots, VPVs, and mean bias
Optimising over the 1990 to 2019 period, with an emphasis on the most recent five seasons
Including a VPV adjustment for Finals
Resetting the all-team aggregate rating to zero at the start of every season
Bias-adjusting predicted team scores

MoSS2020 VERSUS MOSSBODS 2019

All-Time - Home and Away

MoSS2020 has a lower MAE for Game Margins in 75 of 123 seasons (61%)
MoSS2020 has a lower MAE for Game Totals in 76 of 123 seasons (62%)
MoSS2020 has a lower MAE for Home Team Scores in 50 of 123 seasons (41%)
MoSS2020 has a lower MAE for Away Team Scores in 81 of 123 seasons (66%)

All-Time - Finals

MoSS2020 has a lower MAE for Game Margins in 58 of 123 seasons (47%)
MoSS2020 has a lower MAE for Game Totals in 64 of 123 seasons (52%)
MoSS2020 has a lower MAE for Home Team Scores in 55 of 123 seasons (45%)
MoSS2020 has a lower MAE for Away Team Scores in 52 of 123 seasons (42%)

1990 to 2019 - Home and Away

MoSS2020 has a lower MAE for Game Margins in 16 of 30 seasons (53%)
MoSS2020 has a lower MAE for Game Totals in 20 of 30 seasons (67%)
MoSS2020 has a lower MAE for Home Team Scores in 11 of 30 seasons (37%)
MoSS2020 has a lower MAE for Away Team Scores in 18 of 30 seasons (60%)

1990 to 2019 - Finals

MoSS2020 has a lower MAE for Game Margins in 11 of 30 seasons (37%)
MoSS2020 has a lower MAE for Game Totals in 15 of 30 seasons (50%)
MoSS2020 has a lower MAE for Home Team Scores in 9 of 30 seasons (30%)
MoSS2020 has a lower MAE for Away Team Scores in 12 of 30 seasons (40%)

In summary:

MoSS2020 is generally superior to MoSSBODS 2019 in the home and away season (except in forecasting Home Team scores)
MoSS2020 is mostly weaker than MoSSBODS 2019 in Finals. MoSS2020’s MAE for Finals from 1990 to 2019 is 30.2 ppg. MoSSBODS 2019 is 29.6 ppg.

If we convert margin forecasts to probability predictions for each of the last 10 seasons (more about this in a future post), MoSS2020 is superior in terms of log probability score in 6 of the 10 home and away seasons, and in 6 of the 10 Finals series.

MoSS2020’s MAE for the 2019 home and away season would have been 27.15 ppg, and for the Finals 31.89 ppg.

THE MOSHBODS ALGORITHM

MoSHBODS is also an Elo-style team rating system that produces an offensive and a defensive rating for each team, but it has been designed so that these ratings are measured in Points rather than in Scoring Shots. So, a rating of zero means that a team is expected to generate (if we’re considering offensive rating) or concede (if we’re considering defensive rating) an “average” number of points when facing a team rated 0 on both offence and defence at a neutral venue.

In other words, like MoSSBODS, a team rated 0 on offence and on defence is an average team when playing on a neutral venue in the context of the current season and the mix of abilities of the teams.

Now the rationale for MoSSBODS’ using a team's scoring shots rather than its score in determining ratings is the fact that a team's accuracy or conversion rate - the proportion of its scoring shots that it converts into goals - appears to be largely random, in which case rewarding above-average conversion or punishing below-average conversion would be problematic. Conversion is not, however, completely random, since, as the blog post just linked reveals, teams with higher offensive ratings, and teams facing opponents with lower defensive ratings, tend to be marginally more accurate than the average team.

So, if better teams tend to be even slightly more accurate, maybe higher accuracy should be given some weighting in the estimation of team ratings. That was the original motivation for MoSHBODS, which uses a combination of Scoring Shots and Points in its underlying equations.

New Defensive Rating = Old Defensive Rating + k x (Actual Defensive Performance – Expected Defensive Performance)
Actual Defensive Performance = Expected Score for Average Team – Adjusted Opponent’s Score
Expected Defensive Performance = Own Defensive Rating – Opponent’s Offensive Rating + Venue Adjustment / 2
Adjusted Opponent’s Score = f x Opponent's Score if Converted at All-Team Average + (1-f) x Actual Opponent's Score
New Offensive Rating = Old Offensive Rating + k x (Actual Offensive Performance – Expected Offensive Performance)
Actual Offensive Performance = Adjusted Own Score - Expected Score for Average Team
Expected Offensive Performance = Own Offensive Rating – Opponent’s Defensive Rating + Venue Adjustment / 2
Adjusted Own Score = f x Own Score if Converted at All-Team Average + (1-f) x Actual Own Score
Venue Adjustment = Nett VPV Difference after adjustments

The parameters for MoSHBODS have been optimised across the same subset of games from the 1990 to 2019 period.

OPTIMISED MoSHBODS PARAMETERS

All teams started in Round 1 of season 1897 with Offensive and Defensive Ratings of 0. Teams that started later in the period start with a Rating of 0 for their first game.
The value of k in (1) and (5) above varies according to the round in which the game is played, as follows:
- k = 0.117 for Rounds 1 to 6
- k = 0.084 for Rounds 7 to 13
- k = 0.093 for Rounds 14 to 20
- k = 0.081 for Rounds 21 to the end of the home and away season
- k = 0.055 for Finals
MoSHBODS and MoSSBODS use identical splits for all seasons, and the values for k are broadly similar.
It’s interesting to note that k is again smallest within the home and away portion of the season for the last few rounds.
MoSH2020 also now uses the Days to Average parameter to determine the window across which a number of key values are calculated. And, like MoSS2020, the optimal value turns out to be 6 years. One of those is the value for Expected Score, which is used in (2) and (6) above and is calculated using the actual average Score per team per game across the previous 6 years. Like MoSSBODS, the previous MoSHBODS used the average only from the previous season, and made no updates during the course of the season.
To convert actual to adjusted scores, MoSH2020 uses a mixture of a team’s actual score, and what it would have scored had it converted at the same average rate as teams from the past 6 years. It takes 25% of the actual score and 75% of the score that would have been achieved if all Scoring Shots were converted at that long-term average.
Venue Performance Values (VPVs) are, analogously to MoSSBODS, calculated for each team pre-game based on their performance relative to expectations (ignoring venue effects) at that same venue across the previous 6 years, but only once a minimum of 8 games have been played at that venue by that team in that window. The average under- or over-performance is regularised (ie dragged closer to zero) by the use of a multiplier with a value in the (0,1) range called Mean Regularisation. The optimal value has been calculated to be 0.32. In other words, if a team’s average final margin at the venue and ignoring venue effects has been 10 Points above expectation, its VPV for that venue will be +3.2 Points.
If a team has not played 8 or more games at a venue in the 6-year window, then, if it is playing in its home state, it will have an assumed VPV of 0, but, if it is playing out of state, it will have an assumed VPV of -9 Points.
These VPVs are subject to one adjustment.
- During Finals, VPVs are multiplied by a Final VPV Fraction, the optimal value of which turns out to be 1.5 (which implies that venue effects are exacerbated in Finals)

Optimisation suggests that no Cap is required on teams’ actual Scoring Shot data. Consequently, in (4) and (8) above, no Cap is imposed.
(NB This is an update to the original version of this post after a bug was spotted in the code by Dan B. The code linked below is the corrected version)
Teams carry 65% of their Rating from the end of one season to the start of their next season so, for example, a team finishing a season with a +10 Defensive Rating will start their next season with a +6.5 Defensive Rating.
MoSHBODS, like MoSSBODS, treats Sydney, in its first year, as a continuation of South Melbourne, the Western Bulldogs as a continuation of Footscray, and treats the Brisbane Lions, Fitzroy and the Brisbane Bears as three separate teams (this is also different from MoSHBODS’ previous approach, which was to treat Brisbane Lions as a continuation of Fitzroy). The Kangaroos and North Melbourne are also treated as the same team regardless of the name used in any particular season.
The sum of all active teams’ offensive and defensive MoSHBODS ratings during the course of a season are, like MoSSBODS, also zero. Similar adjustments are made at the start of each season to ensure this is the case for all active teams.
MoSHBODS also treats the entrance and exit of teams in the same way as MoSSBODS.

EXPECTED TEAM SCORES AND BIAS CORRECTION

Once we have the pre-game offensive and defensive ratings for both teams, and their VPVs for the venue, we can calculate a team’s expected score for and against as follows:

Expected Score For = Expected Score for Average Team + Own Offensive Rating - Opponent Defensive Rating + (Own VPV - Opponent VPV)/2

Expected Score Against = Expected Score for Average Team + Opponent Offensive Rating - Own Defensive Rating + (Opponent VPV - Own VPV)/2

(note that we here too split the VPV difference 50:50 across offence and defence)

The new MoSHBODS, like MoSSBODS, makes one final adjustment in coming up with estimated team scores, and that is to make a bias adjustment. For this purpose, an average bias over the past six years is calculated (an average of actual less expected scores), separately for all designated home teams and for all designated away teams, and including only home and away games.

That average, all-team bias is then added to the expected scores generated earlier if the game is a home and away contest. No adjustment is made if it is a Final.

So, in summary,

Adjusted Expected Score = Expected Score + 6-year Average All-Team Bias

The average adjustment to expected scores across the period from 1990 to 2019 is about -0.1 ppg, with Home teams, on average, getting about a 1.2 ppg boost, and Away teams a reduction of about 1.4 ppg boost. The net effect on predicted game margins is therefore only about +2.6 ppg.

The expected total score for a game is, as per MoSS2020, just the sum of the Adjusted Expected Scores for the two teams.

MAJOR MoSHBODS DIFFERENCES

The major differences in philosophy between MoSH2020 and the MoSHBODS used in 2019 are:

The use of a six year window to calculate Expected Scoring Shots, VPVs, and mean bias
Optimising over the 1990 to 2019 period, with an emphasis on the most recent five seasons
Including a VPV adjustment for Finals
Resetting the all-team aggregate rating to zero at the start of every season
Bias-adjusting predicted team scores

MoSH2020 VERSUS MOSHBODS 2019

All-Time - Home and Away

MoSH2020 has a lower MAE for Game Margins in 64 of 123 seasons (52%)
MoSH2020 has a lower MAE for Game Totals in 74 of 123 seasons (60%)
MoSH2020 has a lower MAE for Home Team Scores in 44 of 123 seasons (36%)
MoSH2020 has a lower MAE for Away Team Scores in 67 of 123 seasons (54%)

All-Time - Finals

MoSH2020 has a lower MAE for Game Margins in 46 of 123 seasons (37%)
MoSH2020 has a lower MAE for Game Totals in 59 of 123 seasons (48%)
MoSH2020 has a lower MAE for Home Team Scores in 45 of 123 seasons (37%)
MoSH2020 has a lower MAE for Away Team Scores in 53 of 123 seasons (43%)

1990 to 2019 - Home and Away

MoSH2020 has a lower MAE for Game Margins in 21 of 30 seasons (70%)
MoSH2020 has a lower MAE for Game Totals in 22 of 30 seasons (73%)
MoSH2020 has a lower MAE for Home Team Scores in 16 of 30 seasons (53%)
MoSH2020 has a lower MAE for Away Team Scores in 16 of 30 seasons (53%)

1990 to 2019 - Finals

MoSH2020 has a lower MAE for Game Margins in 17 of 30 seasons (57%)
MoSH2020 has a lower MAE for Game Totals in 13 of 30 seasons (43%)
MoSH2020 has a lower MAE for Home Team Scores in 14 of 30 seasons (47%)
MoSH2020 has a lower MAE for Away Team Scores in 14 of 30 seasons (47%)

In summary:

MoSH2020 is generally superior to MoSHBODS 2019 in the home and away season (except in forecasting Home Team scores)
MoSH2020 is mostly weaker than MoSHBODS 2019 in Finals. That said, MoSH2020’s MAE for Finals from 1990 to 2019 is 30.0 ppg, while MoSHBODS 2019’s is 30.4 ppg.

If we convert margin forecasts to probability predictions for each of the last 10 seasons (more about this too in a future post), MoSH2020 is superior in terms of log probability score in 6 of the 10 home and away seasons, and in 6 of the 10 Finals series.

MoSH2020’s MAE for the 2019 home and away season would have been 27.06 ppg, and for the Finals 31.07 ppg.

COMPARING MOSS2020 AND MOSH2020

Lastly, let’s look at how the two new algorithms compare

All-Time - Home and Away

MoSH2020 has a lower MAE for Game Margins in 75 of 123 seasons (61%)
MoSH2020 has a lower MAE for Game Totals in 81 of 123 seasons (66%)
MoSH2020 has a lower MAE for Home Team Scores in 81 of 123 seasons (66%)
MoSH2020 has a lower MAE for Away Team Scores in 69 of 123 seasons (56%)

All-Time - Finals

MoSH2020 has a lower MAE for Game Margins in 62 of 123 seasons (51%)
MoSH2020 has a lower MAE for Game Totals in 69 of 123 seasons (56%)
MoSH2020 has a lower MAE for Home Team Scores in 72 of 123 seasons (59%)
MoSH2020 has a lower MAE for Away Team Scores in 60 of 123 seasons (49%)

1990 to 2019 - Home and Away

MoSH2020 has a lower MAE for Game Margins in 18 of 30 seasons (60%)
MoSH2020 has a lower MAE for Game Totals in 17 of 30 seasons (57%)
MoSH2020 has a lower MAE for Home Team Scores in 20 of 30 seasons (67%)
MoSH2020 has a lower MAE for Away Team Scores in 13 of 30 seasons (43%)

1990 to 2019 - Finals

MoSH2020 has a lower MAE for Game Margins in 16 of 30 seasons (53%)
MoSH2020 has a lower MAE for Game Totals in 16 of 30 seasons (53%)
MoSH2020 has a lower MAE for Home Team Scores in 18 of 30 seasons (60%)
MoSH2020 has a lower MAE for Away Team Scores in 13 of 30 seasons (43%)

If we convert margin forecasts to probability predictions for each of the last 10 seasons, MoSH2020 is superior in terms of log probability score in 6 of the 10 home and away seasons, and in 4 of the 10 Finals series.

Overall then, it seems fair to summarise that MoSH2020 is clearly superior to MoSS2020 in the home and away portion of the season, but the picture is a little more mixed for the Finals, although you’d prefer MoSHBODS if your main aim was to predict game margins or totals.

THE CODE

For the first time ever on MoS, I’ve decided to share the input data and R scripts I’ve used to create MoSH2020 and MoSS2020.

I do this, recognising that you can never truly control anything once you’ve made it publicly available, but nonetheless with the request that:

anyone who uses it - as is, or in derivative works - makes attribution
any improvements in the code be shared with me and others
any errors or improvements in the data be similarly shared

The code is offered as is, and I make no claims about its accuracy (and certainly none about its efficiency or beauty).

To use it you’ll need to download R and install the dplyr, compiler, mvtnorm, emdbook, and moments packages.

POTENTIAL IMPROVEMENTS

The most obvious improvement that might be made to the data is the correct designation of home and away status in Finals. As it stands, since the base data comes from afltables, the winner of a Final is automatically designated as the Home Team. Fixing this - and documenting the approach used - would be invaluable.

In terms of the script, there are a number of areas that I think are worth exploring, including:

Different windows for estimating Expected Scoring Shots, Expected Scores, Expected Points per Score, and Average Bias
Applying different Scoring Shot conversion rates to Home versus Away teams, and even to different Venues, perhaps based on the weather or time of day
Exploring different approaches to modelling Finals, perhaps treating the GF differently to other Finals
Calculating Average Bias on a team-by-team basis, perhaps by venue
Experimenting with different initial ratings for teams

I look forward to hearing about your ideas and progress.