Matter of Stats

View Original

2022 : Simulating the Final Ladder Pre-Season

In this the third-ever, wildly speculative, pre-season ladder simulation, I’ll again be using the two methodologies I call Standard and Heretical, which are described below.

METHODOLOGIES

The Standard methodology uses, as its foundation, the most-recent team ratings and venue performance values from MoSHBODS, and holds these constant throughout the remainder of the simulated season. It introduces uncertainty by randomly perturbing each team’s offensive and defensive ratings for a given round by an amount proportional to the square-root of the time between now and when that game will be played. These perturbed ratings are then used in a team scoring model similar to the one discussed in this blog post, which first generates the competing teams’ scoring shots as a bivariate Negative Binomial, and then uses two univariate Beta Binomials to simulate the conversion of those scoring shots. The means for the two Beta Binomials (ie the assumed conversion rates for home and for away team) are held constant across the season.

For the Heretical methodology, we proceed round by round, using exactly the same team scoring model but with team ratings and venue performance values updated based on the simulated results of previous rounds (this year we’re fixing the expected score and expected home and away team conversion rates across every simulation)

Teams ratings under the Heretical methodology therefore follow a trajectory across the season, so that the simulations for Round X within a replicate are necessarily correlated with those for Round X-1, Round X-2, and so on. This is not the case with the Standard methodology for which the perturbations in any round are unrelated to those in any other round.

That, to be fair, is a selling point for the Heretical methodology, but it comes at a price: it results in desperately slow code. (It’s also logically flawed as a methodology, but I’m not going to argue that case again here.)

Consequently, we have 50,000 simulation replicates for the Standard method, but only 2,500 for the Heretical method, the obvious impact being that the probability estimates from the Heretical method suffer from substantially more sampling variation (about 4.5 times as much).

LADDER FINISHES

Here, firstly, are the projections for teams’ ladder finishes. The results from the Standard method are on the left, and those from the Heretical method on the right.

At a macro level, we see as we have for the past two years, that the results are remarkably similar. There is some minor jiggling of the team orderings based on Expected Wins, but no team moves by more than a single place, and what jiggling we do see is probably an artefact of the different levels of sampling variation. We also, again, get a little more spread in the range of Expected Wins under the Heretical method, with the gap between Melbourne and North Melbourne about 8.3 wins under this method compared to 6.9 wins under the Standard method.

The probability estimates for each team for making the Top 8 or Top 4, or for finishing as Minor Premiers, are also generally similar, with probabilities under the Standard Methodology pushed closer to 0 and 1, and those under the Heretical Methodology pushed nearer 0.5. Under both methodologies, most teams are assessed as having reasonable chances at finishing in quite a wide range of ladder positions.

TEAM AND POSITION CONCENTRATION

There are a number of ways of measuring how much uncertainty there is in the final ladder, including the Gini measure of concentration, which we used for seasons prior to 2020.

As I’ve noted previously, one of the challenges with that measure in practice is in its interpretation. We know that a team with a Gini coefficient of 0.8 has a narrower set of likely final finishes than a team with a Gini coefficient of 0.7, but linking that to anything more interpretable is difficult.

So, I’ve switched to using the Herfindahl-Hirschman Index (HHI) because its inverse has a fairly straightforward interpretation: in the case of the index for a team it can be thought of as the number of ladder positions for which a team is effectively competing, and in the case of the index for a ladder position, it can be thought of as the number of teams effectively competing for that spot.

The HHI figures for the most recent simulation replicates appear below, with those from the Standard methodology on the left, and those from the Heretical methodology on the right.

Standard Methodology - 50,000 Replicates

HERETICAL METHODOLOGY - 2,500 REPLICATES

Again the results are clearly very similar, both in terms of how many ladder positions each team is effectively competing for, and how many teams are effectively competing for each position. The differences are greatest for the strongest (Melbourne) and weakest (North Melbourne) teams, and for the highest and lowest ladder positions.

Those extremes aside, both methods suggest that most teams are effectively competing for between about 11 and 17 different ladder positions, and that most ladder positions have effectively between about 11 and 17 teams competing for them.

CONCLUSION

At a high level, both the Standard and Heretical methodologies produce quite similar results for the key team and game simulation metrics, with the methodological differences being largely swamped by the fact that both use the same underlying initial team ratings, venue performance values (and schedule).

In fact, the correlation between the teams’ Expected Wins under the Standard methodology and the teams’ Expected Wins under the Heretical methodology is +0.995, and the correlations between the various probabilities of finishing Top 8, Top 4, or Minor Premier under the two methodologies are also above +0.99.

The two methods do, however, produce some differences that might have practical significance - for example in the exact probabilities they attach to certain team ladder outcomes. If we assume that bookmakers are the best estimators of these probabilities, then it might be worth comparing the two methodologies’ estimates with those of the TAB bookmaker. Such a comparison reveals that:

  • The Top 8 probabilities from the Standard methodology are, in average absolute percentage point difference terms, closer to the implicit Top 8 probabilities of the TAB bookmaker (9.8% points for Standard and 10.6% points for Heretical)

  • The Top 4 probabilities from the Standard methodology are, in average absolute percentage point difference terms, effectively as different from the implicit Top 4 probabilities of the TAB bookmaker as the probabilities from the Heretical methodology (8.0% points for Standard and 8.2% points for Heretical)

  • The Minor Premiership probabilities from the Heretical methodology are, in average absolute percentage point difference terms, closer to the implicit probabilities of the TAB bookmaker for that market (3.8% points for Standard and 2.9% points for Heretical)