Building Your Own Men's AFL Game Score Progression Simulator

In this (long) blog I’ll walk you through the concepts and R code behind the creation of a fairly simple score progression simulator.

(There’s a link for you to download the entire code yourself at the end of the blog.)

All we’ll be interested in are “events” - period starts, period ends, goals and behinds - and the algorithm will determine for us, given the event that’s just occurred, what the next event is, and how far away in time it will take place.

To be able to do that, the first thing we’re going to need is some data about the typical time between events based on historical games, which we can obtain using the Swiss Army knife of footy data, the fitzRoy() R package.

Read More

How Many Disposals Do You Need to Get the Coaches' Attention?

In the previous blog we investigated the differences between coaches and umpires in the player statistics they appear to take most notice of when casting their respective player-related votes.

We found some similarities (both are very influenced by disposal counts), and some differences (coaches are more influenced by whether the player is on the winning or losing team), but one thing we didn’t investigate was the specific nature of the relationships between individual player metrics and voting behaviour. For example, we know that disposals are an important metric in determining Brownlow and Coaches’ votes, but we don’t know exactly how the number of votes that a player receives varies as the disposal count changes.

Read More

Do Umpires and Coaches Notice Different Things In Assigning Player Votes?

At the conclusion of each game in the men’s AFL home and away season, umpires and coaches are asked to vote on who they saw as the best players in the game. Umpires assign 3 votes to the player they rate as best, 2 votes to the next best, 1 vote to the third best, and (implicitly) 0 votes to every other player. It is these votes that are used to award the Brownlow Medal at the end of the season.

Similarly, the coaches of both teams are asked to independently cast 5-4-3-2-1 votes for the players they see as the five best, meaning that each player can end up with anywhere between 0 and 10 Coaches’ votes.

The question for today is: to what extent can available player game statistics data tell us whether and how coaches and umpires differ in how they arrive at their votes.

(Note that we’ll not be getting into the issue of individual umpire or coach quirks, snubs, or biases, and instead be looking at the data across all voting umpires and coaches.)

Read More

Measuring Strength of Schedule in Terms of Expected Wins

A few weeks back I analysed the men’s 2025 AFL schedule with a view to determining which teams had secured relatively easier overall fixtures, and which had secured relatively more difficult overall fixtures.

We investigated various approaches there and reached some conclusions about relative team fixture difficulty, but none of the methods provided an intuitive way to interpret the outputs.

On a related note, this week I had a kind email from a reader who suggested that there might be an opportunity to continuously update teams’ ‘fixture difficulty rating’ (which is just another term for strength of schedule) during the season, as this service was frequently provided by various fantasy leagues for English Football and other sports.

All of which got me to revisiting my strength of schedule methodology.

Read More

Simulation Replicates and Returns to a Perfect Model

The Situation

We’ve built a model designed to estimate the probability of a binary event (say, for example, the probability that the home team wins on the line market in the AFL).

It’s a good model - very good, in fact, because it is perfectly calibrated. In other words, when the true probability of an event is X% it’s average estimate of the probability of that event is X%.

Those probability estimates, however, are the result of running some simulation replicates with a stochastic element, which means that those estimates will diverge from X% to an extent determined by how many replicates we run.

Read More

We Need to Talk About MoSHBODS ...

Last year’s men’s seasons results for MoSHBODS and MoSSBODS - as forecasters and as opinion-sources for wagering - were at odds with what had gone before.

Other analyses have suggested that the MoS twins might have been a bit unlucky in the extent to which 2024 was different from bookmaker expectations, and I’ve never been one for knee-jerk reactions to single events, but the performance has nonetheless made me think more deeply about the algorithms underpinning the two Rating Systems, more details on which were provided in this blog from 2020, and from the blogs to which it links.

Read More

What if Squiggle Used xScore?

Over the past few blogs (here and here) I’ve been investigating different methods for untangling skill from luck in forecasting game margins and, in this blog, we’ll try another approach, this time using what are called xScores.

One source of randomness in the AFL is how well a team converts the scoring opportunities it creates into goals versus behinds. Given enough data, analysts far cleverer than I can estimate how often a shot of a particular type taken from a particular point of the field under particular conditions should result in a goal, a behind, or no score at all.

So, we can adjust for that randomness in conversion by replacing the result of every scoring opportunity by the average score that we would expect an average player to generate from that opportunity given its specific characteristics. By summing the expected score associated with every scoring opportunity for a team in a given game we can come up with an expected score, or xScore, for that team.

For this blog, I’ll be using the xScores created by Twitter’s @AFLxScore for the years 2017 to 2020, and those created by Twitter’s @WheeloRatings for the years 2021 to 2024.

Let’s look firstly at the season-by-season Squiggle results of using, as a game’s margin, the xScore margin instead of the actual margin.

Read More

Squiggle Performances Revisited: Alternative Sources of Truth

In the previous blog, I compared Squiggle forecasters’ actual margin prediction MAE results with a distribution of potential MAE outcomes from the same forecasts across 10,000 simulated 2024 season as one way of untangling the skill and luck portions of those actual results.

Those simulations require us to select “ground truth” for the underlying expected margin in each game. In the previous blog we used bookmaker data with an added random component of a Normal variable with mean 0 and standard deviation 8 as that ground truth.

Read More

Eight Years of Squiggle Performance

The Squiggle website is a place where forecasters can post their forecasts for the winning team and winning margin, and provide probability estimates for upcoming games of men’s AFL football, and see how well or otherwise they perform relative to other forecasters. The only criteria for posting there is that the forecasters must have a history of performing “reasonably” well, and must not include any human-related inputs such as bookmaker prices in their models.

It’s been running since 2017 and, since 2018, has included a derived forecater, named s10, which is a weighted average of the 10 best Squiggle models, based on mean absolute margin error, from the previous season. The MoS model had been included in s10 in every year from 2018 to 2024, but will be absent in 2025 due to a relatively lowly 22nd place finish.

In this blog, among other things, I want to get a sense of the extent to which that apparently below-average performance might be attributed to skill versus luck.

Read More

Is Favourite-Longshot Bias Evident in Bookmaker Data for the AFL?

More than once here on the MoS website we’ve looked at the topic of favourite-longshot bias (FLB), which asserts that bookmakers apply a higher profit margin to the prices of underdogs than they do to favourites. In one MoS piece (15 years ago!) I had more of a cursory look and found some evidence for FLB using 2006 to 2008 data, and, in another piece, a few years later I had a more detailed look and found only weak to moderate evidence using opening TAB data from 2006 to 2010.

At this point I think it’s fair to say that the jury is still out on FLB’s existence, and waiting for more convincing evidence either way (and very unhappy at having been sequestered for 13 years in the meantime).

Read More

The Relationship Between Expected Victory Margins and Estimated Win Probabilities

There are no doubt a number of viable ways of doing this, but one obvious approach is to fit a logistic equation of the form shown at right.

This provides an S-shaped mapping where estimated win probabilities respond most to changes in expected margins when those margins are near zero. It also ensures that all estimated probabilities lie between 0 and 1, which they must.

I’ve used this form of mapping for many years with values of k in the 0.04 to 0.05 range, and have found it to be very serviceable. I’ve also previously fitted it to bookmaker data and found that it generally provides an excellent fit.

Read More

Are V/AFL Scores (Still) Like Snowflakes?

Almost 10 years ago I wrote a blog that, among other things, noted that the score progressions - the goals.behinds numbers at the end of each quarter for both teams - were unique for every game ever played, regardless of the order in which you considered the two teams’ score progressions, home first then away, or away first and then home, choosing at random for every game. At that point, the statement was true for 14,490 games.

It seemed pretty startling then but, as of the end of 2024’s Round 9, the statement is STILL true, and that’s now for 16,487 games. V/AFL games remain as snowflake-like as ever.

Read More

An Extra Slice of An Analysis of Strength of Schedule for the Men's 2024 AFL Season

I was thinking about the Strength of Schedule metric used in this blog from yesterday, and it struck me that, rather than using the raw values of the opponent team’s MoSHBODS rating and (for some metrics) the net Venue Performance Values (VPVs) for a game, we could, instead, convert these numbers into a win probability, which might make the resulting aggregate Strength of Schedule value more readilly interpretable.

Read More