Matter of Stats

View Original

On The Randomness of Final AFL Scores

You'd be forgiven for assuming that the last digit of a team's final score in an AFL contest was essentially random - in other words that it was as likely to be a 0 as a 1, a 4 or a 9.

And you'd be right, or at least not statistically significantly wrong, if you asserted that in relation to the score of a designated Home team or a designated Away team (or, indeed, of the total score of the two teams). If you were to look at the distribution of the last digit of the scores of Home teams or Away teams over the entirety of VFL/AFL, or only over the course of seasons since 1980, a chi-squared test would reveal that the distribution is not statistically significantly different from a distribution attaching a 10% probability to each of the digits 0 to 9. In practical terms this means that if I asked you to guess the last digit of the final score of the Home team or the Away team in a particular game, you couldn't reliably do any better than to select any one of the numbers 0 to 9.

But - and I'm guessing you knew there'd be a but - the same couldn't be said if you instead made the claim in relation to the score of the winning or the losing team. If I asked you to guess the last digit of the score of the winning team, you'd do a little better than chance if you guessed that it ended in a 4, 5, 6 or a 9, and if I asked the same question of you in relation to the last digit of the score of the losing team, you'd do better than chance if you guessed a 0 or a 3.

(As at the end of Round 5 in the current season, winning teams' scores have ended in a 4, 5, 6 or 9 almost 47% of the time, and losing teams' scores have ended in a 0 or 3 over 24% of the time, both of which exceed the respective expectations of 40% and 20% that we'd assume if the last digits of scores were random.)

So, the last digits of the scores of winning teams of and losing teams appear to be other than random. The same can be said, it turns out, of the joint distribution of the final digits of the winning and losing teams' scores, a fact we can use as the basis for a proposition bet.

YET ANOTHER PROPOSITION BET

Here's the setup: I want you to choose a game at random where the last digit of the winning team's score was different from the last digit of the losing team's score. Then I want you to reveal to me the last digit of the score of one of the teams, chosen at random.

I will then tell you whether the digit of the score that you revealed to me belonged to the winning or losing team and we'll bet at even money that I've selected the correct team.

Amazingly, this is a bad bet for you. The following table, which is based on the scores from all games between 1980 and 2012 (but which isn't that different from the table you get if you include all seasons going back to 1897) shows why.

You read this table as follows. 

Consider the first row. It reveals that, if you chose the winning or the losing team at random, 10.3% of the time you'd read out to me the digit 0 as being the last digit of the score of the team you selected. Across the period 1980 to 2012, in 494 games that last digit of 0 belonged to the winning team, and in 565 games it belonged to the losing. So if, when you read out a 0, I guessed that it related to the losing team I'd have been right 53.4% of the time you'd done that.

The other rows in the table can be interpreted in the same way.

If you read out a 0, 3, 7 or 8 then, I'd guess that the team whose score ended in that digit was the losing team, while if you read out a 1, 2, 4, 5, 6 or 9, I'd guess that the team whose score ended in that digit was the winning team.

Following this strategy for the period 1980 to 2012 I'd have been correct 52.3% of the time, comfortably (and statistically significantly) more often than chance. At even-money odds that represents a 4.6% ROI in my favour. I'd have predicted at better than a 50% (ie chance) rate in 73% of those seasons.

TESTING THE WAGER OUT-OF-SAMPLE

I developed this rule using data up to but not including the current season. What then if I applied it to the 45 games from the first 5 rounds of the current season?

My expected winning rate for the season so far is a healthy 56%, well ahead of the historical average of 52.3%. In reality, my actual winning rate would vary around this figure depending on how often you'd (randomly) chosen to reveal the last digit of the score of the winning or of the losing team (or of the Home team or the Away team), but you'd have needed to be very lucky indeed to have produced a mix, by chance, that lowered my expected winning rate below 50%.