Right now I'm reading Wayne L Winston's Mathletics, a book about the use of fairly simple mathematics and sports statistics to gain insights into the results of American sports. Inspired by this book, in particular by a piece on Pythagorean Expectation which relates the season-long winning percentage of a baseball team to the total runs that it's scored and allowed, I wondered if an AFL team's win percentage could be similarly predicted by a handful of summary statistics about its own and its opponents' scoring.
Initially I imagined that the game would have changed sufficiently over its 100-plus year history to preclude the existence of a simple relationship that would apply to the entirety of its history; I was wrong.
The following equation explains almost 90% of the variability of teams' end-of-season winning percentages across the period 1897-2010:
Predicted Winning Percentage = logistic(0.164 x (Own Scoring Shots per game - Opponent Scoring Shots per game) + 6.18 x (Own Conversion Rate - Opponent Conversion Rate)
(Recall that logistic(x) = exp(x)/(1+ exp(x)))
That's 114 years of AFL history summarised in one equation with 2 constants and 4 parameters.
So, for example, a team that, on average, produced 2 more scoring shots per game than its opponents and that converted them, on average, at the same rate as its opponents would be expected to win about exp(0.164 x 2 + 6.18 x 0)/(1 + exp(0.164 x 2 + 6.18 x 0)), or 58% of its games across the home-and-away season. That expectation applies whether you apply it to a team playing at the turn of the 19th or the 20th century.
This equation allows you to ask and answer some interesting what-ifs. For example, if you're a team that currently generates the same number of scoring shots as your opponents and converts them at the same rate as your opponents - hence your expected winning percentage is 50% - would you rather swap in a player who generates 1 more scoring shot per game or who increases your conversion rate by 2% points?
An extra scoring shot per game lifts your team's winning percentage to exp(0.164)/(1+exp(0.164)) or 54.1%, while a 1% increase in your team's conversion rate lifts its winning percentage to exp(0.0618 x 2)/(1+exp(0.0618 x 2)) or 53.1%. So, you want the player who can generate 1 extra scoring shot per game.
Pythagoras' Slant on AFL
Bill James' Pythagorean expectation equated a baseball team's expected winning percentage to the square of the runs it scored on the sum of the square of the runs it scored plus the square of the runs its opponents scored (hence the Pythagorean nomenclature). I created an equivalent measure for AFL - the square of a team's points scored per game divided by the sum of the square of the team's points scored per game plus the square of their opponents' points scored per game - which I'll call AFL_Pyth, and found that it too could be used to create an equation to predict a team's winning percentage almost as well as the earlier equation.
The new equation is:
Predicted Winning Percentage = logistic(8.24 x AFL_Pyth - 4.13)
This equation explains about 88% of the variability in team winning percentages across the period 1897 to 2010.
As an example of how it might be used, consider a team that, on average, produces a score of 110-100. Its AFL_Pyth would be (110^2)/(110^2 + 100^2) or 0.548 and it would be expected to win logistic(8.24 x 0.548 - 4.13), or 59.4% of its games. Would such a team rather score one more goal per game on average or prevent the scoring of one goal per game on average by its opponents?
Well, if it scored one more goal per game, its AFL_Pyth would become 0.574 and it would be expected to win logistic(8.24 x 0.574 - 4.13), or 64.5% of its games. If, instead, it prevented one goal per game its expected winning percentage would become 65.3%. So, it would prefer better defence to better offence in this case. This result is true generally for all teams which, on average, score more points than their opponents. In fact, it's also true for any team that scores no more than 6 points fewer than its opponents, on average. Teams which, on average, score more than 6 points fewer than their opponents would, instead, prefer to score one more goal per game than to prevent one.
One aspect of this 2nd equation that I think is interesting is how it implies that relatively minor differences in a team's offensive and defensive abilities translate into winning percentages sharply different from 50%. For example, a team that averages a score of just 103 to 100 and which therefore, on average, scores only 50.7% of points, has an expected winning percentage according to our equation of 52.8%.
The Last 30 Years (or so)
Both of the earlier equations were fitted to the entire period 1897 to 2010. For this next section I've restricted our attention to seasons 1980 to 2010 inclusive.
For this period, the initial equation becomes:
Predicted Winning Percentage = logistic(0.148 x (Own Scoring Shots per game - Opponent Scoring Shots per game) + 4.84 x (Own Conversion Rate - Opponent Conversion Rate)
Again the R-squared for the fit is just a shade under 90%.
Note that the coefficients on both terms in this new equation are smaller than they were for the earlier, equivalent equation. This means that a given level of superiority, either in terms of extra scoring shots or a higher rate of conversion of those scoring shots, translates into a smaller expected winning percentage, presumably because of the generally higher levels of scoring in recent history compared to the entire history of the competition.
So, for example, a team that on average generates 1 more scoring shot than its opponents and that converts them at a rate 1% point higher than its opponents would be expected to win, using the original equation, 55.6% of its games. Using the new equation this percentage falls to 54.9%.