Entropy in AFL Scoring (Revisited)

At the distinct risk of diving yet deeper into what was already a fairly esoteric topic, I'm going to return in this blog to the notion of entropy as it applies to VFL/AFL scoring, which I considered at some length in a previous blog. Consider yourself duly warned - this post is probably only for those of you who truly enjoyed that earlier blog.

I've been pondering the results from that earlier posting, which in essence were that:

  • in the modern era, it's harder to guess the number of goals that were scored by a team given the number of behinds that it scored than it is to guess the number of behinds that were scored by a team given the number of goals that it scored
  • it's always been harder to guess the winning score in a game given the losing score than it has been to guess the losing score given the winning score

I arrived at these conclusions by estimating the entropy (ie randomness) that remained in the relevant metric - goals scored, behinds scored, winning score or losing score - once we'd taken into account the information provided by the corresponding metric. 

What finally struck me while I was out walking was that one metric could remain harder to predict than its corresponding metric for two reasons:

  1. It was much harder to predict to begin with (ie in the absence of any information provided by knowledge of the corresponding metric)
  2. The corresponding metric doesn't tell us much about the metric we were trying to predict

Now, in the first blog I used conditional entropy to tell me how much entropy remained once I'd accounted for what the corresponding metric told me, but I didn't estimate how much entropy there was to begin with in the metric I was predicting. The InfoTheo R package, which I used for the previous blog, has an entropy function that allows me to estimate exactly this.

Here, firstly then, are the season-by-season entropy estimates for goals and behinds, along with the conditional entropy estimates from the earlier blog.

The two entropy lines - the red and the green - reveal that the number of goals scored by a team has higher entropy (ie is harder to predict) than the number of behinds scored by a team.

Whilst knowing the number of behinds scored by a team does reduce for someone with that knowledge the entropy of the number of goals it scored, and whilst knowing the number of goals scored also reduces the entropy in the number of behinds scored, the magnitude of the two reductions is insufficiently different - or at least has been since the early 1980s - to make it easier to guess the number of goals that a team scored than the number of behinds that it scored, given knowledge of the corresponding metric.

Looking specifically at the reductions in entropy achieved by knowledge of the corresponding metric, we find that these have been steadily declining in percentage terms since the early 1980s, so much so that the link between goals scored and behinds scored seems to be disappearing. The existence of eras in this chart is striking but is perhaps a topic for another day. 

Performing similar analyses for the winning and losing score metrics, we arrive at the charts at right and below. In the first of them, where we've charted the base and conditional entropy metrics, we see the relatively high and increasing levels of entropy in winning and in losing scores across history, and the dramatic reduction in those entropies that occurs when we have knowledge of the corresponding metric. Knowing the losing score tells us a lot about the likely winning score (well, for one, it's larger than the losing score), and knowing the winning score tells us a lot about the likely losing score (it's smaller and non-negative).

In the end we wind up marginally better able to predict the losing score given the winning score than able to predict the winning score given the losing score because the entropy in winning scores was higher to begin with and because, proportionately, the reduction in that entropy afforded by knowledge of the losing score - which is what is charted at left - is smaller.

(By the way, the entropy and conditional entropy figures reported in this and the earlier blog have been calculated in nats, that is, using base e. This is different from the bits unit that I've used for surprisals calculations in other blogs, which adopts a base of 2. You can convert from nats to bits by multiplying by log 2 to the base e, which is about 1.443.)