Matter of Stats

View Original

Does Crowd Size Affect Game Outcomes?

Based on empirical evidence we know that there is a home ground advantage in AFL which, in part, might be attributable to the pro-Home team leanings amongst the majority of the crowd. In this blog I want to explore a slightly different question about the effects of the crowd: specifically, does the size of the crowd matter too?

For all the regression models discussed in this blog I've used data for the full home-and-away seasons from 2006 to 2012. I've also used Entropy, as described in an earlier blog, as the measure of the relative strengths of the opposing teams where such a concept has seemed relevant.

Total Points Scored

It's conceivable that the total points scored in a game could be influenced by the size of the crowd, though it's hard to present an apriori case for the sign of the effect. Larger crowds might elicit elevated performance levels, which could in turn lead to enhanced offensive effectiveness, heightened defensive effectiveness, or both. Instead, larger crowds might have the opposite effect on player performance levels and lead to weaker defence, sloppier offence, or both.

Factors other than crowd size are also likely to influence total scoring too, namely: 

  • The identity of the home team
  • The identity of the away team
  • The game venue (which, because of its specific characteristics, might lend itself to higher or lower scoring levels)
  • The day of the week on which the game is played (because, for example, Friday games are generally played at night when the prevailing conditions might be less conducive to scoring)

Lastly, it's possible that the relative strengths of the competing teams could have an effect on the total scoring. It might be, for instance, that more evenly-matched teams tend to nullify each other's offences, leading to lower scoring games.

The results of regressing total points scored on all the regressors mentioned above appear as the leftmost columns in the table below. In interpreting the coefficients in this - and in all other regressions in this blog - bear in mind that: 

  • The "reference game" is a Friday night clash between the Pies and the Roos played at the MCG.
  • The team and venue coefficients should generally be considered together. For example, whenever the Designated Home Team is Adelaide, the Venue can only be Adelaide Oval or Football Park, and the Designated Away Team cannot be Adelaide. 

Some of the key conclusions for the total score model are that: 

  • Some teams are more likely to be involved in high scoring games than others, for example Essendon at home playing just about any other team at Docklands, especially if the game's on a Sunday.
  • Similarly, some venues are more likely to produce high-scoring contests (though heed should be taken here of my earlier comment about considering only observed home team / venue pairings). On average, for example, games at Docklands have generated 2 goals more than have games played at the MCG, once we control for the teams playing and the day of week for the fixture.
  • Whilst games with more closely-matched teams do generally produce fewer points, the maximum size of this effect (when Entropy is 1, signifying equal favouritism) is less than 5 points and is not statistically significantly different from zero.
  • The effect of crowd size is positive, but very small and also not statistically significant. An additional 10,000 fans would be expected to increase the total score by fewer than 2 points. 

Home Team and Away Team Scores

We've just seen that the size of the crowd has little effect on the aggregate points scored by the competing teams. Perhaps, instead of effecting the total, it differentially effects the home or the away team scores. The next two sets of coefficients in thetable are for OLS regressions with home team score (and then with away team score) as the target variables.

For these regressions I've controlled for the relative strength of the home team compared to the away team by including as a regressor the TAB Bookmaker's pre-game (implicit) probability assessment of the home team. This variable is, as we'd expect, monumentally statistically significant (ie has a very small p-value) in both regressions.

With these models too it's possible to work through various interesting combinations of teams, venue and day of week to come up with an estimate of the likely home and the likely away team scores, but I'll restrict my comments here to noting the positive, small and once again not statistically significant coefficients on the Crowd variable. One thing that is interesting to note is that the coefficient on the Crowd variable for the Away team score is larger than the coefficient for the Home team score, suggesting that larger crowds are more beneficial to the away team than they are to the home team - something we'll return to a little later in this blog.

Absolute Margin

If crowd size has little to no effect on the points scored by the home, the points scored by the away team, or the sum of these scores, maybe it effects the difference between them. Perhaps, for example, larger crowds lead to closer games.

For this regression I've again used Entropy as a proxy for the bookmaker's pre-game assessment of the relative strengths of the competing teams, and I've also included a binary variable that is 1 if the home team is the clear favourite in the game.

Once again we find that the effect of crowd size is positive, numerically small, and not statistically significant.

(You could argue, I suppose, that the bookmaker has already factored the impact of the expected crowd size into his pre-game prices so that the true magnitude of crowd size is not properly reflected in the coefficients shown here, but re-running the model excluding the Entropy variable swaps the sign on the Crowd variable and still doesn't make it statistically significant.

Additionally you might claim that including team, venue and day of week variables usurps some of the impact of crowd size. A model with Crowd as the only regressor returns a very small, negative, and statistically not significant coefficient on that variable.)

Game Result

For the final regression in this blog I'm going to switch from considering scores to considering only the game outcome from the viewpoint of the home team. I'll be fitting a binary logit to this outcome (and ignoring drawn games) and using for regressors: 

  • the home team, away team, venue and day of week, as before 
  • the implicit home team probability as reflected by the pre-game head-to-head prices
  • some interaction variables which are crowd size multiplied by a 7-level factor which expresses, in a discrete manner, the chances of the home team. 

For this latter variable we define: 

  • a Heavy Underdog as a team priced at over $6.25
  • a Moderate Underdog as a team priced over $3.00 and up to $6.25
  • a Narrow Underdog as a team priced over $2.10 and up to $3.00
  • a Near Equal Favourite as a team priced over $1.75 and up to $2.10
  • a Narrow Favourite as a team priced over $1.35 and up to $1.75
  • a Moderate Favourite as a team priced over $1.15 and up to $1.35
  • a Heavy Favourite as a team priced at $1.15 or less

This variable is, of course, correlated with the implicit home team probability, but this formulation allows us to explore the marginal impact of crowd size for different classes of home team underdogs and favourites, having already accounted for the larger, overall effect of the home team's relative strength.

What we find is that larger crowds are generally detrimental to a home team's victory chances, especially to home teams that are near equal favourites. (This result is similar to the result we found earlier where the coefficient on crowd size for away teams was larger than the coefficient on crowd size for home teams.)

For such teams the coefficient on the relevant Crowd interaction term is negative and statistically significant. To give some context to the coefficient of -0.017, consider a near equal favourite home team that would otherwise be a 50% proposition and then imagine that an extra 10,000 fans turned out for the game unexpectedly. This would lower that home team's victory probability by over 4%.

Conclusions

In summary it seems that crowd size has little if any effect on the total points scored in a game, the points scored by the home team, the points scored by the away team, or the difference between the points scored by the home and the away teams, once we control for the identities of the teams participating, the venue, the day of the week and the bookmaker-assessed strengths, relative or absolute, of the competing teams. If there is any effect of crowd size, it's positive in all four cases.

Crowd size does however appear to influence game outcomes for home teams of differing strengths, with home teams being generally less likely to win than their pre-game odds would suggest as crowds get larger, with near, narrow and moderate favourites most effected of all (though only for near-favourites can we claim that the effect is statistically significant).