Predicting the TAB Sportsbet Margin

We've shown previously that it's possible to predict the TAB Sportsbet Bookmaker's head-to-head prices to a high level of accuracy using only MARS Ratings, the Interstate Status of a game and information about the very recent form of the Home team.
Read More

Deconstructing The 2011 TAB Sportsbet Bookmaker

To what extent can the head-to-head prices set by the TAB Sportsbet Bookmaker in 2011 be modelled using only the competing teams' MAFL MARS Ratings, their respective Venue Experiences, and the Interstate Status of the fixture?
Read More

What 1% of Overround Worth?

Over on the Simulations blog I've been investigating how the returns to Kelly-staking and Level-staking respond to different levels of variability and bias in the bookmaker's team probability assessments, and to different levels of overround in that bookmaker's market prices. In this blog I'll investigate, using a purely mathematical approach, how a punter's expected return varies as the overround varies, depending on the size of the bias in the bookmaker's probability assessment and in the true probability of the team being wagered on.
Read More

Divining the Bookie Mind: Singularly Difficult

It's fun this time of year to mine the posted TAB Sportsbet markets in an attempt to glean what their bookie is thinking about the relative chances of the teams in each of the four possible Grand Final pairings.

Three markets provide us with the relevant information: those for each of the two Preliminary Finals, and that for the Flag.

From these markets we can deduce the following about the TAB Sportsbet bookie's current beliefs (making my standard assumption that the overround on each competitor in a contest is the same, which should be fairly safe given the range of probabilities that we're facing with the possible exception of the Dogs in the Flag market):

  • The probability of Collingwood defeating Geelong this week is 52%
  • The probability of St Kilda defeating the Dogs this week is 75%
  • The probability of Collingwood winning the Flag is about 34%
  • The probability of Geelong winning the Flag is about 32%
  • The probability of St Kilda winning the Flag is about 27%
  • The probability of the Western Bulldogs winning the Flag is about 6%

(Strictly speaking, the last probability is redundant since it's implied by the three before it.)

What I'd like to know is what these explicit probabilities imply about the implicit probabilities that the TAB Sportsbet bookie holds for each of the four possible Grand Final matchups - that is for the probability that the Pies beat the Dogs if those two teams meet in the Grand Final; that the Pies beat the Saints if, instead, that pair meet; and so on for the two matchups involving the Cats and the Dogs, and the Cats and the Saints.

It turns out that the six probabilities listed above are insufficient to determine a unique solution for the four Grand Final probabilities I'm after - in mathematical terms, the relevant system that we need to solve is singular.

That system is (approximately) the following four equations, which we can construct on the basis of the six known probabilities and the mechanics of which team plays which other team this week and, depending on those results, in the Grand Final: 

  • 52% x Pr(Pies beat Dogs) + 48% x Pr(Cats beat Dogs) = 76%
  • 52% x Pr(Pies beat Saints) + 48% x Pr(Cats beat Saints) = 63.5%
  • 75% x Pr(Pies beat Saints) + 25% x Pr(Pies beat Dogs) = 66%
  • 75% x Pr(Cats beat Saints) + 25% x Pr(Cats beat Dogs) = 67.5%

(If you've a mathematical bent you'll readily spot the reason for the singularity in this system of equations: the coefficients in every equation sum to 1, as they must since they're complementary probabilities.)

Whilst there's not a single solution to those four equations - actually there's an infinite number of them, so you'll be relieved to know that I won't be listing them all here - the fact that probabilities must lie between 0 and 1 puts constraints on the set of feasible solutions and allows us to bound the four probabilities we're after.

So, I can assert that, as far as the TAB Sportsbet bookie is concerned:

  • The probability that Collingwood would beat St Kilda if that were the Grand Final matchup - Pr(Pies beats Saints) in the above - is between about 55% and 70%
  • The probability that Collingwood would beat the Dogs if that were the Grand Final matchup is higher than 54% and, of course, less than or equal to 100%.
  • The probability that Geelong would beat St Kilda if that were the Grand Final matchup is between 57% and 73%
  • The probability that Geelong would beat the Dogs if that were the Grand Final matchup is higher than 50.5% and less than or equal to 100%.

One straightforward implication of these assertions is that the TAB Sportsbet bookie currently believes the winner of the Pies v Cats game on Friday night will start as favourite for the Grand Final. That's an interesting conclusion when you recall that the Saints beat the Cats in week 1 of the Finals.

We can be far more definitive about the four probabilities if we're willing to set the value of any one of them, as this then uniquely defines the other three.

So, let's assume that the bookie thinks that the probability of Collingwood defeating the Dogs if those two make the Grand Final is 80%. Given that, we can say that the bookie must also believe that:

  • The probability that Collingwood would beat St Kilda if that were the Grand Final matchup is about 61%.
  • The probability that Geelong would beat St Kilda if that were the Grand Final matchup, is about 66%.
  • The probability that Geelong would beat the Dogs if that were the Grand Final matchup is higher than 72%.

Together, that forms a plausible set of probabilities, I'd suggest, although the Geelong v St Kilda probability is higher than I'd have guessed. The only way to reduce that probability though is to also reduce the probability of the Pies beating the Dogs.

If you want to come up with your own rough numbers, choose your own probability for the Pies v Dogs matchup and then adjust the other three probabilities using the four equations above or using the following approximation:

For every 5% that you add to the Pies v Dogs probability:

  • subtract 1.5% from the Pies v Saints probability
  • add 2% to the Cats v Saints probability, and
  • subtract 5.5% from the Cats v Dogs probability

If you decide to reduce rather than increase the probability for the Pies v Dogs game then move the other three probabilities in the direction opposite to that prescribed in the above. Also, remember that you can't drop the Pies v Dogs probability below 55% nor raise it above 100% (no matter how much better than the Dogs you think the Pies are, the laws of probability must still be obeyed.)

Alternatively, you can just use the table below if you're happy to deal only in 5% increments of the Pies v Dogs probability. Each row corresponds to a set of the four probabilities that is consistent with the TAB Sportsbet markets as they currently stand.

2010 - Grand Final Probabilities.png

I've highlighted the four rows in the table that I think are the ones most likely to match the actual beliefs of the TAB Sportsbet bookie. That narrows each of the four probabilities into a 5-15% range.

At the foot of the table I've then converted these probability ranges into equivalent fair-value price ranges. You should take about 5% off these prices if you want to obtain likely market prices.

Line Betting : A Codicil

While contemplating the result from an earlier blog, which was that home teams had higher handicap-adjusted margins and won at a rate significantly higher than 50% on line betting - virtually regardless of the start they were giving or receiving - I wondered if the source of this anomaly might be that the bookie gives home teams a slightly better deal in setting line margins.
Read More

A Line Betting Enigma

The TAB Sportsbet bookmaker is, as you know, a man to be revered and feared in equal measure. Historically, his head-to-head prices have been so exquisitely well-calibrated that I instinctively compare any model I construct with the forecasts he produces. To show that a model historically outperforms leads me to scuttle off to determine what error I've made in constructing the model, what piece of information I've used that, in truth, was only available with the benefit of hindsight.
Read More

Predicting Head-to-Head Market Prices

In earlier blogs I've claimed that there's not much additional information in bookie prices that's useful for predicting victory margins than what can be derived from a statistical analysis of recent results and an understanding of game venues.
Read More

The Relationship Between Head-to-Head Price and Points Start

I've found yet another MAFL-related use for the Eureqa tool, this time to determine the precise relationship between a team's head-to-head price and the start it's giving or receiving on line betting. A simple plot of the history of a team's head-to-head price (or the probability that can be inferred from it) versus its start on line betting makes it obvious that there's a relationship between the two and that it's a non-linear one, but in the past I've been constrained by my own (lack of) ingenuity and persistence in generating sufficient possibilities to find its exact nature.
Read More

What Do Bookies Know That We Don't?

Bookies, I think MAFL has comprehensively shown, know a lot about football, but just how much more do they know than what you or I might glean from a careful review of each team's recent results and some other fairly basic knowledge about the venues at which games are played?
Read More

What Price the Saints to Beat the Cats in the GF?

If the Grand Final were to be played this weekend, what prices would be on offer?

We can answer this question for the TAB Sportsbet bookie using his prices for this week's games, his prices for the Flag market and a little knowledge of probability.

Consider, for example, what must happen for the Saints to win the flag. They must beat the Dogs this weekend and then beat whichever of the Cats or the Pies wins the other Preliminary Final. So, there are two mutually exclusive ways for them to win the Flag.

In terms of probabilities, we can write this as:

Prob(St Kilda Wins Flag) =

Prob(St Kilda Beats Bulldogs) x Prob (Geelong Beats Collingwood) x Prob(St Kilda Beats Geelong) +

Prob(St Kilda Beats Bulldogs) x Prob (Collingwood Beats Geelong) x Prob(St Kilda Beats Collingwood)

We can write three more equations like this, one for each of the other three Preliminary Finalists.

Now if we assume that the bookie's overround has been applied to each team equally then we can, firstly, calculate the bookie's probability of each team winning the Flag based on the current Flag market prices which are St Kilda $2.40; Geelong $2.50; Collingwood $5.50; and Bulldogs $7.50.

If we do this, we obtain:

  • Prob(St Kilda Wins Flag) = 36.8%
  • Prob(Geelong Wins Flag) = 35.3%
  • Prob(Collingwood Wins Flag) = 16.1%
  • Prob(Bulldogs Win Flag) = 11.8%

Next, from the current head-to-head prices for this week's games, again assuming equally applied overround, we can calculate the following probabilities:

  • Prob(St Kilda Beats Bulldogs) = 70.3%
  • Prob(Geelong Beats Collingwood) = 67.8%

Armed with those probabilities and the four equations of the form of the one above in bold we come up with a set of four equations in four unknowns, the unknowns being the implicit bookie probabilities for all the possible Grand Final matchups.

To lapse into the technical side of things for a second, we have a system of equations Ax = b that we want to solve for x. But, it turns out, the A matrix is rank-deficient. Mathematically this means that there are an infinite number of solutions for x; practically it means that we need to define one of the probabilities in x and we can then solve for the remainder.

Which probability should we choose?

I feel most confident about setting a probability - or a range of probabilities - for a St Kilda v Geelong Grand Final. St Kilda surely would be slight favourites, so let's solve the equations for Prob(St Kilda Beats Geelong) equal to 51% to 57%.

Each column of the table above provides a different solution and is obtained by setting the probability in the top row and then solving the equations to obtain the remaining probabilities.

The solutions in the first 5 columns all have the same characteristic, namely that the Saints are considered more likely to beat the Cats than they are to beat the Pies. To steal a line from Get Smart, I find that hard to believe, Max.

Inevitably then we're drawn to the last two columns of the table, which I've shaded in gray. Either of these solutions, I'd contend, are valid possibilities for the TAB Sportsbet bookie's true current Grand Final matchup probabilities.

If we turn these probabilities into prices, add a 6.5% overround to each, and then round up or down as appropriate, this gives us the following Grand Final matchup prices.

St Kilda v Geelong

  • $1.80/$1.95 or $1.85/$1.90

St Kilda v Collingwood

  • $1.75/$2.00 or $1.70/$2.10

Geelong v Bulldogs

  • $1.50/$2.45 or $1.60/$2.30

Collingwood v Bulldogs

  • $1.65/$2.20 or $1.50/$2.45

And the Last Shall be First (At Least Occasionally)

So far we've learned that handicap-adjusted margins appear to be normally distributed with a mean of zero and a standard deviation of 37.7 points. That means that the unadjusted margin - from the favourite's viewpoint - will be normally distributed with a mean equal to minus the handicap and a standard deviation of 37.7 points. So, if we want to simulate the result of a single game we can generate a random Normal deviate (surely a statistical contradiction in terms) with this mean and standard deviation.

Alternatively, we can, if we want, work from the head-to-head prices if we're willing to assume that the overround attached to each team's price is the same. If we assume that, then the home team's probability of victory is the head-to-head price of the underdog divided by the sum of the favourite's head-to-head price and the underdog's head-to-head price.

So, for example, if the market was Carlton $3.00 / Geelong $1.36, then Carlton's probability of victory is 1.36 / (3.00 + 1.36) or about 31%. More generally let's call the probability we're considering P%.

Working backwards then we can ask: what value of x for a Normal distribution with mean 0 and standard deviation 37.7 puts P% of the distribution on the left? This value will be the appropriate handicap for this game.

Again an example might help, so let's return to the Carlton v Geelong game from earlier and ask what value of x for a Normal distribution with mean 0 and standard deviation 37.7 puts 31% of the distribution on the left? The answer is -18.5. This is the negative of the handicap that Carlton should receive, so Carlton should receive 18.5 points start. Put another way, the head-to-head prices imply that Geelong is expected to win by about 18.5 points.

With this result alone we can draw some fairly startling conclusions.

In a game with prices as per the Carlton v Geelong example above, we know that 69% of the time this match should result in a Geelong victory. But, given our empirically-based assumption about the inherent variability of a football contest, we also know that Carlton, as well as winning 31% of the time, will win by 6 goals or more about 1 time in 14, and will win by 10 goals or more a litle less than 1 time in 50. All of which is ordained to be exactly what we should expect when the underlying stochastic framework is that Geelong's victory margin should follow a Normal distribution with a mean of 18.8 points and a standard deviation of 37.7 points.

So, given only the head-to-head prices for each team, we could readily simulate the outcome of the same game as many times as we like and marvel at the frequency with which apparently extreme results occur. All this is largely because 37.7 points is a sizeable standard deviation.

Well if simulating one game is fun, imagine the joy there is to be had in simulating a whole season. And, following this logic, if simulating a season brings such bounteous enjoyment, simulating say 10,000 seasons must surely produce something close to ecstasy.

I'll let you be the judge of that.

Anyway, using the Wednesday noon (or nearest available) head-to-head TAB Sportsbet prices for each of Rounds 1 to 20, I've calculated the relevant team probabilities for each game using the method described above and then, in turn, used these probabilities to simulate the outcome of each game after first converting these probabilities into expected margins of victory.

(I could, of course, have just used the line betting handicaps but these are posted for some games on days other than Wednesday and I thought it'd be neater to use data that was all from the one day of the week. I'd also need to make an adjustment for those games where the start was 6.5 points as these are handled differently by TAB Sportsbet. In practice it probably wouldn't have made much difference.)

Next, armed with a simulation of the outcome of every game for the season, I've formed the competition ladder that these simulated results would have produced. Since my simulations are of the margins of victory and not of the actual game scores, I've needed to use points differential - that is, total points scored in all games less total points conceded - to separate teams with the same number of wins. As I've shown previously, this is almost always a distinction without a difference.

Lastly, I've repeated all this 10,000 times to generate a distribution of the ladder positions that might have eventuated for each team across an imaginary 10,000 seasons, each played under the same set of game probabilities, a summary of which I've depicted below. As you're reviewing these results keep in mind that every ladder has been produced using the same implicit probabilities derived from actual TAB Sportsbet prices for each game and so, in a sense, every ladder is completely consistent with what TAB Sportsbet 'expected'.

Simulated Seasons.png

The variability you're seeing in teams' final ladder positions is not due to my assuming, say, that Melbourne were a strong team in one season's simulation, an average team in another simulation, and a very weak team in another. Instead, it's because even weak teams occasionally get repeatedly lucky and finish much higher up the ladder than they might reasonably expect to. You know, the glorious uncertainty of sport and all that.

Consider the row for Geelong. It tells us that, based on the average ladder position across the 10,000 simulations, Geelong ranks 1st, based on its average ladder position of 1.5. The barchart in the 3rd column shows the aggregated results for all 10,000 simulations, the leftmost bar showing how often Geelong finished 1st, the next bar how often they finished 2nd, and so on.

The column headed 1st tells us in what proportion of the simulations the relevant team finished 1st, which, for Geelong, was 68%. In the next three columns we find how often the team finished in the Top 4, the Top 8, or Last. Finally we have the team's current ladder position and then, in the column headed Diff, a comparison of the each teams' current ladder position with its ranking based on the average ladder position from the 10,000 simulations. This column provides a crude measure of how well or how poorly teams have fared relative to TAB Sportsbet's expectations, as reflected in their head-to-head prices.

Here are a few things that I find interesting about these results:

  • St Kilda miss the Top 4 about 1 season in 7.
  • Nine teams - Collingwood, the Dogs, Carlton, Adelaide, Brisbane, Essendon, Port Adelaide, Sydney and Hawthorn - all finish at least once in every position on the ladder. The Bulldogs, for example, top the ladder about 1 season in 25, miss the Top 8 about 1 season in 11, and finish 16th a little less often than 1 season in 1,650. Sydney, meanwhile, top the ladder about 1 season in 2,000, finish in the Top 4 about 1 season in 25, and finish last about 1 season in 46.
  • The ten most-highly ranked teams from the simulations all finished in 1st place at least once. Five of them did so about 1 season in 50 or more often than this.
  • Every team from ladder position 3 to 16 could, instead, have been in the Spoon position at this point in the season. Six of those teams had better than about a 1 in 20 chance of being there.
  • Every team - even Melbourne - made the Top 8 in at least 1 simulated season in 200. Indeed, every team except Melbourne made it into the Top 8 about 1 season in 12 or more often.
  • Hawthorn have either been significantly overestimated by the TAB Sportsbet bookie or deucedly unlucky, depending on your viewpoint. They are 5 spots lower on the ladder than the simulations suggest that should expect to be.
  • In contrast, Adelaide, Essendon and West Coast are each 3 spots higher on the ladder than the simulations suggest they should be.

(In another blog I've used the same simulation methodology to simulate the last two rounds of the season and project where each team is likely to finish.)