The Relationship Between Expected Victory Margins and Estimated Win Probabilities

One of the challenges you face early on when you’re forecasting football results, is how to model the relationship between your forecasted game margin and your estimate of team victory probabilities.

There are no doubt a number of viable ways of doing this, but one obvious approach is to fit a logistic equation of the form shown at right.

This provides an S-shaped mapping where estimated win probabilities respond most to changes in expected margins when those margins are near zero. It also ensures that all estimated probabilities lie between 0 and 1, which they must.

I’ve used this form of mapping for many years with values of k in the 0.04 to 0.05 range, and have found it to be very serviceable. I’ve also previously fitted it to bookmaker data and found that it generally provides an excellent fit.

THE WRINKLE

It’s always intrigued me, however, that the same bookmaker can have markets up for two games of AFL, both having the same head-to-head prices for the home and away teams, but each having a different line handicap (and, by implication, different expected game margin). Why is this?

Might it be that the head-to-head and line markets can safely be priced somewhat independently or, instead, might there be another variable that needs to be included in the mapping?

The equation I provided earlier doesn’t allow for such variation - it provides a single mapping between any given estimated victory probability and expected margin - so we’ll need to come up with some variation of it if we can think of another variable to introduce.

But, what variable?

expected scoring opportunities

Consider an imaginary sport in which, in Game 1, it is expected that there will be 10 scoring events, 6 of which are expected to accrue to Team A and 4 to Team B. Here then, Team A has an expected victory margin of 2. Now imagine that there is some luck involved in every scoring event such that the probability of scoring is proportional to each team’s expected share of the overall scoring.

Assuming that every scoring event is independent, Team A has a 60% chance of scoring any given point, and Team B has a 40% chance.

Using the binomial distribution, we estimate (in R):

  • Prob(Team A wins) = sum(dbinom(6:10,10,0.6)) = 63%

  • Prob(Team A ties) = dbinom(5,10,0.6) = 20%

Now imagine that, in Game 2, played against a different opponent, it is expected that there will be 20 scoring events, 11 of which are expected to accrue to Team A and 9 to Team C. So, Team A once again has an expected victory margin of 2.

Using the binomial distribution again, we now have:

  • Prob(Team A wins) = sum(dbinom(11:20,20,0.55)) = 59%

  • Prob(Team A ties) = dbinom(10,20,0.55) = 16%

So, in a higher scoring game, a given expected victory margin maps to a lower probability for the favourite.

Part of the reason for this, as we can see here, is that having the same expected margin in a game with a higher total means that the team’s relative scoring abilities are more similar. In Game 1 we had Team A as being 60% likely to score next, while in Game 2 they’re only 55% likely to score next.

Might this offer a clue to the missing variable?

THE DATA

For the analysis in this blog, we’re going to use the data for the AFL from AusSportsBetting, which includes, among other things, closing head-to-head, line, and totals prices for a large number of games.

We will:

  • Create estimated home team victory probabilities by calculating Away Team Closing H2H Price / ( Away Team Closing H2H Price + Home Team Closing H2H Price). This is by far the simplest way of removing the vig from head-to-head prices (to ensure that they sum to 1 across the two outcomes of home win and away win). Although other methods are possible (eg power, Shin, Weights Proportional to Odds, and so on, as provided for in the implied R package and discussed here), my analysis has shown that the basic method performs generally at least as well as these other methods in providing well-calibrated probability estimates.

  • Create expected home team victory margins by taking the negative of the closing line and, if necessary, adjusting it when the closing prices for the two teams in the line market are not equal. That adjustment is calculated by making the simplifying assumption that margins are Normally distributed around their expected value with a standard deviation of 34.5.
    So, for example, if the closing price in the line market for the home team is $1.80 and that for the away team $2.00, then we calculate the adjustment to the line (in R) using

    round(qnorm(Away Line Odds Close/(Away Line Odds Close + Home Line Odds Close), 0, 34.5), 0)
    (ie round(qnorm(2/(2 + 1.8), 0, 34.5), 0) = 2)

    We subtract this adjustment from the handicap (and then take the negative of that adjusted value as our expected home team victory margin). The logic here is that the $1.80 price for the home team is signalling that the handicap favours the home team and so should be increased (ie made more negative).

  • Create expected total scores by taking the closing total and, if necessary, adjusting it when the closing prices for under versus over that total are not equal. That adjustment is calculated by making the simplifying assumption that totals are Normally distributed around their expected value with a standard deviation of 28.
    So, for example, if the closing price in the totals line for the for the over is $2.00 and that for the under $1.80, then we calculate the adjustment to the total (in R) using

    round(qnorm(Under Odds Close/(Over Odds Close + Under Odds Close), 0, 28), 0)
    (ie
    round(qnorm(1.8/(1.8 + 2), 0, 28), 0) = -2)

    We add this adjustment to the original total. The logic here is that the $2 price for the over is signalling that the probability of the score exceeding the total is less than 50%, so we need to lower the total to achieve the 50:50 balance.

    We’ll use expected total score as the proxy for expected scoring opportunities.

At the end of this process, we have estimated probabilities, expected margins and expected totals for 2,167 home and away games and finals.

THE FIT

Many years ago I extolled the virtues of a program called Eureqa (and possibly even received a quasi-mention in the New Scientist as a result of using it and corresponding with its creator), which used a genetic algorithm and symbollic regression (IIRC) to fit arbitrary functional forms to data. In other words, you could say to it

“I think there’s some sort of relationship between the rating that a wine receives from experts and its alcohol content, level of residual sugar, acidity, level of sulfur dioxide, level of chlorides, its density, and its pH level, but I don’t know the exact nature of any of the relationships. Please find acceptably simple equations that will explain as large a proportion of variability in expert ratings as you can”

Eureqa would then create a number of equations of increasing complexity that would answer your question. It was fabulous.

It was also free, but was soon purchased by Nutonian Inc and, later, by DataRobot, where it is now apparently available and considerably not free.

I deeply lamented the loss of Eureqa for a number of years until, maybe 18 months ago, I discovered Turing Bot, which replicates much of the functionality of Eureqa but can be purchased for a not totally ridiculous price of US$149/yr or US$379 lifetime as at the time of writing.

It’s this application that I used for the analysis.

We’ll start by giving Turing Bot no steer at all about the functional form we expect, although we will stop it from using any of the trigonometric functions and some of the other, funkier functions such as erf and lgamma.

You can see the output it produced after about 5 minutes in the graphic below

(Note that I used the Line here instead of the Expected Margin, so just replace LINE with -Expected Margin in your mind)

You can see that the fit is pretty good for the chosen equation of size 8, so let’s chart this equation for different expected margin and total values.

A few things to note:

  • Given the chance, the fitting algorithm very quickly includes the total as a variable.

  • As we’d hope, estimated win probability increases with expected margin and estimated win probability is about 50% for an expected margin of zero.

  • The victory probability for any given positive expected margin falls as the expected total rises, which is what we were testing for.

  • The absolute impact on estimated probabilities for a given expected total varies significantly with the expected margin and is most pronounced for larger margins in absolute terms.

  • Estimated probabilities are not bound by (0,1), which they should be, although this is only a problem for relatively large margins in absolute terms.

Now there is nothing preventing us from giving the Turing Bot some suggestions about functional forms to investigate, which we do using the “Advanced” option and then providing our suggestion.

In the image below you see that we’ve suggested a logistic form, but not constrained how line and total should enter the component inside the exponential.

Note that we have opted for a 70/30 train and test split, which allows us to compare the fit on the two samples to avoid over-fitting. In the above image we have the R-squared for the training sample at 0.996. Below we have the figure for the test sample and it is of a similar magnitude, which provides some comfort that the algorithm has not merely fitted the training sample.

If we chart the highlighted equation for different expected margins and totals, we get the chart at right.

Here, we note:

  • Given the chance, the fitting algorithm again includes the total as a variable.

  • Estimated win probability increases with expected margin and estimated win probability is exactly 50% for an expected margin of zero.

  • The victory probability for any given positive expected margin falls as the expected total rises, which is what we were testing for.

  • The absolute impact on estimated probabilities for a given expected total varies with the expected margin and is most pronounced for larger margins in absolute terms, though never huge.

  • Estimated probabilities are bound by (0,1).

SUMMARY AND CONCLUSION

We set out to see if the assumption of a one-to-one mapping between expected victory margin and estimated win probability was a reasonable one, made quizzical in part by the observation that bookmakers sometimes have markets for two games simultaneously with matching head-to-head prices but differing handicaps (or the opposite).

Next we observed that, if we assume scoring opportunities have a stochastic element, then games with the same expected margin but differing expected numbers of scoring opportunities yield differing theoretical win probability estimates. In particular, for a given expected margin, more scoring opportunities implies a lower win probability for the favourite.

Fitting models to empirical bookmaker data revealed the same relationship - that is, two games with the same expected margins will have different head-to-head prices (and hence implied win probabilities) if they have different expected totals. As well, games with higher expected totals will be associated with lower estimated win probabilities for a favourite expected to win by some fixed margin.

All that said, the relevant effect sizes are quite small. For example, a team expected to win by 40 points will have an estimated win probability of about 88% in a game expected to produce about 145 points (which is roughly the 10th percentile) and an estimated win probability of about 85.5% in a game expected to produce about 186 points (which is roughly the 90th percentile).

So, practically speaking, the maximum effect size is only around a couple of percentage points, which means that, for most applications, using a fitted version of the very first equation in this blog should suffice. If you look at the equation above the chosen one in the final diagram, you can see what that fit is for the bookmaker data (ie it uses a constant of about 0.046).

As a final thought, it will be interesting to pay more attention to the times when a bookmaker has matched head-to-head or matched lines across two games to see how significant are the actual differences in implied probabilities or expected victory margins.