Dixon-Coles Model for Football Betting: Building a Match Predictor

The Dixon-Coles model is the most cited academic framework for football match prediction. Published in 1997, it solves a specific flaw in standard Poisson regression and remains the benchmark against which newer models are measured.

The Problem with Basic Poisson

Standard Poisson regression models each team's goals independently. This works reasonably well overall, but it systematically underestimates the probability of low-scoring results — particularly 0-0, 1-0, 0-1, and 1-1 scorelines.

In reality, there is a slight positive dependence between low goal counts. When one team struggles to score, the match environment often suppresses the other team's attacking output too (defensive tactics, low tempo, weather). Basic Poisson misses this.

The Dixon-Coles Correction

Dixon and Coles introduced a dependence parameter rho that adjusts the joint probability of low-scoring results:

For 0-0: multiply the Poisson probability by (1 - lambda x mu x rho)
For 1-0: multiply by (1 + mu x rho)
For 0-1: multiply by (1 + lambda x rho)
For 1-1: multiply by (1 - rho)

Where lambda and mu are the expected goals for home and away teams. The parameter rho is typically a small negative number (around -0.13 to -0.08), which increases the probability of 0-0 and 1-1 and decreases 1-0 and 0-1.

Implementation Steps

1. Prepare Data

Load historical match data with columns: date, home team, away team, home goals, away goals.

2. Define the Model Parameters

Each team has an attack parameter (alpha) and defence parameter (beta). Plus:

A home advantage parameter (gamma)
The dependence correction (rho)

For a 20-team league, this means 41 parameters to estimate (20 attack + 20 defence + 1 home advantage), with rho estimated separately.

3. Apply Time-Decay Weighting

Multiply each match's contribution to the likelihood function by a decay factor:

Weight = exp(-xi x t)

Where t is the number of days since the match and xi controls the decay rate. A typical xi of 0.005 means a match from 6 months ago carries roughly 40% of the weight of a match played yesterday.

4. Optimise Using Maximum Likelihood

Use numerical optimisation (Python's scipy.optimize.minimize or R's optim) to find the parameters that maximise the likelihood of observing the historical results.

5. Generate Predictions

For a new match, combine the attack and defence parameters to get expected goals, then use the Poisson distribution with the Dixon-Coles correction to generate probabilities for every scoreline.

Example: Arsenal (attack 1.35, defence 0.82) vs Brighton (attack 1.05, defence 1.12). Home xG = 1.35 x 1.12 x 1.36 (home advantage) = 2.06. Away xG = 1.05 x 0.82 = 0.86. After Dixon-Coles correction: Home win 58.2%, Draw 21.5%, Away win 20.3%.

A £30 bet on the draw at odds of 3.60 returns £108. If your model gives 21.5% draw probability and the bookmaker's implied probability is 27.8% (1/3.60), the bookmaker is actually overpricing the draw — no value there. But if another market shows edge, your model guides you to it.

Beyond Dixon-Coles

The model can be extended with additional features: incorporating expected goals (xG) data, adding team-specific home advantages, or using bivariate Poisson distributions instead of the correction factor. Each extension adds complexity but potentially improves accuracy.

Frequently Asked Questions

What is the Dixon-Coles model?+

Published by Mark Dixon and Stuart Coles in 1997, the Dixon-Coles model is a modified Poisson regression specifically designed for football match prediction. It corrects a known flaw in standard Poisson models — the underestimation of low-scoring draws — by adding a dependence parameter (rho) that adjusts probabilities for scorelines like 0-0, 1-0, 0-1, and 1-1.

How does Dixon-Coles differ from basic Poisson regression?+

Basic Poisson regression assumes home and away goals are independent. In reality, football goals show slight positive dependence at low scores — 0-0 and 1-1 draws occur more often than independence predicts. Dixon-Coles adds a correlation parameter rho that corrects this, improving accuracy on correct score and draw predictions.

What data do I need to implement Dixon-Coles?+

You need historical match results with home goals and away goals for at least one full season. Better models use 2-3 seasons with time-decay weighting. Team names must be consistent across all records. No additional statistics beyond scores are required — the model derives everything from goal data.

Is Dixon-Coles still relevant in modern betting?+

Yes. While more complex models incorporating expected goals (xG) and machine learning exist, Dixon-Coles remains a strong baseline. Many professional bettors and analysts still use it as a foundation, often augmenting it with additional data rather than replacing it entirely.

Can I implement Dixon-Coles in a spreadsheet?+

Not practically. The model requires maximum likelihood estimation to solve for all parameters simultaneously, which needs iterative numerical optimisation. Python (with SciPy's minimize function) or R are the standard implementation tools. A working implementation requires approximately 100-150 lines of code.