What is a Poisson Distribution?
The Poisson distribution is a probability distribution that was introduced in 1837 by the French mathematician, Siméon Denis Poisson.
The Poisson model has 4 main assumptions.
In the context of predicting football scores, these assumptions are as follows:
1. The number of goals is countable.
2. The occurrence of goals are independent.
3. The average rate of goals occurring can be calculated.
4. Two goals cannot occur at exactly the same time.
The third assumption is questionable in relation to football matches. This will be discussed later in this post.
In this post, I will discuss the arguments for and against using Poisson Calculators for trading or betting on football.
Poisson Calculators for Calculating Football Scores
There is substantial literature in peer-reviewed journals that supports the profitability of the Poisson distribution for predicting football scores (eg Dixon and Coles, 1997).
Most of these studies were conducted by statisticians.
I have taken the use of the Poisson distribution for analysing football matches to the next level.
A. The Advantages of Using the Poisson Distribution to Predict Football Results
I’ll first deal with the advantages of using the Poisson Distribution for predicting football matches.
After that, I will look at the objections to this method and my arguments against the objections.
1. The Calculator is Objective.
The formula is always the same and as such, we can be objective in our analysis of football matches. This means that we exclude our emotions and prejudices from our analysis.
2. The Poisson Model is Supported by Statisticians and Mathematicians.
I created my first Poisson Calculator several years ago. At first, I was put off because the odds didn’t always match the odds on Betfair. In such cases, I would think that the calculator was wrong and the Betfair odds must be right. In fact, these differences lead me to shelve the idea for a few months.
Although I had shelved the idea, it bothered me that there was so much literature in peer-reviewed journals.
These articles claimed:
The Poisson Distribution was accurate for predicting odds and results of football matches and that the authors were able to make a profit by using the Poisson Distribution
For example, the following are quotes from statistics and mathematics journal articles:
Our betting strategy is equally simple: we bet on all outcomes for which the ratio of model to bookmakers’ probabilities exceeds a specified level. For sufficiently high levels, we have shown that this strategy yields a positive expected return, even allowing for the in-built bias in the bookmakers’ odds
Dixon and Coles, 1997
A value betting strategy was devised and was noted to be profitable in the longrun
Mwembe, 2015
We show that our statistical modeling framework can produce a significant positive return over the bookmaker’s odds
Koopman and Lit, 2012
As you will learn later, it is the differences between the Poisson Calculator results and the Betfair odds might provide us with value bets and trades.
3. The Poisson Model May Help Us to Find Value
There are 2 aspects to finding value in football. Firstly, you need a baseline of odds to work from. Secondly, you need an opinion. The opinion shouldn’t be random. It should be based on several factors, that I will explain in future posts.
B. The Arguments Against Using the Poisson Model and Counter-Arguments
There are several common arguments against the use of the Poisson Distribution for predicting football results.
In this section, I will address these.
1. The Poisson Model is Based on a Single Variable (Goals Scored in Previous Matches)
The model is only based on a single variable if you only use one variable. Goals are used to obtain statistics, such as attack strength, defence strength and expected goals. The argument is that all of these statistics are based completely on goals scored.
I don’t recommend using the Poisson Calculator, without doing additional research. Therefore, I recommend Poisson as a useful tool that only forms part of your research, which means that you are using multiple variables for your research.
2. The Poisson Model Doesn’t Take into Account the Team Sheets
As I’ve said, even with the Poisson Model on your side, you still need to do further research into a match. Therefore, you should look at the team sheets.
However, you also need to be aware that the odds, generated by the Poisson Model, are derived from the performances of teams that have an average number of injuries throughout the year. In other words, the Poisson Model odds aren’t based on performances of teams that are always at full strength.
Occasionally, the pre-match market may overreact to big name players, who are missing from the starting line-ups. This might create value.
An example of this, is when Mo Salah isn’t in the starting line-up for Liverpool against a weaker team. How important is Mo Salah? Against a strong team, he is very important. However, against weaker teams, his replacement is not likely to affect the match result. It might make the difference between Liverpool winning 3-0 and 2-0.
Therefore, the absence of Mo Salah might affect the Over/Under Goal markets. However, it shouldn’t have a massive impact on the match odds.
3. The Model Yields Different Results, Depending on the Size of Your Data.
It’s important to understand that Poisson Model should be used for long term analysis only. It is not designed for small sample sizes.
As I mentioned, there is a substantial body of literature in peer-reviewed journals that supports the profitability of the Poisson distribution for predicting football scores (eg Dixon and Coles, 1997).
In most of these studies, sample sizes of 3 years or more were used.
If you want to look at recent form, you will need to use a separate method. I have provided a separate statistics for analysing the last 6 matches within the calculator.
Generally, I recommend using 5 years of data or more for analysis using the Poisson Model. However, you should be flexible and analytical.
I don’t necessarily stick to 5 years. For example, if a team has improved over the last couple of seasons, compared to previous years, I might use data from 2 seasons. This is because I don’t want to dilute the last 2 seasons statistics with data that might not be relevant.
Similarly, if a team has had a sudden injection of capital and has been buying expensive players, I might focus more on recent form.
4. Relegated and Promoted Teams will Not Be Represented Accurately
You don’t have to bet on every football match. I have listed the newly promoted and relegated teams into the leagues in order to give a warning that the data for these teams may be sparse.
Similarly, if a team performed unusually well or unusually poorly in the previous season, you could avoid such teams. On the other hand, if you use a lot of data, the one season that the team performed differently, will be diluted.
3. The Model Breaks an Assumption that the Occurrence of Goals are Independent.
As mentioned in the introduction to this post, the Poisson model has 4 main assumptions. The requirement that the occurrence of goals are independent is not met when analysing football matches.
If a goal is scored in a football match, this may change the strategy of both teams. The losing team may open up its game to try and score a goal. This may increase the chances of another goal being scored by either side.
How important is this? There are very few real-life events that can be said to be truly independent.
Half-time statistics generally show that a goal is more likely to occur in the second half, if the half-time score is 1-0 or 0-1, compared to the score being 0-0.
In the English Premier League, if the half-time score is 0-0, the league average for at least one goal being scored in the second half is around 76%.
If the half-time score is 1-0 or 0-1, the league average for a minimum of one goal being scored is just over 80%.
Therefore, there is only 4-5% difference in the likelihood of a goal being scored, depending on whether no goals or one goal has been scored by half-time. Although any difference is significant from a betting point of view, this variation is not too big to use the Poisson Distribution for predicting football results.
In addition, there is substantial literature in peer-reviewed journals that supports the profitability of the Poisson distribution for predicting football scores (eg Dixon and Coles, 1997).
Conclusion
Whether you are a punter or a trader, it’s important to have an odds baseline that you can work from.
The Poisson model is useful for the long term analysis of football teams. We also need to do some short-term analysis.
Once you have put the team names into the Poisson calculator, you will see some tabs appear at the top of the calculator.
The tabs include:
- Stats (1yr) – 1 year statistics
- Last 6 games (H vs A) – Last 6 Home games for the home team and last 6 away games for the away team
- Last 6 games (All) – The last 6 matches, regardless of whether the matches were home or away matches
- Stats (Last 6) – Statistics for the last 6 matches.
I will explain how I do an analysis of the last 6 matches in article 3 of this series.
In Article 2, I will explain how the traditional Poisson formula is calculated.
References
Çelik, Şenol. (2021). Predicting the Number of Goals in Football Matches with the Poisson distribution: Example of Spain La Liga. 8. 133-142. 10.36347/sjpms.2021.v08i08.002.
Dixon, M. J., & Coles, S. G. (1997). Modelling Association Football Scores and Inefficiencies in the Football Betting Market. Journal of the Royal Statistical Society. Series C (Applied Statistics), 46(2), 265–280. http://www.jstor.org/stable/2986290
Koopman, Siem Jan & Lit, Rutger. (2012). A Dynamic Bivariate Poisson Model for Analysing and Forecasting Match Results in the English Premier League. Journal of the Royal Statistical Society: Series A (Statistics in Society). 178. 10.2139/ssrn.2154792.
Mwembe, D. (2015). Application of a Bivariate Poisson Model in Devising a Profitable Betting Strategy of the Zimbabwe Premier Soccer League Match Results. American Journal of Theoretical and Applied Statistics. 4. 99. 10.11648/j.ajtas.20150403.15.