Issue #4


Descriptive vs. Predictive

We will talk about the differences between descriptive statistics and predictive modelling.


Descriptive statistics summarizes your data into a small number of key parameters such as percentages, frequencies, averages, standard deviations, etc., along with charts and graphs. Descriptive statistics looks into the past and does exactly what its name implies, describes your data. The following is an example of a descriptive statistic (remember it as we will use it later):

After 20 matches involving team A, 15 matches ended with Over 2.5 goals = 75%

Box Plot


On the other hand, predictive modelling uses methods + statistics to peek into the future. While generally a function of historical data, the methods applied to the data are designed to make the best possible prediction based on that data.

We may never perfectly model a football match due to the infinite number of variables and complex interactions between them. What we can do is reduce these variables by making assumptions. Assumptions are like a recipe: too many assumptions and our model can become too complex to implement. A powerful assumption made time and time in academic papers that study football is that the number of goals in a match follows a Poisson distribution. I would even dare to say that researchers concluded this by looking at the descriptive statistics that summarized a dataset containing the number of goals scored!

So we made 1 assumption...

Great! We have reduced football goals to a statistical problem we have studied extensively, know a great deal about and behaves well, the Poisson distribution. Now let's assume that football goals depend on the teams playing which is not too crazy to assume.

Now we have 2 assumptions...

Even better, our second assumption means that goals scored is a function of the teams playing. We can now use powerful methods to obtain the best estimate of goals scored in a match as a function of two teams playing. In fact, both teams might have never played before, but we can still plug them into our function and obtain a result. What is more, this result will behave according to the Poisson distribution (due to our 1st assumption) such that we can extract a probability and make statistical inferences about it (requires an entirely new article to expand on this concept).

Linear Regression

Model: If team A and team B play, the probability of Over 2.5 goals = 50%

So is it 75% or 50%?!

The future is always uncertain. What we can say is that the descriptive statistic did not incorporate the information of team B. Even if we did try to incorporate it (by taking the frequency of matches with Over 2.5 goals involving team B for example) what would we do next? Would we simply take an average of both numbers? We would lose tons of information by doing this because we have reduced a large amount of data into two numbers. Using our recipe analogy, the assumptions of this model are too simple.

Predictive modelling does a tremendous job at incorporating all the data possible to generate accurate forecasts. Every additional bit of data can only help to further improve a prediction, obviously within reason. In fact, we can tune these models so that more recent data is considered more important than older data. Another key aspect of predictive modelling that incorporates statistics is that our output is not only a single number, but a range of numbers due to the innate uncertainty of making predictions.


There is no right or wrong here, these are all mathematical tools created for our quest to predict what the future holds. Descriptive statistics hold an extremely important part in organizing and understanding data, but predictive modelling is specifically designed to make predictions.


The Business of Betting Podcast

First, if you have not heard of this sports betting podcast then I highly recommend you do so. It has fantastic interviews with high caliber betting experts like the Director of Pinnacle trading!

Second, I was featured in episode 121, so if you want to hear me ramble a bit, feel free to do so.

Third, give Jake a shout out at his twitter handle @bettingpod. He has a fantastic project going on and I can't wait to hear more interviews.

Updates! Search - Selections - Asian Handicap 0.25, 0.75 and more

I have added Quarter Asian Handicap Markets 0.25 and 0.75.

A new search feature has been added.

You can now save matches you have selected and keep track of them.

You can finally adjusted your time zone settings so match times are according to your time.

There is more to come!

Like what you read? I'd like to hear your ideas and suggestions! Contributors are welcome!

Email me at [email protected]