Classifier bettor

This example illustrates how to use a classifier-based bettor and evaluate its performance on soccer historical data.

# Author: Georgios Douzas <gdouzas@icloud.com>
# Licence: MIT

from sportsbet.datasets import SoccerDataLoader
from sportsbet.evaluation import ClassifierBettor
from sklearn.dummy import DummyClassifier
from sklearn.model_selection import cross_val_score, TimeSeriesSplit

Extracting the training data

We extract the training data for the Spanish league. We also remove columns that contain missing values and select the market maximum odds.

dataloader = SoccerDataLoader(param_grid={'league': ['Spain']})
X_train, Y_train, O_train = dataloader.extract_train_data(
    drop_na_thres=1.0, odds_type='market_maximum'
)

Out:

Football-Data.co.uk: ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00

The input data:

print(X_train)

Out:

           league  division  year   home_team    away_team  match_quality  ...  away_team_soccer_power_index  home_team_probability_win  away_team_probability_win  probability_draw  home_team_projected_score  away_team_projected_score
date                                                                       ...
2016-08-19  Spain         1  2017      Malaga      Osasuna      63.805561  ...                         56.93                     0.5475                     0.1897            0.2628                       1.56                       0.70
2016-08-19  Spain         1  2017   La Coruna        Eibar      64.335545  ...                         62.29                     0.5003                     0.2260            0.2738                       1.47                       0.79
2016-08-20  Spain         1  2017     Granada   Villarreal      64.559709  ...                         76.79                     0.3194                     0.3917            0.2889                       1.07                       1.19
2016-08-20  Spain         1  2017     Sevilla      Espanol      73.415362  ...                         68.75                     0.5952                     0.1760            0.2288                       1.89                       0.88
2016-08-20  Spain         1  2017   Barcelona        Betis      81.054510  ...                         69.95                     0.9591                     0.0071            0.0337                       3.40                       0.42
...           ...       ...   ...         ...          ...            ...  ...                           ...                        ...                        ...               ...                        ...                        ...
2022-04-17  Spain         1  2022     Sevilla  Real Madrid      78.514148  ...                         84.21                     0.3087                     0.4187            0.2727                       1.21                       1.46
2022-04-17  Spain         1  2022  Ath Madrid      Espanol      70.290915  ...                         62.90                     0.6979                     0.0987            0.2034                       2.03                       0.61
2022-04-17  Spain         1  2022     Granada      Levante      59.139742  ...                         62.59                     0.3985                     0.3456            0.2559                       1.57                       1.45
2022-04-17  Spain         1  2022  Ath Bilbao        Celta      75.637057  ...                         73.36                     0.4816                     0.2351            0.2833                       1.44                       0.92
2022-04-18  Spain         1  2022   Barcelona        Cadiz      70.151267  ...                         59.65                     0.7060                     0.1040            0.1899                       2.21                       0.71

[4026 rows x 13 columns]

The multi-output targets:

print(Y_train)

Out:

      output__home_win__full_time_goals  output__draw__full_time_goals  output__away_win__full_time_goals  output__over_2.5__full_time_goals  output__under_2.5__full_time_goals
0                                 False                           True                              False                              False                                True
1                                  True                          False                              False                               True                               False
2                                 False                           True                              False                              False                                True
3                                  True                          False                              False                               True                               False
4                                  True                          False                              False                               True                               False
...                                 ...                            ...                                ...                                ...                                 ...
4021                              False                          False                               True                               True                               False
4022                               True                          False                              False                               True                               False
4023                              False                          False                               True                               True                               False
4024                              False                          False                               True                              False                                True
4025                              False                          False                               True                              False                                True

[4026 rows x 5 columns]

The odds data:

print(O_train)

Out:

      odds__market_maximum__home_win__full_time_goals  odds__market_maximum__draw__full_time_goals  odds__market_maximum__away_win__full_time_goals  odds__market_maximum__over_2.5__full_time_goals  odds__market_maximum__under_2.5__full_time_goals
0                                                 NaN                                          NaN                                              NaN                                              NaN                                               NaN
1                                                 NaN                                          NaN                                              NaN                                              NaN                                               NaN
2                                                 NaN                                          NaN                                              NaN                                              NaN                                               NaN
3                                                 NaN                                          NaN                                              NaN                                              NaN                                               NaN
4                                                 NaN                                          NaN                                              NaN                                              NaN                                               NaN
...                                               ...                                          ...                                              ...                                              ...                                               ...
4021                                             3.19                                         3.50                                             2.58                                             2.14                                              1.90
4022                                             1.46                                         4.75                                             9.30                                             1.96                                              1.98
4023                                             2.45                                         3.67                                             3.06                                             1.98                                              2.03
4024                                             1.90                                         3.69                                             5.00                                             2.27                                              1.75
4025                                             1.24                                         7.00                                            16.00                                             1.65                                              2.50

[4026 rows x 5 columns]

Classifier bettor

We can use ClassifierBettor class to create a classifier-based bettor. A DummyClassifier is selected for convenience.

Any bettor is a classifier, therefore we can fit it on the training data.

Out:

ClassifierBettor(classifier=DummyClassifier())

We can predict probabilities for the positive class.

Out:

array([[0.44138102, 0.29334327, 0.26527571, 0.43070045, 0.56929955],
       [0.44138102, 0.29334327, 0.26527571, 0.43070045, 0.56929955],
       [0.44138102, 0.29334327, 0.26527571, 0.43070045, 0.56929955],
       ...,
       [0.44138102, 0.29334327, 0.26527571, 0.43070045, 0.56929955],
       [0.44138102, 0.29334327, 0.26527571, 0.43070045, 0.56929955],
       [0.44138102, 0.29334327, 0.26527571, 0.43070045, 0.56929955]])

We can also predict the class label.

Out:

array([[False, False, False, False,  True],
       [False, False, False, False,  True],
       [False, False, False, False,  True],
       ...,
       [False, False, False, False,  True],
       [False, False, False, False,  True],
       [False, False, False, False,  True]])

Finally, we can evaluate its cross-validation accuracy.

Out:

0.0

Backtesting the bettor

We can backtest the bettor using the historical data.

Out:

ClassifierBettor(classifier=DummyClassifier())

Various backtesting statistics are calculated.

Training Start Training End Training Period Testing Start Testing End Testing Period Start Value End Value Total Return [%] Max Drawdown [%] Max Drawdown Duration Total Bets Win Rate [%] Best Bet [%] Worst Bet [%] Avg Winning Bet [%] Avg Losing Bet [%] Profit Factor Sharpe Ratio Avg Bet Yield [%] Std Bet Yield [%]
0 2016-08-19 2018-01-28 527 days 2018-01-28 2019-06-09 498 days 1000.0 1000.00 0.000 NaN NaT 0.0 NaN NaN NaN NaN NaN NaN inf NaN NaN
1 2016-08-19 2018-10-27 799 days 2018-10-27 2020-07-19 632 days 1000.0 1007.35 0.735 6.086273 253 days 12:00:00 485.0 41.855670 778.0 -166.666667 149.136547 -107.730221 1.016547 0.125659 0.227334 152.179665
2 2016-08-19 2019-08-17 1093 days 2019-08-18 2021-02-08 541 days 1000.0 1054.13 5.413 4.406679 243 days 00:00:00 691.0 42.836469 1600.0 -171.428571 152.932786 -105.652345 1.084648 0.706009 5.116394 164.035557
3 2016-08-19 2021-02-08 1634 days 2020-09-12 2021-09-06 360 days 1000.0 938.24 -6.176 9.042500 277 days 12:00:00 633.0 39.652449 2053.0 -183.333333 150.042383 -101.531569 0.902835 -1.040598 -1.776337 166.121940
4 2016-08-19 2021-09-06 1844 days 2021-08-13 2022-04-18 249 days 1000.0 868.67 -13.133 17.216347 208 days 12:00:00 624.0 38.301282 1051.0 -171.428571 148.266129 -104.328580 0.811538 -2.491423 -7.414375 156.980380


We can also plot the portfolio value for any testing period from the above backtesting results.

testing_period = 2
bettor.backtest_plot_value_(testing_period)


Estimating the value bets

We extract the fixtures data to estimate the value bets.

We can estimate the value bets by using the fitted classifier.

odds__market_maximum__home_win__full_time_goals odds__market_maximum__draw__full_time_goals odds__market_maximum__away_win__full_time_goals odds__market_maximum__over_2.5__full_time_goals odds__market_maximum__under_2.5__full_time_goals


Total running time of the script: ( 0 minutes 49.857 seconds)

Gallery generated by Sphinx-Gallery