Note
Click here to download the full example code
Classifier bettor¶
This example illustrates how to use a classifier-based bettor and evaluate its performance on soccer historical data.
# Author: Georgios Douzas <gdouzas@icloud.com>
# Licence: MIT
import numpy as np
from sportsbet.datasets import SoccerDataLoader
from sportsbet.evaluation import ClassifierBettor
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import cross_val_score, TimeSeriesSplit
Extracting the training data¶
We extract the training data for the Spanish league. We also remove columns that contain missing values and select the market maximum odds.
dataloader = SoccerDataLoader(param_grid={'league': ['Spain']})
X_train, Y_train, O_train = dataloader.extract_train_data(
drop_na_thres=1.0, odds_type='market_maximum'
)
Out:
Football-Data.co.uk: ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
The input data:
The multi-output targets:
The odds:
The simplify the training process we keep only the numerical features of the input data:
num_features = [
col
for col in X_train.columns
if X_train[col].dtype in (np.dtype(int), np.dtype(float))
]
X_train = X_train[num_features]
Classifier bettor¶
We can use ClassifierBettor
class to create
a classifier-based bettor. A KNeighborsClassifier
is selected.
Any bettor is a classifier, therefore we can fit it on the training data.
Out:
ClassifierBettor(classifier=KNeighborsClassifier())
We can predict probabilities for the positive class.
Out:
array([[0.2, 0.4, 0.4, 0.4, 0.6],
[0. , 0.4, 0.6, 0.4, 0.6],
[0. , 0.2, 0.8, 0.6, 0.4],
...,
[0.4, 0.6, 0. , 0.2, 0.8],
[0.2, 0. , 0.8, 0.4, 0.6],
[0.8, 0.2, 0. , 0.6, 0.4]])
We can predict the class label.
Out:
array([[False, False, False, False, True],
[False, False, True, False, True],
[False, False, True, True, False],
...,
[False, True, False, False, True],
[False, False, True, False, True],
[ True, False, False, True, False]])
Finally, we can evaluate its cross-validation accuracy.
cross_val_score(bettor, X_train, Y_train, cv=TimeSeriesSplit()).mean()
Out:
0.17025316455696204
Backtesting the bettor¶
We can backtest the bettor using the historical data.
Out:
ClassifierBettor(classifier=KNeighborsClassifier())
Various backtesting statistics are calculated.
We can also plot the portfolio value for any testing period from the above backtesting results.
testing_period = 2
bettor.backtest_plot_value_(testing_period)
Estimating the value bets¶
We extract the fixtures data to estimate the value bets.
We can estimate the value bets by using the fitted classifier.
Total running time of the script: ( 0 minutes 44.064 seconds)