Note
Click here to download the full example code
FiveThirtyEight soccer data¶
This example illustrates the usage of FiveThirtyEight soccer dataloader.
# Author: Georgios Douzas <gdouzas@icloud.com>
# Licence: MIT
from sportsbet.datasets import FTESoccerDataLoader
Getting the available parameters¶
We can get the available parameters in order to select the training data
to be extracted, using the get_all_params()
class method.
Out:
[{'division': 1, 'league': 'Argentina', 'year': 2018}, {'division': 1, 'league': 'Argentina', 'year': 2019}, {'division': 1, 'league': 'Argentina', 'year': 2020}, {'division': 1, 'league': 'Argentina', 'year': 2022}, {'division': 1, 'league': 'Australia', 'year': 2019}, {'division': 1, 'league': 'Australia', 'year': 2020}, {'division': 1, 'league': 'Australia', 'year': 2021}, {'division': 1, 'league': 'Australia', 'year': 2022}, {'division': 1, 'league': 'Austria', 'year': 2018}, {'division': 1, 'league': 'Austria', 'year': 2019}, {'division': 1, 'league': 'Austria', 'year': 2020}, {'division': 1, 'league': 'Austria', 'year': 2021}, {'division': 1, 'league': 'Austria', 'year': 2022}, {'division': 1, 'league': 'Belgium', 'year': 2019}, {'division': 1, 'league': 'Belgium', 'year': 2020}, {'division': 1, 'league': 'Belgium', 'year': 2021}, {'division': 1, 'league': 'Belgium', 'year': 2022}, {'division': 1, 'league': 'Brazil', 'year': 2018}, {'division': 1, 'league': 'Brazil', 'year': 2019}, {'division': 1, 'league': 'Brazil', 'year': 2020}, {'division': 1, 'league': 'Brazil', 'year': 2021}, {'division': 1, 'league': 'Brazil', 'year': 2022}, {'division': 1, 'league': 'Brazil', 'year': 2023}, {'division': 1, 'league': 'Champions-League', 'year': 2017}, {'division': 1, 'league': 'Champions-League', 'year': 2018}, {'division': 1, 'league': 'Champions-League', 'year': 2019}, {'division': 1, 'league': 'Champions-League', 'year': 2020}, {'division': 1, 'league': 'Champions-League', 'year': 2021}, {'division': 1, 'league': 'Champions-League', 'year': 2022}, {'division': 1, 'league': 'China', 'year': 2019}, {'division': 1, 'league': 'China', 'year': 2020}, {'division': 1, 'league': 'Denmark', 'year': 2019}, {'division': 1, 'league': 'Denmark', 'year': 2020}, {'division': 1, 'league': 'Denmark', 'year': 2021}, {'division': 1, 'league': 'Denmark', 'year': 2022}, {'division': 1, 'league': 'England', 'year': 2017}, {'division': 1, 'league': 'England', 'year': 2018}, {'division': 1, 'league': 'England', 'year': 2019}, {'division': 1, 'league': 'England', 'year': 2020}, {'division': 1, 'league': 'England', 'year': 2021}, {'division': 1, 'league': 'England', 'year': 2022}, {'division': 1, 'league': 'Europa', 'year': 2022}, {'division': 1, 'league': 'Europa-League', 'year': 2018}, {'division': 1, 'league': 'Europa-League', 'year': 2019}, {'division': 1, 'league': 'Europa-League', 'year': 2020}, {'division': 1, 'league': 'Europa-League', 'year': 2021}, {'division': 1, 'league': 'Europa-League', 'year': 2022}, {'division': 1, 'league': 'FAWSL', 'year': 2017}, {'division': 1, 'league': 'FAWSL', 'year': 2018}, {'division': 1, 'league': 'FAWSL', 'year': 2019}, {'division': 1, 'league': 'FAWSL', 'year': 2020}, {'division': 1, 'league': 'FAWSL', 'year': 2021}, {'division': 1, 'league': 'FAWSL', 'year': 2022}, {'division': 1, 'league': 'France', 'year': 2017}, {'division': 1, 'league': 'France', 'year': 2018}, {'division': 1, 'league': 'France', 'year': 2019}, {'division': 1, 'league': 'France', 'year': 2020}, {'division': 1, 'league': 'France', 'year': 2021}, {'division': 1, 'league': 'France', 'year': 2022}, {'division': 1, 'league': 'Germany', 'year': 2017}, {'division': 1, 'league': 'Germany', 'year': 2018}, {'division': 1, 'league': 'Germany', 'year': 2019}, {'division': 1, 'league': 'Germany', 'year': 2020}, {'division': 1, 'league': 'Germany', 'year': 2021}, {'division': 1, 'league': 'Germany', 'year': 2022}, {'division': 1, 'league': 'Greece', 'year': 2019}, {'division': 1, 'league': 'Greece', 'year': 2020}, {'division': 1, 'league': 'Greece', 'year': 2022}, {'division': 1, 'league': 'Italy', 'year': 2017}, {'division': 1, 'league': 'Italy', 'year': 2018}, {'division': 1, 'league': 'Italy', 'year': 2019}, {'division': 1, 'league': 'Italy', 'year': 2020}, {'division': 1, 'league': 'Italy', 'year': 2021}, {'division': 1, 'league': 'Italy', 'year': 2022}, {'division': 1, 'league': 'Japan', 'year': 2019}, {'division': 1, 'league': 'Japan', 'year': 2020}, {'division': 1, 'league': 'Japan', 'year': 2021}, {'division': 1, 'league': 'Japan', 'year': 2022}, {'division': 1, 'league': 'Japan', 'year': 2023}, {'division': 1, 'league': 'Mexico', 'year': 2017}, {'division': 1, 'league': 'Mexico', 'year': 2018}, {'division': 1, 'league': 'Mexico', 'year': 2019}, {'division': 1, 'league': 'Mexico', 'year': 2020}, {'division': 1, 'league': 'Mexico', 'year': 2021}, {'division': 1, 'league': 'Mexico', 'year': 2022}, {'division': 1, 'league': 'NWSL', 'year': 2018}, {'division': 1, 'league': 'NWSL', 'year': 2019}, {'division': 1, 'league': 'NWSL', 'year': 2020}, {'division': 1, 'league': 'NWSL', 'year': 2021}, {'division': 1, 'league': 'NWSL', 'year': 2022}, {'division': 1, 'league': 'NWSL', 'year': 2023}, {'division': 1, 'league': 'Netherlands', 'year': 2018}, {'division': 1, 'league': 'Netherlands', 'year': 2019}, {'division': 1, 'league': 'Netherlands', 'year': 2020}, {'division': 1, 'league': 'Netherlands', 'year': 2021}, {'division': 1, 'league': 'Netherlands', 'year': 2022}, {'division': 1, 'league': 'Norway', 'year': 2018}, {'division': 1, 'league': 'Norway', 'year': 2019}, {'division': 1, 'league': 'Norway', 'year': 2020}, {'division': 1, 'league': 'Norway', 'year': 2021}, {'division': 1, 'league': 'Norway', 'year': 2022}, {'division': 1, 'league': 'Norway', 'year': 2023}, {'division': 1, 'league': 'Portugal', 'year': 2018}, {'division': 1, 'league': 'Portugal', 'year': 2019}, {'division': 1, 'league': 'Portugal', 'year': 2020}, {'division': 1, 'league': 'Portugal', 'year': 2021}, {'division': 1, 'league': 'Portugal', 'year': 2022}, {'division': 1, 'league': 'Russia', 'year': 2018}, {'division': 1, 'league': 'Russia', 'year': 2019}, {'division': 1, 'league': 'Russia', 'year': 2020}, {'division': 1, 'league': 'Russia', 'year': 2021}, {'division': 1, 'league': 'Russia', 'year': 2022}, {'division': 1, 'league': 'Scotland', 'year': 2018}, {'division': 1, 'league': 'Scotland', 'year': 2019}, {'division': 1, 'league': 'Scotland', 'year': 2020}, {'division': 1, 'league': 'Scotland', 'year': 2021}, {'division': 1, 'league': 'Scotland', 'year': 2022}, {'division': 1, 'league': 'South-Africa', 'year': 2019}, {'division': 1, 'league': 'South-Africa', 'year': 2020}, {'division': 1, 'league': 'South-Africa', 'year': 2022}, {'division': 1, 'league': 'Spain', 'year': 2017}, {'division': 1, 'league': 'Spain', 'year': 2018}, {'division': 1, 'league': 'Spain', 'year': 2019}, {'division': 1, 'league': 'Spain', 'year': 2020}, {'division': 1, 'league': 'Spain', 'year': 2021}, {'division': 1, 'league': 'Spain', 'year': 2022}, {'division': 1, 'league': 'Sweden', 'year': 2018}, {'division': 1, 'league': 'Sweden', 'year': 2019}, {'division': 1, 'league': 'Sweden', 'year': 2020}, {'division': 1, 'league': 'Sweden', 'year': 2021}, {'division': 1, 'league': 'Sweden', 'year': 2022}, {'division': 1, 'league': 'Sweden', 'year': 2023}, {'division': 1, 'league': 'Switzerland', 'year': 2018}, {'division': 1, 'league': 'Switzerland', 'year': 2019}, {'division': 1, 'league': 'Switzerland', 'year': 2020}, {'division': 1, 'league': 'Switzerland', 'year': 2021}, {'division': 1, 'league': 'Switzerland', 'year': 2022}, {'division': 1, 'league': 'Turkey', 'year': 2018}, {'division': 1, 'league': 'Turkey', 'year': 2019}, {'division': 1, 'league': 'Turkey', 'year': 2020}, {'division': 1, 'league': 'Turkey', 'year': 2021}, {'division': 1, 'league': 'Turkey', 'year': 2022}, {'division': 1, 'league': 'USA', 'year': 2018}, {'division': 1, 'league': 'USA', 'year': 2019}, {'division': 1, 'league': 'USA', 'year': 2020}, {'division': 1, 'league': 'USA', 'year': 2021}, {'division': 1, 'league': 'USA', 'year': 2022}, {'division': 1, 'league': 'USA', 'year': 2023}, {'division': 1, 'league': 'United-Soccer-League', 'year': 2019}, {'division': 1, 'league': 'United-Soccer-League', 'year': 2020}, {'division': 1, 'league': 'United-Soccer-League', 'year': 2021}, {'division': 1, 'league': 'United-Soccer-League', 'year': 2022}, {'division': 1, 'league': 'United-Soccer-League', 'year': 2023}, {'division': 2, 'league': 'England', 'year': 2018}, {'division': 2, 'league': 'England', 'year': 2019}, {'division': 2, 'league': 'England', 'year': 2020}, {'division': 2, 'league': 'England', 'year': 2021}, {'division': 2, 'league': 'England', 'year': 2022}, {'division': 2, 'league': 'France', 'year': 2018}, {'division': 2, 'league': 'France', 'year': 2019}, {'division': 2, 'league': 'France', 'year': 2020}, {'division': 2, 'league': 'France', 'year': 2021}, {'division': 2, 'league': 'France', 'year': 2022}, {'division': 2, 'league': 'Germany', 'year': 2018}, {'division': 2, 'league': 'Germany', 'year': 2019}, {'division': 2, 'league': 'Germany', 'year': 2020}, {'division': 2, 'league': 'Germany', 'year': 2021}, {'division': 2, 'league': 'Germany', 'year': 2022}, {'division': 2, 'league': 'Italy', 'year': 2018}, {'division': 2, 'league': 'Italy', 'year': 2019}, {'division': 2, 'league': 'Italy', 'year': 2020}, {'division': 2, 'league': 'Italy', 'year': 2021}, {'division': 2, 'league': 'Italy', 'year': 2022}, {'division': 2, 'league': 'Spain', 'year': 2018}, {'division': 2, 'league': 'Spain', 'year': 2019}, {'division': 2, 'league': 'Spain', 'year': 2020}, {'division': 2, 'league': 'Spain', 'year': 2021}, {'division': 2, 'league': 'Spain', 'year': 2022}, {'division': 3, 'league': 'England', 'year': 2019}, {'division': 3, 'league': 'England', 'year': 2020}, {'division': 3, 'league': 'England', 'year': 2021}, {'division': 3, 'league': 'England', 'year': 2022}, {'division': 4, 'league': 'England', 'year': 2019}, {'division': 4, 'league': 'England', 'year': 2020}, {'division': 4, 'league': 'England', 'year': 2021}, {'division': 4, 'league': 'England', 'year': 2022}]
We select to extract training data only for the year 2021 of all the divisions of English league.
param_grid = {'league': ['England'], 'year': [2021]}
dataloader = FTESoccerDataLoader(param_grid=param_grid)
Getting the available odds types¶
We can get the available odds types in order to match the output of the
training data, using the get_odds_types()
class method.
Out:
[]
Therefore no odds data are available.
Extracting the training data¶
We extract the training data using the default values for the parameters
odds_type`
and drop_na_thres`
.
The input data:
print(X_train)
Out:
year division match_quality league home_team away_team ... away_team_probability_win probability_draw home_team_projected_score away_team_projected_score home_team_match_importance away_team_match_importance
date ...
2020-09-11 2021 2 54.127384 England Watford Middlesbrough ... 0.1423 0.2190 2.06 0.85 53.0 16.5
2020-09-12 2021 2 45.561500 England Queens Park Rangers Nottingham Forest ... 0.3297 0.2717 1.49 1.33 23.4 16.2
2020-09-12 2021 2 45.738207 England Derby County Reading ... 0.2891 0.2742 1.52 1.19 16.6 22.0
2020-09-12 2021 2 53.410804 England Huddersfield Town Norwich City ... 0.4612 0.2680 1.17 1.60 18.0 43.6
2020-09-12 2021 2 54.727624 England AFC Bournemouth Blackburn ... 0.1492 0.2199 2.06 0.89 55.9 20.2
... ... ... ... ... ... ... ... ... ... ... ... ... ...
2021-05-18 2021 4 11.104789 England Newport County Forest Green Rovers ... 0.2475 0.3073 1.21 0.82 100.0 100.0
2021-05-20 2021 4 13.561136 England Tranmere Rovers Morecambe ... 0.3779 0.2818 1.19 1.27 100.0 100.0
2021-05-23 2021 4 10.938702 England Forest Green Rovers Newport County ... 0.3932 0.3134 0.92 1.11 100.0 100.0
2021-05-23 2021 4 13.488255 England Morecambe Tranmere Rovers ... 0.2586 0.2708 1.47 1.02 100.0 100.0
2021-05-31 2021 4 14.866651 England Morecambe Newport County ... 0.4691 0.0000 1.23 1.13 100.0 100.0
[2051 rows x 15 columns]
The targets:
print(Y_train)
Out:
output__home_win__full_time_goals output__away_win__full_time_goals output__draw__full_time_goals ... output__under_2.5__full_time_goals output__under_3.5__full_time_goals output__under_4.5__full_time_goals
0 True False False ... True True True
1 True False False ... True True True
2 False True False ... True True True
3 False True False ... True True True
4 True False False ... False False False
... ... ... ... ... ... ... ...
2046 True False False ... True True True
2047 False True False ... False True True
2048 True False False ... False False False
2049 False False True ... True True True
2050 True False False ... True True True
[2051 rows x 11 columns]
Extracting the fixtures data¶
We extract the fixtures data with columns that match the columns of the
training data. On the other hand, the fixtures data are not affected by
the param_grid
selection.
The input data:
print(X_fix)
Out:
year division match_quality league home_team ... probability_draw home_team_projected_score away_team_projected_score home_team_match_importance away_team_match_importance
date ...
2022-04-19 2022 1 84.389443 England Liverpool ... 0.1570 2.53 0.78 100.0 30.9
2022-04-19 2022 3 30.339354 England Oxford United ... 0.2406 1.45 1.69 20.5 100.0
2022-04-19 2022 3 26.603504 England Ipswich Town ... 0.3176 0.94 1.03 0.0 100.0
2022-04-19 2022 3 17.331843 England Cambridge United ... 0.2857 1.19 1.17 0.0 0.0
2022-04-19 2022 1 67.863289 Spain Real Betis ... 0.2207 1.93 0.69 91.1 3.2
... ... ... ... ... ... ... ... ... ... ... ...
2022-11-13 2023 1 33.604233 Norway Viking FK ... 0.2333 2.01 1.06 NaN NaN
2022-11-13 2023 1 20.219616 Norway Kristiansund BK ... 0.2707 1.62 0.85 NaN NaN
2022-11-13 2023 1 25.689516 Norway Tromso ... 0.3044 1.24 1.11 NaN NaN
2022-11-13 2023 1 28.961629 Norway Lillestrom ... 0.2336 1.93 0.80 NaN NaN
2022-12-01 2023 1 12.989485 United-Soccer-League New Mexico United ... 0.2211 1.90 0.80 NaN NaN
[2792 rows x 15 columns]
Total running time of the script: ( 0 minutes 0.605 seconds)