FiveThirtyEight soccer data

This example illustrates the usage of FiveThirtyEight soccer dataloader.

# Author: Georgios Douzas <gdouzas@icloud.com>
# Licence: MIT

from sportsbet.datasets import FTESoccerDataLoader

Getting the available parameters

We can get the available parameters in order to select the training data to be extracted, using the get_all_params() class method.

Out:

[{'division': 1, 'league': 'Argentina', 'year': 2018}, {'division': 1, 'league': 'Argentina', 'year': 2019}, {'division': 1, 'league': 'Argentina', 'year': 2020}, {'division': 1, 'league': 'Argentina', 'year': 2022}, {'division': 1, 'league': 'Australia', 'year': 2019}, {'division': 1, 'league': 'Australia', 'year': 2020}, {'division': 1, 'league': 'Australia', 'year': 2021}, {'division': 1, 'league': 'Australia', 'year': 2022}, {'division': 1, 'league': 'Austria', 'year': 2018}, {'division': 1, 'league': 'Austria', 'year': 2019}, {'division': 1, 'league': 'Austria', 'year': 2020}, {'division': 1, 'league': 'Austria', 'year': 2021}, {'division': 1, 'league': 'Austria', 'year': 2022}, {'division': 1, 'league': 'Belgium', 'year': 2019}, {'division': 1, 'league': 'Belgium', 'year': 2020}, {'division': 1, 'league': 'Belgium', 'year': 2021}, {'division': 1, 'league': 'Belgium', 'year': 2022}, {'division': 1, 'league': 'Brazil', 'year': 2018}, {'division': 1, 'league': 'Brazil', 'year': 2019}, {'division': 1, 'league': 'Brazil', 'year': 2020}, {'division': 1, 'league': 'Brazil', 'year': 2021}, {'division': 1, 'league': 'Brazil', 'year': 2022}, {'division': 1, 'league': 'Brazil', 'year': 2023}, {'division': 1, 'league': 'Champions-League', 'year': 2017}, {'division': 1, 'league': 'Champions-League', 'year': 2018}, {'division': 1, 'league': 'Champions-League', 'year': 2019}, {'division': 1, 'league': 'Champions-League', 'year': 2020}, {'division': 1, 'league': 'Champions-League', 'year': 2021}, {'division': 1, 'league': 'Champions-League', 'year': 2022}, {'division': 1, 'league': 'China', 'year': 2019}, {'division': 1, 'league': 'China', 'year': 2020}, {'division': 1, 'league': 'Denmark', 'year': 2019}, {'division': 1, 'league': 'Denmark', 'year': 2020}, {'division': 1, 'league': 'Denmark', 'year': 2021}, {'division': 1, 'league': 'Denmark', 'year': 2022}, {'division': 1, 'league': 'England', 'year': 2017}, {'division': 1, 'league': 'England', 'year': 2018}, {'division': 1, 'league': 'England', 'year': 2019}, {'division': 1, 'league': 'England', 'year': 2020}, {'division': 1, 'league': 'England', 'year': 2021}, {'division': 1, 'league': 'England', 'year': 2022}, {'division': 1, 'league': 'Europa', 'year': 2022}, {'division': 1, 'league': 'Europa-League', 'year': 2018}, {'division': 1, 'league': 'Europa-League', 'year': 2019}, {'division': 1, 'league': 'Europa-League', 'year': 2020}, {'division': 1, 'league': 'Europa-League', 'year': 2021}, {'division': 1, 'league': 'Europa-League', 'year': 2022}, {'division': 1, 'league': 'FAWSL', 'year': 2017}, {'division': 1, 'league': 'FAWSL', 'year': 2018}, {'division': 1, 'league': 'FAWSL', 'year': 2019}, {'division': 1, 'league': 'FAWSL', 'year': 2020}, {'division': 1, 'league': 'FAWSL', 'year': 2021}, {'division': 1, 'league': 'FAWSL', 'year': 2022}, {'division': 1, 'league': 'France', 'year': 2017}, {'division': 1, 'league': 'France', 'year': 2018}, {'division': 1, 'league': 'France', 'year': 2019}, {'division': 1, 'league': 'France', 'year': 2020}, {'division': 1, 'league': 'France', 'year': 2021}, {'division': 1, 'league': 'France', 'year': 2022}, {'division': 1, 'league': 'Germany', 'year': 2017}, {'division': 1, 'league': 'Germany', 'year': 2018}, {'division': 1, 'league': 'Germany', 'year': 2019}, {'division': 1, 'league': 'Germany', 'year': 2020}, {'division': 1, 'league': 'Germany', 'year': 2021}, {'division': 1, 'league': 'Germany', 'year': 2022}, {'division': 1, 'league': 'Greece', 'year': 2019}, {'division': 1, 'league': 'Greece', 'year': 2020}, {'division': 1, 'league': 'Greece', 'year': 2022}, {'division': 1, 'league': 'Italy', 'year': 2017}, {'division': 1, 'league': 'Italy', 'year': 2018}, {'division': 1, 'league': 'Italy', 'year': 2019}, {'division': 1, 'league': 'Italy', 'year': 2020}, {'division': 1, 'league': 'Italy', 'year': 2021}, {'division': 1, 'league': 'Italy', 'year': 2022}, {'division': 1, 'league': 'Japan', 'year': 2019}, {'division': 1, 'league': 'Japan', 'year': 2020}, {'division': 1, 'league': 'Japan', 'year': 2021}, {'division': 1, 'league': 'Japan', 'year': 2022}, {'division': 1, 'league': 'Japan', 'year': 2023}, {'division': 1, 'league': 'Mexico', 'year': 2017}, {'division': 1, 'league': 'Mexico', 'year': 2018}, {'division': 1, 'league': 'Mexico', 'year': 2019}, {'division': 1, 'league': 'Mexico', 'year': 2020}, {'division': 1, 'league': 'Mexico', 'year': 2021}, {'division': 1, 'league': 'Mexico', 'year': 2022}, {'division': 1, 'league': 'NWSL', 'year': 2018}, {'division': 1, 'league': 'NWSL', 'year': 2019}, {'division': 1, 'league': 'NWSL', 'year': 2020}, {'division': 1, 'league': 'NWSL', 'year': 2021}, {'division': 1, 'league': 'NWSL', 'year': 2022}, {'division': 1, 'league': 'NWSL', 'year': 2023}, {'division': 1, 'league': 'Netherlands', 'year': 2018}, {'division': 1, 'league': 'Netherlands', 'year': 2019}, {'division': 1, 'league': 'Netherlands', 'year': 2020}, {'division': 1, 'league': 'Netherlands', 'year': 2021}, {'division': 1, 'league': 'Netherlands', 'year': 2022}, {'division': 1, 'league': 'Norway', 'year': 2018}, {'division': 1, 'league': 'Norway', 'year': 2019}, {'division': 1, 'league': 'Norway', 'year': 2020}, {'division': 1, 'league': 'Norway', 'year': 2021}, {'division': 1, 'league': 'Norway', 'year': 2022}, {'division': 1, 'league': 'Norway', 'year': 2023}, {'division': 1, 'league': 'Portugal', 'year': 2018}, {'division': 1, 'league': 'Portugal', 'year': 2019}, {'division': 1, 'league': 'Portugal', 'year': 2020}, {'division': 1, 'league': 'Portugal', 'year': 2021}, {'division': 1, 'league': 'Portugal', 'year': 2022}, {'division': 1, 'league': 'Russia', 'year': 2018}, {'division': 1, 'league': 'Russia', 'year': 2019}, {'division': 1, 'league': 'Russia', 'year': 2020}, {'division': 1, 'league': 'Russia', 'year': 2021}, {'division': 1, 'league': 'Russia', 'year': 2022}, {'division': 1, 'league': 'Scotland', 'year': 2018}, {'division': 1, 'league': 'Scotland', 'year': 2019}, {'division': 1, 'league': 'Scotland', 'year': 2020}, {'division': 1, 'league': 'Scotland', 'year': 2021}, {'division': 1, 'league': 'Scotland', 'year': 2022}, {'division': 1, 'league': 'South-Africa', 'year': 2019}, {'division': 1, 'league': 'South-Africa', 'year': 2020}, {'division': 1, 'league': 'South-Africa', 'year': 2022}, {'division': 1, 'league': 'Spain', 'year': 2017}, {'division': 1, 'league': 'Spain', 'year': 2018}, {'division': 1, 'league': 'Spain', 'year': 2019}, {'division': 1, 'league': 'Spain', 'year': 2020}, {'division': 1, 'league': 'Spain', 'year': 2021}, {'division': 1, 'league': 'Spain', 'year': 2022}, {'division': 1, 'league': 'Sweden', 'year': 2018}, {'division': 1, 'league': 'Sweden', 'year': 2019}, {'division': 1, 'league': 'Sweden', 'year': 2020}, {'division': 1, 'league': 'Sweden', 'year': 2021}, {'division': 1, 'league': 'Sweden', 'year': 2022}, {'division': 1, 'league': 'Sweden', 'year': 2023}, {'division': 1, 'league': 'Switzerland', 'year': 2018}, {'division': 1, 'league': 'Switzerland', 'year': 2019}, {'division': 1, 'league': 'Switzerland', 'year': 2020}, {'division': 1, 'league': 'Switzerland', 'year': 2021}, {'division': 1, 'league': 'Switzerland', 'year': 2022}, {'division': 1, 'league': 'Turkey', 'year': 2018}, {'division': 1, 'league': 'Turkey', 'year': 2019}, {'division': 1, 'league': 'Turkey', 'year': 2020}, {'division': 1, 'league': 'Turkey', 'year': 2021}, {'division': 1, 'league': 'Turkey', 'year': 2022}, {'division': 1, 'league': 'USA', 'year': 2018}, {'division': 1, 'league': 'USA', 'year': 2019}, {'division': 1, 'league': 'USA', 'year': 2020}, {'division': 1, 'league': 'USA', 'year': 2021}, {'division': 1, 'league': 'USA', 'year': 2022}, {'division': 1, 'league': 'USA', 'year': 2023}, {'division': 1, 'league': 'United-Soccer-League', 'year': 2019}, {'division': 1, 'league': 'United-Soccer-League', 'year': 2020}, {'division': 1, 'league': 'United-Soccer-League', 'year': 2021}, {'division': 1, 'league': 'United-Soccer-League', 'year': 2022}, {'division': 1, 'league': 'United-Soccer-League', 'year': 2023}, {'division': 2, 'league': 'England', 'year': 2018}, {'division': 2, 'league': 'England', 'year': 2019}, {'division': 2, 'league': 'England', 'year': 2020}, {'division': 2, 'league': 'England', 'year': 2021}, {'division': 2, 'league': 'England', 'year': 2022}, {'division': 2, 'league': 'France', 'year': 2018}, {'division': 2, 'league': 'France', 'year': 2019}, {'division': 2, 'league': 'France', 'year': 2020}, {'division': 2, 'league': 'France', 'year': 2021}, {'division': 2, 'league': 'France', 'year': 2022}, {'division': 2, 'league': 'Germany', 'year': 2018}, {'division': 2, 'league': 'Germany', 'year': 2019}, {'division': 2, 'league': 'Germany', 'year': 2020}, {'division': 2, 'league': 'Germany', 'year': 2021}, {'division': 2, 'league': 'Germany', 'year': 2022}, {'division': 2, 'league': 'Italy', 'year': 2018}, {'division': 2, 'league': 'Italy', 'year': 2019}, {'division': 2, 'league': 'Italy', 'year': 2020}, {'division': 2, 'league': 'Italy', 'year': 2021}, {'division': 2, 'league': 'Italy', 'year': 2022}, {'division': 2, 'league': 'Spain', 'year': 2018}, {'division': 2, 'league': 'Spain', 'year': 2019}, {'division': 2, 'league': 'Spain', 'year': 2020}, {'division': 2, 'league': 'Spain', 'year': 2021}, {'division': 2, 'league': 'Spain', 'year': 2022}, {'division': 3, 'league': 'England', 'year': 2019}, {'division': 3, 'league': 'England', 'year': 2020}, {'division': 3, 'league': 'England', 'year': 2021}, {'division': 3, 'league': 'England', 'year': 2022}, {'division': 4, 'league': 'England', 'year': 2019}, {'division': 4, 'league': 'England', 'year': 2020}, {'division': 4, 'league': 'England', 'year': 2021}, {'division': 4, 'league': 'England', 'year': 2022}]

We select to extract training data only for the year 2021 of all the divisions of English league.

param_grid = {'league': ['England'], 'year': [2021]}
dataloader = FTESoccerDataLoader(param_grid=param_grid)

Getting the available odds types

We can get the available odds types in order to match the output of the training data, using the get_odds_types() class method.

Out:

[]

Therefore no odds data are available.

Extracting the training data

We extract the training data using the default values for the parameters odds_type` and drop_na_thres`.

The input data:

print(X_train)

Out:

            year  division  match_quality   league            home_team            away_team  ...  away_team_probability_win  probability_draw  home_team_projected_score  away_team_projected_score  home_team_match_importance  away_team_match_importance
date                                                                                          ...
2020-09-11  2021         2      54.127384  England              Watford        Middlesbrough  ...                     0.1423            0.2190                       2.06                       0.85                        53.0                        16.5
2020-09-12  2021         2      45.561500  England  Queens Park Rangers    Nottingham Forest  ...                     0.3297            0.2717                       1.49                       1.33                        23.4                        16.2
2020-09-12  2021         2      45.738207  England         Derby County              Reading  ...                     0.2891            0.2742                       1.52                       1.19                        16.6                        22.0
2020-09-12  2021         2      53.410804  England    Huddersfield Town         Norwich City  ...                     0.4612            0.2680                       1.17                       1.60                        18.0                        43.6
2020-09-12  2021         2      54.727624  England      AFC Bournemouth            Blackburn  ...                     0.1492            0.2199                       2.06                       0.89                        55.9                        20.2
...          ...       ...            ...      ...                  ...                  ...  ...                        ...               ...                        ...                        ...                         ...                         ...
2021-05-18  2021         4      11.104789  England       Newport County  Forest Green Rovers  ...                     0.2475            0.3073                       1.21                       0.82                       100.0                       100.0
2021-05-20  2021         4      13.561136  England      Tranmere Rovers            Morecambe  ...                     0.3779            0.2818                       1.19                       1.27                       100.0                       100.0
2021-05-23  2021         4      10.938702  England  Forest Green Rovers       Newport County  ...                     0.3932            0.3134                       0.92                       1.11                       100.0                       100.0
2021-05-23  2021         4      13.488255  England            Morecambe      Tranmere Rovers  ...                     0.2586            0.2708                       1.47                       1.02                       100.0                       100.0
2021-05-31  2021         4      14.866651  England            Morecambe       Newport County  ...                     0.4691            0.0000                       1.23                       1.13                       100.0                       100.0

[2051 rows x 15 columns]

The targets:

print(Y_train)

Out:

      output__home_win__full_time_goals  output__away_win__full_time_goals  output__draw__full_time_goals  ...  output__under_2.5__full_time_goals  output__under_3.5__full_time_goals  output__under_4.5__full_time_goals
0                                  True                              False                          False  ...                                True                                True                                True
1                                  True                              False                          False  ...                                True                                True                                True
2                                 False                               True                          False  ...                                True                                True                                True
3                                 False                               True                          False  ...                                True                                True                                True
4                                  True                              False                          False  ...                               False                               False                               False
...                                 ...                                ...                            ...  ...                                 ...                                 ...                                 ...
2046                               True                              False                          False  ...                                True                                True                                True
2047                              False                               True                          False  ...                               False                                True                                True
2048                               True                              False                          False  ...                               False                               False                               False
2049                              False                              False                           True  ...                                True                                True                                True
2050                               True                              False                          False  ...                                True                                True                                True

[2051 rows x 11 columns]

Extracting the fixtures data

We extract the fixtures data with columns that match the columns of the training data. On the other hand, the fixtures data are not affected by the param_grid selection.

The input data:

print(X_fix)

Out:

            year  division  match_quality                league          home_team  ... probability_draw  home_team_projected_score  away_team_projected_score  home_team_match_importance  away_team_match_importance
date                                                                                ...
2022-04-19  2022         1      84.389443               England          Liverpool  ...           0.1570                       2.53                       0.78                       100.0                        30.9
2022-04-19  2022         3      30.339354               England      Oxford United  ...           0.2406                       1.45                       1.69                        20.5                       100.0
2022-04-19  2022         3      26.603504               England       Ipswich Town  ...           0.3176                       0.94                       1.03                         0.0                       100.0
2022-04-19  2022         3      17.331843               England   Cambridge United  ...           0.2857                       1.19                       1.17                         0.0                         0.0
2022-04-19  2022         1      67.863289                 Spain         Real Betis  ...           0.2207                       1.93                       0.69                        91.1                         3.2
...          ...       ...            ...                   ...                ...  ...              ...                        ...                        ...                         ...                         ...
2022-11-13  2023         1      33.604233                Norway          Viking FK  ...           0.2333                       2.01                       1.06                         NaN                         NaN
2022-11-13  2023         1      20.219616                Norway    Kristiansund BK  ...           0.2707                       1.62                       0.85                         NaN                         NaN
2022-11-13  2023         1      25.689516                Norway             Tromso  ...           0.3044                       1.24                       1.11                         NaN                         NaN
2022-11-13  2023         1      28.961629                Norway         Lillestrom  ...           0.2336                       1.93                       0.80                         NaN                         NaN
2022-12-01  2023         1      12.989485  United-Soccer-League  New Mexico United  ...           0.2211                       1.90                       0.80                         NaN                         NaN

[2792 rows x 15 columns]

Total running time of the script: ( 0 minutes 0.605 seconds)

Gallery generated by Sphinx-Gallery