FiveThirtyEight soccer data

This example illustrates the usage of FiveThirtyEight soccer dataloader.

# Author: Georgios Douzas <gdouzas@icloud.com>
# Licence: MIT

from sportsbet.datasets import FTESoccerDataLoader

Getting the available parameters

We can get the available parameters in order to select the training data to be extracted, using the get_all_params() class method.

Out:

[{'division': [1], 'league': ['Argentina'], 'year': [2018]}, {'division': [1], 'league': ['Argentina'], 'year': [2019]}, {'division': [1], 'league': ['Argentina'], 'year': [2020]}, {'division': [1], 'league': ['Argentina'], 'year': [2022]}, {'division': [1], 'league': ['Australia'], 'year': [2019]}, {'division': [1], 'league': ['Australia'], 'year': [2020]}, {'division': [1], 'league': ['Australia'], 'year': [2021]}, {'division': [1], 'league': ['Australia'], 'year': [2022]}, {'division': [1], 'league': ['Austria'], 'year': [2018]}, {'division': [1], 'league': ['Austria'], 'year': [2019]}, {'division': [1], 'league': ['Austria'], 'year': [2020]}, {'division': [1], 'league': ['Austria'], 'year': [2021]}, {'division': [1], 'league': ['Austria'], 'year': [2022]}, {'division': [1], 'league': ['Belgium'], 'year': [2019]}, {'division': [1], 'league': ['Belgium'], 'year': [2020]}, {'division': [1], 'league': ['Belgium'], 'year': [2021]}, {'division': [1], 'league': ['Belgium'], 'year': [2022]}, {'division': [1], 'league': ['Brazil'], 'year': [2018]}, {'division': [1], 'league': ['Brazil'], 'year': [2019]}, {'division': [1], 'league': ['Brazil'], 'year': [2020]}, {'division': [1], 'league': ['Brazil'], 'year': [2021]}, {'division': [1], 'league': ['Brazil'], 'year': [2022]}, {'division': [1], 'league': ['Champions-League'], 'year': [2017]}, {'division': [1], 'league': ['Champions-League'], 'year': [2018]}, {'division': [1], 'league': ['Champions-League'], 'year': [2019]}, {'division': [1], 'league': ['Champions-League'], 'year': [2020]}, {'division': [1], 'league': ['Champions-League'], 'year': [2021]}, {'division': [1], 'league': ['Champions-League'], 'year': [2022]}, {'division': [1], 'league': ['China'], 'year': [2019]}, {'division': [1], 'league': ['China'], 'year': [2020]}, {'division': [1], 'league': ['Denmark'], 'year': [2019]}, {'division': [1], 'league': ['Denmark'], 'year': [2020]}, {'division': [1], 'league': ['Denmark'], 'year': [2021]}, {'division': [1], 'league': ['Denmark'], 'year': [2022]}, {'division': [1], 'league': ['England'], 'year': [2017]}, {'division': [1], 'league': ['England'], 'year': [2018]}, {'division': [1], 'league': ['England'], 'year': [2019]}, {'division': [1], 'league': ['England'], 'year': [2020]}, {'division': [1], 'league': ['England'], 'year': [2021]}, {'division': [1], 'league': ['England'], 'year': [2022]}, {'division': [1], 'league': ['Europa'], 'year': [2022]}, {'division': [1], 'league': ['Europa-League'], 'year': [2018]}, {'division': [1], 'league': ['Europa-League'], 'year': [2019]}, {'division': [1], 'league': ['Europa-League'], 'year': [2020]}, {'division': [1], 'league': ['Europa-League'], 'year': [2021]}, {'division': [1], 'league': ['Europa-League'], 'year': [2022]}, {'division': [1], 'league': ['FAWSL'], 'year': [2017]}, {'division': [1], 'league': ['FAWSL'], 'year': [2018]}, {'division': [1], 'league': ['FAWSL'], 'year': [2019]}, {'division': [1], 'league': ['FAWSL'], 'year': [2020]}, {'division': [1], 'league': ['FAWSL'], 'year': [2021]}, {'division': [1], 'league': ['FAWSL'], 'year': [2022]}, {'division': [1], 'league': ['France'], 'year': [2017]}, {'division': [1], 'league': ['France'], 'year': [2018]}, {'division': [1], 'league': ['France'], 'year': [2019]}, {'division': [1], 'league': ['France'], 'year': [2020]}, {'division': [1], 'league': ['France'], 'year': [2021]}, {'division': [1], 'league': ['France'], 'year': [2022]}, {'division': [1], 'league': ['Germany'], 'year': [2017]}, {'division': [1], 'league': ['Germany'], 'year': [2018]}, {'division': [1], 'league': ['Germany'], 'year': [2019]}, {'division': [1], 'league': ['Germany'], 'year': [2020]}, {'division': [1], 'league': ['Germany'], 'year': [2021]}, {'division': [1], 'league': ['Germany'], 'year': [2022]}, {'division': [1], 'league': ['Greece'], 'year': [2019]}, {'division': [1], 'league': ['Greece'], 'year': [2020]}, {'division': [1], 'league': ['Greece'], 'year': [2022]}, {'division': [1], 'league': ['Italy'], 'year': [2017]}, {'division': [1], 'league': ['Italy'], 'year': [2018]}, {'division': [1], 'league': ['Italy'], 'year': [2019]}, {'division': [1], 'league': ['Italy'], 'year': [2020]}, {'division': [1], 'league': ['Italy'], 'year': [2021]}, {'division': [1], 'league': ['Italy'], 'year': [2022]}, {'division': [1], 'league': ['Japan'], 'year': [2019]}, {'division': [1], 'league': ['Japan'], 'year': [2020]}, {'division': [1], 'league': ['Japan'], 'year': [2021]}, {'division': [1], 'league': ['Japan'], 'year': [2022]}, {'division': [1], 'league': ['Mexico'], 'year': [2017]}, {'division': [1], 'league': ['Mexico'], 'year': [2018]}, {'division': [1], 'league': ['Mexico'], 'year': [2019]}, {'division': [1], 'league': ['Mexico'], 'year': [2020]}, {'division': [1], 'league': ['Mexico'], 'year': [2021]}, {'division': [1], 'league': ['Mexico'], 'year': [2022]}, {'division': [1], 'league': ['NWSL'], 'year': [2018]}, {'division': [1], 'league': ['NWSL'], 'year': [2019]}, {'division': [1], 'league': ['NWSL'], 'year': [2020]}, {'division': [1], 'league': ['NWSL'], 'year': [2021]}, {'division': [1], 'league': ['NWSL'], 'year': [2022]}, {'division': [1], 'league': ['Netherlands'], 'year': [2018]}, {'division': [1], 'league': ['Netherlands'], 'year': [2019]}, {'division': [1], 'league': ['Netherlands'], 'year': [2020]}, {'division': [1], 'league': ['Netherlands'], 'year': [2021]}, {'division': [1], 'league': ['Netherlands'], 'year': [2022]}, {'division': [1], 'league': ['Norway'], 'year': [2018]}, {'division': [1], 'league': ['Norway'], 'year': [2019]}, {'division': [1], 'league': ['Norway'], 'year': [2020]}, {'division': [1], 'league': ['Norway'], 'year': [2021]}, {'division': [1], 'league': ['Norway'], 'year': [2022]}, {'division': [1], 'league': ['Portugal'], 'year': [2018]}, {'division': [1], 'league': ['Portugal'], 'year': [2019]}, {'division': [1], 'league': ['Portugal'], 'year': [2020]}, {'division': [1], 'league': ['Portugal'], 'year': [2021]}, {'division': [1], 'league': ['Portugal'], 'year': [2022]}, {'division': [1], 'league': ['Russia'], 'year': [2018]}, {'division': [1], 'league': ['Russia'], 'year': [2019]}, {'division': [1], 'league': ['Russia'], 'year': [2020]}, {'division': [1], 'league': ['Russia'], 'year': [2021]}, {'division': [1], 'league': ['Russia'], 'year': [2022]}, {'division': [1], 'league': ['Scotland'], 'year': [2018]}, {'division': [1], 'league': ['Scotland'], 'year': [2019]}, {'division': [1], 'league': ['Scotland'], 'year': [2020]}, {'division': [1], 'league': ['Scotland'], 'year': [2021]}, {'division': [1], 'league': ['Scotland'], 'year': [2022]}, {'division': [1], 'league': ['South-Africa'], 'year': [2019]}, {'division': [1], 'league': ['South-Africa'], 'year': [2020]}, {'division': [1], 'league': ['South-Africa'], 'year': [2022]}, {'division': [1], 'league': ['Spain'], 'year': [2017]}, {'division': [1], 'league': ['Spain'], 'year': [2018]}, {'division': [1], 'league': ['Spain'], 'year': [2019]}, {'division': [1], 'league': ['Spain'], 'year': [2020]}, {'division': [1], 'league': ['Spain'], 'year': [2021]}, {'division': [1], 'league': ['Spain'], 'year': [2022]}, {'division': [1], 'league': ['Sweden'], 'year': [2018]}, {'division': [1], 'league': ['Sweden'], 'year': [2019]}, {'division': [1], 'league': ['Sweden'], 'year': [2020]}, {'division': [1], 'league': ['Sweden'], 'year': [2021]}, {'division': [1], 'league': ['Sweden'], 'year': [2022]}, {'division': [1], 'league': ['Switzerland'], 'year': [2018]}, {'division': [1], 'league': ['Switzerland'], 'year': [2019]}, {'division': [1], 'league': ['Switzerland'], 'year': [2020]}, {'division': [1], 'league': ['Switzerland'], 'year': [2021]}, {'division': [1], 'league': ['Switzerland'], 'year': [2022]}, {'division': [1], 'league': ['Turkey'], 'year': [2018]}, {'division': [1], 'league': ['Turkey'], 'year': [2019]}, {'division': [1], 'league': ['Turkey'], 'year': [2020]}, {'division': [1], 'league': ['Turkey'], 'year': [2021]}, {'division': [1], 'league': ['Turkey'], 'year': [2022]}, {'division': [1], 'league': ['USA'], 'year': [2018]}, {'division': [1], 'league': ['USA'], 'year': [2019]}, {'division': [1], 'league': ['USA'], 'year': [2020]}, {'division': [1], 'league': ['USA'], 'year': [2021]}, {'division': [1], 'league': ['USA'], 'year': [2022]}, {'division': [1], 'league': ['United-Soccer-League'], 'year': [2019]}, {'division': [1], 'league': ['United-Soccer-League'], 'year': [2020]}, {'division': [1], 'league': ['United-Soccer-League'], 'year': [2021]}, {'division': [1], 'league': ['United-Soccer-League'], 'year': [2022]}, {'division': [2], 'league': ['England'], 'year': [2018]}, {'division': [2], 'league': ['England'], 'year': [2019]}, {'division': [2], 'league': ['England'], 'year': [2020]}, {'division': [2], 'league': ['England'], 'year': [2021]}, {'division': [2], 'league': ['England'], 'year': [2022]}, {'division': [2], 'league': ['France'], 'year': [2018]}, {'division': [2], 'league': ['France'], 'year': [2019]}, {'division': [2], 'league': ['France'], 'year': [2020]}, {'division': [2], 'league': ['France'], 'year': [2021]}, {'division': [2], 'league': ['France'], 'year': [2022]}, {'division': [2], 'league': ['Germany'], 'year': [2018]}, {'division': [2], 'league': ['Germany'], 'year': [2019]}, {'division': [2], 'league': ['Germany'], 'year': [2020]}, {'division': [2], 'league': ['Germany'], 'year': [2021]}, {'division': [2], 'league': ['Germany'], 'year': [2022]}, {'division': [2], 'league': ['Italy'], 'year': [2018]}, {'division': [2], 'league': ['Italy'], 'year': [2019]}, {'division': [2], 'league': ['Italy'], 'year': [2020]}, {'division': [2], 'league': ['Italy'], 'year': [2021]}, {'division': [2], 'league': ['Italy'], 'year': [2022]}, {'division': [2], 'league': ['Spain'], 'year': [2018]}, {'division': [2], 'league': ['Spain'], 'year': [2019]}, {'division': [2], 'league': ['Spain'], 'year': [2020]}, {'division': [2], 'league': ['Spain'], 'year': [2021]}, {'division': [2], 'league': ['Spain'], 'year': [2022]}, {'division': [3], 'league': ['England'], 'year': [2019]}, {'division': [3], 'league': ['England'], 'year': [2020]}, {'division': [3], 'league': ['England'], 'year': [2021]}, {'division': [3], 'league': ['England'], 'year': [2022]}, {'division': [4], 'league': ['England'], 'year': [2019]}, {'division': [4], 'league': ['England'], 'year': [2020]}, {'division': [4], 'league': ['England'], 'year': [2021]}, {'division': [4], 'league': ['England'], 'year': [2022]}]

We select to extract training data only for the year 2021 of all the divisions of English league.

param_grid = {'league': ['England'], 'year': [2021]}
dataloader = FTESoccerDataLoader(param_grid=param_grid)

Getting the available odds types

We can get the available odds types in order to match the output of the training data, using the get_odds_types() class method.

Out:

[]

Therefore no odds data are available.

Extracting the training data

We extract the training data using the default values for the parameters odds_type` and drop_na_thres`.

The input data:

print(X_train)

Out:

            year  division  match_quality   league          home_team          away_team  ...  away_team_probability_win  probability_draw  home_team_projected_score  away_team_projected_score  home_team_match_importance  away_team_match_importance
date                                                                                      ...
2020-09-11  2021         2      54.127384  England            Watford      Middlesbrough  ...                     0.1423            0.2190                       2.06                       0.85                        53.0                        16.5
2020-09-12  2021         2      41.720803  England           Barnsley         Luton Town  ...                     0.2443            0.2826                       1.49                       0.99                        22.1                        34.5
2020-09-12  2021         2      43.113441  England       Bristol City      Coventry City  ...                     0.2450            0.2664                       1.64                       1.08                        16.1                        29.5
2020-09-12  2021         2      49.081026  England  Preston North End       Swansea City  ...                     0.3285            0.2916                       1.30                       1.19                        16.2                        20.0
2020-09-12  2021         2      38.383306  England  Wycombe Wanderers   Rotherham United  ...                     0.3091            0.2868                       1.38                       1.17                        35.2                        31.6
...          ...       ...            ...      ...                ...                ...  ...                        ...               ...                        ...                        ...                         ...                         ...
2021-05-23  2021         1      76.750499  England      Wolverhampton  Manchester United  ...                     0.4464            0.2627                       1.16                       1.50                         0.0                         0.0
2021-05-23  2021         1      72.962664  England          Liverpool     Crystal Palace  ...                     0.0447            0.1164                       2.76                       0.50                       100.0                         0.0
2021-05-23  2021         1      81.757707  England    Manchester City            Everton  ...                     0.2148            0.2365                       1.81                       1.04                         0.0                         0.0
2021-05-23  2021         1      60.789751  England   Sheffield United            Burnley  ...                     0.3981            0.2770                       1.16                       1.32                         0.0                         0.0
2021-05-23  2021         1      80.416620  England        Aston Villa            Chelsea  ...                     0.6261            0.2238                       0.79                       1.87                         0.0                       100.0

[937 rows x 15 columns]

The targets:

print(Y_train)

Out:

     home_win__full_time_goals  away_win__full_time_goals  draw__full_time_goals  over_1.5__full_time_goals  ...  under_1.5__full_time_goals  under_2.5__full_time_goals  under_3.5__full_time_goals  under_4.5__full_time_goals
0                         True                      False                  False                      False  ...                        True                        True                        True                        True
1                        False                       True                  False                      False  ...                        True                        True                        True                        True
2                         True                      False                  False                       True  ...                       False                       False                        True                        True
3                        False                       True                  False                      False  ...                        True                        True                        True                        True
4                        False                       True                  False                      False  ...                        True                        True                        True                        True
..                         ...                        ...                    ...                        ...  ...                         ...                         ...                         ...                         ...
932                      False                       True                  False                       True  ...                       False                       False                        True                        True
933                       True                      False                  False                       True  ...                       False                        True                        True                        True
934                       True                      False                  False                       True  ...                       False                       False                       False                       False
935                       True                      False                  False                      False  ...                        True                        True                        True                        True
936                       True                      False                  False                       True  ...                       False                       False                        True                        True

[937 rows x 11 columns]

Extracting the fixtures data

We extract the fixtures data with columns that match the columns of the training data. On the other hand, the fixtures data are not affected by the param_grid selection.

The input data:

print(X_fix)

Out:

            year  division  match_quality   league        home_team       away_team  ...  away_team_probability_win  probability_draw  home_team_projected_score  away_team_projected_score  home_team_match_importance  away_team_match_importance
date                                                                                 ...
2022-02-10  2022         1      80.492795  England        Liverpool  Leicester City  ...                     0.0965            0.1379                       2.96                       1.00                        43.4                        12.3
2022-02-10  2022         1      77.684168  England    Wolverhampton         Arsenal  ...                     0.4537            0.2931                       0.88                       1.28                        20.2                        74.4
2022-02-10  2022         1      23.210600   Greece        Atromitos           Lamia  ...                     0.2903            0.2936                       1.26                       1.01                        50.6                        21.6
2022-02-11  2022         2      32.091653    Spain          Leganes   Real Zaragoza  ...                     0.2548            0.3304                       1.14                       0.83                        22.8                        27.1
2022-02-11  2022         1      48.808070   Mexico           Puebla           Atlas  ...                     0.3237            0.3022                       1.11                       1.01                        12.2                         9.5
...          ...       ...            ...      ...              ...             ...  ...                        ...               ...                        ...                        ...                         ...                         ...
2022-05-29  2022         2      27.227074    Spain      AD Alcorcon           Eibar  ...                     0.4565            0.2792                       1.08                       1.51                         NaN                         NaN
2022-05-29  2022         2      32.948424    Spain           Burgos       Girona FC  ...                     0.4222            0.2900                       1.10                       1.39                         NaN                         NaN
2022-05-29  2022         2      33.415279    Spain         Tenerife    FC Cartagena  ...                     0.2189            0.2829                       1.51                       0.90                         NaN                         NaN
2022-05-29  2022         2      39.900775    Spain  Real Valladolid       SD Huesca  ...                     0.2144            0.2757                       1.57                       0.92                         NaN                         NaN
2022-05-29  2022         2      30.950030    Spain  Sporting Gijón      Las Palmas  ...                     0.3780            0.3088                       1.07                       1.20                         NaN                         NaN

[3064 rows x 15 columns]

Total running time of the script: ( 0 minutes 0.992 seconds)

Gallery generated by Sphinx-Gallery