sportsbet.datasets.FTESoccerDataLoader

class sportsbet.datasets.FTESoccerDataLoader(param_grid=None)[source]

Dataloader for FiveThirtyEight soccer data.

It downloads historical and fixtures data from FiveThirtyEight.

Read more in the user guide.

Parameters:
param_grid : dict of str to sequence, or sequence of such parameter, default=None

It selects the type of information that the data include. The keys of dictionaries might be parameters like 'league' or 'division' while the values are sequences of allowed values. It works in a similar way as the param_grid parameter of the ParameterGrid class. The default value None corresponds to all parameters.

Examples

>>> from sportsbet.datasets import FTESoccerDataLoader
>>> import pandas as pd
>>> # Select all training data
>>> dataloader = FTESoccerDataLoader()
>>> # Get available odds types
>>> dataloader.get_odds_types()
[]
>>> X_train, Y_train, O_train = dataloader.extract_train_data()
>>> # Extract the corresponding fixtures data
>>> X_fix, Y_fix, O_fix = dataloader.extract_fixtures_data()
>>> # Training and fixtures input data have the same column names
>>> pd.testing.assert_index_equal(X_train.columns, X_fix.columns)
>>> # Fixtures data have no output
>>> Y_fix is None
True
>>> # No odds data are available
>>> O_train is None and O_fix is None
True
extract_fixtures_data()

Extract the fixtures data.

Read more in the user guide.

It returns fixtures data that can be used to make predictions for upcoming matches based on a betting strategy.

Before calling the extract_fixtures_data() method for the first time, the extract__data() should be called, in order to match the columns of the input, output and odds data.

The data contain information about the matches known before the start of the match, i.e. the training data X and the odds data O. The multi-output targets Y is always equal to None and are only included for consistency with the method extract_train_data().

The param_grid parameter of the initialization method __init__() has no effect on the fixtures data.

Returns:
(X, None, O) : tuple of DataFrame objects

Each of the components represent the fixtures input data X, the multi-output targets Y equal to None and the corresponding odds O, respectively.

extract_train_data(drop_na_thres=0.0, odds_type=None)

Extract the training data.

Read more in the user guide.

It returns historical data that can be used to create a betting strategy based on heuristics or machine learning models.

The data contain information about the matches that belong in two categories. The first category includes any information known before the start of the match, i.e. the training data X and the odds data O. The second category includes the outcomes of matches i.e. the multi-output targets Y.

The method selects only the the data allowed by the param_grid parameter of the initialization method __init__(). Additionally, columns with missing values are dropped through the drop_na_thres parameter, while the types of odds returned is defined by the odds_type parameter.

Parameters:
drop_na_thres : float, default=0.0

The threshold that specifies the input columns to drop. It is a float in the [0.0, 1.0] range. Higher values result in dropping more values. The default value drop_na_thres=0.0 keeps all columns while the maximum value drop_na_thres=1.0 keeps only columns with non missing values.

odds_type : str, default=None

The selected odds type. It should be one of the available odds columns prefixes returned by the method get_odds_types(). If odds_type=None then no odds are returned.

Returns:
(X, Y, O) : tuple of DataFrame objects

Each of the components represent the training input data X, the multi-output targets Y and the corresponding odds O, respectively.

classmethod get_all_params()

Get the available parameters.

It can be used to get the allowed names and values for the param_grid parameter of the dataloader object.

Returns:
param_grid: object

An object of the ParameterGrid class.

classmethod get_odds_types()

Get the available odds types.

It can be used to get the allowed odds types of the dataloader’s class method extract_train_data().

Returns:
odds_types: list of str

A list of available odds types.

save(path)

Save the dataloader object.

Parameters:
path : str

The path to save the object.

Returns:
self: object

The dataloader object.

Examples using sportsbet.datasets.FTESoccerDataLoader