matminer.datasets package

Submodules

matminer.datasets.convenience_loaders module

matminer.datasets.dataset_retrieval module

matminer.datasets.dataset_retrieval.available_datasets(print_datasets=True, print_descriptions=True, sort_method='alphabetical')

Function for retrieving the datasets available within matminer.

Args:
print_datasets (bool): Whether to, along with returning a
list of dataset names, also print info on each dataset
print_descriptions (bool): Whether to print the description of the
dataset along with the name. Ignored if print_datasets is False
sort_method (str): By what metric to sort the datasets when retrieving

their information.

alphabetical: sorts by dataset name, num_entries: sorts by number of dataset entries

Returns: (list)

matminer.datasets.dataset_retrieval.load_dataset(name, data_home=None, download_if_missing=True, include_metadata=False)

Loads a dataframe containing the dataset specified with the ‘name’ field.

Dataset file is stored/loaded from data_home if specified, otherwise at the MATMINER_DATA environment variable if set or at matminer/datasets by default.

Args:
name (str): keyword specifying what dataset to load, run
matminer.datasets.available_datasets() for options

data_home (str): path to folder to look for dataset file

download_if_missing (bool): whether to download the dataset if is not
found on disk
include_metadata (bool): optional argument for some datasets with
metadata fields

Returns: (pd.DataFrame)

matminer.datasets.utils module

Module contents