matminer.descriptors package

Submodules

matminer.descriptors.add_descriptors module

matminer.descriptors.bandstructure_features module

matminer.descriptors.composition_features module

matminer.descriptors.structure_features module

matminer.descriptors.structure_features.get_coulomb_matrix(struct, diag_elems=False)

This function generates the Coulomb matrix, M, of the input structure (or molecule). The Coulomb matrix was put forward by Rupp et al. (Phys. Rev. Lett. 108, 058301, 2012) and is defined by off-diagonal elements M_ij = Z_i*Z_j/|R_i-R_j| and diagonal elements 0.5*Z_i^2.4, where Z_i and R_i denote the nuclear charge and the position of atom i, respectively.

Args:

struct (Structure or Molecule): input structure (or molecule). diag_elems (bool): flag indicating whether (True) to use

the original definition of the diagonal elements; if set to False (default), the diagonal elements are set to zero.
Returns:
(Nsites x Nsites matrix) Coulomb matrix.
matminer.descriptors.structure_features.get_density(s)
matminer.descriptors.structure_features.get_min_relative_distances(struct, cutoff=10.0)

This function determines the relative distance of each site to its closest neighbor. We use the relative distance, f_ij = r_ij / (r^atom_i + r^atom_j), as a measure rather than the absolute distances, r_ij, to account for the fact that different atoms/species have different sizes. The function uses the valence-ionic radius estimator implemented in pymatgen.

Args:

struct (Structure): input structure. cutoff (float): (absolute) distance up to which tentative closest

neighbors (on the basis of relative distances) are to be determined.
Returns:
([float]) list of all minimum relative distances (i.e., for all sites).
matminer.descriptors.structure_features.get_neighbors_of_site_with_index(struct, n, p=None)

Determine the neighbors around the site that has index n in the input Structure object struct, given the approach defined by parameters p. All supported neighbor-finding approaches and listed and explained in the following. All approaches start by creating a tentative list of neighbors using a large cutoff radius defined in parameter dictionary p via key “cutoff”. “min_dist”: find nearest neighbor and its distance d_nn; consider all

neighbors which are within a distance of d_nn * (1 + delta), where delta is an additional parameter provided in the dictionary p via key “delta”.
“scaled_VIRE”: compute the radii, r_i, of all sites on the basis of
the valence-ionic radius evaluator (VIRE); consider all neighbors for which the distance to the central site is less than the sum of the radii multiplied by an a priori chosen parameter, delta, (i.e., dist < delta * (r_central + r_neighbor)).
“min_relative_VIRE”: same approach as “min_dist”, except that we
use relative distances (i.e., distances divided by the sum of the atom radii from VIRE).
“min_relative_OKeeffe”: same approach as “min_relative_VIRE”, except
that we use the bond valence parameters from O’Keeffe’s bond valence method (J. Am. Chem. Soc. 1991, 3226-3229) to calculate relative distances.
Args:

struct (Structure): input structure. n (int): index of site in Structure object for which

neighbors are to be determined.
p (dict): specification (via “approach” key; default is “min_dist”)
and parameters of neighbor-finding approach. Default cutoff radius is 6 Angstrom (key: “cutoff”). Other default parameters are as follows. min_dist: “delta”: 0.15; min_relative_OKeeffe: “delta”: 0.05; min_relative_VIRE: “delta”: 0.05; scaled_VIRE: “delta”: 2.
Returns: ([site]) list of sites that are considered to be nearest
neighbors to site with index n in Structure object struct.
matminer.descriptors.structure_features.get_okeeffe_distance_prediction(el1, el2)

Returns an estimate of the bond valence parameter (bond length) using the derived parameters from ‘Atoms Sizes and Bond Lengths in Molecules and Crystals’ (O’Keeffe & Brese, 1991). The estimate is based on two experimental parameters: r and c. The value for r is based off radius, while c is (usually) the Allred-Rochow electronegativity. Values used are not generated from pymatgen, and are found in ‘okeeffe_params.json’.

Args:
el1, el2 (Element): two Element objects
Returns:
a float value of the predicted bond length
matminer.descriptors.structure_features.get_okeeffe_params(el_symbol)

Returns the elemental parameters related to atom size and electronegativity which are used for estimating bond-valence parameters (bond length) of pairs of atoms on the basis of data provided in ‘Atoms Sizes and Bond Lengths in Molecules and Crystals’ (O’Keeffe & Brese, 1991).

Args:
el_symbol (str): element symbol.
Returns:
(dict): atom-size (‘r’) and electronegativity-related (‘c’)
parameter.
matminer.descriptors.structure_features.get_order_parameter_feature_vectors_difference(struct1, struct2, pneighs=None, convert_none_to_zero=True, delta_op=0.01, ignore_op_types=None)

Determine the difference vector between two order parameter-statistics feature vector resulting from two input structures.

Args:

struct1 (Structure): first input structure. struct2 (Structure): second input structure. pneighs (dict): specification and parameters of

neighbor-finding approach (see get_neighbors_of_site_with_index function for more details).
convert_none_to_zero (bool): flag indicating whether or not
to convert None values in OPs to zero (cf., get_order_parameters function).
delta_op (float): bin size of histogram that is computed
in order to identify peak locations (cf., get_order_parameters_stats function).
ignore_op_types ([str]): list of OP types to be ignored in
output dictionary (cf., get_order_parameters_stats function).
Returns: ([float]) difference vector between order
parameter-statistics feature vectors obtained from the two input structures (structure 1 - structure 2).
matminer.descriptors.structure_features.get_order_parameter_stats(struct, pneighs=None, convert_none_to_zero=True, delta_op=0.01, ignore_op_types=None)

Determine the order parameter statistics accumulated across all sites in Structure object struct using the get_order_parameters function.

Args:

struct (Structure): input structure. pneighs (dict): specification and parameters of

neighbor-finding approach (see get_neighbors_of_site_with_index function for more details).
convert_none_to_zero (bool): flag indicating whether or not
to convert None values in OPs to zero (cf., get_order_parameters function).
delta_op (float): bin size of histogram that is computed
in order to identify peak locations.
ignore_op_types ([str]): list of OP types to be ignored in
output dictionary (e.g., [“cn”, “bent”]). Default (None) will consider all OPs.
Returns: ({}) dictionary, the keys of which represent
the order parameter type (e.g., “bent5”, “tet”, “sq_pyr”) and the values of which are dictionaries carring the statistics (“min”, “max”, “mean”, “std”, “peak1”, “peak2”).
matminer.descriptors.structure_features.get_order_parameters(struct, pneighs=None, convert_none_to_zero=True)

Calculate all order parameters (OPs) for all sites in Structure object struct.

Args:

struct (Structure): input structure. pneighs (dict): specification and parameters of

neighbor-finding approach (see get_neighbors_of_site_with_index function for more details).
convert_none_to_zero (bool): flag indicating whether or not
to convert None values in OPs to zero.
Returns: ([[float]]) matrix of all sites’ (1st dimension)
order parameters (2nd dimension). 46 order parameters are computed per site: q_cn (coordination number), q_lin, 35 x q_bent (starting with a target angle of 5 degrees and, increasing by 5 degrees, until 175 degrees), q_tet, q_oct, q_bcc, q_2, q_4, q_6, q_reg_tri, q_sq, q_sq_pyr.
matminer.descriptors.structure_features.get_packing_fraction(s)
matminer.descriptors.structure_features.get_prdf(structure, cutoff=20.0, bin_size=0.1)

Compute the partial radial distribution function for a structure

The partial radial distribution function is the radial distibution function broken down for each pair of atom types

The PRDF was proposed as a structural descriptor by [Schutt et al.] (https://journals.aps.org/prb/abstract/10.1103/PhysRevB.89.205118)

Args:
structure: pymatgen structure object cutoff: (int/float) distance to calculate rdf up to bin_size: (int/float) size of bin to obtain rdf for
Returns: (tuple) First element is a dict where keys are tuples of element names
and values are PRDFs,
matminer.descriptors.structure_features.get_rdf(structure, cutoff=20.0, bin_size=0.1)

Calculate rdf fingerprint of a given structure

Args:
structure: pymatgen structure object cutoff: (int/float) distance to calculate rdf up to bin_size: (int/float) size of bin to obtain rdf for

Returns: (tuple of ndarray) first element is the normalized RDF, second is the inner radius of the RDF bin

matminer.descriptors.structure_features.get_rdf_peaks(rdf, rdf_bins, n_peaks=2)

Get location of highest peaks in rdf of a structure.

Args:
rdf: (ndarray) as output by the function “get_rdf” rdf_bins: (ndarray) inner radius of the rdf bin n_peaks: (int) Number of the top peaks to return

Returns: (ndarray) of distances highest peaks, listed by descending height

matminer.descriptors.structure_features.get_redf(struct, cutoff=None, dr=0.05)

This function permits the calculation of the crystal structure-inherent electronic radial distribution function (ReDF) according to Willighagen et al., Acta Cryst., 2005, B61, 29-36. The ReDF is a structure-integral RDF (i.e., summed over all sites) in which the positions of neighboring sites are weighted by electrostatic interactions inferred from atomic partial charges. Atomic charges are obtained from the ValenceIonicRadiusEvaluator class.

Args:

struct (Structure): input Structure object. cutoff (float): distance up to which the ReDF is to be

calculated (default: longest diagaonal in primitive cell)

dr (float): width of bins (“x”-axis) of ReDF (default: 0.05 A).

Returns:
(dict) a copy of the electronic radial distribution functions (ReDF) as a dictionary. The distance list (“x”-axis values of ReDF) can be accessed via key ‘distances’; the ReDF itself via key ‘redf’.
matminer.descriptors.structure_features.get_vol_per_site(s)
matminer.descriptors.structure_features.site_is_of_motif_type(struct, n, pneighs=None, thresh=None)

Returns the motif type of site with index n in structure struct; currently featuring “tetrahedral”, “octahedral”, “bcc”, and “cp” (close-packed: fcc and hcp). If the site is not recognized or if it has been recognized as two different motif types, the function labels the site as “unrecognized”.

Args:

struct (Structure): input structure. n (int): index of site in Structure object for which motif type

is to be determined.
pneighs (dict): specification and parameters of neighbor-finding
approach (cf., function get_neighbors_of_site_with_index).
thresh (dict): thresholds for motif criteria (currently, required
keys and their default values are “qtet”: 0.5, “qoct”: 0.5, “qbcc”: 0.5, “q6”: 0.4).

Returns: motif type (str).

Module contents