matminer.descriptors package

Submodules

matminer.descriptors.add_descriptors module

class matminer.descriptors.add_descriptors.AddDescriptor(df, formula_colname='pretty_formula', separator='_')

Code to add a descriptor column to a dataframe

__init__(df, formula_colname='pretty_formula', separator='_')
Args:
df: dataframe to add the descriptor column to formula_colname: (str) name of the column containing the formula/composition separator: (str) separator to use in naming the new descriptor column

Returns: None

add_pmgdescriptor_column(descriptor, stat_function, stat_name)
Args:
descriptor: (str) name of descriptor - must match the name in the source library stat_function: function to approximate the descriptor. For example, numpy.mean, numpy.std, etc. stat_name: (str) name of stat function to append to new descriptor column name

Returns: dataframe with appended descriptor column

matminer.descriptors.bandstructure_features module

matminer.descriptors.bandstructure_features.absolute_band_positions_bpe(bs, target_gap=None, **kwargs)

Absolute VBM and CBM positions with respect to branch point energy

Args:
bs: Bandstructure object target_gap: if a better band gap is known, shift band positions by this gap **kwargs: arguments to feed into branch point energy code
Returns:
(vbm, cbm) - tuple of floats
matminer.descriptors.bandstructure_features.branch_point_energy(bs, n_vb=1, n_cb=1)

Get the branch point energy as defined by: Schleife, Fuchs, Rodi, Furthmuller, Bechstedt, APL 94, 012104 (2009)

Args:
bs: (BandStructure) - uniform mesh bandstructure object n_vb: number of valence bands to include n_cb: number of conduction bands to include

Returns: (int) branch point energy on same energy scale as BS eigenvalues

matminer.descriptors.composition_features module

matminer.descriptors.composition_features.band_center(comp)

Estimate absolution position of band center using geometric mean of electronegativity Ref: Butler, M. a. & Ginley, D. S. Prediction of Flatband Potentials at Semiconductor-Electrolyte Interfaces from Atomic Electronegativities. J. Electrochem. Soc. 125, 228 (1978).

Args:
comp: (Composition)

Returns: (float) band center

matminer.descriptors.composition_features.get_cohesive_energy(comp)

Get cohesive energy of compound by subtracting elemental cohesive energies from the formation energy of the compund. Elemental cohesive energies are taken from http://www.knowledgedoor.com/2/elements_handbook/cohesive_energy.html. Most of them are taken from “Charles Kittel: Introduction to Solid State Physics, 8th edition. Hoboken, NJ: John Wiley & Sons, Inc, 2005, p. 50.”

Args:
comp: (str) compound composition, eg: “NaCl”

Returns: (float) cohesive energy of compound

matminer.descriptors.composition_features.get_holder_mean(data_lst, power)

Get Holder mean

Args:
data_lst: (list/array) of values power: (int/float) non-zero real number

Returns: Holder mean

matminer.descriptors.composition_features.get_magpie_descriptor(comp, descriptor_name)

Get descriptor data for elements in a compound from the Magpie data repository.

Args:

comp: (str) compound composition, eg: “NaCl” descriptor_name: name of Magpie descriptor needed. Find the entire list at

Returns: (list) of descriptor values for each atom in the composition

matminer.descriptors.composition_features.get_pymatgen_descriptor(comp, prop)

Get descriptor data for elements in a compound from pymatgen.

Args:

comp: (str) compound composition, eg: “NaCl” prop: (str) pymatgen element attribute, as defined in the Element class at

Returns: (list) of values containing descriptor floats for each atom in the compound

matminer.descriptors.structure_features module

matminer.descriptors.structure_features.get_density(s)
matminer.descriptors.structure_features.get_min_relative_distances(struct, cutoff=10.0)

This function determines the relative distance of each site to its closest neighbor. We use the relative distance, f_ij = r_ij / (r^atom_i + r^atom_j), as a measure rather than the absolute distances, r_ij, to account for the fact that different atoms/species have different sizes. The function uses the valence-ionic radius estimator implemented in pymatgen.

Args:

struct (Structure): input structure. cutoff (float): (absolute) distance up to which tentative closest

neighbors (on the basis of relative distances) are to be determined.
Returns: ([float]) list of all minimum relative distances (i.e., for all
sites).
matminer.descriptors.structure_features.get_neighbors_of_site_with_index(struct, n, p={})

Determine the neighbors around the site that has index n in the input Structure object struct, given a pre-defined approach. So far, “scaled_VIRE” and “min_relative_VIRE” are implemented (VIRE = valence-ionic radius evaluator).

Args:

struct (Structure): input structure. n (int): index of site in Structure object for which

neighbors are to be determined.
p (dict): specification (“approach”) and parameters of
neighbor-finding approach. min_relative_VIRE (default): “delta_scale” (0.05) and “scale_cut” (4); scaled_VIRE: “scale” (2) and “scale_cut” (4).
Returns: ([site]) list of sites that are considered to be nearest
neighbors to site with index n in Structure object struct.
matminer.descriptors.structure_features.get_order_parameter_stats(struct, pneighs={}, convert_none_to_zero=True, delta_op=0.01)

Determine the order parameter statistics based on the data from the get_order_parameters function.

Args:

struct (Structure): input structure. pneighs (dict): specification (“approach”) and parameters of

neighbor-finding approach (see get_neighbors_of_site_with_index function for more details; default: min_relative_VIRE, delta_scale = 0.05, scale_cut = 4).
convert_none_to_zero (bool): flag indicating whether or not
to convert None values in OPs to zero.
delta_op (float): bin size of histogram that is computed
in order to identify peak locations.
Returns: ({}) dictionary, the keys of which represent
the order parameter type (e.g., “bent5”, “tet”, “sq_pyr”) and the values are another dictionary carring the statistics (“min”, “max”, “mean”, “std”, “peak1”, “peak2”).
matminer.descriptors.structure_features.get_order_parameters(struct, pneighs={}, convert_none_to_zero=True)

Determine the neighbors around the site that has index n in the input Structure object struct, given a pre-defined approach. So far, “scaled_VIRE” and “min_relative_VIRE” are implemented (VIRE = valence-ionic radius evaluator).

Args:

struct (Structure): input structure. pneighs (dict): specification (“approach”) and parameters of

neighbor-finding approach (see get_neighbors_of_site_with_index function for more details; default: min_relative_VIRE, delta_scale = 0.05, scale_cut = 4).
convert_none_to_zero (bool): flag indicating whether or not
to convert None values in OPs to zero.
Returns: ([[float]]) matrix of all sites’ (1st dimension)
order parameters (2nd dimension). 46 order parameters are computed per site: q_cn (coordination number), q_lin, 35 x q_bent (starting with a target angle of 5 degrees and, increasing by 5 degrees, until 175 degrees), q_tet, q_oct, q_bcc, q_2, q_4, q_6, q_reg_tri, q_sq, q_sq_pyr.
matminer.descriptors.structure_features.get_packing_fraction(s)
matminer.descriptors.structure_features.get_rdf(structure, cutoff=20.0, bin_size=0.1)

Calculate rdf fingerprint of a given structure

Args:
structure: pymatgen structure object cutoff: (int/float) distance to calculate rdf up to bin_size: (int/float) size of bin to obtain rdf for

Returns: (dict) rdf in dict format where keys indicate bin distance and values are calculated rdf for that bin.

matminer.descriptors.structure_features.get_rdf_peaks(dist_rdf)

Get location of highest and second highest peaks in rdf of a structure.

Args:
dist_rdf: (dict) as output by the function “get_rdf”, keys correspond to distances and values correspond to rdf.

Returns: (tuple) of distances highest and second highest peaks.

matminer.descriptors.structure_features.get_redf(struct, cutoff=None, dr=0.05)

This function permits the calculation of the crystal structure-inherent electronic radial distribution function (ReDF) according to Willighagen et al., Acta Cryst., 2005, B61, 29-36. The ReDF is a structure-integral RDF (i.e., summed over all sites) in which the positions of neighboring sites are weighted by electrostatic interactions inferred from atomic partial charges. Atomic charges are obtained from the ValenceIonicRadiusEvaluator class.

Args:

struct (Structure): input Structure object. cutoff (float): distance up to which the ReDF is to be

calculated (default: longest diagaonal in primitive cell)

dr (float): width of bins (“x”-axis) of ReDF (default: 0.05 A).

Returns: (dict) a copy of the electronic radial distribution functions (ReDF) as a dictionary. The distance list
(“x”-axis values of ReDF) can be accessed via key ‘distances’; the ReDF itself via key ‘redf’.
matminer.descriptors.structure_features.get_vol_per_site(s)

Module contents