kalepy full package documentation¶
kalepy.kde module¶
kalepy’s top-level KDE class which provides all direct KDE functionality.
Contents:
KDE
: class for interfacing with KDEs and derived functionality.
-
class
kalepy.kde.
KDE
(dataset, bandwidth=None, weights=None, kernel=None, extrema=None, points=None, reflect=None, neff=None, diagonal=False, helper=True, bw_rescale=None, **kwargs)¶ Bases:
object
Core class and primary API for using kalepy, by constructin a KDE based on given data.
The KDE class acts as an API to the underlying kernel structures and methods. From the passed data, a ‘bandwidth’ is calculated and/or set (using optional specifications using the bandwidth argument). A kernel is constructed (using optional specifications in the kernel argument) which performs the calculations of the kernel density estimation.
Notes
Reflection
Reflective boundary conditions can be used to better reconstruct a PDF that is known to have finite support (i.e. boundaries outside of which the PDF should be zero).
The pdf and resample methods accept the keyword-argument (kwarg) reflect to specify that a reflecting boundary should be used.
- reflect(D,) array_like, None (default)
Locations at which reflecting boundary conditions should be imposed. For each dimension D, a pair of boundary locations (for: lower, upper) must be specified, or None. None can also be given to specify no boundary at that location.
If a pair of boundaries are given, then the first value corresponds to the lower boundary, and the second value to the upper boundary, in that dimension. If there should only be a single lower or upper boundary, then None should be passed as the other boundary value.
For example, reflect=[None, [-1.0, 1.0], [0.0, None]], specifies that the 0th dimension has no boundaries, the 1st dimension has boundaries at both -1.0 and 1.0, and the 2nd dimension has a lower boundary at 0.0, and no upper boundary.
Projection / Marginalization
The PDF can be calculated for only particular parameters/dimensions. The pdf method accepts the keyword-argument (kwarg) params to specify particular parameters over which to calculate the PDF (i.e. the other parameters are projected over).
- paramsint, array_like of int, None (default)
Only calculate the PDF for certain parameters (dimensions).
If None, then calculate PDF along all dimensions. If params is specified, then the target evaluation points pnts, must only contain the corresponding dimensions.
For example, if the dataset has shape (4, 100), but pdf is called with params=(1, 2), then the pnts array should have shape (2, M) where the two provides dimensions correspond to the 1st and 2nd variables of the dataset.
TO-DO: add notes on keep parameter
Dynamic Range
When the elements of the covariace matrix between data variables differs by numerous orders of magnitude, the KDE values (especially marginalized values) can become spurious. One solution is to use a diagonal covariance matrix by initializing the KDE instance with diagonal=True. An alternative is to transform the input data in such a way that each variable’s dynamic range becomes similar (e.g. taking the log of the values). A warning is given if the covariance matrix has a large dynamic very-large dynamic range, but no error is raised.
Examples
Construct semi-random data:
>>> import numpy as np >>> np.random.seed(1234) >>> data = np.random.normal(0.0, 1.0, 1000)
Construct KDE instance using this data, and the default bandwidth and kernels.
>>> import kalepy as kale >>> kde = kale.KDE(data)
Compare original PDF and the data to the reconstructed PDF from the KDE:
>>> xx = np.linspace(-3, 3, 400) >>> pdf_tru = np.exp(-xx*xx/2) / np.sqrt(2*np.pi) >>> xx, pdf_kde = kde.density(xx, probability=True)
>>> import matplotlib.pyplot as plt >>> ll = plt.plot(xx, pdf_tru, 'k--', label='Normal PDF') >>> _, bins, _ = plt.hist(data, bins=14, density=True, color='0.5', rwidth=0.9, alpha=0.5, label='Data') >>> ll = plt.plot(xx, pdf_kde, 'r-', label='KDE') >>> ll = plt.legend()
Compare the KDE reconstructed PDF to the “true” PDF, make sure the chi-squared is consistent:
>>> dof = xx.size - 1 >>> x2 = np.sum(np.square(pdf_kde - pdf_tru)/pdf_tru**2) >>> x2 = x2 / dof >>> x2 < 0.1 True >>> print("Chi-Squared: {:.1e}".format(x2)) Chi-Squared: 1.7e-02
Draw new samples from the data and make sure they are consistent with the original data:
>>> import scipy as sp >>> samp = kde.resample() >>> ll = plt.hist(samp, bins=bins, density=True, color='r', alpha=0.5, rwidth=0.5, label='Samples') >>> ks, pv = sp.stats.ks_2samp(data, samp) >>> pv > 0.05 True
Initialize the KDE class with the given dataset and optional specifications.
- Parameters
dataset (array_like (N,) or (D,N,)) – Dataset from which to construct the kernel-density-estimate. For multivariate data with D variables and N values, the data must be shaped (D,N). For univariate (D=1) data, this can be a single array with shape (N,).
bandwidth (str, float, array of float, None [optional]) – Specification for the bandwidth, or the method by which the bandwidth should be determined. If a str is given, it must match one of the standard bandwidth determination methods. If a float is given, it is used as the bandwidth in each dimension. If an array of `float`s are given, then each value will be used as the bandwidth for the corresponding data dimension.
weights (array_like (N,), None [optional]) – Weights corresponding to each dataset point. Must match the number of points N in the dataset. If None, weights are uniformly set to 1.0 for each value.
kernel (str, Distribution, None [optional]) – The distribution function that should be used for the kernel. This can be a str specification that must match one of the existing distribution functions, or this can be a Distribution subclass itself that overrides the _evaluate method.
neff (int, None [optional]) – An effective number of datapoints. This is used in the plugin bandwidth determination methods. If None, neff is calculated from the weights array. If weights are all uniform, then neff equals the number of datapoints N.
diagonal (bool,) – Whether the bandwidth/covariance matrix should be set as a diagonal matrix (i.e. without covariances between parameters). NOTE: see KDE docstrings, “Dynamic Range”.
-
__init__
(dataset, bandwidth=None, weights=None, kernel=None, extrema=None, points=None, reflect=None, neff=None, diagonal=False, helper=True, bw_rescale=None, **kwargs)¶ Initialize the KDE class with the given dataset and optional specifications.
- Parameters
dataset (array_like (N,) or (D,N,)) – Dataset from which to construct the kernel-density-estimate. For multivariate data with D variables and N values, the data must be shaped (D,N). For univariate (D=1) data, this can be a single array with shape (N,).
bandwidth (str, float, array of float, None [optional]) – Specification for the bandwidth, or the method by which the bandwidth should be determined. If a str is given, it must match one of the standard bandwidth determination methods. If a float is given, it is used as the bandwidth in each dimension. If an array of `float`s are given, then each value will be used as the bandwidth for the corresponding data dimension.
weights (array_like (N,), None [optional]) – Weights corresponding to each dataset point. Must match the number of points N in the dataset. If None, weights are uniformly set to 1.0 for each value.
kernel (str, Distribution, None [optional]) – The distribution function that should be used for the kernel. This can be a str specification that must match one of the existing distribution functions, or this can be a Distribution subclass itself that overrides the _evaluate method.
neff (int, None [optional]) – An effective number of datapoints. This is used in the plugin bandwidth determination methods. If None, neff is calculated from the weights array. If weights are all uniform, then neff equals the number of datapoints N.
diagonal (bool,) – Whether the bandwidth/covariance matrix should be set as a diagonal matrix (i.e. without covariances between parameters). NOTE: see KDE docstrings, “Dynamic Range”.
-
property
bandwidth
¶
-
cdf
(pnts, params=None, reflect=None)¶ Cumulative Distribution Function based on KDE smoothed data.
- Parameters
pnts (([D,]N,) array_like of scalar) – Target evaluation points
- Returns
cdf – CDF Values at the target points
- Return type
(N,) ndarray of scalar
-
cdf_grid
(edges, **kwargs)¶ - NOTE: optimize: there are likely much faster methods than broadcasting and flattening,
use a different method to calculate cdf on a grid.
-
property
covariance
¶
-
property
dataset
¶
-
density
(points=None, reflect=None, params=None, grid=False, probability=False)¶ Evaluate the KDE distribution at the given data-points.
This method acts as an API to the Kernel.pdf method for this instance’s kernel.
- Parameters
points (([D,]M,) array_like of float, or (D,) set of array_like point specifications) –
The locations at which the PDF should be evaluated. The number of dimensions D must match that of the dataset that initialized this class’ instance. NOTE: If the params kwarg (see below) is given, then only those dimensions of the target parameters should be specified in points.
The meaning of points depends on the value of the grid argument:
grid=True : points must be a set of (D,) array_like objects which each give the evaluation points for the corresponding dimension to produce a grid of values. For example, for a 2D dataset, points=([0.1, 0.2, 0.3], [1, 2]), would produce a grid of points with shape (3, 2): [[0.1, 1], [0.1, 2]], [[0.2, 1], [0.2, 2]], [[0.3, 1], [0.3, 2]], and the returned values would be an array of the same shape (3, 2).
grid=False : points must be an array_like (D,M) describing the position of M sample points in each of D dimensions. For example, for a 3D dataset: points=([0.1, 0.2], [1.0, 2.0], [10, 20]), describes 2 sample points at the 3D locations, (0.1, 1.0, 10) and (0.2, 2.0, 20), and the returned values would be an array of shape (2,).
reflect ((D,) array_like, None (default)) – Locations at which reflecting boundary conditions should be imposed. For each dimension D, a pair of boundary locations (for: lower, upper) must be specified, or None. None can also be given to specify no boundary at that location. See class docstrings:Reflection for more information.
params (int, array_like of int, None (default)) – Only calculate the PDF for certain parameters (dimensions). See class docstrings:Projection for more information.
grid (bool,) – Evaluate the KDE distribution at a grid of points specified by points. See points argument description above.
probability (bool, normalize the results to sum to unity) –
- Returns
points (array_like of scalar) – Locations at which the PDF is evaluated.
vals (array_like of scalar) – PDF evaluated at the given points
-
property
extrema
¶
-
property
kernel
¶
-
property
ndata
¶
-
property
ndim
¶
-
property
neff
¶
-
pdf
(*args, **kwargs)¶
-
property
points
¶
-
resample
(size=None, keep=None, reflect=None, squeeze=True)¶ Draw new values from the kernel-density-estimate calculated PDF.
The KDE calculates a PDF from the given dataset. This method draws new, semi-random data points from that PDF.
- Parameters
size (int, None (default)) – The number of new data points to draw. If None, then the number of datapoints is used.
keep (int, array_like of int, None (default)) – Parameters/dimensions where the original data-values should be drawn from, instead of from the reconstructed PDF. TODO: add more information.
reflect ((D,) array_like, None (default)) – Locations at which reflecting boundary conditions should be imposed. For each dimension D, a pair of boundary locations (for: lower, upper) must be specified, or None. None can also be given to specify no boundary at that location.
squeeze (bool, (default: True)) – If the number of dimensions D is one, then return an array of shape (L,) instead of (1, L).
- Returns
samples – Newly drawn samples from the PDF, where the number of points L is determined by the size argument. If squeeze is True (default), and the number of dimensions in the original dataset D is one, then the returned array will have shape (L,).
- Return type
([D,]L) ndarray of float
-
property
weights
¶
kalepy.kernels module¶
Kernal basis functions for KDE calculations, used by kalepy.kde.KDE class.
Contents:
Kernel
: class performing the numerical/mathematical functions of a KDE using a particular kernel-function.Distribution
: base class for kernel-function functionalityGaussian(Distribution)
: class for Gaussian kernel functionsBox_Asym(Distribution)
: class for box (top-hat) kernel functionsParabola(Distribution)
: class for parabolic (Epanechnikov) kernel functionsget_distribution_class
: returns the appropriate Distribution subclass matching the given string specification.get_all_distribution_classes
: returns a list of active Distribution subclasses (used for testing).
-
class
kalepy.kernels.
Box_Asym
¶ Bases:
kalepy.kernels.Distribution
-
classmethod
cdf
(xx)¶
-
classmethod
inside
(points)¶
-
classmethod
-
class
kalepy.kernels.
Distribution
¶ Bases:
object
Distribution positional arguments (xx or yy) must be shaped as (D, N) for ‘D’ dimensions and ‘N’ data-points.
-
property
FINITE
¶
-
property
SYMMETRIC
¶
-
__init__
()¶ Initialize self. See help(type(self)) for accurate signature.
-
cdf
(xx)¶
-
property
cdf_grid
¶
-
classmethod
evaluate
(xx)¶
-
classmethod
grid
(edges, **kwargs)¶
-
classmethod
inside
(points)¶
-
classmethod
name
()¶
-
ppf
(cd)¶ Percentile Point Function - the inverse of the cumulative distribution function.
NOTE: for symmetric kernels, this (effectively) uses points only with cdf in [0.0, 0.5], which produces better numerical results (unclear why).
-
sample
(size, ndim=None, squeeze=None)¶
-
property
-
class
kalepy.kernels.
Gaussian
¶ Bases:
kalepy.kernels.Distribution
-
cdf
(yy)¶
-
classmethod
inside
(points)¶
-
classmethod
norm
(ndim=1)¶
-
-
class
kalepy.kernels.
Kernel
(distribution=None, bandwidth=None, covariance=None, helper=False, chunk=100000.0)¶ Bases:
object
-
property
FINITE
¶
-
__init__
(distribution=None, bandwidth=None, covariance=None, helper=False, chunk=100000.0)¶ Initialize self. See help(type(self)) for accurate signature.
-
property
bandwidth
¶
-
property
covariance
¶
-
density
(points, data, weights=None, reflect=None, params=None)¶ Calculate the Density Function using this Kernel.
- Parameters
points ((D, N), 2darray of float,) – N points at which to evaluate the density function over D parameters (dimensions). Locations must be specified for each dimension of the data, or for each of target params dimensions of the data.
-
property
distribution
¶
-
property
matrix
¶
-
property
matrix_inv
¶
-
property
norm
¶
-
resample
(data, weights=None, size=None, keep=None, reflect=None, squeeze=True)¶
-
property
-
class
kalepy.kernels.
Parabola
¶ Bases:
kalepy.kernels.Distribution
-
classmethod
cdf
(xx)¶
-
classmethod
-
kalepy.kernels.
get_all_distribution_classes
()¶
-
kalepy.kernels.
get_distribution_class
(arg=None)¶
kalepy.plot module¶
kalepy’s plotting submodule
This submodule containts the Corner class, and all plotting methods. The Corner class, and additional API functions are imported into the base package namespace of kalepy, e.g. kalepy.Corner and kalepy.carpet access the kalepy.plot.Corner and kalepy.plot.carpet methods respectively.
Additional options and customization:
The core plotting routines, such as draw_hist1d, draw_hist2d, draw_contour2d, etc include a fairly large number of keyword arguments for customization. The top level API methods, such as corner() or Corner.plot_data() often do not provide access to all of those arguments, but additional customization is possible by using the drawing methods directly, and optionally subclassing the Corner class to provide additional or different functionality.
Plotting API¶
Corner
: class for corner/triangle/pair plots.corner
: method which constructs a Corner instance and plots 1D and 2D distributions.dist1d
: plot a 1D distribution with numerous possible elements (e.g. histogram, carpet, etc)dist2d
: plot a 2D distribution with numerous possible elements (e.g. histogram, contours, etc)carpet
: draw a 1D scatter-like plot to semi-quantitatively depict a distribution.contour
: draw a 2D contour plot. A wrapper of additional functionality around plt.contourconfidence
: draw 1D confidence intervals using shaded bands.hist1d
: draw a 1D histogramhist2d
: draw a 2D histogram. A wrapper of additional functionality around plt.pcolormesh
-
class
kalepy.plot.
Corner
(kde_data, weights=None, labels=None, limits=None, rotate=True, **kwfig)¶ Bases:
object
Class for creating ‘corner’ / ‘pair’ plots of multidimensional covariances.
The Corner class acts as a constructor for a matplotlib figure and axes, and coordinates the plotting of 1D and 2D distributions. The kalepy.plot.dist1d() and kalepy.plot.dist2d() methods are used for plotting the distributions. The class methods provide wrappers, and default setting for those methods. The Corner.plot method is the standard plotting method with default parameters chosen for plotting a single, multidimensional dataset. For overplotting numerous datasets, the Corner.clean or Corner.plot_data methods are better.
plot : the standard plotting method which, by default, includes both KDE and data elements.
clean : minimal plots with only the KDE generated PDF in 1D and contours in 2D, by default.
hist : minimal plots with only the data based 1D and 2D histograms, by default.
plot_kde : plot elements with only KDE based info: the clean settings with a little more.
plot_data : plot elements without using KDE info.
Examples
Load some predefined 3D data, and generate a default corner plot:
>>> import kalepy as kale >>> data = kale.utils._random_data_3d_03() >>> corner = kale.corner(data)
Load two different datasets, and overplot them using a kalepy.Corner instance.
>>> data1 = kale.utils._random_data_3d_03(par=[0.0, 0.5], cov=0.05) >>> data2 = kale.utils._random_data_3d_03(par=[1.0, 0.25], cov=0.5) >>> corner = kale.Corner(3) # construct '3' dimensional corner-plot (i.e. 3x3 axes) >>> _ = corner.clean(data1) >>> _ = corner.clean(data2)
Initialize Corner instance and construct figure and axes based on the given arguments.
- Parameters
kde_data (object, one of the following) –
int D, the number of parameters/dimensions to construct a DxD corner plot.
instance of kalepy.kde.KDE, providing the data and KDE to be plotted.
array_like scalar (D,N) of data with D parameters and N data points.
weights (array_like scalar (N,) or None) – The weights for each data point. NOTE: only applicable when kde_data is a (D,N) dataset.
labels (array_like string (N,) of names for each parameters.) –
limits (None, or (D,2) of scalar) – Specification for the limits of each axes (for each of D parameters): * None : the limits are determined automatically, * (D,2) : limits for each axis
rotate (bool,) – Whether or not the bottom-right-most axes should be rotated.
**kwfig (keyword-arguments passed to _figax() for constructing figure and axes.) – See kalepy.plot._figax() for specifications.
-
__init__
(kde_data, weights=None, labels=None, limits=None, rotate=True, **kwfig)¶ Initialize Corner instance and construct figure and axes based on the given arguments.
- Parameters
kde_data (object, one of the following) –
int D, the number of parameters/dimensions to construct a DxD corner plot.
instance of kalepy.kde.KDE, providing the data and KDE to be plotted.
array_like scalar (D,N) of data with D parameters and N data points.
weights (array_like scalar (N,) or None) – The weights for each data point. NOTE: only applicable when kde_data is a (D,N) dataset.
labels (array_like string (N,) of names for each parameters.) –
limits (None, or (D,2) of scalar) – Specification for the limits of each axes (for each of D parameters): * None : the limits are determined automatically, * (D,2) : limits for each axis
rotate (bool,) – Whether or not the bottom-right-most axes should be rotated.
**kwfig (keyword-arguments passed to _figax() for constructing figure and axes.) – See kalepy.plot._figax() for specifications.
-
clean
(kde_data=None, weights=None, dist1d={}, dist2d={}, **kwargs)¶ Wrapper for plot_kde that sets parameters for minimalism: PDF and contours only.
- Parameters
kde_data (kalepy.KDE instance, (D,N) array_like of scalars, or None) –
instance of kalepy.kde.KDE, providing the data and KDE to be plotted.
array_like scalar (D,N) of data with D parameters and N data points.
None : use the KDE/data stored during class initialization. raises ValueError if no KDE/data was provided
weights (None or (N,) array_like of scalar, the weighting of each data-point if and) – only-if the given kde_data argument is a (D,N) array_like of scalar data.
dist1d (dict of keyword-arguments passed to the kale.plot.dist1d method.) –
dist2d (dict of keyword-arguments passed to the kale.plot.dist2d method.) –
**kwargs (additiona keyword-arguments passed directly to Corner.plot_kde.) –
-
hist
(kde_data=None, weights=None, dist1d={}, dist2d={}, **kwargs)¶ Wrapper for plot_data that sets parameters to only plot 1D and 2D histograms of data.
- Parameters
kde_data (kalepy.KDE instance, (D,N) array_like of scalars, or None) –
instance of kalepy.kde.KDE, providing the data and KDE to be plotted.
array_like scalar (D,N) of data with D parameters and N data points.
None : use the KDE/data stored during class initialization. raises ValueError if no KDE/data was provided
weights (None or (N,) array_like of scalar, the weighting of each data-point if and) – only-if the given kde_data argument is a (D,N) array_like of scalar data.
dist1d (dict of keyword-arguments passed to the kale.plot.dist1d method.) –
dist2d (dict of keyword-arguments passed to the kale.plot.dist2d method.) –
**kwargs (additiona keyword-arguments passed directly to Corner.plot_kde.) –
-
legend
(handles, labels, index=None, loc=None, fancybox=False, borderaxespad=0, **kwargs)¶
-
plot
(kde_data=None, edges=None, weights=None, quantiles=None, limit=None, color=None, cmap=None, dist1d={}, dist2d={})¶ Plot with standard settings for plotting a single, multidimensional dataset or KDE.
This function coordinates the drawing of a corner plot that ultimately uses the kalepy.plot.dist1d and kalepy.plot.dist2d methods to draw parameter distributions using an instance of kalepy.kde.KDE.
- Parameters
kde_data (kalepy.KDE instance, (D,N) array_like of scalars, or None) –
instance of kalepy.kde.KDE, providing the data and KDE to be plotted.
array_like scalar (D,N) of data with D parameters and N data points.
None : use the KDE/data stored during class initialization. raises ValueError if no KDE/data was provided
edges (object specifying historgam edge locations; or None) –
int : the number of bins for all dimensions, locations calculated automatically
(D,) array_like of int : the number of bins for each of D dimensions
(D,) of array_like : the bin-edge locations for each of D dimensions, e.g. ([0, 1, 2], [0.0, 0.1, 0.2, 0.3],) would describe two bins for the 0th dimension, and 3 bins for the 1st dimension.
(X,) array_like of scalar : the bin-edge locations to be used for all dimensions
None : the number and locations of bins are calculated automatically for each dim
weights (None or (N,) array_like of scalar, the weighting of each data-point if and) – only-if the given kde_data argument is a (D,N) array_like of scalar data.
quantiles (None or array_like of scalar values in [0.0, 1.0] denoting the fractions of) – data to demarkate with contours and confidence bands.
limit (bool or None, whether the axes limits should be reset based on the plotted data.) – If None, then the limits will be readjusted unless limits were provided on class initialization.
color (matplotlib color specification (i.e. named color, hex or rgb) or None.) –
- If None:
cmap is given, then the color will be set to the cmap midpoint.
cmap is not given, then the color will be determined by the next value of the default matplotlib color-cycle, and cmap will be set to a matching colormap.
This parameter effects the color of 1D: histograms, confidence intervals, and carpet; 2D: scatter points.
cmap (matplotlib colormap specification, or None) –
All valid matplotlib specifications can be used, e.g. named value (like ‘Reds’ or ‘viridis’) or a matplotlib.colors.Colormap instance.
If None then a colormap is constructed based on the value of color (see above).
dist1d (dict of keyword-arguments passed to the kale.plot.dist1d method.) –
dist2d (dict of keyword-arguments passed to the kale.plot.dist2d method.) –
-
plot_data
(data=None, edges=None, weights=None, quantiles=None, limit=None, color=None, cmap=None, dist1d={}, dist2d={})¶ Plot with default settings to emphasize the given data (not KDE derived properties).
This function coordinates the drawing of a corner plot that ultimately uses the kalepy.plot.dist1d and kalepy.plot.dist2d methods to draw parameter distributions using an instance of kalepy.kde.KDE.
- Parameters
data ((D,N) array_like of scalars, kalepy.KDE instance, or None) –
array_like scalar (D,N) of data with D parameters and N data points.
None : use the KDE/data stored during class initialization. raises ValueError if no KDE/data was provided
instance of kalepy.kde.KDE, providing the data and KDE to be plotted.
- NOTE: if a KDE instance is given, or one was stored during initilization, then the
dataset is extracted from the instance.
edges (object specifying historgam edge locations; or None) –
int : the number of bins for all dimensions, locations calculated automatically
(D,) array_like of int : the number of bins for each of D dimensions
(D,) of array_like : the bin-edge locations for each of D dimensions, e.g. ([0, 1, 2], [0.0, 0.1, 0.2, 0.3],) would describe two bins for the 0th dimension, and 3 bins for the 1st dimension.
(X,) array_like of scalar : the bin-edge locations to be used for all dimensions
None : the number and locations of bins are calculated automatically for each dim
weights (None or (N,) array_like of scalar, the weighting of each data-point if and) – only-if the given kde_data argument is a (D,N) array_like of scalar data.
quantiles (None or array_like of scalar values in [0.0, 1.0] denoting the fractions of) – data to demarkate with contours and confidence bands.
limit (bool or None, whether the axes limits should be reset based on the plotted data.) – If None, then the limits will be readjusted unless limits were provided on class initialization.
color (matplotlib color specification (i.e. named color, hex or rgb) or None.) –
- If None:
cmap is given, then the color will be set to the cmap midpoint.
cmap is not given, then the color will be determined by the next value of the default matplotlib color-cycle, and cmap will be set to a matching colormap.
This parameter effects the color of 1D: histograms, confidence intervals, and carpet; 2D: scatter points.
cmap (matplotlib colormap specification, or None) –
All valid matplotlib specifications can be used, e.g. named value (like ‘Reds’ or ‘viridis’) or a matplotlib.colors.Colormap instance.
If None then a colormap is constructed based on the value of color (see above).
dist1d (dict of keyword-arguments passed to the kale.plot.dist1d method.) –
dist2d (dict of keyword-arguments passed to the kale.plot.dist2d method.) –
-
plot_kde
(kde=None, edges=None, weights=None, quantiles=None, limit=None, ls='-', color=None, cmap=None, dist1d={}, dist2d={})¶ Plot with default settings to emphasize the KDE derived distributions.
This function coordinates the drawing of a corner plot that ultimately uses the kalepy.plot.dist1d and kalepy.plot.dist2d methods to draw parameter distributions using an instance of kalepy.kde.KDE.
- Parameters
kde (kalepy.KDE instance, (D,N) array_like of scalars, or None) –
instance of kalepy.kde.KDE, providing the data and KDE to be plotted.
array_like scalar (D,N) of data with D parameters and N data points.
None : use the KDE/data stored during class initialization. raises ValueError if no KDE/data was provided
edges (object specifying historgam edge locations; or None) –
int : the number of bins for all dimensions, locations calculated automatically
(D,) array_like of int : the number of bins for each of D dimensions
(D,) of array_like : the bin-edge locations for each of D dimensions, e.g. ([0, 1, 2], [0.0, 0.1, 0.2, 0.3],) would describe two bins for the 0th dimension, and 3 bins for the 1st dimension.
(X,) array_like of scalar : the bin-edge locations to be used for all dimensions
None : the number and locations of bins are calculated automatically for each dim
weights (None or (N,) array_like of scalar, the weighting of each data-point if and) – only-if the given kde argument is a (D,N) array_like of scalar data from which a KDE instance is created.
quantiles (None or array_like of scalar values in [0.0, 1.0] denoting the fractions of) – data to demarkate with contours and confidence bands.
limit (bool or None, whether the axes limits should be reset based on the plotted data.) – If None, then the limits will be readjusted unless limits were provided on class initialization.
color (matplotlib color specification (i.e. named color, hex or rgb) or None.) –
- If None:
cmap is given, then the color will be set to the cmap midpoint.
cmap is not given, then the color will be determined by the next value of the default matplotlib color-cycle, and cmap will be set to a matching colormap.
This parameter effects the color of 1D: histograms, confidence intervals, and carpet; 2D: scatter points.
cmap (matplotlib colormap specification, or None) –
All valid matplotlib specifications can be used, e.g. named value (like ‘Reds’ or ‘viridis’) or a matplotlib.colors.Colormap instance.
If None then a colormap is constructed based on the value of color (see above).
dist1d (dict of keyword-arguments passed to the kale.plot.dist1d method.) –
dist2d (dict of keyword-arguments passed to the kale.plot.dist2d method.) –
-
kalepy.plot.
carpet
(xx, weights=None, ax=None, ystd=None, yave=None, shift=0.0, fancy=False, random='normal', rotate=False, **kwargs)¶ Draw a ‘carpet plot’ that shows semi-quantitatively the distribution of points.
The given data (xx) is plotted as scatter points, where the abscissa (typically x-values) are the actual locations of the data and the ordinate are generated randomly. The size and transparency of points are chosen based on the number of points. If weights are given, it the size of the data points are chosen proportionally.
- NOTE: the shift argument determines the reference ordinate-value of the distribution, this is
particularly useful when numerous datasets are being overplotted.
- Parameters
xx ((N,) array_like of scalar, the data values to be plotted) –
weights (None or (N,) array_like of scalar, the weighting of each data-point if and) – only-if the given kde_data argument is a (D,N) array_like of scalar data.
ax (None or matplotlib.axis.Axis, if None the plt.gca() is used) –
ystd (scalar or None, a measure of the dispersion in the ordinate scatter of values) – If None then an appropriate value is guessed based yave or the axis limits
yave (scalar or None, the baseline at which the ordinate values are generated,) – This is very similar to the shift argument, determining the ordinate-offset, but in the case that ystd is not given but yave is given, then the yave value determines ystd.
shift (scalar,) – A systematic ordinate shift of all data-points, particularly useful when multiple datasets are being plotted, such that one carpet plot can be offset from the other(s).
fancy (bool,) – Experimental resizing of data-points to visually emphasize outliers.
random (str, one of ['normal', 'uniform'],) – How the ordinate values are randomly generated: either a uniform or normal (i.e. Gaussian).
rotate (bool, if True switch the x and y values such that x becomes the ordinate.) –
kwargs (additional keyword-arguments passed to matplotlib.axes.Axes.scatter()) –
-
kalepy.plot.
confidence
(data, ax=None, weights=None, quantiles=[0.5, 0.9], median=True, rotate=False, **kwargs)¶ Plot 1D Confidence intervals at the given quantiles.
For each quantile q, a shaded range is plotted that includes a fration q of data values around the median. Ultimately either plt.axhspan or plt.axvspan is used for drawing.
- Parameters
data ((N,) array_like of scalar, the data values around which to calculate confidence intervals) –
ax (None or matplotlib.axes.Axes instance, if None then plt.gca() is used.) –
weights (None or (N,) array_like of scalar, the weighting of each data-point if and) – only-if the given kde_data argument is a (D,N) array_like of scalar data.
quantiles (array_like of scalar values in [0.0, 1.0] denoting the fractions of data to mark.) –
median (bool, mark the location of the median value.) –
rotate (bool, if true switch the x and y coordinates (i.e. rotate plot 90deg clockwise)) –
**kwargs (additional keyword-arguments passed to plt.axhspan or plt.axvspan.) –
-
kalepy.plot.
contour
(data, edges=None, ax=None, weights=None, color=None, cmap=None, quantiles=None, smooth=1.0, upsample=2, pad=1, **kwargs)¶ Calculate and draw 2D contours.
This is a wrapper for draw_contour, which in turn wraps plt.contour. This function constructs bin-edges and calculates the histogram from which the contours are calculated.
- Parameters
data ((2, N) array_like of scalars,) – The data from which contours should be calculated.
edges (object specifying historgam edge locations; or None) –
int : the number of bins for both dimensions, locations calculated automatically
(2,) array_like of int : the number of bins for each dimension.
(2,) of array_like : the bin-edge locations for each dimension, e.g. ([0, 1, 2], [0.0, 0.1, 0.2, 0.3],) would describe two bins for the 0th dimension, and 3 bins for the 1st dimension: i.e. 6 total.
(X,) array_like of scalar : the bin-edge locations to be used for both dimensions.
None : the number and locations of bins are calculated automatically.
ax (matplotlib.axes.Axes instance, or None; if None then plt.gca() is used.) –
weights (None or (N,) array_like of scalar, the weighting of each data-point if and) – only-if the given kde_data argument is a (D,N) array_like of scalar data.
color (matplotlib color specification (i.e. named color, hex or rgb) or None.) –
- If None:
cmap is given, then the color will be set to the cmap midpoint.
cmap is not given, then the color will be determined by the next value of the default matplotlib color-cycle, and cmap will be set to a matching colormap.
This parameter effects the color of 1D: histograms, confidence intervals, and carpet; 2D: scatter points.
cmap (matplotlib colormap specification, or None) –
All valid matplotlib specifications can be used, e.g. named value (like ‘Reds’ or ‘viridis’) or a matplotlib.colors.Colormap instance.
If None then a colormap is constructed based on the value of color (see above).
quantiles (None or array_like of scalar values in [0.0, 1.0] denoting the fractions of) – data to demarkate with contours and confidence bands.
smooth (scalar or None/False,) – if scalar: The width, in histogram bins, of a gaussian smoothing filter if None or False: no smoothing.
upsample (int or None/False,) – if int: the factor by which to upsample the histogram by interpolation. if None or False: no upsampling
pad (int, True, or None/False,) – if int: the number of edge bins added to the histogram to close contours hitting the edges if true: the default padding size is used if None or False: no padding is used.
**kwargs (additiona keyword-arguments passed to kalepy.plot.draw_contour2d().) –
-
kalepy.plot.
corner
(kde_data, labels=None, kwcorner={}, **kwplot)¶ Simple wrapper function to construct a Corner instance and plot the given data.
See kalepy.plot.Corner and kalepy.plot.Corner.plot for more information.
- Parameters
kde_data (kalepy.KDE instance, or (D,N) array_like of scalars) –
- instance of kalepy.kde.KDE, providing the data and KDE to be plotted.
In this case the param argument selects which dimension/parameter is plotted if numerous are included in the KDE.
array_like scalar (D,N) of data with D parameters and N data points.
labels (None or (D,) array_like of string, names of each parameter being plotted.) –
kwcorner (dict, keyword-arguments passed to Corner constructor.) –
**kwplot (additional keyword-arguments passed to Corner.plot method.) –
-
kalepy.plot.
dist1d
(kde_data, ax=None, edges=None, weights=None, probability=True, param=0, rotate=False, density=None, confidence=False, hist=None, carpet=True, color=None, quantiles=None, ls=None, **kwargs)¶ Draw 1D data distributions with numerous possible components.
The components of the plot are controlled by the arguments: * density : a KDE distribution curve, * confidence : 1D confidence bands calculated from a KDE, * hist : 1D histogram from the provided data, * carpet : ‘carpet plot’ (see kalepy.plot.carpet()) showing the data as a scatter-like plot.
- Parameters
kde_data (kalepy.KDE instance, (D,N) array_like of scalars, or None) –
instance of kalepy.kde.KDE, providing the data and KDE to be plotted. In this case the param argument selects which dimension/parameter is plotted if numerous are included in the KDE.
array_like scalar (D,N) of data with D parameters and N data points.
ax (matplotlib.axes.Axes instance, or None; if None then plt.gca() is used.) –
edges (object specifying historgam edge locations; or None) –
int : the number of bins, locations calculated automatically
array_like : the bin-edge locations
None : the number and locations of bins are calculated automatically
weights (None or (N,) array_like of scalar, the weighting of each data-point if and) – only-if the given kde_data argument is a (D,N) array_like of scalar data.
probability (bool,) – Whether distributions (hist and density) are normalized such that the sum is unity.
param (int,) – If a KDE instance is provided as the kde_data argument, and it includes multiple dimensions/parameters of data, then this argument determines which parameter is plotted.
rotate (bool, if true switch the x and y coordinates (i.e. rotate plot 90deg clockwise)) –
density (bool or None, whether the density KDE distribution is plotted or not.) – If None then this is set based on what is passed as the kde_data.
confidence (bool, whether confidence intervals are plotted based on the KDE distribution,) – intervals are placed according to the quantiles argument.
hist (bool or None, whether a histogram is plotted from the given data.) – If None, then the value is chosen based on the given kde_data argument.
carpet (bool, whether or not a 'carpet plot' is shown from the given data.) –
color (matplotlib color specification (i.e. named color, hex or rgb) or None.) – If None then the color will be determined by the next value of the default matplotlib color-cycle.
quantiles (array_like of scalar values in [0.0, 1.0] denoting the fractions of data to mark.) –
ls (str or None, matplotlib linestyle specification) –
**kwargs (additional keyword-arguments passed to plt.plot command when plotting ‘density’) – and ‘hist’ components.
-
kalepy.plot.
dist2d
(kde_data, ax=None, edges=None, weights=None, params=[0, 1], quantiles=None, color=None, cmap=None, smooth=None, upsample=None, pad=True, ls='-', median=True, scatter=True, contour=True, hist=True, mask_dense=None, mask_below=True)¶ Draw 2D data distributions with numerous possible components.
The components of the plot are controlled by the arguments: * median : the median values of each coordinate in a ‘cross-hairs’ style, * scatter : 2D scatter points of the raw data, * contour : 2D contour plot from the KDE, * hist : 2D histogram of the raw data.
These components are modified by: * mask_dense : mask over scatter points within the outer-most contour interval, * mask_below : mask out (ignore) histogram bins below a certain value.
- Parameters
kde_data (kalepy.KDE instance, or (D,N) array_like of scalars) –
instance of kalepy.kde.KDE, providing the data and KDE to be plotted. In this case the param argument selects which dimension/parameter is plotted if numerous are included in the KDE.
array_like scalar (D,N) of data with D parameters and N data points.
ax (matplotlib.axes.Axes instance, or None; if None then plt.gca() is used.) –
edges (object specifying historgam edge locations; or None) –
int : the number of bins for both dimensions, locations calculated automatically
(2,) array_like of int : the number of bins for each dimension.
(2,) of array_like : the bin-edge locations for each dimension, e.g. ([0, 1, 2], [0.0, 0.1, 0.2, 0.3],) would describe two bins for the 0th dimension, and 3 bins for the 1st dimension: i.e. 6 total.
(X,) array_like of scalar : the bin-edge locations to be used for both dimensions.
None : the number and locations of bins are calculated automatically.
weights (None or (N,) array_like of scalar, the weighting of each data-point if and) – only-if the given kde_data argument is a (D,N) array_like of scalar data.
param ((2,) array_like of int,) – If a KDE instance is provided as the kde_data argument, and it includes multiple dimensions/parameters of data, then this argument determines which parameters are plotted.
quantiles (array_like of scalar values in [0.0, 1.0] denoting the fractions of data to contour.) –
color (matplotlib color specification (i.e. named color, hex or rgb) or None.) –
- If None:
cmap is given, then the color will be set to the cmap midpoint.
cmap is not given, then the color will be determined by the next value of the default matplotlib color-cycle, and cmap will be set to a matching colormap.
This parameter effects the color of 1D: histograms, confidence intervals, and carpet; 2D: scatter points.
cmap (matplotlib colormap specification, or None) –
All valid matplotlib specifications can be used, e.g. named value (like ‘Reds’ or ‘viridis’) or a matplotlib.colors.Colormap instance.
If None then a colormap is constructed based on the value of color (see above).
smooth (scalar or None/False, smoothing of plotted contours (only)) – if scalar: The width, in histogram bins, of a gaussian smoothing filter if None or False: no smoothing.
upsample (int or None/False, upsampling of plotted contours (only)) – if int: the factor by which to upsample the histogram by interpolation. if None or False: no upsampling
pad (int, True, or None/False,) – if int: the number of edge bins added to the histogram to close contours hitting the edges if true: the default padding size is used if None or False: no padding is used.
ls (str or None, matplotlib linestyle specification for ‘contour’ and ‘mdedian’ components.) –
median (bool, mark the location of the median values in both dimensions (cross-hairs style)) –
scatter (bool, whether to plot scatter points of the data points.) – The mask_dense parameter determines if some of these points are masked over.
contour (bool, whether or not contours are plotted at the given quantiles.) –
hist (bool, whether a 2D histogram is plotted from the given data.) –
mask_dense (bool, whether to mask over high-density scatter points (within the lowest contour)) –
mask_below (bool or scalar; whether, or the value below which, hist bins should be excluded.) –
If True : exclude histogram bins with less than the average weight of a data point. If weights are not given, this means exclude empty histogram bins.
If False : do not exclude any bins (i.e. include all bins).
If scalar : exclude histogram bins with values below the given value.
Notes
There is no probability argument because the normalization of the 2D distributions currently has no effect.
-
kalepy.plot.
hist1d
(data, edges=None, ax=None, weights=None, density=False, probability=False, renormalize=False, joints=True, positive=True, rotate=False, **kwargs)¶ Calculate and draw a 1D histogram.
This is a thin wrapper around the kalepy.plot.draw_hist1d() method which draws a histogram that has already been computed (e.g. with kalepy.utils.histogram or numpy.histogram).
- Parameters
data ((N,) array_like of scalar, data to be histogrammed.) –
edges (object specifying historgam edge locations; or None) –
int : the number of bins, locations calculated automatically
array_like : the bin-edge locations
None : the number and locations of bins are calculated automatically
ax (matplotlib.axes.Axes instance, or None; if None then plt.gca() is used.) –
weights (None or (N,) array_like of scalar, the weighting of each data-point if and) – only-if the given kde_data argument is a (D,N) array_like of scalar data.
density (bool or None, whether the density KDE distribution is plotted or not.) – If None then this is set based on what is passed as the kde_data.
probability (bool,) – Whether distributions (hist and density) are normalized such that the sum is unity. NOTE: this can be overridden by the renormalize argument.
renormalize (bool or scalar, whether or to what value to renormalize the histrogram maximum.) – If True : renormalize the maximum histogram value to unity. If False : do not renormalize. If scalar : renormalize the histogram maximum to this value.
joints (bool, plot the vertical connectors ('joints') between histogram bins; if False, only) – horizontal lines are plotted for each bin.
positive (bool, only plot bins with positive values.) –
rotate (bool, if true switch the x and y coordinates (i.e. rotate plot 90deg clockwise)) –
**kwargs (additional keyword-arguments passed to kalepy.plot.draw_hist1d().) – Any arguments not caught by draw_hist1d() are eventually passed to plt.plot() method.
Notes
TO-DO: Add scipy.binned_statistic functionality for arbitrary statistics beyond histgramming.
-
kalepy.plot.
hist2d
(data, edges=None, ax=None, weights=None, mask_below=False, **kwargs)¶ Calculate and draw a 2D histogram.
This is a thin wrapper around the kalepy.plot.draw_hist2d() method which draws a 2D histogram that has already been computed (e.g. with numpy.histogram2d).
- Parameters
data ((2, N) array_like of scalar, data to be histogrammed.) –
edges (object specifying historgam edge locations; or None) –
int : the number of bins for both dimensions, locations calculated automatically
(2,) array_like of int : the number of bins for each dimension.
(2,) of array_like : the bin-edge locations for each dimension, e.g. ([0, 1, 2], [0.0, 0.1, 0.2, 0.3],) would describe two bins for the 0th dimension, and 3 bins for the 1st dimension: i.e. 6 total.
(X,) array_like of scalar : the bin-edge locations to be used for both dimensions.
None : the number and locations of bins are calculated automatically.
ax (matplotlib.axes.Axes instance, or None; if None then plt.gca() is used.) –
weights (None or (N,) array_like of scalar, the weighting of each data-point if and) – only-if the given kde_data argument is a (D,N) array_like of scalar data.
mask_below (bool or scalar; whether, or the value below which, hist bins should be excluded.) –
If True : exclude histogram bins with less than the average weight of a data point. If weights are not given, this means exclude empty histogram bins.
If False : do not exclude any bins (i.e. include all bins).
If scalar : exclude histogram bins with values below the given value.
**kwargs (additional keyword-arguments passed to kalepy.plot.draw_hist2d().) – Any arguments not caught by draw_hist1d() are eventually passed to plt.pcolormesh().
Notes
TO-DO: Add scipy.binned_statistic functionality for arbitrary statistics beyond histgramming.
kalepy.utils module¶
kalepy’s internal, utility functions.
-
kalepy.utils.
add_cov
(data, cov)¶
-
kalepy.utils.
allclose
(xx, yy, msg=None, **kwargs)¶
-
kalepy.utils.
alltrue
(xx, msg=None)¶
-
kalepy.utils.
array_str
(data, num=3, format=':.2e')¶
-
kalepy.utils.
assert_true
(val, msg=None)¶
-
kalepy.utils.
bins
(*args, **kwargs)¶ Calculate np.linspace(*args) and return also centers and widths.
- Returns
xe ((N,) bin edges)
xc ((N-1,) bin centers)
dx ((N-1,) bin widths)
-
kalepy.utils.
bound_indices
(data, bounds, outside=False)¶ Find the indices of the data array that are bounded by the given bounds.
If outside is True, then indices for values outside of the bounds are returned.
-
kalepy.utils.
check_path
(fname)¶ Make sure the given path exists. Create directories as needed.
-
kalepy.utils.
cov_from_var_cor
(var, corr)¶
-
kalepy.utils.
cumsum
(vals, axis=None)¶ Perform a cumulative sum without flattening the input array.
See: https://stackoverflow.com/a/60647166/230468
- Parameters
vals (array_like of scalar) – Input values to sum over.
axis (None or int) – Axis over which to perform the cumulative sum.
- Returns
res – Same shape as input vals
- Return type
ndarray of scalar
-
kalepy.utils.
cumtrapz
(pdf, edges, prepend=True, axis=None)¶ Perform a cumulative integration using the trapezoid rule.
- Parameters
pdf (array_like of scalar) – Input values (e.g. a PDF) to be integrated.
edges ([D,] list of (array_like of scalar)) – Edges defining bins along each dimension. This should be an array/list of edges for each of D dimensions.
prepend (bool) – Whether or not to prepend zero values along the integrated dimensions.
axis (None or int) – Axis/Dimension over which to integrate.
- Returns
cdf – Values integrated over the desired axes. Shape: * If prepend is False, the shape of cdf will be one smaller than the input pdf * in all dimensions integrated over. * If prepend is True, the shape of cdf will match that of the input pdf.
- Return type
ndarray of scalar
-
kalepy.utils.
flatlen
(arr)¶
-
kalepy.utils.
flatten
(arr)¶ Flatten a ND array, whether jagged or not, into a 1D array.
-
kalepy.utils.
histogram
(data, bins=None, weights=None, density=False, probability=False)¶
-
kalepy.utils.
iqcenter
(data, weights=None, axis=None)¶
-
kalepy.utils.
iqrange
(data, log=False, weights=None)¶ Calculate inter-quartile range of the given data.
-
kalepy.utils.
isjagged
(arr)¶ Test if the given array is jagged.
-
kalepy.utils.
jshape
(arr, level=0, printout=False, prepend='', indent=' ')¶ Print the complete shape (even if jagged) of the given array.
-
kalepy.utils.
matrix_invert
(matrix, helper=True)¶
-
kalepy.utils.
meshgrid
(*args, indexing='ij', **kwargs)¶
-
kalepy.utils.
midpoints
(data, scale='lin', frac=0.5, axis=- 1, squeeze=True)¶ Return the midpoints between values in the given array.
-
kalepy.utils.
minmax
(data, positive=False, prev=None, stretch=None, log_stretch=None, limit=None)¶
-
kalepy.utils.
modify_exists
(path_fname)¶ Modify the given filename if it already exists.
-
kalepy.utils.
parse_edges
(data, edges=None, extrema=None, weights=None, params=None, nmin=5, nmax=1000, pad=None, refine=1.0)¶
-
kalepy.utils.
quantiles
(values, percs=None, sigmas=None, weights=None, axis=None, values_sorted=False)¶ Compute weighted quartiles.
Taken from zcode.math.statistics Based on @Alleo answer: http://stackoverflow.com/a/29677616/230468
- Parameters
values ((N,)) – input data
percs ((M,) scalar [0.0, 1.0]) – Desired percentiles of the data.
weights ((N,) or None) – Weighted for each input data point in values.
values_sorted (bool) – If True, then input values are assumed to already be sorted.
- Returns
percs – Array of percentiles of the weighted input data.
- Return type
(M,) float
-
kalepy.utils.
really1d
(arr)¶ Test whether an array_like is really 1D (i.e. not a jagged ND array).
Test whether the input array is uniformly one-dimensional, as apposed to (e.g.) a
ndim == 1
list or array of irregularly shaped sub-lists/sub-arrays. True for an empty list [].- Parameters
arr (array_like) – Array to be tested.
- Returns
Whether arr is purely 1D.
- Return type
bool
-
kalepy.utils.
rem_cov
(data, cov=None)¶
-
kalepy.utils.
run_if
(func, target, *args, otherwise=None, **kwargs)¶
-
kalepy.utils.
run_if_notebook
(func, *args, otherwise=None, **kwargs)¶
-
kalepy.utils.
run_if_script
(func, *args, otherwise=None, **kwargs)¶
-
kalepy.utils.
spacing
(data, scale='log', num=None, dex=10, **kwargs)¶
-
kalepy.utils.
stats
(data, shape=True, sample=3, stats=True)¶
-
kalepy.utils.
stats_str
(data, percs=[0.0, 0.16, 0.5, 0.84, 1.0], ave=False, std=False, weights=None, format=None, log=False, label_log=True)¶ Return a string with the statistics of the given array.
- Parameters
data (ndarray of scalar) – Input data from which to calculate statistics.
percs (array_like of scalars in {0, 100}) – Which percentiles to calculate.
ave (bool) – Include average value in output.
std (bool) – Include standard-deviation in output.
format (str) – Formatting for all numerical output, (e.g. “:.2f”).
log (bool) – Convert values to log10 before printing.
Output –
------ –
out (str) – Single-line string of the desired statistics.
-
kalepy.utils.
trapz_dens_to_mass
(pdf, edges, axis=None)¶ Convert from density to mass, for values on the corner of a grid, using the trapezoid rule.
- Parameters
pdf (array_like) – Density values, computed at the grid edges specified by the edges list-of-lists.
edges (array_like of array_like) – List of edge-locations along each dimension specifying the grid of values at which pdf are located. e.g. [[x0, x1, … xn], [y0, y1, … ym], …] The length of each sub-list in edges, must match the shape of pdf. e.g. if edges is a (3,) list, composed of sub-lists with lengths: [N, M, L,] then the shape of pdf must be (N, M, L,).
axis (int, array_like int, or None) – Along which axes to convert from density to mass.
- Returns
mass – The mass array has as many dimensions as pdf, with each dimension one element shorter. e.g. if the shape of pdf is (N, M, …), then the shape of mass is (N-1, M-1, …).
- Return type
array_like
-
kalepy.utils.
trapz_nd
(data, edges, axis=None)¶