Core kernel classes: contains the base Kernel class and helper subclasses.
Bases: object
Covariance kernel base class. Not meant to be explicitly instantiated!
Initialize the kernel with the given number of input dimensions.
Parameters: | num_dim : positive int
num_params : Non-negative int
initial_params : Array or other Array-like, (num_params,), optional
fixed_params : Array or other Array-like of bool, (num_params,), optional
param_bounds : list of 2-tuples (num_params,), optional
param_names : list of str (num_params,), optional
enforce_bounds : bool, optional
hyperprior : JointPrior instance or list, optional
|
---|---|
Raises: | ValueError :
GPArgumentError :
|
Attributes
num_params | int | Number of parameters |
num_dim | int | Number of dimensions |
params | Array of float, (num_params,) | Array of parameters. |
fixed_params | Array of bool, (num_params,) | Array of booleans indicated which parameters in params are fixed. |
param_names | list of str, (num_params,) | List of the labels for the hyperparameters. |
hyperprior | JointPrior instance | Joint prior distribution for the hyperparameters. |
Evaluate the covariance between points Xi and Xj with derivative order ni, nj.
Note that this method only returns the covariance – the hyperpriors and potentials stored in this kernel must be applied separately.
Parameters: | Xi : Matrix or other Array-like, (M, D)
Xj : Matrix or other Array-like, (M, D)
ni : Matrix or other Array-like, (M, D)
nj : Matrix or other Array-like, (M, D)
hyper_deriv : Non-negative int or None, optional
symmetric : bool, optional
|
---|---|
Returns: | Kij : Array, (M,)
|
Notes
THIS IS ONLY A METHOD STUB TO DEFINE THE NEEDED CALLING FINGERPRINT!
Sets the free hyperparameters to the new parameter values in new_params.
Parameters: | new_params : Array or other Array-like, (len(self.params),)
|
---|
Returns the indices of the free parameters in the main arrays of parameters, etc.
Returns the values of the free hyperparameters.
Returns: | free_params : Array
|
---|
Returns the bounds of the free hyperparameters.
Returns: | free_param_bounds : Array
|
---|
Returns the names of the free hyperparameters.
Returns: | free_param_names : Array
|
---|
Add two Kernels together.
Parameters: | other : Kernel
|
---|---|
Returns: | sum : SumKernel
|
Multiply two Kernels together.
Parameters: | other : Kernel
|
---|---|
Returns: | prod : ProductKernel
|
Bases: gptools.kernel.core.Kernel
Abstract class for binary operations on kernels (addition, multiplication, etc.).
Parameters: | k1, k2 : Kernel instances to be combined |
---|
Notes
k1 and k2 must have the same number of dimensions.
Boolean indicating whether or not the kernel will explicitly enforce the bounds defined by the hyperprior.
Returns the bounds of the free hyperparameters.
Returns: | free_param_bounds : Array
|
---|
Bases: gptools.kernel.core.BinaryKernel
The sum of two kernels.
Evaluate the covariance between points Xi and Xj with derivative order ni, nj.
Parameters: | Xi : Matrix or other Array-like, (M, D)
Xj : Matrix or other Array-like, (M, D)
ni : Matrix or other Array-like, (M, D)
nj : Matrix or other Array-like, (M, D)
symmetric : bool, optional
|
---|---|
Returns: | Kij : Array, (M,)
|
Raises: | NotImplementedError :
|
Bases: gptools.kernel.core.BinaryKernel
The product of two kernels.
Evaluate the covariance between points Xi and Xj with derivative order ni, nj.
Parameters: | Xi : Matrix or other Array-like, (M, D)
Xj : Matrix or other Array-like, (M, D)
ni : Matrix or other Array-like, (M, D)
nj : Matrix or other Array-like, (M, D)
symmetric : bool, optional
|
---|---|
Returns: | Kij : Array, (M,)
|
Raises: | NotImplementedError :
|
Bases: gptools.kernel.core.Kernel
Abstract class for the common methods in creating kernels that require application of Faa di Bruno’s formula.
Evaluate the covariance between points Xi and Xj with derivative order ni, nj.
Parameters: | Xi : Matrix or other Array-like, (M, D)
Xj : Matrix or other Array-like, (M, D)
ni : Matrix or other Array-like, (M, D)
nj : Matrix or other Array-like, (M, D)
hyper_deriv : Non-negative int or None, optional
symmetric : bool
|
---|---|
Returns: | Kij : Array, (M,)
|
Raises: | NotImplementedError :
|
Bases: gptools.kernel.core.Kernel
Covariance kernel from an arbitrary covariance function.
Computes derivatives using mpmath.diff() and is hence in general much slower than a hard-coded implementation of a given kernel.
Parameters: | num_dim : positive int
cov_func : callable, takes >= 2 args
num_proc : int or None, optional
num_params : int or None, optional
**kwargs :
|
---|
Attributes
cov_func | callable | The covariance function |
num_proc | non-negative int | Number of processors to use in evaluating covariance derivatives. 0 means serial. |
Evaluate the covariance between points Xi and Xj with derivative order ni, nj.
Parameters: | Xi : Matrix or other Array-like, (M, D)
Xj : Matrix or other Array-like, (M, D)
ni : Matrix or other Array-like, (M, D)
nj : Matrix or other Array-like, (M, D)
hyper_deriv : Non-negative int or None, optional
symmetric : bool, optional
|
---|---|
Returns: | Kij : Array, (M,)
|
Raises: | NotImplementedError :
|
Bases: gptools.kernel.core.Kernel
Creates a kernel that is only masked to operate on certain dimensions, or has scaling/shifting.
This can be used, for instance, to put a squared exponential kernel in one direction and a Matern kernel in the other.
Overrides __getattribute__() and __setattr__() to make all setting/accessing go to the base kernel.
Parameters: | base : Kernel instance
total_dim : int, optional
mask : list or other array-like, optional
scale : list or other array-like, optional
|
---|
Gets all attributes from the base kernel.
The exceptions are ‘base’, ‘mask’, ‘maskC’, ‘num_dim’, ‘scale’ and any special method (i.e., a method/attribute having leading and trailing double underscores), which are taken from MaskedKernel.
Sets all attributes in the base kernel.
The exceptions are ‘base’, ‘mask’, ‘maskC’, ‘num_dim’, ‘scale’ and any special method (i.e., a method/attribute having leading and trailing double underscores), which are set in MaskedKernel.
Evaluate the covariance between points Xi and Xj with derivative order ni, nj.
Note that in the argument specifications, D is the total_dim specified in the constructor (i.e., num_dim for the MaskedKernel instance itself).
Parameters: | Xi : Matrix or other Array-like, (M, D)
Xj : Matrix or other Array-like, (M, D)
ni : Matrix or other Array-like, (M, D)
nj : Matrix or other Array-like, (M, D)
hyper_deriv : Non-negative int or None, optional
symmetric : bool, optional
|
---|---|
Returns: | Kij : Array, (M,)
|
Provides classes and functions for creating SE kernels with warped length scales.
Warps the X coordinate with the tanh model
Parameters: | X : Array, (M,) or scalar float
l1 : positive float
l2 : positive float
lw : positive float
x0 : float
|
---|---|
Returns: | l : Array, (M,) or scalar float
|
Warps the X coordinate with a Gaussian-shaped divot.
Parameters: | X : Array, (M,) or scalar float
l1 : positive float
l2 : positive float
lw : positive float
x0 : float
|
---|---|
Returns: | l : Array, (M,) or scalar float
|
Bases: object
Wrapper class for the Gibbs covariance function, permits the use of arbitrary warping.
The covariance function is given by
Parameters: | warp_function : callable
|
---|
Evaluate the covariance function between points Xi and Xj.
Parameters: | Xi, Xj : Array, mpf or scalar float
sigmaf : scalar float
l1, l2, lw, x0 : scalar floats
|
---|---|
Returns: | k : Array or mpf
|
Bases: gptools.kernel.core.ArbitraryKernel
Gibbs warped squared exponential covariance function in 1d.
Computes derivatives using mpmath.diff() and is hence in general much slower than a hard-coded implementation of a given kernel.
The covariance function is given by
Warps the length scale using a hyperbolic tangent:
The order of the hyperparameters is:
0 | sigmaf | Amplitude of the covariance function |
1 | l1 | Small-X saturation value of the length scale. |
2 | l2 | Large-X saturation value of the length scale. |
3 | lw | Length scale of the transition between the two length scales. |
4 | x0 | Location of the center of the transition between the two length scales. |
Parameters: | **kwargs :
|
---|
Bases: gptools.kernel.core.ArbitraryKernel
Gibbs warped squared exponential covariance function in 1d.
Computes derivatives using mpmath.diff() and is hence in general much slower than a hard-coded implementation of a given kernel.
The covariance function is given by
Warps the length scale using a gaussian:
The order of the hyperparameters is:
0 | sigmaf | Amplitude of the covariance function |
1 | l1 | Global value of the length scale. |
2 | l2 | Pedestal value of the length scale. |
3 | lw | Width of the dip. |
4 | x0 | Location of the center of the dip in length scale. |
Parameters: | **kwargs :
|
---|
Bases: gptools.kernel.core.Kernel
Univariate Gibbs kernel with arbitrary length scale warping for low derivatives.
The covariance function is given by
The derivatives are hard-coded using expressions obtained from Mathematica.
Parameters: | l_func : callable
num_params : int, optional
**kwargs :
|
---|
Evaluate the covariance between points Xi and Xj with derivative order ni, nj.
Parameters: | Xi : Matrix or other Array-like, (M, D)
Xj : Matrix or other Array-like, (M, D)
ni : Matrix or other Array-like, (M, D)
nj : Matrix or other Array-like, (M, D)
hyper_deriv : Non-negative int or None, optional
symmetric : bool, optional
|
---|---|
Returns: | Kij : Array, (M,)
|
Raises: | NotImplementedError :
|
Implements a tanh warping function and its derivative.
Parameters: | x : float or array of float
n : int
l1 : positive float
l2 : positive float
lw : positive float
x0 : float
|
---|---|
Returns: | l : float or array
|
Raises: | NotImplementedError :
|
Bases: gptools.kernel.gibbs.GibbsKernel1d
Gibbs warped squared exponential covariance function in 1d.
Uses hard-coded implementation up to first derivatives.
The covariance function is given by
Warps the length scale using a hyperbolic tangent:
The order of the hyperparameters is:
0 | sigmaf | Amplitude of the covariance function |
1 | l1 | Small-X saturation value of the length scale. |
2 | l2 | Large-X saturation value of the length scale. |
3 | lw | Length scale of the transition between the two length scales. |
4 | x0 | Location of the center of the transition between the two length scales. |
Parameters: | **kwargs :
|
---|
Implements a sum-of-tanh warping function and its derivative.
Parameters: | x : float or array of float
n : int
lcore : float
lmid : float
ledge : float
la : positive float
lb : positive float
xa : float
xb : float
|
---|---|
Returns: | l : float or array
|
Raises: | NotImplementedError :
|
Bases: gptools.kernel.gibbs.GibbsKernel1d
Gibbs warped squared exponential covariance function in 1d.
Uses hard-coded implementation up to first derivatives.
The covariance function is given by
Warps the length scale using two hyperbolic tangents:
The order of the hyperparameters is:
0 | sigmaf | Amplitude of the covariance function |
1 | lcore | Core length scale |
2 | lmid | Intermediate length scale |
3 | ledge | Edge length scale |
4 | la | Width of first tanh |
5 | lb | Width of second tanh |
6 | xa | Center of first tanh |
7 | xb | Center of second tanh |
Parameters: | **kwargs :
|
---|
Warps the length scale with a piecewise cubic “bucket” shape.
Parameters: | x : float or array-like of float
n : non-negative int
l1 : positive float
l2 : positive float
l3 : positive float
x0 : float
w1 : positive float
w2 : positive float
w3 : positive float
|
---|
Bases: gptools.kernel.gibbs.GibbsKernel1d
Gibbs warped squared exponential covariance function in 1d.
Uses hard-coded implementation up to first derivatives.
The covariance function is given by
Warps the length scale using a “bucket” function with cubic joins.
The order of the hyperparameters is:
0 | sigmaf | Amplitude of the covariance function |
1 | l1 | Length scale to the left of the bucket. |
2 | l2 | Length scale in the bucket. |
3 | l3 | Length scale to the right of the bucket. |
4 | x0 | Location of the center of the bucket. |
5 | w1 | Width of the left side cubic section. |
6 | w2 | Width of the bucket. |
7 | w3 | Width of the right side cubic section. |
Parameters: | **kwargs :
|
---|
Warps the length scale with a piecewise quintic “bucket” shape.
Parameters: | x : float or array-like of float
n : non-negative int
l1 : positive float
l2 : positive float
l3 : positive float
x0 : float
w1 : positive float
w2 : positive float
w3 : positive float
|
---|
Bases: gptools.kernel.gibbs.GibbsKernel1d
Gibbs warped squared exponential covariance function in 1d.
Uses hard-coded implementation up to first derivatives.
The covariance function is given by
Warps the length scale using a “bucket” function with quintic joins.
The order of the hyperparameters is:
0 | sigmaf | Amplitude of the covariance function |
1 | l1 | Length scale to the left of the bucket. |
2 | l2 | Length scale in the bucket. |
3 | l3 | Length scale to the right of the bucket. |
4 | x0 | Location of the center of the bucket. |
5 | w1 | Width of the left side quintic section. |
6 | w2 | Width of the bucket. |
7 | w3 | Width of the right side quintic section. |
Parameters: | **kwargs :
|
---|
Provides the MaternKernel class which implements the anisotropic Matern kernel.
Matern covariance function of arbitrary dimension, for use with ArbitraryKernel.
The Matern kernel has the following hyperparameters, always referenced in the order listed:
0 | sigma | prefactor |
1 | nu | order of kernel |
2 | l1 | length scale for the first dimension |
3 | l2 | ...and so on for all dimensions |
The kernel is defined as:
Parameters: | Xi, Xj : Array, mpf, tuple or scalar float
*args :
|
---|
Bases: gptools.kernel.core.ArbitraryKernel
Matern covariance kernel. Supports arbitrary derivatives. Treats order as a hyperparameter.
This version of the Matern kernel is painfully slow, but uses mpmath to ensure the derivatives are computed properly, since there may be issues with the regular MaternKernel.
The Matern kernel has the following hyperparameters, always referenced in the order listed:
0 | sigma | prefactor |
1 | nu | order of kernel |
2 | l1 | length scale for the first dimension |
3 | l2 | ...and so on for all dimensions |
The kernel is defined as:
Parameters: | **kwargs :
|
---|
Bases: gptools.kernel.core.Kernel
Matern covariance kernel. Only for univariate data. Only supports up to first derivatives. Treats order as a hyperparameter.
The Matern kernel has the following hyperparameters, always referenced in the order listed:
0 | sigma | prefactor |
1 | nu | order of kernel |
2 | l1 | length scale |
The kernel is defined as:
where .
Note that the expressions for the derivatives break down for .
Parameters: | **kwargs :
|
---|
Evaluate the covariance between points Xi and Xj with derivative order ni, nj.
Parameters: | Xi : Matrix or other Array-like, (M, D)
Xj : Matrix or other Array-like, (M, D)
ni : Matrix or other Array-like, (M, D)
nj : Matrix or other Array-like, (M, D)
hyper_deriv : Non-negative int or None, optional
symmetric : bool, optional
|
---|---|
Returns: | Kij : Array, (M,)
|
Raises: | NotImplementedError :
|
Bases: gptools.kernel.core.ChainRuleKernel
Matern covariance kernel. Supports arbitrary derivatives. Treats order as a hyperparameter.
EXPERIMENTAL IMPLEMENTATION! DO NOT TRUST FOR HIGHER DERIVATIVES!
The Matern kernel has the following hyperparameters, always referenced in the order listed:
0 | sigma | prefactor |
1 | nu | order of kernel |
2 | l1 | length scale for the first dimension |
3 | l2 | ...and so on for all dimensions |
The kernel is defined as:
Parameters: | num_dim : int
**kwargs :
|
---|---|
Raises: | ValueError :
GPArgumentError :
|
Bases: gptools.kernel.core.Kernel
Matern 5/2 covariance kernel. Supports only 0th and 1st derivatives and is fixed at nu=5/2. Because of these limitations, it is quite a bit faster than the more general Matern kernels.
The Matern52 kernel has the following hyperparameters, always referenced in the order listed:
0 | sigma | prefactor |
1 | l1 | length scale for the first dimension |
2 | l2 | ...and so on for all dimensions |
The kernel is defined as:
Parameters: | num_dim : int
**kwargs :
|
---|---|
Raises: | ValueError :
GPArgumentError :
|
Evaluate the covariance between points Xi and Xj with derivative order ni, nj.
Parameters: | Xi : Matrix or other Array-like, (M, D)
Xj : Matrix or other Array-like, (M, D)
ni : Matrix or other Array-like, (M, D)
nj : Matrix or other Array-like, (M, D)
hyper_deriv : Non-negative int or None, optional
symmetric : bool
|
---|---|
Returns: | Kij : Array, (M,)
|
Raises: | NotImplementedError :
|
Provides classes for implementing uncorrelated noise.
Bases: gptools.kernel.core.Kernel
Kernel that has constant, independent noise (i.e., a diagonal kernel).
Parameters: | num_dim : positive int
initial_noise : float, optional
fixed_noise : bool, optional
noise_bound : 2-tuple, optional
n : non-negative int or tuple of non-negative ints with length equal to num_dim, optional
hyperprior : callable, optional
|
---|
Evaluate the covariance between points Xi and Xj with derivative order ni, nj.
Parameters: | Xi : Matrix or other Array-like, (M, D)
Xj : Matrix or other Array-like, (M, D)
ni : Matrix or other Array-like, (M, D)
nj : Matrix or other Array-like, (M, D)
hyper_deriv : Non-negative int or None, optional
symmetric : bool, optional
|
---|---|
Returns: | Kij : Array, (M,)
|
Bases: gptools.kernel.noise.DiagonalNoiseKernel
Kernel that always evaluates to zero, used as the default noise kernel.
Parameters: | num_dim : positive int
|
---|
Return zeros the same length as the input Xi.
Ignores all other arguments.
Parameters: | Xi : Matrix or other Array-like, (M, D)
Xj : Matrix or other Array-like, (M, D)
ni : Matrix or other Array-like, (M, D)
nj : Matrix or other Array-like, (M, D)
hyper_deriv : Non-negative int or None, optional
symmetric : bool, optional
|
---|---|
Returns: | Kij : Array, (M,)
|
Provides the RationalQuadraticKernel class which implements the anisotropic rational quadratic (RQ) kernel.
Bases: gptools.kernel.core.ChainRuleKernel
Rational quadratic (RQ) covariance kernel. Supports arbitrary derivatives.
The RQ kernel has the following hyperparameters, always referenced in the order listed:
0 | sigma | prefactor. |
1 | alpha | order of kernel. |
2 | l1 | length scale for the first dimension. |
3 | l2 | ...and so on for all dimensions. |
The kernel is defined as:
Parameters: | num_dim : int
**kwargs :
|
---|---|
Raises: | ValueError :
GPArgumentError :
|
Provides the SquaredExponentialKernel class that implements the anisotropic SE kernel.
Bases: gptools.kernel.core.Kernel
Squared exponential covariance kernel. Supports arbitrary derivatives.
The squared exponential has the following hyperparameters, always referenced in the order listed:
0 | sigma | prefactor on the SE |
1 | l1 | length scale for the first dimension |
2 | l2 | ...and so on for all dimensions |
The kernel is defined as:
Parameters: | num_dim : int
**kwargs :
|
---|---|
Raises: | ValueError :
GPArgumentError :
|
Evaluate the covariance between points Xi and Xj with derivative order ni, nj.
Parameters: | Xi : Matrix or other Array-like, (M, D)
Xj : Matrix or other Array-like, (M, D)
ni : Matrix or other Array-like, (M, D)
nj : Matrix or other Array-like, (M, D)
hyper_deriv : Non-negative int or None, optional
symmetric : bool, optional
|
---|---|
Returns: | Kij : Array, (M,)
|
Raises: | NotImplementedError :
|
Classes and functions to warp inputs to kernels to enable fitting of nonstationary data. Note that this accomplishes a similar goal as the Gibbs kernel (which is a nonstationary version of the squared exponential kernel), but with the difference that the warpings in this module can be applied to any existing kernel. Furthermore, whereas the Gibbs kernel implements nonstationarity by changing the length scale of the covariance function, the warpings in the module act by transforming the input values directly.
The module contains two core classes that work together. WarpingFunction gives you a way to wrap a given function in a way that allows you to optimize/integrate over the hyperparameters that describe the warping. WarpedKernel is an extension of the Kernel base class and is how you apply a WarpingFunction to whatever kernel you want to warp.
Two useful warpings have been implemented at this point: BetaWarpedKernel warps the inputs using the CDF of the beta distribution (i.e., the regularized incomplete beta function). This requires that the starting inputs be constrained to the unit hypercube [0, 1]. In order to get arbitrary data to this form, LinearWarpedKernel allows you to apply a linear transformation based on the known bounds of your data.
So, for example, to make a beta-warped squared exponential kernel, you simply type:
k_SE = gptools.SquaredExponentialKernel()
k_SE_beta = gptools.BetaWarpedKernel(k_SE)
Furthermore, if your inputs X are not confined to the unit hypercube [0, 1], you should use a linear transformation to map them to it:
k_SE_beta_unit = gptools.LinearWarpedKernel(k_SE_beta, X.min(axis=0), X.max(axis=0))
Bases: object
Wrapper to interface a function with WarpedKernel.
This lets you define a simple function fun(X, d, n, p1, p2, ...) that operates on one dimension of X at a time and has several hyperparameters.
Parameters: | fun : callable
num_dim : positive int, optional
num_params : Non-negative int, optional
initial_params : Array, (num_params,), optional
fixed_params : Array of bool, (num_params,), optional
param_bounds : list of 2-tuples (num_params,), optional
param_names : list of str (num_params,), optional
enforce_bounds : bool, optional
hyperprior : JointPrior instance or list, optional
|
---|
Evaluate the warping function.
Parameters: | X : Array, (M,)
d : non-negative int
n : non-negative int
|
---|
Sets the free hyperparameters to the new parameter values in new_params.
Parameters: | new_params : Array or other Array-like, (len(self.params),)
|
---|
Returns the indices of the free parameters in the main arrays of parameters, etc.
Returns the values of the free hyperparameters.
Returns: | free_params : Array
|
---|
Warp inputs that are confined to the unit hypercube using the regularized incomplete beta function.
Applies separately to each dimension, designed for use with WarpingFunction.
Assumes that your inputs X lie entirely within the unit hypercube [0, 1].
Note that you may experience some issues with constraining and computing
derivatives at when
and at
when
. As a workaround, try mapping your data to not touch the
boundaries of the unit hypercube.
Parameters: | X : array, (M,)
d : non-negative int
n : non-negative int
*args : 2N scalars
|
---|
References
[R1] | J. Snoek, K. Swersky, R. Zemel, R. P. Adams, “Input Warping for Bayesian Optimization of Non-stationary Functions” ICML (2014) |
Warp inputs with a linear transformation.
Applies the warping
to each dimension. If you set a=min(X) and b=max(X) then this is a convenient way to map your inputs to the unit hypercube.
Parameters: | X : array, (M,)
d : non-negative int
n : non-negative int
*args : 2N scalars
|
---|
Bases: gptools.kernel.core.Kernel
Kernel which has had its inputs warped through a basic, elementwise warping function.
In other words, takes and turns it into
.
Bases: gptools.kernel.warping.WarpedKernel
Class to warp any existing Kernel with the beta CDF.
Assumes that your inputs X lie entirely within the unit hypercube [0, 1].
Note that you may experience some issues with constraining and computing
derivatives at when
and at
when
. As a workaround, try mapping your data to not touch the
boundaries of the unit hypercube.
Parameters: | k : Kernel
**w_kwargs : optional kwargs
|
---|
References
[R2] | J. Snoek, K. Swersky, R. Zemel, R. P. Adams, “Input Warping for Bayesian Optimization of Non-stationary Functions” ICML (2014) |
Bases: gptools.kernel.warping.WarpedKernel
Class to warp any existing Kernel with the linear transformation given in linear_warp().
If you set a to be the minimum of your X inputs in each dimension and b to be the maximum then you can use this to map data from an arbitrary domain to the unit hypercube [0, 1], as is required for application of the BetaWarpedKernel, for instance.
Parameters: | k : Kernel
a : list
b : list
|
---|
Subpackage containing a variety of covariance kernels and associated helpers.