BIP.Bayes.Samplers¶
MCMC¶
Module implementing MCMC samplers
- Metropolis: Adaptive Metropolis Hastings sampler
- Dream: DiffeRential Evolution Adaptive Markov chain sampler
-
class
BIP.Bayes.Samplers.MCMC.
Dream
(meldobj, samples, sampmax, data, t, parpriors, parnames, parlimits, likfun, likvariance, burnin, thin=5, convergenceCriteria=1.1, nCR=3, DEpairs=1, adaptationRate=0.65, eps=5e-06, mConvergence=False, mAccept=False, **kwargs)¶ DiffeRential Evolution Adaptive Markov chain sampler
-
delayed_rejection
(xi, zi, pxi, zprob)¶ Generates a second proposal based on rejected proposal xi :param xi: Current state of chains :param zi: Proposed evolution :param pxi: posterior log-probs of xi :param zprob: Posterior log-probs of zi :return:
-
step
()¶ Does the actual sampling loop.
-
-
class
BIP.Bayes.Samplers.MCMC.
Metropolis
(meldobj, samples, sampmax, data, t, parpriors, parnames, parlimits, likfun, likvariance, burnin, **kwargs)¶ Standard random-walk Metropolis Hastings sampler class
-
step
(nchains=1)¶ Does the actual sampling loop.
-
-
BIP.Bayes.Samplers.MCMC.
model_as_ra
(theta, model, phinames)¶ Does a single run of self.model and returns the results as a record array
-
BIP.Bayes.Samplers.MCMC.
multinomial
(n, pvals, size=None)¶ Draw samples from a multinomial distribution.
The multinomial distribution is a multivariate generalisation of the binomial distribution. Take an experiment with one of
p
possible outcomes. An example of such an experiment is throwing a dice, where the outcome can be 1 through 6. Each sample drawn from the distribution represents n such experiments. Its values,X_i = [X_0, X_1, ..., X_p]
, represent the number of times the outcome wasi
.- n : int
- Number of experiments.
- pvals : sequence of floats, length p
- Probabilities of each of the
p
different outcomes. These should sum to 1 (however, the last element is always assumed to account for the remaining probability, as long assum(pvals[:-1]) <= 1)
. - size : int or tuple of ints, optional
- Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. Default is None, in which case a single value is returned.
- out : ndarray
The drawn samples, of shape size, if that was provided. If not, the shape is
(N,)
.In other words, each entry
out[i,j,...,:]
is an N-dimensional value drawn from the distribution.
Throw a dice 20 times:
>>> np.random.multinomial(20, [1/6.]*6, size=1) array([[4, 1, 7, 5, 2, 1]])
It landed 4 times on 1, once on 2, etc.
Now, throw the dice 20 times, and 20 times again:
>>> np.random.multinomial(20, [1/6.]*6, size=2) array([[3, 4, 3, 3, 4, 3], [2, 4, 3, 4, 0, 7]])
For the first run, we threw 3 times 1, 4 times 2, etc. For the second, we threw 2 times 1, 4 times 2, etc.
A loaded die is more likely to land on number 6:
>>> np.random.multinomial(100, [1/7.]*5 + [2/7.]) array([11, 16, 14, 17, 16, 26])
The probability inputs should be normalized. As an implementation detail, the value of the last entry is ignored and assumed to take up any leftover probability mass, but this should not be relied on. A biased coin which has twice as much weight on one side as on the other should be sampled like so:
>>> np.random.multinomial(100, [1.0 / 3, 2.0 / 3]) # RIGHT array([38, 62])
not like:
>>> np.random.multinomial(100, [1.0, 2.0]) # WRONG array([100, 0])
-
BIP.Bayes.Samplers.MCMC.
multivariate_normal
(mean, cov[, size])¶ Draw random samples from a multivariate normal distribution.
The multivariate normal, multinormal or Gaussian distribution is a generalization of the one-dimensional normal distribution to higher dimensions. Such a distribution is specified by its mean and covariance matrix. These parameters are analogous to the mean (average or “center”) and variance (standard deviation, or “width,” squared) of the one-dimensional normal distribution.
- mean : 1-D array_like, of length N
- Mean of the N-dimensional distribution.
- cov : 2-D array_like, of shape (N, N)
- Covariance matrix of the distribution. It must be symmetric and positive-semidefinite for proper sampling.
- size : int or tuple of ints, optional
- Given a shape of, for example,
(m,n,k)
,m*n*k
samples are generated, and packed in an m-by-n-by-k arrangement. Because each sample is N-dimensional, the output shape is(m,n,k,N)
. If no shape is specified, a single (N-D) sample is returned.
- out : ndarray
The drawn samples, of shape size, if that was provided. If not, the shape is
(N,)
.In other words, each entry
out[i,j,...,:]
is an N-dimensional value drawn from the distribution.
The mean is a coordinate in N-dimensional space, which represents the location where samples are most likely to be generated. This is analogous to the peak of the bell curve for the one-dimensional or univariate normal distribution.
Covariance indicates the level to which two variables vary together. From the multivariate normal distribution, we draw N-dimensional samples,
. The covariance matrix element
is the covariance of
and
. The element
is the variance of
(i.e. its “spread”).
Instead of specifying the full covariance matrix, popular approximations include:
- Spherical covariance (cov is a multiple of the identity matrix)
- Diagonal covariance (cov has non-negative elements, and only on the diagonal)
This geometrical property can be seen in two dimensions by plotting generated data-points:
>>> mean = [0, 0] >>> cov = [[1, 0], [0, 100]] # diagonal covariance
Diagonal covariance means that points are oriented along x or y-axis:
>>> import matplotlib.pyplot as plt >>> x, y = np.random.multivariate_normal(mean, cov, 5000).T >>> plt.plot(x, y, 'x') >>> plt.axis('equal') >>> plt.show()
Note that the covariance matrix must be positive semidefinite (a.k.a. nonnegative-definite). Otherwise, the behavior of this method is undefined and backwards compatibility is not guaranteed.
[1] Papoulis, A., “Probability, Random Variables, and Stochastic Processes,” 3rd ed., New York: McGraw-Hill, 1991. [2] Duda, R. O., Hart, P. E., and Stork, D. G., “Pattern Classification,” 2nd ed., New York: Wiley, 2001. >>> mean = (1, 2) >>> cov = [[1, 0], [0, 1]] >>> x = np.random.multivariate_normal(mean, cov, (3, 3)) >>> x.shape (3, 3, 2)
The following is probably true, given that 0.6 is roughly twice the standard deviation:
>>> list((x[0,0,:] - mean) < 0.6) [True, True]
-
BIP.Bayes.Samplers.MCMC.
normal
(loc=0.0, scale=1.0, size=None)¶ Draw random samples from a normal (Gaussian) distribution.
The probability density function of the normal distribution, first derived by De Moivre and 200 years later by both Gauss and Laplace independently [2]_, is often called the bell curve because of its characteristic shape (see the example below).
The normal distributions occurs often in nature. For example, it describes the commonly occurring distribution of samples influenced by a large number of tiny, random disturbances, each with its own unique distribution [2]_.
- loc : float or array_like of floats
- Mean (“centre”) of the distribution.
- scale : float or array_like of floats
- Standard deviation (spread or “width”) of the distribution.
- size : int or tuple of ints, optional
- Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. If size isNone
(default), a single value is returned ifloc
andscale
are both scalars. Otherwise,np.broadcast(loc, scale).size
samples are drawn.
- out : ndarray or scalar
- Drawn samples from the parameterized normal distribution.
- scipy.stats.norm : probability density function, distribution or
- cumulative density function, etc.
The probability density for the Gaussian distribution is
where
is the mean and
the standard deviation. The square of the standard deviation,
, is called the variance.
The function has its peak at the mean, and its “spread” increases with the standard deviation (the function reaches 0.607 times its maximum at
and
[2]_). This implies that numpy.random.normal is more likely to return samples lying close to the mean, rather than those far away.
[1] Wikipedia, “Normal distribution”, http://en.wikipedia.org/wiki/Normal_distribution [2] P. R. Peebles Jr., “Central Limit Theorem” in “Probability, Random Variables and Random Signal Principles”, 4th ed., 2001, pp. 51, 51, 125. Draw samples from the distribution:
>>> mu, sigma = 0, 0.1 # mean and standard deviation >>> s = np.random.normal(mu, sigma, 1000)
Verify the mean and the variance:
>>> abs(mu - np.mean(s)) < 0.01 True
>>> abs(sigma - np.std(s, ddof=1)) < 0.01 True
Display the histogram of the samples, along with the probability density function:
>>> import matplotlib.pyplot as plt >>> count, bins, ignored = plt.hist(s, 30, normed=True) >>> plt.plot(bins, 1/(sigma * np.sqrt(2 * np.pi)) * ... np.exp( - (bins - mu)**2 / (2 * sigma**2) ), ... linewidth=2, color='r') >>> plt.show()
-
BIP.Bayes.Samplers.MCMC.
rand
(d0, d1, ..., dn)¶ Random values in a given shape.
Create an array of the given shape and populate it with random samples from a uniform distribution over
[0, 1)
.- d0, d1, ..., dn : int, optional
- The dimensions of the returned array, should all be positive. If no argument is given a single Python float is returned.
- out : ndarray, shape
(d0, d1, ..., dn)
- Random values.
random
This is a convenience function. If you want an interface that takes a shape-tuple as the first argument, refer to np.random.random_sample .
>>> np.random.rand(3,2) array([[ 0.14022471, 0.96360618], #random [ 0.37601032, 0.25528411], #random [ 0.49313049, 0.94909878]]) #random
-
BIP.Bayes.Samplers.MCMC.
random
()¶ random_sample(size=None)
Return random floats in the half-open interval [0.0, 1.0).
Results are from the “continuous uniform” distribution over the stated interval. To sample
multiply the output of random_sample by (b-a) and add a:
(b - a) * random_sample() + a
- size : int or tuple of ints, optional
- Output shape. If the given shape is, e.g.,
(m, n, k)
, thenm * n * k
samples are drawn. Default is None, in which case a single value is returned.
- out : float or ndarray of floats
- Array of random floats of shape size (unless
size=None
, in which case a single float is returned).
>>> np.random.random_sample() 0.47108547995356098 >>> type(np.random.random_sample()) <type 'float'> >>> np.random.random_sample((5,)) array([ 0.30220482, 0.86820401, 0.1654503 , 0.11659149, 0.54323428])
Three-by-two array of random numbers from [-5, 0):
>>> 5 * np.random.random_sample((3, 2)) - 5 array([[-3.99149989, -0.52338984], [-2.99091858, -0.79479508], [-1.23204345, -1.75224494]])
-
BIP.Bayes.Samplers.MCMC.
timeit
(method)¶ Decorator to time methods