lsqfit - Nonlinear Least Squares Fitting

Introduction

This package contains tools for nonlinear least-squares curve fitting of data. In general a fit has four inputs:

  1. The dependent data y that is to be fit — typically y is a Python dictionary in an lsqfit analysis. Its values y[k] are either gvar.GVars or arrays (any shape or dimension) of gvar.GVars that specify the values of the dependent variables and their errors.
  2. A collection x of independent data — x can have any structure and contain any data, or it can be omitted.
  3. A fit function f(x, p) whose parameters p are adjusted by the fit until f(x, p) equals y to within ys errors — parameters p` are usually specified by a dictionary whose values p[k] are individual parameters or (numpy) arrays of parameters. The fit function is assumed independent of x (that is, f(p)) if x = False (or if x is omitted from the input data).
  4. Initial estimates or priors for each parameter in p — priors are usually specified using a dictionary prior whose values prior[k] are gvar.GVars or arrays of gvar.GVars that give initial estimates (values and errors) for parameters p[k].

A typical code sequence has the structure:

... collect x, y, prior ...

def f(x, p):
    ... compute fit to y[k], for all k in y, using x, p ...
    ... return dictionary containing the fit values for the y[k]s ...

fit = lsqfit.nonlinear_fit(data=(x, y), prior=prior, fcn=f)
print(fit)      # variable fit is of type nonlinear_fit

The parameters p[k] are varied until the chi**2 for the fit is minimized.

The best-fit values for the parameters are recovered after fitting using, for example, p=fit.p. Then the p[k] are gvar.GVars or arrays of gvar.GVars that give best-fit estimates and fit uncertainties in those estimates. The print(fit) statement prints a summary of the fit results.

The dependent variable y above could be an array instead of a dictionary, which is less flexible in general but possibly more convenient in simpler fits. Then the approximate y returned by fit function f(x, p) must be an array with the same shape as the dependent variable. The prior prior could also be represented by an array instead of a dictionary.

By default priors are Gaussian/normal distributions, represented by gvar.GVars. Setting nonlinear_fit parameter extend=True allows for log-normal and sqrt-normal distributions as well. The latter are indicated by replacing the prior (in a dictionary prior) with key c, for example, by a prior for the parameter’s logarithm or square root, with key log(c) or sqrt(c), respectively. nonlinear_fit adds parameter c to the parameter dictionary, deriving its value from parameter log(c) or sqrt(c). The fit function can be expressed directly in terms of parameter c and so is the same no matter which distribution is used for c. Note that a sqrt-normal distribution with zero mean is equivalent to an exponential distribution. Additional distributions can be added using gvar.add_parameter_distribution().

The lsqfit tutorial contains extended explanations and examples. The first appendix in the paper at http://arxiv.org/abs/arXiv:1406.2279 provides conceptual background on the techniques used in this module for fits and, especially, error budgets.

nonlinear_fit Objects

class lsqfit.nonlinear_fit(data, fcn, prior=None, p0=None, extend=False, svdcut=1e-12, debug=False, tol=1e-8, maxit=1000, fitter='gsl_multifit', **fitterargs)

Nonlinear least-squares fit.

lsqfit.nonlinear_fit fits a (nonlinear) function f(x, p) to data y by varying parameters p, and stores the results: for example,

fit = nonlinear_fit(data=(x, y), fcn=f, prior=prior)   # do fit
print(fit)                               # print fit results

The best-fit values for the parameters are in fit.p, while the chi**2, the number of degrees of freedom, the logarithm of Gaussian Bayes Factor, the number of iterations (or function evaluations), and the cpu time needed for the fit are in fit.chi2, fit.dof, fit.logGBF, fit.nit, and fit.time, respectively. Results for individual parameters in fit.p are of type gvar.GVar, and therefore carry information about errors and correlations with other parameters. The fit data and prior can be recovered using fit.x (equals False if there is no x), fit.y, and fit.prior; the data and prior are corrected for the SVD cut, if there is one (that is, their covariance matrices have been modified in accordance with the SVD cut).

Parameters:
  • data (dict, array or tuple) –

    Data to be fit by lsqfit.nonlinear_fit can have any of the following forms:

    data = x, y
    x is the independent data that is passed to the fit function with the fit parameters: fcn(x, p). y is a dictionary (or array) of gvar.GVars that encode the means and covariance matrix for the data that is to be fit being fit. The fit function must return a result having the same layout as y.
    data = y
    y is a dictionary (or array) of gvar.GVars that encode the means and covariance matrix for the data being fit. There is no independent data so the fit function depends only upon the fit parameters: fit(p). The fit function must return a result having the same layout as y.
    data = x, ymean, ycov
    x is the independent data that is passed to the fit function with the fit parameters: fcn(x, p). ymean is an array containing the mean values of the fit data. ycov is an array containing the covariance matrix of the fit data; ycov.shape equals 2*ymean.shape. The fit function must return an array having the same shape as ymean.
    data = x, ymean, ysdev
    x is the independent data that is passed to the fit function with the fit parameters: fcn(x, p). ymean is an array containing the mean values of the fit data. ysdev is an array containing the standard deviations of the fit data; ysdev.shape equals ymean.shape. The data are assumed to be uncorrelated. The fit function must return an array having the same shape as ymean.

    Setting x=False in the first, third or fourth of these formats implies that the fit function depends only on the fit parameters: that is, fcn(p) instead of fcn(x, p). (This is not assumed if x=None.)

  • fcn (callable) – The function to be fit to data. It is either a function of the independent data x and the fit parameters p (fcn(x, p)), or a function of just the fit parameters (fcn(p)) when there is no x data or x=False. The parameters are tuned in the fit until the function returns values that agree with the y data to within the ys’ errors. The function’s return value must have the same layout as the y data (a dictionary or an array). The fit parameters p are either: 1) a dictionary where each p[k] is a single parameter or an array of parameters (any shape); or, 2) a single array of parameters. The layout of the parameters is the same as that of prior prior if it is specified; otherwise, it is inferred from of the starting value p0 for the fit.
  • prior (dict, array, str, gvar.GVar or None) – A dictionary (or array) containing a priori estimates for all parameters p used by fit function fcn(x, p) (or fcn(p)). Fit parameters p are stored in a dictionary (or array) with the same keys and structure (or shape) as prior. The default value is None; prior must be defined if p0 is None.
  • p0 (dict, array, float or None) – Starting values for fit parameters in fit. lsqfit.nonlinear_fit adjusts p0 to make it consistent in shape and structure with prior when the latter is specified: elements missing from p0 are filled in using prior, and elements in p0 that are not in prior are discarded. If p0 is a string, it is taken as a file name and lsqfit.nonlinear_fit attempts to read starting values from that file; best-fit parameter values are written out to the same file after the fit (for priming future fits). If p0 is None or the attempt to read the file fails, starting values are extracted from prior. The default value is None; p0 must be defined if prior is None.
  • svdcut (float or None) – If svdcut is nonzero (but not None), SVD cuts are applied to every block-diagonal sub-matrix of the covariance matrix for the data y and prior (if there is a prior). The blocks are first rescaled so that all diagonal elements equal 1 – that is, the blocks are replaced by the correlation matrices for the corresponding subsets of variables. Then, if svdcut > 0, eigenvalues of the rescaled matrices that are smaller than svdcut times the maximum eigenvalue are replaced by svdcut times the maximum eigenvalue. This makes the covariance matrix less singular and less susceptible to roundoff error. When svdcut < 0, eigenvalues smaller than |svdcut| times the maximum eigenvalue are discarded and the corresponding components in y and prior are zeroed out. Default is 1e-12.
  • extend (bool) – Log-normal and sqrt-normal distributions can be used for fit priors when extend=True, provided the parameters are specified by a dictionary (as opposed to an array). To use such a distribution for a parameter 'c' in the fit prior, replace prior['c'] with a prior specifying its logarithm or square root, designated by prior['log(c)'] or prior['sqrt(c)'], respectively. The dictionaries containing parameters generated by lsqfit.nonlinear_fit will have entries for both 'c' and 'log(c)' or 'sqrt(c)', so only the prior need be changed to switch to log-normal/sqrt-normal distributions. Setting extend=False (the default) restricts all parameters to Gaussian distributions. Additional distributions can be added using gvar.add_parameter_distribution().
  • udata (dict, array or tuple) – Same as data but instructs the fitter to ignore correlations between different pieces of data. This speeds up the fit, particularly for large amounts of data, but ignores potentially valuable information if the data actually are correlated. Only one of data or udata should be specified. (Default is None.)
  • fitter (str or None) – Fitter code. Options if GSL is installed include: 'gsl_multifit' (default) and 'gsl_v1_multifit' (original fitter). Options if scipy is installed include: 'scipy_least_squares' (default if GSL not installed). gsl_multifit has many options, providing extensive user control. scipy_least_squares can be used for fits where the parameters are bounded. (Bounded parameters can also be implemented, for any of the fitters, using non-Gaussian priors — see the tutorial.)
  • tol (float or tuple) –

    Assigning tol=(xtol, gtol, ftol) causes the fit to stop searching for a minimum when any of

    1. xtol >= relative change in parameters between iterations
    2. gtol >= relative size of gradient of chi**2 function
    3. ftol >= relative change in chi**2 between iterations

    is satisfied. See the fitter documentation for detailed definitions of these stopping conditions. Typically one sets xtol=1/10**d where d is the number of digits of precision desired in the result, while gtol<<1 and ftol<<1. Setting tol=eps where eps is a number is equivalent to setting tol=(eps,1e-10,1e-10). Setting tol=(eps1,eps2) is equivalent to setting tol=(eps1,eps2,1e-10). Default is tol=1e-8. (Note: the ftol option is disabled in some versions of the GSL library.)

  • maxit (int) – Maximum number of algorithm iterations (or function evaluations for some fitters) in search for minimum; default is 1000.
  • debug (bool) – Set to True for extra debugging of the fit function and a check for roundoff errors. (Default is False.)
  • fitterargs (dict) – Dictionary of additional arguments passed through to the underlying fitter. Different fitters offer different parameters; see the documentation for each.

Objects of type lsqfit.nonlinear_fit have the following attributes:

chi2

float

The minimum chi**2 for the fit. fit.chi2 / fit.dof is usually of order one in good fits; values much less than one suggest that the actual standard deviations in the input data and/or priors are smaller than the standard deviations used in the fit.

cov

array

Covariance matrix of the best-fit parameters from the fit.

dof

int

Number of degrees of freedom in the fit, which equals the number of pieces of data being fit when priors are specified for the fit parameters. Without priors, it is the number of pieces of data minus the number of fit parameters.

error

str

Error message generated by the underlying fitter when an error occurs. None otherwise.

fitter_results

Results returned by the underlying fitter. Refer to the appropriate fitter’s documentation for details.

logGBF

float or None

The logarithm of the probability (density) of obtaining the fit data by randomly sampling the parameter model (priors plus fit function) used in the fit — that is, it is P(data|model). This quantity is useful for comparing fits of the same data to different models, with different priors and/or fit functions. The model with the largest value of fit.logGBF is the one preferred by the data. The exponential of the difference in fit.logGBF between two models is the ratio of probabilities (Bayes factor) for those models. Differences in fit.logGBF smaller than 1 are not very significant. Gaussian statistics are assumed when computing fit.logGBF.

p

dict, array or gvar.GVar

Best-fit parameters from fit. Depending upon what was used for the prior (or p0), it is either: a dictionary (gvar.BufferDict) of gvar.GVars and/or arrays of gvar.GVars; or an array (numpy.ndarray) of gvar.GVars. fit.p represents a multi-dimensional Gaussian distribution which, in Bayesian terminology, is the posterior probability distribution of the fit parameters.

pmean

dict, array or float

Means of the best-fit parameters from fit.

psdev

dict, array or float

Standard deviations of the best-fit parameters from fit.

palt

dict, array or gvar.GVar

Same as fit.p except that the errors are computed directly from fit.cov. This is faster but means that no information about correlations with the input data is retained (unlike in fit.p); and, therefore, fit.palt cannot be used to generate error budgets. fit.p and fit.palt give the same means and normally give the same errors for each parameter. They differ only when the input data’s covariance matrix is too singular to invert accurately (because of roundoff error), in which case an SVD cut is advisable.

p0

dict, array or float

The parameter values used to start the fit. This will differ from the input p0 if the latter was incomplete.

prior

dict, array, gvar.GVar or None

Prior used in the fit. This may differ from the input prior if an SVD cut is used. It is either a dictionary (gvar.BufferDict) or an array (numpy.ndarray), depending upon the input. Equals None if no prior was specified.

Q

float or None

The probability that the chi**2 from the fit could have been larger, by chance, assuming the best-fit model is correct. Good fits have Q values larger than 0.1 or so. Also called the p-value of the fit.

stopping_criterion

int

Criterion used to stop fit:

0: didn’t converge

1: xtol >= relative change in parameters between iterations

2: gtol >= relative size of gradient of chi**2

3: ftol >= relative change in chi**2 between iterations

svdcorrection

gvar.GVar

Sum of all SVD corrections, if any, added to the fit data y or the prior prior.

svdn

int

Number of eigenmodes modified (and/or deleted) by the SVD cut.

time

float

CPU time (in secs) taken by fit.

tol

tuple

Tolerance used in fit. This differs from the input tolerance if the latter was incompletely specified.

x

obj

The first field in the input data. This is sometimes the independent variable (as in ‘y vs x’ plot), but may be anything. It is set equal to False if the x field is omitted from the input data. (This also means that the fit function has no x argument: so f(p) rather than f(x,p).)

y

dict, array or gvar.GVar

Fit data used in the fit. This may differ from the input data if an SVD cut is used. It is either a dictionary (gvar.BufferDict) or an array (numpy.ndarray), depending upon the input.

nblocks

dict

nblocks[s] equals the number of block-diagonal sub-matrices of the yprior covariance matrix that are size s-by-s. This is sometimes useful for debugging.

The global defaults used by lsqfit.nonlinear_fit can be changed by changing entries in dictionary lsqfit.nonlinear_fit.DEFAULTS for keys ‘extend’, ‘svdcut’, ‘debug’, tol, ‘maxit’, and ‘fitter’. Additional defaults can be added to that dictionary to be are passed through lsqfit.nonlinear_fit to the underlying fitter (via dictionary fitterargs).

Additional methods are provided for printing out detailed information about the fit, testing fits with simulated data, doing bootstrap analyses of the fit errors, dumping (for later use) and loading parameter values, and checking for roundoff errors in the final error estimates:

format(maxline=0, pstyle='v')

Formats fit output details into a string for printing.

The output tabulates the chi**2 per degree of freedom of the fit (chi2/dof), the number of degrees of freedom, the logarithm of the Gaussian Bayes Factor for the fit (logGBF), and the number of fit- algorithm iterations needed by the fit. Optionally, it will also list the best-fit values for the fit parameters together with the prior for each (in [] on each line). Lines for parameters that deviate from their prior by more than one (prior) standard deviation are marked with asterisks, with the number of asterisks equal to the number of standard deviations (up to five). format can also list all of the data and the corresponding values from the fit, again with asterisks on lines where there is a significant discrepancy. At the end it lists the SVD cut, the number of eigenmodes modified by the SVD cut, the tolerances used in the fit, and the time in seconds needed to do the fit. The tolerance used to terminate the fit is marked with an asterisk.

Parameters:
  • maxline (integer or bool) – Maximum number of data points for which fit results and input data are tabulated. maxline<0 implies that only chi2, Q, logGBF, and itns are tabulated; no parameter values are included. Setting maxline=True prints all data points; setting it equal None or False is the same as setting it equal to -1. Default is maxline=0.
  • pstyle (‘vv’, ‘v’, or ‘m’) – Style used for parameter list. Supported values are ‘vv’ for very verbose, ‘v’ for verbose, and ‘m’ for minimal. When ‘m’ is set, only parameters whose values differ from their prior values are listed. Setting pstyle=None implies no parameters are listed.
Returns:

String containing detailed information about fit.

fmt_errorbudget(outputs, inputs, ndecimal=2, percent=True)

Tabulate error budget for outputs[ko] due to inputs[ki].

For each output outputs[ko], fmt_errorbudget computes the contributions to outputs[ko]‘s standard deviation coming from the gvar.GVars collected in inputs[ki]. This is done for each key combination (ko,ki) and the results are tabulated with columns and rows labeled by ko and ki, respectively. If a gvar.GVar in inputs[ki] is correlated with other gvar.GVars, the contribution from the others is included in the ki contribution as well (since contributions from correlated gvar.GVars cannot be distinguished). The table is returned as a string.

Parameters:
  • outputs – Dictionary of gvar.GVars for which an error budget is computed.
  • inputs – Dictionary of: gvar.GVars, arrays/dictionaries of gvar.GVars, or lists of gvar.GVars and/or arrays/dictionaries of gvar.GVars. fmt_errorbudget tabulates the parts of the standard deviations of each outputs[ko] due to each inputs[ki].
  • ndecimal (int) – Number of decimal places displayed in table.
  • percent (boolean) – Tabulate % errors if percent is True; otherwise tabulate the errors themselves.
  • colwidth (positive integer or None) – Width of each column. This is set automatically, to accommodate label widths, if colwidth=None (default).
  • verify (boolean) – If True, a warning is issued if: 1) different inputs are correlated (and therefore double count errors); or 2) the sum (in quadrature) of partial errors is not equal to the total error to within 0.1% of the error (and the error budget is incomplete or overcomplete). No checking is done if verify==False (default).
Returns:

A table (str) containing the error budget. Output variables are labeled by the keys in outputs (columns); sources of uncertainty are labeled by the keys in inputs (rows).

fmt_values(outputs, ndecimal=None)

Tabulate gvar.GVars in outputs.

Parameters:
  • outputs – A dictionary of gvar.GVar objects.
  • ndecimal (int or None) – Format values v using v.fmt(ndecimal).
Returns:

A table (str) containing values and standard deviations for variables in outputs, labeled by the keys in outputs.

simulated_fit_iter(n=None, pexact=None, **kargs)

Iterator that returns simulation copies of a fit.

Fit reliability can be tested using simulated data which replaces the mean values in self.y with random numbers drawn from a distribution whose mean equals self.fcn(pexact) and whose covariance matrix is the same as self.y‘s. Simulated data is very similar to the original fit data, self.y, but corresponds to a world where the correct values for the parameters (i.e., averaged over many simulated data sets) are given by pexact. pexact is usually taken equal to fit.pmean.

Each iteration of the iterator creates new simulated data, with different random numbers, and fits it, returning the the lsqfit.nonlinear_fit that results. The simulated data has the same covariance matrix as fit.y. Typical usage is:

...
fit = nonlinear_fit(...)
...
for sfit in fit.simulated_fit_iter(n=3):
    ... verify that sfit.p agrees with pexact=fit.pmean within errors ...

Only a few iterations are needed to get a sense of the fit’s reliability since we know the correct answer in each case. The simulated fit’s output results should agree with pexact (=fit.pmean here) within the simulated fit’s errors.

Simulated fits can also be used to estimate biases in the fit’s output parameters or functions of them, should non-Gaussian behavior arise. This is possible, again, because we know the correct value for every parameter before we do the fit. Again only a few iterations may be needed for reliable estimates.

The (possibly non-Gaussian) probability distributions for parameters, or functions of them, can be explored in more detail by setting option bootstrap=True and collecting results from a large number of simulated fits. With bootstrap=True, the means of the priors are also varied from fit to fit, as in a bootstrap simulation; the new prior means are chosen at random from the prior distribution. Variations in the best-fit parameters (or functions of them) from fit to fit define the probability distributions for those quantities. For example, one would use the following code to analyze the distribution of function g(p) of the fit parameters:

fit = nonlinear_fit(...)

...

glist = []
for sfit in fit.simulated_fit_iter(n=100, bootstrap=True):
    glist.append(g(sfit.pmean))

... analyze samples glist[i] from g(p) distribution ...

This code generates n=100 samples glist[i] from the probability distribution of g(p). If everything is Gaussian, the mean and standard deviation of glist[i] should agree with g(fit.p).mean and g(fit.p).sdev.

The only difference between simulated fits with bootstrap=True and bootstrap=False (the default) is that the prior means are varied. It is essential that they be varied in a bootstrap analysis since one wants to capture the impact of the priors on the final distributions, but it is not necessary and probably not desirable when simply testing a fit’s reliability.

Parameters:
  • n (integer or None) – Maximum number of iterations (equals infinity if None).
  • pexact (None or array or dictionary of numbers) – Fit-parameter values for the underlying distribution used to generate simulated data; replaced by self.pmean if is None (default).
  • bootstrap (bool) – Vary prior means if True; otherwise vary only the means in self.y (default).
Returns:

An iterator that returns lsqfit.nonlinear_fits for different simulated data.

Note that additional keywords can be added to overwrite keyword arguments in lsqfit.nonlinear_fit.

bootstrap_iter(n=None, datalist=None)

Iterator that returns bootstrap copies of a fit.

A bootstrap analysis involves three steps: 1) make a large number of “bootstrap copies” of the original input data and prior that differ from each other by random amounts characteristic of the underlying randomness in the original data; 2) repeat the entire fit analysis for each bootstrap copy of the data, extracting fit results from each; and 3) use the variation of the fit results from bootstrap copy to bootstrap copy to determine an approximate probability distribution (possibly non-gaussian) for the fit parameters and/or functions of them: the results from each bootstrap fit are samples from that distribution.

Bootstrap copies of the data for step 2 are provided in datalist. If datalist is None, they are generated instead from the means and covariance matrix of the fit data (assuming gaussian statistics). The maximum number of bootstrap copies considered is specified by n (None implies no limit).

Variations in the best-fit parameters (or functions of them) from bootstrap fit to bootstrap fit define the probability distributions for those quantities. For example, one could use the following code to analyze the distribution of function g(p) of the fit parameters:

fit = nonlinear_fit(...)

...

glist = []
for sfit in fit.bootstrapped_fit_iter(
    n=100, datalist=datalist, bootstrap=True
    ):
    glist.append(g(sfit.pmean))

... analyze samples glist[i] from g(p) distribution ...

This code generates n=100 samples glist[i] from the probability distribution of g(p). If everything is Gaussian, the mean and standard deviation of glist[i] should agree with g(fit.p).mean and g(fit.p).sdev.

Parameters:
  • n (integer) – Maximum number of iterations if n is not None; otherwise there is no maximum.
  • datalist (sequence or iterator or None) – Collection of bootstrap data sets for fitter.
Returns:

Iterator that returns an lsqfit.nonlinear_fit object containing results from the fit to the next data set in datalist

dump_p(filename)

Dump parameter values (fit.p) into file filename.

fit.dump_p(filename) saves the best-fit parameter values (fit.p) from a nonlinear_fit called fit. These values are recovered using p = nonlinear_fit.load_parameters(filename) where p‘s layout is the same as that of fit.p.

dump_pmean(filename)

Dump parameter means (fit.pmean) into file filename.

fit.dump_pmean(filename) saves the means of the best-fit parameter values (fit.pmean) from a nonlinear_fit called fit. These values are recovered using p0 = nonlinear_fit.load_parameters(filename) where p0‘s layout is the same as fit.pmean. The saved values can be used to initialize a later fit (nonlinear_fit parameter p0).

static load_parameters(filename)

Load parameters stored in file filename.

p = nonlinear_fit.load_p(filename) is used to recover the values of fit parameters dumped using fit.dump_p(filename) (or fit.dump_pmean(filename)) where fit is of type lsqfit.nonlinear_fit. The layout of the returned parameters p is the same as that of fit.p (or fit.pmean).

check_roundoff(rtol=0.25, atol=1e-6)

Check for roundoff errors in fit.p.

Compares standard deviations from fit.p and fit.palt to see if they agree to within relative tolerance rtol and absolute tolerance atol. Generates a warning if they do not (in which case an SVD cut might be advisable).

static set(clear=False, **defaults)

Set default parameters for lsqfit.nonlinear_fit.

Use to set default values for parameters: extend, svdcut, debug, tol, maxit, and fitter. Can also set parameters specific to the fitter specified by the fitter argument.

Sample usage:

import lsqfit

old_defaults = lsqfit.nonlinear_fit.set(
    fitter='gsl_multifit', alg='subspace2D', solver='cholesky',
    tol=1e-10, debug=True,
    )

nonlinear_fit.set() without arguments returns a dictionary containing the current defaults.

Parameters:
  • clear (bool) – If True remove earlier settings, restoring the original defaults, before adding new defaults. The default value is clear=False. nonlinear_fit.set(clear=True) restores the original defaults.
  • defaults (dict) – Dictionary containing new defaults.
Returns:

A dictionary containing the old defaults, before they were updated. These can be restored using nonlinear_fit.set(old_defaults) where old_defaults is the dictionary containint the old defaults.

Functions

lsqfit.empbayes_fit(z0, fitargs, **minargs)

Return fit and z corresponding to the fit lsqfit.nonlinear_fit(**fitargs(z)) that maximizes logGBF.

This function maximizes the logarithm of the Bayes Factor from fit lsqfit.nonlinear_fit(**fitargs(z)) by varying z, starting at z0. The fit is redone for each value of z that is tried, in order to determine logGBF.

The Bayes Factor is proportional to the probability that the data came from the model (fit function and priors) used in the fit. empbayes_fit() finds the model or data that maximizes this probability.

One application is illustrated by the following code:

import numpy as np
import gvar as gv
import lsqfit

# fit data
x = np.array([1., 2., 3., 4.])
y = np.array([3.4422, 1.2929, 0.4798, 0.1725])

# prior
prior = gv.gvar(['10(1)', '1.0(1)'])

# fit function
def fcn(x, p):
    return p[0] * gv.exp( - p[1] * x)

# find optimal dy
def fitargs(z):
    dy = y * z
    newy = gv.gvar(y, dy)
    return dict(data=(x, newy), fcn=fcn, prior=prior)

fit, z = lsqfit.empbayes_fit(0.1, fitargs)
print fit.format(True)

Here we want to fit data y with fit function fcn but we don’t know the uncertainties in our y values. We assume that the relative errors are x-independent and uncorrelated. We add the error dy that maximizes the Bayes Factor, as this is the most likely choice. This fit gives the following output:

Least Square Fit:
  chi2/dof [dof] = 0.58 [4]    Q = 0.67    logGBF = 7.4834

Parameters:
              0     9.44 (18)     [ 10.0 (1.0) ]
              1   0.9979 (69)     [  1.00 (10) ]

Fit:
     x[k]           y[k]      f(x[k],p)
---------------------------------------
        1     3.442 (54)     3.481 (45)
        2     1.293 (20)     1.283 (11)
        3    0.4798 (75)    0.4731 (41)
        4    0.1725 (27)    0.1744 (23)

Settings:
  svdcut/n = 1e-12/0    tol = (1e-08*,1e-10,1e-10)    (itns/time = 3/0.0)

We have, in effect, used the variation in the data relative to the best fit curve to estimate that the uncertainty in each data point is of order 1.6%.

Parameters:
  • z0 (number, array or dict) – Starting point for search.
  • fitargs (callable) – Function of z that returns a dictionary args containing the lsqfit.nonlinear_fit arguments corresponding to z. z should have the same layout (number, array or dictionary) as z0. fitargs(z) can instead return a tuple (args, plausibility), where args is again the dictionary for lsqfit.nonlinear_fit. plausibility is the logarithm of the a priori probabilitiy that z is sensible. When plausibility is provided, lsqfit.empbayes_fit() maximizes the sum logGBF + plausibility. Specifying plausibility is a way of steering selections away from completely implausible values for z.
  • minargs (dict) – Optional argument dictionary, passed on to lsqfit.gsl_multiminex (or lsqfit.scipy_multiminex), which finds the minimum.
Returns:

A tuple containing the best fit (object of type lsqfit.nonlinear_fit) and the optimal value for parameter z.

lsqfit.wavg(dataseq, prior=None, fast=False, **fitterargs)

Weighted average of gvar.GVars or arrays/dicts of gvar.GVars.

The weighted average of several gvar.GVars is what one obtains from a least-squares fit of the collection of gvar.GVars to the one-parameter fit function

def f(p):
    return N * [p[0]]

where N is the number of gvar.GVars. The average is the best-fit value for p[0]. gvar.GVars with smaller standard deviations carry more weight than those with larger standard deviations. The averages computed by wavg take account of correlations between the gvar.GVars.

If prior is not None, it is added to the list of data used in the average. Thus wavg([x2, x3], prior=x1) is the same as wavg([x1, x2, x3]).

Typical usage is

x1 = gvar.gvar(...)
x2 = gvar.gvar(...)
x3 = gvar.gvar(...)
xavg = wavg([x1, x2, x3])   # weighted average of x1, x2 and x3

where the result xavg is a gvar.GVar containing the weighted average.

The individual gvar.GVars in the last example can be replaced by multidimensional distributions, represented by arrays of gvar.GVars or dictionaries of gvar.GVars (or arrays of gvar.GVars). For example,

x1 = [gvar.gvar(...), gvar.gvar(...)]
x2 = [gvar.gvar(...), gvar.gvar(...)]
x3 = [gvar.gvar(...), gvar.gvar(...)]
xavg = wavg([x1, x2, x3])
    # xavg[i] is wgtd avg of x1[i], x2[i], x3[i]

where each array x1, x2 ... must have the same shape. The result xavg in this case is an array of gvar.GVars, where the shape of the array is the same as that of x1, etc.

Another example is

x1 = dict(a=[gvar.gvar(...), gvar.gvar(...)], b=gvar.gvar(...))
x2 = dict(a=[gvar.gvar(...), gvar.gvar(...)], b=gvar.gvar(...))
x3 = dict(a=[gvar.gvar(...), gvar.gvar(...)])
xavg = wavg([x1, x2, x3])
    # xavg['a'][i] is wgtd avg of x1['a'][i], x2['a'][i], x3['a'][i]
    # xavg['b'] is gtd avg of x1['b'], x2['b']

where different dictionaries can have (some) different keys. Here the result xavg is a gvar.BufferDict` having the same keys as x1, etc.

Weighted averages can become costly when the number of random samples being averaged is large (100s or more). In such cases it might be useful to set parameter fast=True. This causes wavg to estimate the weighted average by incorporating the random samples one at a time into a running average:

result = prior
for dataseq_i in dataseq:
    result = wavg([result, dataseq_i], ...)

This method is much faster when len(dataseq) is large, and gives the exact result when there are no correlations between different elements of list dataseq. The results are approximately correct when dataseq[i] and dataseq[j] are correlated for i!=j.

Parameters:
  • dataseq (list) – The gvar.GVars to be averaged. dataseq is a one-dimensional sequence of gvar.GVars, or of arrays of gvar.GVars, or of dictionaries containing gvar.GVars and/or arrays of gvar.GVars. All dataseq[i] must have the same shape.
  • prior (dict, array or gvar.GVar) – Prior values for the averages, to be included in the weighted average. Default value is None, in which case prior is ignored.
  • fast (bool) – Setting fast=True causes wavg to compute an approximation to the weighted average that is much faster to calculate when averaging a large number of samples (100s or more). The default is fast=False.
  • fitterargs (dict) – Additional arguments (e.g., svdcut) for the lsqfit.nonlinear_fit fitter used to do the averaging.

Results returned by gvar.wavg() have the following extra attributes describing the average:

chi2 - chi**2 for weighted average.

dof - Effective number of degrees of freedom.

Q - The probability that the chi**2 could have been larger,

by chance, assuming that the data are all Gaussian and consistent with each other. Values smaller than 0.1 or so suggest that the data are not Gaussian or are inconsistent with each other. Also called the p-value.

Quality factor Q (or p-value) for fit.

time - Time required to do average.

svdcorrection - The svd corrections made to the data
when svdcut is not None.

fit - Fit output from average.

lsqfit.gammaQ()

Return the normalized incomplete gamma function Q(a,x) = 1-P(a,x).

Q(a, x) = 1/Gamma(a) * \int_x^\infty dt exp(-t) t ** (a-1) = 1 - P(a, x)

Note that gammaQ(ndof/2., chi2/2.) is the probabilty that one could get a chi**2 larger than chi2 with ndof degrees of freedom even if the model used to construct chi2 is correct.

gvar.add_parameter_distribution()

Add new parameter distribution for use in fits.

This function adds new distributions for the parameters used in lsqfit.nonlinear_fit. For example, the code

import gvar as gv
gv.add_parameter_distribution('log', gv.exp)

enables the use of log-normal distributions for parameters. The log-normal distribution is invoked for a parameter p by including log(p) rather than p itself in the fit prior. log-normal, sqrt-normal, and erfinv-normal distributions are included by default. (Setting a prior prior[erfinv(w)] equal to gv.gvar('0(1)') / gv.sqrt(2) means that the prior probability for w is distributed uniformly between -1 and 1, and is zero elsewhere.)

These distributions are implemented by replacing a fit parameter p by a new fit parameter fcn(p) where fcn is some function. fcn(p) is assumed to have a Gaussian distribution, and parameter p is recovered using the inverse function invfcn where p=invfcn(fcn(p)).

Parameters:
  • name (str) – Distribution’s name.
  • invfcn – Inverse of the transformation function.
gvar.del_parameter_distribution()

Delete parameter distribution name.

gvar.add_parameter_parentheses()

Return dictionary with proper keys for parameter distributions (legacy code).

This utility function helps fix legacy code that uses parameter keys like logp or sqrtp instead of log(p) or sqrt(p), as now required. This method creates a copy of dictionary p'' but with keys like ``logp or sqrtp replaced by log(p) or sqrt(p). So setting

p = add_parameter_parentheses(p)

fixes the keys in p for log-normal and sqrt-normal parameters.

Classes for Bayesian Integrals

lsqfit provides support for doing Bayesian integrals, using results from a least-squares fit to optimize the multi-dimensional integral. This is useful for severely non-Gaussian situations. Module vegas is used to do the integrals, using an adaptive Monte Carlo algorithm.

The integrator class is:

class lsqfit.BayesIntegrator(fit, limit=1e15, scale=1, pdf=None, adapt_to_pdf=True, svdcut=1e-15)

vegas integrator for Bayesian fit integrals.

Parameters:
  • fit – Fit from nonlinear_fit.
  • limit (positive float) – Limits the integrations to a finite region of size limit times the standard deviation on either side of the mean. This can be useful if the functions being integrated misbehave for large parameter values (e.g., numpy.exp overflows for a large range of arguments). Default is 1e15.
  • scale (positive float) – The integration variables are rescaled to emphasize parameter values of order scale times the corresponding standard deviations. The rescaling does not change the value of the integral but it can reduce uncertainties in the vegas estimates. Default is 1.0.
  • pdf (callable) – Probability density function pdf(p) of the fit parameters to use in place of the normal PDF associated with the least-squares fit used to create the integrator.
  • adapt_to_pdf (bool) – vegas adapts to the PDF if True (default); otherwise it adapts to f(p) times the PDF.
  • svdcut (non-negative float or None) – If not None, replace covariance matrix of g with a new matrix whose small eigenvalues are modified: eigenvalues smaller than svdcut times the maximum eigenvalue eig_max are replaced by svdcut*eig_max. This can ameliorate problems caused by roundoff errors when inverting the covariance matrix. It increases the uncertainty associated with the modified eigenvalues and so is conservative. Setting svdcut=None or svdcut=0 leaves the covariance matrix unchanged. Default is 1e-15.

BayesIntegrator(fit) is a vegas integrator that evaluates expectation values for the multi-dimensional Bayesian distribution associated with nonlinear_fit fit: the probability density function is the exponential of the chi**2 function (times -1/2), for data and priors, used in the fit. For linear fits, it is equivalent to vegas.PDFIntegrator(fit.p), since the chi**2 function is quadratic in the fit parameters; but they can differ significantly for nonlinear fits.

BayesIntegrator integrates over the entire parameter space but first re-expresses the integrals in terms of variables that diagonalize the covariance matrix of the best-fit parameters fit.p from nonlinear_fit and are centered at the best-fit values. This greatly facilitates the integration using vegas, making integrals over 10s or more of parameters feasible. (The vegas module must be installed separately in order to use BayesIntegrator.)

A simple illustration of BayesIntegrator is given by the following code, which we use to evaluate the mean and standard deviation for s*g where s and g are fit parameters:

import lsqfit
import gvar as gv
import numpy as np

# least-squares fit
x = np.array([0.1, 1.2, 1.9, 3.5])
y = gv.gvar(['1.2(1.0)', '2.4(1)', '2.0(1.2)', '5.2(3.2)'])
prior = gv.gvar(dict(a='0(5)', s='0(2)', g='2(2)'))
def f(x, p):
    return p['a'] + p['s'] * x ** p['g']
fit = lsqfit.nonlinear_fit(data=(x,y), prior=prior, fcn=f, debug=True)
print(fit)

# Bayesian integral to evaluate expectation value of s*g
def g(p):
    sg = p['s'] * p['g']
    return [sg, sg**2]

expval = lsqfit.BayesIntegrator(fit, limit=20.)
warmup = expval(neval=4000, nitn=10)
results = expval(g, neval=4000, nitn=15, adapt=False)
print(results.summary())
print('results =', results, '\n')
sg, sg2 = results
sg_sdev = (sg2 - sg**2) ** 0.5
print('s*g from Bayes integral:  mean =', sg, '  sdev =', sg_sdev)
print('s*g from fit:', fit.p['s'] * fit.p['g'])

where the warmup calls to the integrator are used to adapt it to probability density function from the fit, and then the integrator is used to evaluate the expectation value of g(p), which is returned in array results. Here neval is the (approximate) number of function calls per iteration of the vegas algorithm and nitn is the number of iterations. We use the integrator to calculated the expectation value of s*g and (s*g)**2 so we can compute a mean and standard deviation.

The output from this code shows that the Gaussian approximation for s*g (0.76(66)) is somewhat different from the result obtained from a Bayesian integral (0.48(54)):

Least Square Fit:
  chi2/dof [dof] = 0.32 [4]    Q = 0.87    logGBF = -9.2027

Parameters:
              a    1.61 (90)     [  0.0 (5.0) ]
              s    0.62 (81)     [  0.0 (2.0) ]
              g    1.2 (1.1)     [  2.0 (2.0) ]

Settings:
  svdcut/n = 1e-15/0    reltol/abstol = 0.0001/0*    (itns/time = 10/0.0)

itn   integral        average         chi2/dof        Q
-------------------------------------------------------
  1   1.034(21)       1.034(21)           0.00     1.00
  2   1.034(21)       1.034(15)           0.56     0.64
  3   1.024(18)       1.030(12)           0.37     0.90
  4   1.010(18)       1.0254(98)          0.47     0.89
  5   1.005(17)       1.0213(85)          0.55     0.88
  6   1.013(19)       1.0199(78)          0.69     0.80
  7   0.987(16)       1.0152(70)          0.78     0.72
  8   1.002(18)       1.0135(66)          0.90     0.59
  9   1.036(20)       1.0160(62)          0.86     0.66
 10   1.060(20)       1.0204(60)          0.94     0.55

results = [0.4837(32) 0.5259(47)]

s*g from Bayes integral:  mean = 0.4837(32)   sdev = 0.5403(25)
s*g from fit: 0.78(66)

The table shows estimates of the probability density function’s normalization from each of the vegas iterations used by the integrator to estimate the final results.

In general functions being integrated can return a number, or an array of numbers, or a dictionary whose values are numbers or arrays of numbers. This allows multiple expectation values to be evaluated simultaneously.

See the documentation with the vegas module for more details on its use, and on the attributes and methods associated with integrators. The example above sets adapt=False when computing final results. This gives more reliable error estimates when neval is small. Note that neval may need to be much larger (tens or hundreds of thousands) for more difficult high-dimension integrals.

__call__(f=None, pdf=None, adapt_to_pdf=None, **kargs)

Estimate expectation value of function f(p).

Uses multi-dimensional integration modules vegas to estimate the expectation value of f(p) with respect to the probability density function associated with nonlinear_fit fit.

Parameters:
  • f (callable) – Function f(p) to integrate. Integral is the expectation value of the function with respect to the distribution. The function can return a number, an array of numbers, or a dictionary whose values are numbers or arrays of numbers. Its argument p has the same format as self.fit.pmean (that is, either a number, an array, or a dictionary). Omitting f (or setting it to None) implies that only the PDF is integrated.
  • pdf (callable) – Probability density function pdf(p) of the fit parameters to use in place of the normal PDF associated with the least-squares fit used to create the integrator. The PDF need not be normalized; vegas will normalize it. Ignored if pdf=None (the default).
  • adapt_to_pdf (bool) – vegas adapts to the PDF if True (default); otherwise it adapts to f(p) times the PDF.

All other keyword arguments are passed on to a vegas integrator; see the vegas documentation for further information.

The results returned are similar to what vegas returns but with an extra attribute: results.norm, which contains the vegas estimate for the norm of the PDF. This should equal 1 within errors if the PDF is normalized (and so can serve as a check on the integration in those cases).

A class that describes the Bayesian probability distribution associated with a fit is:

class lsqfit.BayesPDF(fit, svdcut=1e-15)

Bayesian probability density function corresponding to nonlinear_fit fit.

The probability density function is the exponential of -1/2 times the chi**2 function (data and priors) used in fit divided by norm.

Parameters:
  • fit – Fit from nonlinear_fit.
  • svdcut (non-negative float or None) – If not None, replace covariance matrix of g with a new matrix whose small eigenvalues are modified: eigenvalues smaller than svdcut times the maximum eigenvalue eig_max are replaced by svdcut*eig_max. This can ameliorate problems caused by roundoff errors when inverting the covariance matrix. It increases the uncertainty associated with the modified eigenvalues and so is conservative. Setting svdcut=None or svdcut=0 leaves the covariance matrix unchanged. Default is 1e-15.
__call__(p)

Probability density function evaluated at p.

logpdf(p)

Logarithm of the probability density function evaluated at p.

lsqfit.MultiFitter Classes

lsqfit.MultiFitter provides a framework for fitting multiple pieces of data using a set of custom-designed models, derived from lsqfit.MultiFitterModel, each of which encapsulates a particular fit function. This framework was developed to support the corrfitter module, but is more general. Instances of model classes associate specific subsets of the fit data with specific subsets of the fit parameters. This allows fit problems to be broken down down into more manageable pieces, which are then aggregated by lsqfit.MultiFitter into a single fit.

A trivial example of a model would be one that encapsulates a linear fit function:

import numpy as np
import lsqfit

class Linear(lsqfit.MultiFitterModel):
    def __init__(self, datatag, x, intercept, slope):
        super(Linear, self).__init__(datatag)
        # the independent variable
        self.x = np.array(x)
        # keys used to find the intercept and slope in a parameter dictionary
        self.intercept = intercept
        self.slope = slope

    def fitfcn(self, p):
        if self.slope in p:
            return p[self.intercept] + p[self.slope] * self.x
        else:
            # slope parameter marginalized
            return len(self.x) * [p[self.intercept]]

    def buildprior(self, prior, mopt=None, extend=False):
        " Extract the model's parameters from prior. "
        newprior = {}
        newprior[self.intercept] = prior[self.intercept]
        if mopt is None:
            # slope parameter marginalized if mopt is not None
            newprior[self.slope] = prior[self.slope]
        return newprior

    def builddata(self, data):
        " Extract the model's fit data from data. "
        return data[self.datatag]

Imagine four sets of data, each corresponding to x=1,2,3,4, all of which have the same intercept but different slopes:

data = gv.gvar(dict(
    d1=['1.154(10)', '2.107(16)', '3.042(22)', '3.978(29)'],
    d2=['0.692(10)', '1.196(16)', '1.657(22)', '2.189(29)'],
    d3=['0.107(10)', '0.030(16)', '-0.027(22)', '-0.149(29)'],
    d4=['0.002(10)', '-0.197(16)', '-0.382(22)', '-0.627(29)'],
    ))

To find the common intercept, we define a model for each set of data:

models = [
   Linear('d1', x=[1,2,3,4], intercept='a', slope='s1'),
   Linear('d2', x=[1,2,3,4], intercept='a', slope='s2'),
   Linear('d3', x=[1,2,3,4], intercept='a', slope='s3'),
   Linear('d4', x=[1,2,3,4], intercept='a', slope='s4'),
   ]

This says that data['d3'], for example, should be fit with function p['a'] + p['s3'] * np.array([1,2,3,4]) where p is a dictionary of fit parameters. The models here all share the same intercept, but have different slopes. Assume that we know a priori that the intercept and slopes are all order one:

prior = gv.gvar(dict(a='0(1)', s1='0(1)', s2='0(1)', s3='0(1)', s4='0(1)'))

Then we can fit all the data to determine the intercept:

fitter = lsqfit.MultiFitter(models=models)
fit = fitter.lsqfit(data=data, prior=prior)
print(fit)
print('intercept =', fit.p['a'])

The output from this code is:

Least Square Fit:
  chi2/dof [dof] = 0.49 [16]    Q = 0.95    logGBF = 18.793

Parameters:
              a    0.2012 (78)      [  0.0 (1.0) ]
             s1    0.9485 (53)      [  0.0 (1.0) ]
             s2    0.4927 (53)      [  0.0 (1.0) ]
             s3   -0.0847 (53)      [  0.0 (1.0) ]
             s4   -0.2001 (53)      [  0.0 (1.0) ]

Settings:
  svdcut/n = 1e-12/0    tol = (1e-08*,1e-10,1e-10)    (itns/time = 5/0.0)

intercept = 0.2012(78)

Model class Linear is configured to allow marginalization of the slope parameter, if desired. Calling fitter.lsqfit(data=data, prior=prior, mopt=True) moves the slope parameters into the data (by subtracting m.x * prior[m.slope] from the data for each model m), and does a single-parameter fit for the intercept:

Least Square Fit:
  chi2/dof [dof] = 0.49 [16]    Q = 0.95    logGBF = 18.793

Parameters:
              a   0.2012 (78)     [  0.0 (1.0) ]

Settings:
  svdcut/n = 1e-12/0    tol = (1e-08*,1e-10,1e-10)    (itns/time = 4/0.0)

intercept = 0.2012(78)

Marginalization can be useful when fitting large data sets since it reduces the number of fit parameters and simplifies the fit.

Another variation is to replace the simultaneous fit of the four models by a chained fit, where one model is fit at a time and its results are fed into the next fit through that fit’s prior. Replacing the fit code by

fitter = lsqfit.MultiFitter(models=models)
fit = fitter.chained_lsqfit(data=data, prior=prior)
print(fit.formatall())
print('slope =', fit.p['a'])

gives the following output:

========== d1
Least Square Fit:
  chi2/dof [dof] = 0.32 [4]    Q = 0.86    logGBF = 2.0969

Parameters:
              a    0.213 (16)     [  0.0 (1.0) ]
             s1   0.9432 (82)     [  0.0 (1.0) ]

Settings:
  svdcut/n = 1e-12/0    tol = (1e-08*,1e-10,1e-10)    (itns/time = 5/0.0)

========== d2
Least Square Fit:
  chi2/dof [dof] = 0.58 [4]    Q = 0.67    logGBF = 5.3792

Parameters:
              a    0.206 (11)     [  0.213 (16) ]
             s2   0.4904 (64)     [   0.0 (1.0) ]
             s1   0.9462 (64)     [ 0.9432 (82) ]

Settings:
  svdcut/n = 1e-12/0    tol = (1e-08*,1e-10,1e-10)    (itns/time = 5/0.0)

========== d3
Least Square Fit:
  chi2/dof [dof] = 0.66 [4]    Q = 0.62    logGBF = 5.3767

Parameters:
              a    0.1995 (90)      [  0.206 (11) ]
             s3   -0.0840 (57)      [   0.0 (1.0) ]
             s1    0.9493 (57)      [ 0.9462 (64) ]
             s2    0.4934 (57)      [ 0.4904 (64) ]

Settings:
  svdcut/n = 1e-12/0    tol = (1e-08*,1e-10,1e-10)    (itns/time = 4/0.0)

========== d4
Least Square Fit:
  chi2/dof [dof] = 0.41 [4]    Q = 0.81    logGBF = 5.9402

Parameters:
              a    0.2012 (78)      [  0.1995 (90) ]
             s4   -0.2001 (53)      [    0.0 (1.0) ]
             s1    0.9485 (53)      [  0.9493 (57) ]
             s2    0.4927 (53)      [  0.4934 (57) ]
             s3   -0.0847 (53)      [ -0.0840 (57) ]

Settings:
  svdcut/n = 1e-12/0    tol = (1e-08*,1e-10,1e-10)    (itns/time = 4/0.0)

intercept = 0.2012(78)

Note how the value for s1 improves with each fit despite the fact that it appears only in the first fit function. This happens because its value is correlated with that of the intercept a, which appears in every fit function.

Chained fits are most useful with very large data sets when it is possible to break the data into smaller, more manageable chunks. There are a variety of options for organizing the chain of fits that are discussed in the MultiFitter.chained_lsqfit() documentation.

class lsqfit.MultiFitter(models, mopt=None, ratio=False, fast=True, extend=False, **fitterargs)

Nonlinear least-squares fitter for a collection of models.

Parameters:
  • models – List of models, derived from modelfitter.MultiFitterModel, to be fit to the data. Individual models in the list can be replaced by lists of models or tuples of models; see below.
  • mopt (object) – Marginalization options. If not None, marginalization is used to reduce the number of fit parameters. Object mopt is passed to the models when constructing the prior for a fit; it typically indicates the degree of marginalization (in a model-dependent fashion). Setting mopt=None implies no marginalization.
  • ratio (bool) – If True, implement marginalization using ratios: data_marg = data * fitfcn(prior_marg) / fitfcn(prior). If False (default), implement using differences: data_marg = data + (fitfcn(prior_marg) - fitfcn(prior)).
  • fast (bool) – Setting fast=True (default) strips any variable not required by the fit from the prior. This speeds fits but loses information about correlations between variables in the fit and those that are not. The information can be restored using lsqfit.wavg after the fit.
  • extend (bool) – If True supports log-normal and other non-Gaussian priors. See lsqfit documentation for details. Default is False.
  • fitname (callable or None) – Individual fits in a chained fit are assigned default names, constructed from the datatags of the corresponding models, for access and reporting. These names get unwieldy when lots of models are involved. When fitname is not None (default), each default name dname is replaced by fitname(dname).
  • fitterargs – Additional arguments for the lsqfit.nonlinear_fit, such as tol, maxit, svdcut, fitter, etc., as needed.
lsqfit(data=None, prior=None, pdata=None, p0=None, **kargs)

Compute least-squares fit of models to data.

MultiFitter.lsqfit() fits all of the models together, in a single fit. It returns the lsqfit.nonlinear_fit object from the fit.

To see plots of the fit data divided by the fit function with the best-fit parameters use

fit.show_plots()

Plotting requires module matplotlib.

Parameters:
  • data – Input data. One of data or pdata must be specified but not both. pdata is obtained from data by collecting the output from m.builddata(data) for each model m and storing it in a dictionary with key m.datatag.
  • pdata – Input data that has been processed by the models using MultiFitter.process_data() or MultiFitter.process_dataset(). One of data or pdata must be specified but not both.
  • prior – Bayesian prior for fit parameters used by the models.
  • p0 – Dictionary , indexed by parameter labels, containing initial values for the parameters in the fit. Setting p0=None implies that initial values are extracted from the prior. Setting p0="filename" causes the fitter to look in the file with name "filename" for initial values and to write out best-fit parameter values after the fit (for the next call to self.lsqfit()).
  • kargs – Arguments that override parameters specified when the MultiFitter was created. Can also include additional arguments to be passed through to the lsqfit fitter.
chained_lsqfit(data=None, pdata=None, prior=None, p0=None, **kargs)

Compute chained least-squares fit of models to data.

In a chained fit to models [s1, s2, ...], the models are fit one at a time, with the fit output from one being fed into the prior for the next. This can be much faster than fitting the models together, simultaneously. The final result comes from the last fit in the chain, and includes parameters from all of the models.

The most general chain has the structure [s1, s2, s3 ...] where each sn is one of:

  1. a model (derived from multifitter.MultiFitterModel);

  2. a tuple (m1, m2, m3) of models, to be fit together in

    a single fit (i.e., simultaneously);

  3. a list [p1, p2, p3 ...] where each pn is either

    a model or a tuple of models (see #2). The pn are fit separately, and independently of each other (i.e., in parallel). Results from the separate fits are averaged at the end to provide a single composite result for the collection of fits.

The final result fit returned by MultiFitter.chained_fit() has an extra attribute fit.chained_fits which is an ordered dictionary containing fit results from each link sn in the chain, and keyed by the models’ datatags. If any of these involves parallel fits (see #3 above), it will have an extra attribute fit.chained_fits[fittag].sub_fits that contains results from the separate parallel fits. To list results from all the chained and parallel fits, use

print(fit.formatall())

To see plots of the fit data divided by the fit function with the best-fit parameters use

fit.show_plots()

Plotting requires module matplotlib.

Parameters:
  • data – Input data. One of data or pdata must be specified but not both. pdata is obtained from data by collecting the output from m.builddata(data) for each model m and storing it in a dictionary with key m.datatag.
  • pdata – Input data that has been processed by the models using MultiFitter.process_data() or MultiFitter.process_dataset(). One of data or pdata must be specified but not both.
  • prior – Bayesian prior for fit parameters used by the models.
  • p0 – Dictionary , indexed by parameter labels, containing initial values for the parameters in the fit. Setting p0=None implies that initial values are extracted from the prior. Setting p0="filename" causes the fitter to look in the file with name "filename" for initial values and to write out best-fit parameter values after the fit (for the next call to self.lsqfit()).
  • kargs – Arguments that override parameters specified when the MultiFitter was created. Can also include additional arguments to be passed through to the lsqfit fitter.
static process_data(data, models)

Convert data to processed data using models.

Data from dictionary data is processed by each model in list models, and the results collected into a new dictionary pdata for use in MultiFitter.lsqfit() and MultiFitter.chained_lsqft().

static process_dataset(dataset, models, **kargs)

Convert dataset to processed data using models.

:class;`gvar.dataset.Dataset` object dataset is processed by each model in list models, and the results collected into a new dictionary pdata for use in MultiFitter.lsqfit() and MultiFitter.chained_lsqft(). Assumes that the models have defined method MultiFitterModel.builddataset().

static show_plots(fitdata, fitval, x=None, save=False)

Show plots of fitdata[k]/fitval[k] for each key k.

Assumes matplotlib is installed (to make the plots). Plots are shown for one correlator at a time. Press key n to see the next correlator; press key p to see the previous one; press key q to quit the plot and return control to the calling program; press a digit to go directly to one of the first ten plots. Zoom, pan and save using the window controls.

Copies of the plots that are viewed can be saved by setting parameter save=prefix where prefix is a string used to create file names: the file name for the plot corresponding to key k is prefix.format(k). It is important that the filename end with a suffix indicating the type of plot file desired: e.g., prefix='plot-{}.pdf'.

lsqfit.MultiFitter models are derived from the following class. Methods buildprior, builddata, fitfcn, and builddataset are not implemented in this base class. They need to be overwritten by the derived class (except for builddataset which is optional).

class lsqfit.MultiFitterModel(datatag, ncg=1)

Base class for MultiFitter models.

Derived classes must define methods fitfcn, buildprior, and builddata, all of which are described below. In addition they have attributes:

datatag

lsqfit.MultiFitter builds fit data for the correlator by extracting the data labelled by datatag (eg, a string) from an input data set (eg, a dictionary). This label is stored in the MultiFitterModel and must be passed to its constructor. It must be a hashable quantity, like a string or number or tuple of strings and numbers.

ncg

When ncg>1, fit data and functions are coarse-grained by breaking them up into bins of of ncg values and replacing each bin by its average. This can increase the fitting speed, because their is less data, without much loss of precision if the data elements within a bin are highly correlated.

Parameters:
  • datatag – Label used to identify model’s data.
  • ncg (int) – Size of bins for coarse graining (default is ncg=1).
buildprior(prior, mopt=None, extend=False)

Extract fit prior from prior.

Returns a dictionary containing the part of dictionary prior that is relevant to this model’s fit. The code could be as simple as collecting the appropriate pieces: e.g.,

def buildprior(self, prior, mopt=None, extend=False):
    mprior = gv.BufferDict()
    model_keys = [...]
    for k in model_keys:
        mprior[k] = prior[k]
    return mprior

where model_keys is a list of keys corresponding to the model’s parameters. Supporting the extend option requires a slight modification: e.g.,

def buildprior(self, prior, mopt=None, extend=False):
    mprior = gv.BufferDict()
    model_keys = [...]
    for k in self.get_prior_keys(prior, model_keys, extend):
        mprior[k] = prior[k]
    return mprior

Marginalization involves omitting some of the fit parameters from the model’s prior. mopt=None implies no marginalization. Otherwise mopt will typically contain information about what and how much to marginalize.

Parameters:
  • prior – Dictionary containing a priori estimates of all fit parameters.
  • mopt (object) – Marginalization options. Ignore if None. Otherwise marginalize fit parameters as specified by mopt. mopt can be any type of Python object; it is used only in buildprior and is passed through to it unchanged.
  • extend (bool) – If True supports log-normal and other non-Gaussian priors. See lsqfit documentation for details.
builddata(data)

Extract fit data corresponding to this model from data set data.

The fit data is returned in a 1-dimensional array; the fitfcn must return arrays of the same length.

Parameters:data – Data set containing the fit data for all models. This is typically a dictionary, whose keys are the datatags of the models.
fitfcn(p)

Compute fit function fit for parameters p.

Results are returned in a 1-dimensional array the same length as (and corresponding to) the fit data returned by self.builddata(data).

If marginalization is supported, fitfcn must work with or without the marginalized parameters.

Parameters:p – Dictionary of parameter values.
builddataset(dataset)

Extract fit dataset from gvar.dataset.Dataset dataset.

The code

import gvar as gv

data = gv.dataset.avg_data(m.builddataset(dataset))

that builds data for model m should be functionally equivalent to

import gvar as gv

data = m.builddata(gv.dataset.avg_data(dataset))

This method is optional. It is used only by MultiFitter.process_dataset().

Parameters:datasetgvar.dataset.Dataset dataset containing the fit data for all models. This is typically a dictionary, whose keys are the datatags of the models.
static get_prior_keys(prior, keys, extend=False)

Return list of keys in dictionary prior for keys in list keys.

List keys is returned if extend=False. Otherwise the keys returned may differ from those in keys. For example, a prior that has a key log(x) would return that key in place of a key x in list keys. This support non-Gaussian priors as discussed in the lsqfit documentation.

static prior_key()

Find base key in prior corresponding to k.

lsqfit.MultiFitter was inspired by an unconventional

Requirements

lsqfit relies heavily on the gvar, and numpy modules. Also the fitting and minimization routines are from the Gnu Scientific Library (GSL) and/or the Python scipy module.