Coverage for /home/martinb/.local/share/virtualenvs/camcops/lib/python3.6/site-packages/statsmodels/regression/mixed_linear_model.py : 6%

Hot-keys on this page
r m x p toggle line displays
j k next/prev highlighted chunk
0 (zero) top of page
1 (one) first highlighted chunk
1"""
2Linear mixed effects models are regression models for dependent data.
3They can be used to estimate regression relationships involving both
4means and variances.
6These models are also known as multilevel linear models, and
7hierarchical linear models.
9The MixedLM class fits linear mixed effects models to data, and
10provides support for some common post-estimation tasks. This is a
11group-based implementation that is most efficient for models in which
12the data can be partitioned into independent groups. Some models with
13crossed effects can be handled by specifying a model with a single
14group.
16The data are partitioned into disjoint groups. The probability model
17for group i is:
19Y = X*beta + Z*gamma + epsilon
21where
23* n_i is the number of observations in group i
25* Y is a n_i dimensional response vector (called endog in MixedLM)
27* X is a n_i x k_fe dimensional design matrix for the fixed effects
28 (called exog in MixedLM)
30* beta is a k_fe-dimensional vector of fixed effects parameters
31 (called fe_params in MixedLM)
33* Z is a design matrix for the random effects with n_i rows (called
34 exog_re in MixedLM). The number of columns in Z can vary by group
35 as discussed below.
37* gamma is a random vector with mean 0. The covariance matrix for the
38 first `k_re` elements of `gamma` (called cov_re in MixedLM) is
39 common to all groups. The remaining elements of `gamma` are
40 variance components as discussed in more detail below. Each group
41 receives its own independent realization of gamma.
43* epsilon is a n_i dimensional vector of iid normal
44 errors with mean 0 and variance sigma^2; the epsilon
45 values are independent both within and between groups
47Y, X and Z must be entirely observed. beta, Psi, and sigma^2 are
48estimated using ML or REML estimation, and gamma and epsilon are
49random so define the probability model.
51The marginal mean structure is E[Y | X, Z] = X*beta. If only the mean
52structure is of interest, GEE is an alternative to using linear mixed
53models.
55Two types of random effects are supported. Standard random effects
56are correlated with each other in arbitrary ways. Every group has the
57same number (`k_re`) of standard random effects, with the same joint
58distribution (but with independent realizations across the groups).
60Variance components are uncorrelated with each other, and with the
61standard random effects. Each variance component has mean zero, and
62all realizations of a given variance component have the same variance
63parameter. The number of realized variance components per variance
64parameter can differ across the groups.
66The primary reference for the implementation details is:
68MJ Lindstrom, DM Bates (1988). "Newton Raphson and EM algorithms for
69linear mixed effects models for repeated measures data". Journal of
70the American Statistical Association. Volume 83, Issue 404, pages
711014-1022.
73See also this more recent document:
75http://econ.ucsb.edu/~doug/245a/Papers/Mixed%20Effects%20Implement.pdf
77All the likelihood, gradient, and Hessian calculations closely follow
78Lindstrom and Bates 1988, adapted to support variance components.
80The following two documents are written more from the perspective of
81users:
83http://lme4.r-forge.r-project.org/lMMwR/lrgprt.pdf
85http://lme4.r-forge.r-project.org/slides/2009-07-07-Rennes/3Longitudinal-4.pdf
87Notation:
89* `cov_re` is the random effects covariance matrix (referred to above
90 as Psi) and `scale` is the (scalar) error variance. For a single
91 group, the marginal covariance matrix of endog given exog is scale*I
92 + Z * cov_re * Z', where Z is the design matrix for the random
93 effects in one group.
95* `vcomp` is a vector of variance parameters. The length of `vcomp`
96 is determined by the number of keys in either the `exog_vc` argument
97 to ``MixedLM``, or the `vc_formula` argument when using formulas to
98 fit a model.
100Notes:
1021. Three different parameterizations are used in different places.
103The regression slopes (usually called `fe_params`) are identical in
104all three parameterizations, but the variance parameters differ. The
105parameterizations are:
107* The "user parameterization" in which cov(endog) = scale*I + Z *
108 cov_re * Z', as described above. This is the main parameterization
109 visible to the user.
111* The "profile parameterization" in which cov(endog) = I +
112 Z * cov_re1 * Z'. This is the parameterization of the profile
113 likelihood that is maximized to produce parameter estimates.
114 (see Lindstrom and Bates for details). The "user" cov_re is
115 equal to the "profile" cov_re1 times the scale.
117* The "square root parameterization" in which we work with the Cholesky
118 factor of cov_re1 instead of cov_re directly. This is hidden from the
119 user.
121All three parameterizations can be packed into a vector by
122(optionally) concatenating `fe_params` together with the lower
123triangle or Cholesky square root of the dependence structure, followed
124by the variance parameters for the variance components. The are
125stored as square roots if (and only if) the random effects covariance
126matrix is stored as its Choleky factor. Note that when unpacking, it
127is important to either square or reflect the dependence structure
128depending on which parameterization is being used.
130Two score methods are implemented. One takes the score with respect
131to the elements of the random effects covariance matrix (used for
132inference once the MLE is reached), and the other takes the score with
133respect to the parameters of the Choleky square root of the random
134effects covariance matrix (used for optimization).
136The numerical optimization uses GLS to avoid explicitly optimizing
137over the fixed effects parameters. The likelihood that is optimized
138is profiled over both the scale parameter (a scalar) and the fixed
139effects parameters (if any). As a result of this profiling, it is
140difficult and unnecessary to calculate the Hessian of the profiled log
141likelihood function, so that calculation is not implemented here.
142Therefore, optimization methods requiring the Hessian matrix such as
143the Newton-Raphson algorithm cannot be used for model fitting.
144"""
146import numpy as np
147import statsmodels.base.model as base
148from statsmodels.tools.decorators import cache_readonly
149from statsmodels.tools import data as data_tools
150from scipy.stats.distributions import norm
151from scipy import sparse
152import pandas as pd
153import patsy
154from collections import OrderedDict
155import warnings
156from statsmodels.tools.sm_exceptions import ConvergenceWarning
157from statsmodels.base._penalties import Penalty
160def _dot(x, y):
161 """
162 Returns the dot product of the arrays, works for sparse and dense.
163 """
165 if isinstance(x, np.ndarray) and isinstance(y, np.ndarray):
166 return np.dot(x, y)
167 elif sparse.issparse(x):
168 return x.dot(y)
169 elif sparse.issparse(y):
170 return y.T.dot(x.T).T
173# From numpy, adapted to work with sparse and dense arrays.
174def _multi_dot_three(A, B, C):
175 """
176 Find best ordering for three arrays and do the multiplication.
178 Doing in manually instead of using dynamic programing is
179 approximately 15 times faster.
180 """
181 # cost1 = cost((AB)C)
182 cost1 = (A.shape[0] * A.shape[1] * B.shape[1] + # (AB)
183 A.shape[0] * B.shape[1] * C.shape[1]) # (--)C
184 # cost2 = cost((AB)C)
185 cost2 = (B.shape[0] * B.shape[1] * C.shape[1] + # (BC)
186 A.shape[0] * A.shape[1] * C.shape[1]) # A(--)
188 if cost1 < cost2:
189 return _dot(_dot(A, B), C)
190 else:
191 return _dot(A, _dot(B, C))
194def _dotsum(x, y):
195 """
196 Returns sum(x * y), where '*' is the pointwise product, computed
197 efficiently for dense and sparse matrices.
198 """
200 if sparse.issparse(x):
201 return x.multiply(y).sum()
202 else:
203 # This way usually avoids allocating a temporary.
204 return np.dot(x.ravel(), y.ravel())
207class VCSpec(object):
208 """
209 Define the variance component structure of a multilevel model.
211 An instance of the class contains three attributes:
213 - names : names[k] is the name of variance component k.
215 - mats : mats[k][i] is the design matrix for group index
216 i in variance component k.
218 - colnames : colnames[k][i] is the list of column names for
219 mats[k][i].
221 The groups in colnames and mats must be in sorted order.
222 """
224 def __init__(self, names, colnames, mats):
225 self.names = names
226 self.colnames = colnames
227 self.mats = mats
230def _get_exog_re_names(self, exog_re):
231 """
232 Passes through if given a list of names. Otherwise, gets pandas names
233 or creates some generic variable names as needed.
234 """
235 if self.k_re == 0:
236 return []
237 if isinstance(exog_re, pd.DataFrame):
238 return exog_re.columns.tolist()
239 elif isinstance(exog_re, pd.Series) and exog_re.name is not None:
240 return [exog_re.name]
241 elif isinstance(exog_re, list):
242 return exog_re
244 # Default names
245 defnames = ["x_re{0:1d}".format(k + 1) for k in range(exog_re.shape[1])]
246 return defnames
249class MixedLMParams(object):
250 """
251 This class represents a parameter state for a mixed linear model.
253 Parameters
254 ----------
255 k_fe : int
256 The number of covariates with fixed effects.
257 k_re : int
258 The number of covariates with random coefficients (excluding
259 variance components).
260 k_vc : int
261 The number of variance components parameters.
263 Notes
264 -----
265 This object represents the parameter state for the model in which
266 the scale parameter has been profiled out.
267 """
269 def __init__(self, k_fe, k_re, k_vc):
271 self.k_fe = k_fe
272 self.k_re = k_re
273 self.k_re2 = k_re * (k_re + 1) // 2
274 self.k_vc = k_vc
275 self.k_tot = self.k_fe + self.k_re2 + self.k_vc
276 self._ix = np.tril_indices(self.k_re)
278 def from_packed(params, k_fe, k_re, use_sqrt, has_fe):
279 """
280 Create a MixedLMParams object from packed parameter vector.
282 Parameters
283 ----------
284 params : array_like
285 The mode parameters packed into a single vector.
286 k_fe : int
287 The number of covariates with fixed effects
288 k_re : int
289 The number of covariates with random effects (excluding
290 variance components).
291 use_sqrt : bool
292 If True, the random effects covariance matrix is provided
293 as its Cholesky factor, otherwise the lower triangle of
294 the covariance matrix is stored.
295 has_fe : bool
296 If True, `params` contains fixed effects parameters.
297 Otherwise, the fixed effects parameters are set to zero.
299 Returns
300 -------
301 A MixedLMParams object.
302 """
303 k_re2 = int(k_re * (k_re + 1) / 2)
305 # The number of covariance parameters.
306 if has_fe:
307 k_vc = len(params) - k_fe - k_re2
308 else:
309 k_vc = len(params) - k_re2
311 pa = MixedLMParams(k_fe, k_re, k_vc)
313 cov_re = np.zeros((k_re, k_re))
314 ix = pa._ix
315 if has_fe:
316 pa.fe_params = params[0:k_fe]
317 cov_re[ix] = params[k_fe:k_fe+k_re2]
318 else:
319 pa.fe_params = np.zeros(k_fe)
320 cov_re[ix] = params[0:k_re2]
322 if use_sqrt:
323 cov_re = np.dot(cov_re, cov_re.T)
324 else:
325 cov_re = (cov_re + cov_re.T) - np.diag(np.diag(cov_re))
327 pa.cov_re = cov_re
328 if k_vc > 0:
329 if use_sqrt:
330 pa.vcomp = params[-k_vc:]**2
331 else:
332 pa.vcomp = params[-k_vc:]
333 else:
334 pa.vcomp = np.array([])
336 return pa
338 from_packed = staticmethod(from_packed)
340 def from_components(fe_params=None, cov_re=None, cov_re_sqrt=None,
341 vcomp=None):
342 """
343 Create a MixedLMParams object from each parameter component.
345 Parameters
346 ----------
347 fe_params : array_like
348 The fixed effects parameter (a 1-dimensional array). If
349 None, there are no fixed effects.
350 cov_re : array_like
351 The random effects covariance matrix (a square, symmetric
352 2-dimensional array).
353 cov_re_sqrt : array_like
354 The Cholesky (lower triangular) square root of the random
355 effects covariance matrix.
356 vcomp : array_like
357 The variance component parameters. If None, there are no
358 variance components.
360 Returns
361 -------
362 A MixedLMParams object.
363 """
365 if vcomp is None:
366 vcomp = np.empty(0)
367 if fe_params is None:
368 fe_params = np.empty(0)
369 if cov_re is None and cov_re_sqrt is None:
370 cov_re = np.empty((0, 0))
372 k_fe = len(fe_params)
373 k_vc = len(vcomp)
374 k_re = cov_re.shape[0] if cov_re is not None else cov_re_sqrt.shape[0]
376 pa = MixedLMParams(k_fe, k_re, k_vc)
377 pa.fe_params = fe_params
378 if cov_re_sqrt is not None:
379 pa.cov_re = np.dot(cov_re_sqrt, cov_re_sqrt.T)
380 elif cov_re is not None:
381 pa.cov_re = cov_re
383 pa.vcomp = vcomp
385 return pa
387 from_components = staticmethod(from_components)
389 def copy(self):
390 """
391 Returns a copy of the object.
392 """
393 obj = MixedLMParams(self.k_fe, self.k_re, self.k_vc)
394 obj.fe_params = self.fe_params.copy()
395 obj.cov_re = self.cov_re.copy()
396 obj.vcomp = self.vcomp.copy()
397 return obj
399 def get_packed(self, use_sqrt, has_fe=False):
400 """
401 Return the model parameters packed into a single vector.
403 Parameters
404 ----------
405 use_sqrt : bool
406 If True, the Cholesky square root of `cov_re` is
407 included in the packed result. Otherwise the
408 lower triangle of `cov_re` is included.
409 has_fe : bool
410 If True, the fixed effects parameters are included
411 in the packed result, otherwise they are omitted.
412 """
414 if self.k_re > 0:
415 if use_sqrt:
416 L = np.linalg.cholesky(self.cov_re)
417 cpa = L[self._ix]
418 else:
419 cpa = self.cov_re[self._ix]
420 else:
421 cpa = np.zeros(0)
423 if use_sqrt:
424 vcomp = np.sqrt(self.vcomp)
425 else:
426 vcomp = self.vcomp
428 if has_fe:
429 pa = np.concatenate((self.fe_params, cpa, vcomp))
430 else:
431 pa = np.concatenate((cpa, vcomp))
433 return pa
436def _smw_solver(s, A, AtA, Qi, di):
437 r"""
438 Returns a solver for the linear system:
440 .. math::
442 (sI + ABA^\prime) y = x
444 The returned function f satisfies f(x) = y as defined above.
446 B and its inverse matrix are block diagonal. The upper left block
447 of :math:`B^{-1}` is Qi and its lower right block is diag(di).
449 Parameters
450 ----------
451 s : scalar
452 See above for usage
453 A : ndarray
454 p x q matrix, in general q << p, may be sparse.
455 AtA : square ndarray
456 :math:`A^\prime A`, a q x q matrix.
457 Qi : square symmetric ndarray
458 The matrix `B` is q x q, where q = r + d. `B` consists of a r
459 x r diagonal block whose inverse is `Qi`, and a d x d diagonal
460 block, whose inverse is diag(di).
461 di : 1d array_like
462 See documentation for Qi.
464 Returns
465 -------
466 A function for solving a linear system, as documented above.
468 Notes
469 -----
470 Uses Sherman-Morrison-Woodbury identity:
471 https://en.wikipedia.org/wiki/Woodbury_matrix_identity
472 """
474 # Use SMW identity
475 qmat = AtA / s
476 if sparse.issparse(qmat):
477 qmat = qmat.todense()
478 m = Qi.shape[0]
479 qmat[0:m, 0:m] += Qi
480 d = qmat.shape[0]
481 qmat.flat[m*(d+1)::d+1] += di
482 if sparse.issparse(A):
483 qmati = sparse.linalg.spsolve(sparse.csc_matrix(qmat), A.T)
484 else:
485 qmati = np.linalg.solve(qmat, A.T)
487 if sparse.issparse(A):
488 def solver(rhs):
489 ql = qmati.dot(rhs)
490 ql = A.dot(ql)
491 return rhs / s - ql / s**2
492 else:
493 def solver(rhs):
494 ql = np.dot(qmati, rhs)
495 ql = np.dot(A, ql)
496 return rhs / s - ql / s**2
498 return solver
501def _smw_logdet(s, A, AtA, Qi, di, B_logdet):
502 r"""
503 Returns the log determinant of
505 .. math::
507 sI + ABA^\prime
509 Uses the matrix determinant lemma to accelerate the calculation.
510 B is assumed to be positive definite, and s > 0, therefore the
511 determinant is positive.
513 Parameters
514 ----------
515 s : positive scalar
516 See above for usage
517 A : ndarray
518 p x q matrix, in general q << p.
519 AtA : square ndarray
520 :math:`A^\prime A`, a q x q matrix.
521 Qi : square symmetric ndarray
522 The matrix `B` is q x q, where q = r + d. `B` consists of a r
523 x r diagonal block whose inverse is `Qi`, and a d x d diagonal
524 block, whose inverse is diag(di).
525 di : 1d array_like
526 See documentation for Qi.
527 B_logdet : real
528 The log determinant of B
530 Returns
531 -------
532 The log determinant of s*I + A*B*A'.
534 Notes
535 -----
536 Uses the matrix determinant lemma:
537 https://en.wikipedia.org/wiki/Matrix_determinant_lemma
538 """
540 p = A.shape[0]
541 ld = p * np.log(s)
542 qmat = AtA / s
543 m = Qi.shape[0]
544 qmat[0:m, 0:m] += Qi
545 d = qmat.shape[0]
546 qmat.flat[m*(d+1)::d+1] += di
547 _, ld1 = np.linalg.slogdet(qmat)
548 return B_logdet + ld + ld1
551def _convert_vc(exog_vc):
553 vc_names = []
554 vc_colnames = []
555 vc_mats = []
557 # Get the groups in sorted order
558 groups = set([])
559 for k, v in exog_vc.items():
560 groups |= set(v.keys())
561 groups = list(groups)
562 groups.sort()
564 for k, v in exog_vc.items():
565 vc_names.append(k)
566 colnames, mats = [], []
567 for g in groups:
568 try:
569 colnames.append(v[g].columns)
570 except AttributeError:
571 colnames.append([str(j) for j in range(v[g].shape[1])])
572 mats.append(v[g])
573 vc_colnames.append(colnames)
574 vc_mats.append(mats)
576 ii = np.argsort(vc_names)
577 vc_names = [vc_names[i] for i in ii]
578 vc_colnames = [vc_colnames[i] for i in ii]
579 vc_mats = [vc_mats[i] for i in ii]
581 return VCSpec(vc_names, vc_colnames, vc_mats)
584class MixedLM(base.LikelihoodModel):
585 """
586 Linear Mixed Effects Model
588 Parameters
589 ----------
590 endog : 1d array_like
591 The dependent variable
592 exog : 2d array_like
593 A matrix of covariates used to determine the
594 mean structure (the "fixed effects" covariates).
595 groups : 1d array_like
596 A vector of labels determining the groups -- data from
597 different groups are independent
598 exog_re : 2d array_like
599 A matrix of covariates used to determine the variance and
600 covariance structure (the "random effects" covariates). If
601 None, defaults to a random intercept for each group.
602 exog_vc : VCSpec instance or dict-like (deprecated)
603 A VCSPec instance defines the structure of the variance
604 components in the model. Alternatively, see notes below
605 for a dictionary-based format. The dictionary format is
606 deprecated and may be removed at some point in the future.
607 use_sqrt : bool
608 If True, optimization is carried out using the lower
609 triangle of the square root of the random effects
610 covariance matrix, otherwise it is carried out using the
611 lower triangle of the random effects covariance matrix.
612 missing : str
613 The approach to missing data handling
615 Notes
616 -----
617 If `exog_vc` is not a `VCSpec` instance, then it must be a
618 dictionary of dictionaries. Specifically, `exog_vc[a][g]` is a
619 matrix whose columns are linearly combined using independent
620 random coefficients. This random term then contributes to the
621 variance structure of the data for group `g`. The random
622 coefficients all have mean zero, and have the same variance. The
623 matrix must be `m x k`, where `m` is the number of observations in
624 group `g`. The number of columns may differ among the top-level
625 groups.
627 The covariates in `exog`, `exog_re` and `exog_vc` may (but need
628 not) partially or wholly overlap.
630 `use_sqrt` should almost always be set to True. The main use case
631 for use_sqrt=False is when complicated patterns of fixed values in
632 the covariance structure are set (using the `free` argument to
633 `fit`) that cannot be expressed in terms of the Cholesky factor L.
635 Examples
636 --------
637 A basic mixed model with fixed effects for the columns of
638 ``exog`` and a random intercept for each distinct value of
639 ``group``:
641 >>> model = sm.MixedLM(endog, exog, groups)
642 >>> result = model.fit()
644 A mixed model with fixed effects for the columns of ``exog`` and
645 correlated random coefficients for the columns of ``exog_re``:
647 >>> model = sm.MixedLM(endog, exog, groups, exog_re=exog_re)
648 >>> result = model.fit()
650 A mixed model with fixed effects for the columns of ``exog`` and
651 independent random coefficients for the columns of ``exog_re``:
653 >>> free = MixedLMParams.from_components(
654 fe_params=np.ones(exog.shape[1]),
655 cov_re=np.eye(exog_re.shape[1]))
656 >>> model = sm.MixedLM(endog, exog, groups, exog_re=exog_re)
657 >>> result = model.fit(free=free)
659 A different way to specify independent random coefficients for the
660 columns of ``exog_re``. In this example ``groups`` must be a
661 Pandas Series with compatible indexing with ``exog_re``, and
662 ``exog_re`` has two columns.
664 >>> g = pd.groupby(groups, by=groups).groups
665 >>> vc = {}
666 >>> vc['1'] = {k : exog_re.loc[g[k], 0] for k in g}
667 >>> vc['2'] = {k : exog_re.loc[g[k], 1] for k in g}
668 >>> model = sm.MixedLM(endog, exog, groups, vcomp=vc)
669 >>> result = model.fit()
670 """
672 def __init__(self, endog, exog, groups, exog_re=None,
673 exog_vc=None, use_sqrt=True, missing='none',
674 **kwargs):
676 _allowed_kwargs = ["missing_idx", "design_info", "formula"]
677 for x in kwargs.keys():
678 if x not in _allowed_kwargs:
679 raise ValueError(
680 "argument %s not permitted for MixedLM initialization" % x)
682 self.use_sqrt = use_sqrt
684 # Some defaults
685 self.reml = True
686 self.fe_pen = None
687 self.re_pen = None
689 if isinstance(exog_vc, dict):
690 warnings.warn("Using deprecated variance components format")
691 # Convert from old to new representation
692 exog_vc = _convert_vc(exog_vc)
694 if exog_vc is not None:
695 self.k_vc = len(exog_vc.names)
696 self.exog_vc = exog_vc
697 else:
698 self.k_vc = 0
699 self.exog_vc = VCSpec([], [], [])
701 # If there is one covariate, it may be passed in as a column
702 # vector, convert these to 2d arrays.
703 # TODO: Can this be moved up in the class hierarchy?
704 # yes, it should be done up the hierarchy
705 if (exog is not None and
706 data_tools._is_using_ndarray_type(exog, None) and
707 exog.ndim == 1):
708 exog = exog[:, None]
709 if (exog_re is not None and
710 data_tools._is_using_ndarray_type(exog_re, None) and
711 exog_re.ndim == 1):
712 exog_re = exog_re[:, None]
714 # Calling super creates self.endog, etc. as ndarrays and the
715 # original exog, endog, etc. are self.data.endog, etc.
716 super(MixedLM, self).__init__(endog, exog, groups=groups,
717 exog_re=exog_re, missing=missing,
718 **kwargs)
720 self._init_keys.extend(["use_sqrt", "exog_vc"])
722 # Number of fixed effects parameters
723 self.k_fe = exog.shape[1]
725 if exog_re is None and len(self.exog_vc.names) == 0:
726 # Default random effects structure (random intercepts).
727 self.k_re = 1
728 self.k_re2 = 1
729 self.exog_re = np.ones((len(endog), 1), dtype=np.float64)
730 self.data.exog_re = self.exog_re
731 names = ['Group Var']
732 self.data.param_names = self.exog_names + names
733 self.data.exog_re_names = names
734 self.data.exog_re_names_full = names
736 elif exog_re is not None:
737 # Process exog_re the same way that exog is handled
738 # upstream
739 # TODO: this is wrong and should be handled upstream wholly
740 self.data.exog_re = exog_re
741 self.exog_re = np.asarray(exog_re)
742 if self.exog_re.ndim == 1:
743 self.exog_re = self.exog_re[:, None]
744 # Model dimensions
745 # Number of random effect covariates
746 self.k_re = self.exog_re.shape[1]
747 # Number of covariance parameters
748 self.k_re2 = self.k_re * (self.k_re + 1) // 2
750 else:
751 # All random effects are variance components
752 self.k_re = 0
753 self.k_re2 = 0
755 if not self.data._param_names:
756 # HACK: could have been set in from_formula already
757 # needs refactor
758 (param_names, exog_re_names,
759 exog_re_names_full) = self._make_param_names(exog_re)
760 self.data.param_names = param_names
761 self.data.exog_re_names = exog_re_names
762 self.data.exog_re_names_full = exog_re_names_full
764 self.k_params = self.k_fe + self.k_re2
766 # Convert the data to the internal representation, which is a
767 # list of arrays, corresponding to the groups.
768 group_labels = list(set(groups))
769 group_labels.sort()
770 row_indices = dict((s, []) for s in group_labels)
771 for i, g in enumerate(groups):
772 row_indices[g].append(i)
773 self.row_indices = row_indices
774 self.group_labels = group_labels
775 self.n_groups = len(self.group_labels)
777 # Split the data by groups
778 self.endog_li = self.group_list(self.endog)
779 self.exog_li = self.group_list(self.exog)
780 self.exog_re_li = self.group_list(self.exog_re)
782 # Precompute this.
783 if self.exog_re is None:
784 self.exog_re2_li = None
785 else:
786 self.exog_re2_li = [np.dot(x.T, x) for x in self.exog_re_li]
788 # The total number of observations, summed over all groups
789 self.nobs = len(self.endog)
790 self.n_totobs = self.nobs
792 # Set the fixed effects parameter names
793 if self.exog_names is None:
794 self.exog_names = ["FE%d" % (k + 1) for k in
795 range(self.exog.shape[1])]
797 # Precompute this
798 self._aex_r = []
799 self._aex_r2 = []
800 for i in range(self.n_groups):
801 a = self._augment_exog(i)
802 self._aex_r.append(a)
804 # This matrix is not very sparse so convert it to dense.
805 ma = _dot(a.T, a)
806 if sparse.issparse(ma):
807 ma = ma.todense()
808 self._aex_r2.append(ma)
810 # Precompute this
811 self._lin, self._quad = self._reparam()
813 def _make_param_names(self, exog_re):
814 """
815 Returns the full parameter names list, just the exogenous random
816 effects variables, and the exogenous random effects variables with
817 the interaction terms.
818 """
819 exog_names = list(self.exog_names)
820 exog_re_names = _get_exog_re_names(self, exog_re)
821 param_names = []
823 jj = self.k_fe
824 for i in range(len(exog_re_names)):
825 for j in range(i + 1):
826 if i == j:
827 param_names.append(exog_re_names[i] + " Var")
828 else:
829 param_names.append(exog_re_names[j] + " x " +
830 exog_re_names[i] + " Cov")
831 jj += 1
833 vc_names = [x + " Var" for x in self.exog_vc.names]
835 return exog_names + param_names + vc_names, exog_re_names, param_names
837 @classmethod
838 def from_formula(cls, formula, data, re_formula=None, vc_formula=None,
839 subset=None, use_sparse=False, missing='none', *args,
840 **kwargs):
841 """
842 Create a Model from a formula and dataframe.
844 Parameters
845 ----------
846 formula : str or generic Formula object
847 The formula specifying the model
848 data : array_like
849 The data for the model. See Notes.
850 re_formula : str
851 A one-sided formula defining the variance structure of the
852 model. The default gives a random intercept for each
853 group.
854 vc_formula : dict-like
855 Formulas describing variance components. `vc_formula[vc]` is
856 the formula for the component with variance parameter named
857 `vc`. The formula is processed into a matrix, and the columns
858 of this matrix are linearly combined with independent random
859 coefficients having mean zero and a common variance.
860 subset : array_like
861 An array-like object of booleans, integers, or index
862 values that indicate the subset of df to use in the
863 model. Assumes df is a `pandas.DataFrame`
864 missing : str
865 Either 'none' or 'drop'
866 args : extra arguments
867 These are passed to the model
868 kwargs : extra keyword arguments
869 These are passed to the model with one exception. The
870 ``eval_env`` keyword is passed to patsy. It can be either a
871 :class:`patsy:patsy.EvalEnvironment` object or an integer
872 indicating the depth of the namespace to use. For example, the
873 default ``eval_env=0`` uses the calling namespace. If you wish
874 to use a "clean" environment set ``eval_env=-1``.
876 Returns
877 -------
878 model : Model instance
880 Notes
881 -----
882 `data` must define __getitem__ with the keys in the formula
883 terms args and kwargs are passed on to the model
884 instantiation. E.g., a numpy structured or rec array, a
885 dictionary, or a pandas DataFrame.
887 If the variance component is intended to produce random
888 intercepts for disjoint subsets of a group, specified by
889 string labels or a categorical data value, always use '0 +' in
890 the formula so that no overall intercept is included.
892 If the variance components specify random slopes and you do
893 not also want a random group-level intercept in the model,
894 then use '0 +' in the formula to exclude the intercept.
896 The variance components formulas are processed separately for
897 each group. If a variable is categorical the results will not
898 be affected by whether the group labels are distinct or
899 re-used over the top-level groups.
901 Examples
902 --------
903 Suppose we have data from an educational study with students
904 nested in classrooms nested in schools. The students take a
905 test, and we want to relate the test scores to the students'
906 ages, while accounting for the effects of classrooms and
907 schools. The school will be the top-level group, and the
908 classroom is a nested group that is specified as a variance
909 component. Note that the schools may have different number of
910 classrooms, and the classroom labels may (but need not be)
911 different across the schools.
913 >>> vc = {'classroom': '0 + C(classroom)'}
914 >>> MixedLM.from_formula('test_score ~ age', vc_formula=vc, \
915 re_formula='1', groups='school', data=data)
917 Now suppose we also have a previous test score called
918 'pretest'. If we want the relationship between pretest
919 scores and the current test to vary by classroom, we can
920 specify a random slope for the pretest score
922 >>> vc = {'classroom': '0 + C(classroom)', 'pretest': '0 + pretest'}
923 >>> MixedLM.from_formula('test_score ~ age + pretest', vc_formula=vc, \
924 re_formula='1', groups='school', data=data)
926 The following model is almost equivalent to the previous one,
927 but here the classroom random intercept and pretest slope may
928 be correlated.
930 >>> vc = {'classroom': '0 + C(classroom)'}
931 >>> MixedLM.from_formula('test_score ~ age + pretest', vc_formula=vc, \
932 re_formula='1 + pretest', groups='school', \
933 data=data)
934 """
936 if "groups" not in kwargs.keys():
937 raise AttributeError("'groups' is a required keyword argument " +
938 "in MixedLM.from_formula")
939 groups = kwargs["groups"]
941 # If `groups` is a variable name, retrieve the data for the
942 # groups variable.
943 group_name = "Group"
944 if isinstance(groups, str):
945 group_name = groups
946 groups = np.asarray(data[groups])
947 else:
948 groups = np.asarray(groups)
949 del kwargs["groups"]
951 # Bypass all upstream missing data handling to properly handle
952 # variance components
953 if missing == 'drop':
954 data, groups = _handle_missing(data, groups, formula, re_formula,
955 vc_formula)
956 missing = 'none'
958 if re_formula is not None:
959 if re_formula.strip() == "1":
960 # Work around Patsy bug, fixed by 0.3.
961 exog_re = np.ones((data.shape[0], 1))
962 exog_re_names = [group_name]
963 else:
964 eval_env = kwargs.get('eval_env', None)
965 if eval_env is None:
966 eval_env = 1
967 elif eval_env == -1:
968 from patsy import EvalEnvironment
969 eval_env = EvalEnvironment({})
970 exog_re = patsy.dmatrix(re_formula, data, eval_env=eval_env)
971 exog_re_names = exog_re.design_info.column_names
972 exog_re_names = [x.replace("Intercept", group_name)
973 for x in exog_re_names]
974 exog_re = np.asarray(exog_re)
975 if exog_re.ndim == 1:
976 exog_re = exog_re[:, None]
977 else:
978 exog_re = None
979 if vc_formula is None:
980 exog_re_names = [group_name]
981 else:
982 exog_re_names = []
984 if vc_formula is not None:
985 eval_env = kwargs.get('eval_env', None)
986 if eval_env is None:
987 eval_env = 1
988 elif eval_env == -1:
989 from patsy import EvalEnvironment
990 eval_env = EvalEnvironment({})
992 vc_mats = []
993 vc_colnames = []
994 vc_names = []
995 gb = data.groupby(groups)
996 kylist = sorted(gb.groups.keys())
997 vcf = sorted(vc_formula.keys())
998 for vc_name in vcf:
999 md = patsy.ModelDesc.from_formula(vc_formula[vc_name])
1000 vc_names.append(vc_name)
1001 evc_mats, evc_colnames = [], []
1002 for group_ix, group in enumerate(kylist):
1003 ii = gb.groups[group]
1004 mat = patsy.dmatrix(
1005 md,
1006 data.loc[ii, :],
1007 eval_env=eval_env,
1008 return_type='dataframe')
1009 evc_colnames.append(mat.columns.tolist())
1010 if use_sparse:
1011 evc_mats.append(sparse.csr_matrix(mat))
1012 else:
1013 evc_mats.append(np.asarray(mat))
1014 vc_mats.append(evc_mats)
1015 vc_colnames.append(evc_colnames)
1016 exog_vc = VCSpec(vc_names, vc_colnames, vc_mats)
1017 else:
1018 exog_vc = VCSpec([], [], [])
1020 kwargs["subset"] = None
1021 kwargs["exog_re"] = exog_re
1022 kwargs["exog_vc"] = exog_vc
1023 kwargs["groups"] = groups
1024 mod = super(MixedLM, cls).from_formula(
1025 formula, data, *args, **kwargs)
1027 # expand re names to account for pairs of RE
1028 (param_names,
1029 exog_re_names,
1030 exog_re_names_full) = mod._make_param_names(exog_re_names)
1032 mod.data.param_names = param_names
1033 mod.data.exog_re_names = exog_re_names
1034 mod.data.exog_re_names_full = exog_re_names_full
1036 if vc_formula is not None:
1037 mod.data.vcomp_names = mod.exog_vc.names
1039 return mod
1041 def predict(self, params, exog=None):
1042 """
1043 Return predicted values from a design matrix.
1045 Parameters
1046 ----------
1047 params : array_like
1048 Parameters of a mixed linear model. Can be either a
1049 MixedLMParams instance, or a vector containing the packed
1050 model parameters in which the fixed effects parameters are
1051 at the beginning of the vector, or a vector containing
1052 only the fixed effects parameters.
1053 exog : array_like, optional
1054 Design / exogenous data for the fixed effects. Model exog
1055 is used if None.
1057 Returns
1058 -------
1059 An array of fitted values. Note that these predicted values
1060 only reflect the fixed effects mean structure of the model.
1061 """
1062 if exog is None:
1063 exog = self.exog
1065 if isinstance(params, MixedLMParams):
1066 params = params.fe_params
1067 else:
1068 params = params[0:self.k_fe]
1070 return np.dot(exog, params)
1072 def group_list(self, array):
1073 """
1074 Returns `array` split into subarrays corresponding to the
1075 grouping structure.
1076 """
1078 if array is None:
1079 return None
1081 if array.ndim == 1:
1082 return [np.array(array[self.row_indices[k]])
1083 for k in self.group_labels]
1084 else:
1085 return [np.array(array[self.row_indices[k], :])
1086 for k in self.group_labels]
1088 def fit_regularized(self, start_params=None, method='l1', alpha=0,
1089 ceps=1e-4, ptol=1e-6, maxit=200, **fit_kwargs):
1090 """
1091 Fit a model in which the fixed effects parameters are
1092 penalized. The dependence parameters are held fixed at their
1093 estimated values in the unpenalized model.
1095 Parameters
1096 ----------
1097 method : str of Penalty object
1098 Method for regularization. If a string, must be 'l1'.
1099 alpha : array_like
1100 Scalar or vector of penalty weights. If a scalar, the
1101 same weight is applied to all coefficients; if a vector,
1102 it contains a weight for each coefficient. If method is a
1103 Penalty object, the weights are scaled by alpha. For L1
1104 regularization, the weights are used directly.
1105 ceps : positive real scalar
1106 Fixed effects parameters smaller than this value
1107 in magnitude are treated as being zero.
1108 ptol : positive real scalar
1109 Convergence occurs when the sup norm difference
1110 between successive values of `fe_params` is less than
1111 `ptol`.
1112 maxit : int
1113 The maximum number of iterations.
1114 fit_kwargs : keywords
1115 Additional keyword arguments passed to fit.
1117 Returns
1118 -------
1119 A MixedLMResults instance containing the results.
1121 Notes
1122 -----
1123 The covariance structure is not updated as the fixed effects
1124 parameters are varied.
1126 The algorithm used here for L1 regularization is a"shooting"
1127 or cyclic coordinate descent algorithm.
1129 If method is 'l1', then `fe_pen` and `cov_pen` are used to
1130 obtain the covariance structure, but are ignored during the
1131 L1-penalized fitting.
1133 References
1134 ----------
1135 Friedman, J. H., Hastie, T. and Tibshirani, R. Regularized
1136 Paths for Generalized Linear Models via Coordinate
1137 Descent. Journal of Statistical Software, 33(1) (2008)
1138 http://www.jstatsoft.org/v33/i01/paper
1140 http://statweb.stanford.edu/~tibs/stat315a/Supplements/fuse.pdf
1141 """
1143 if isinstance(method, str) and (method.lower() != 'l1'):
1144 raise ValueError("Invalid regularization method")
1146 # If method is a smooth penalty just optimize directly.
1147 if isinstance(method, Penalty):
1148 # Scale the penalty weights by alpha
1149 method.alpha = alpha
1150 fit_kwargs.update({"fe_pen": method})
1151 return self.fit(**fit_kwargs)
1153 if np.isscalar(alpha):
1154 alpha = alpha * np.ones(self.k_fe, dtype=np.float64)
1156 # Fit the unpenalized model to get the dependence structure.
1157 mdf = self.fit(**fit_kwargs)
1158 fe_params = mdf.fe_params
1159 cov_re = mdf.cov_re
1160 vcomp = mdf.vcomp
1161 scale = mdf.scale
1162 try:
1163 cov_re_inv = np.linalg.inv(cov_re)
1164 except np.linalg.LinAlgError:
1165 cov_re_inv = None
1167 for itr in range(maxit):
1169 fe_params_s = fe_params.copy()
1170 for j in range(self.k_fe):
1172 if abs(fe_params[j]) < ceps:
1173 continue
1175 # The residuals
1176 fe_params[j] = 0.
1177 expval = np.dot(self.exog, fe_params)
1178 resid_all = self.endog - expval
1180 # The loss function has the form
1181 # a*x^2 + b*x + pwt*|x|
1182 a, b = 0., 0.
1183 for group_ix, group in enumerate(self.group_labels):
1185 vc_var = self._expand_vcomp(vcomp, group_ix)
1187 exog = self.exog_li[group_ix]
1188 ex_r, ex2_r = self._aex_r[group_ix], self._aex_r2[group_ix]
1190 resid = resid_all[self.row_indices[group]]
1191 solver = _smw_solver(scale, ex_r, ex2_r, cov_re_inv,
1192 1 / vc_var)
1194 x = exog[:, j]
1195 u = solver(x)
1196 a += np.dot(u, x)
1197 b -= 2 * np.dot(u, resid)
1199 pwt1 = alpha[j]
1200 if b > pwt1:
1201 fe_params[j] = -(b - pwt1) / (2 * a)
1202 elif b < -pwt1:
1203 fe_params[j] = -(b + pwt1) / (2 * a)
1205 if np.abs(fe_params_s - fe_params).max() < ptol:
1206 break
1208 # Replace the fixed effects estimates with their penalized
1209 # values, leave the dependence parameters in their unpenalized
1210 # state.
1211 params_prof = mdf.params.copy()
1212 params_prof[0:self.k_fe] = fe_params
1214 scale = self.get_scale(fe_params, mdf.cov_re_unscaled, mdf.vcomp)
1216 # Get the Hessian including only the nonzero fixed effects,
1217 # then blow back up to the full size after inverting.
1218 hess = self.hessian(params_prof)
1219 pcov = np.nan * np.ones_like(hess)
1220 ii = np.abs(params_prof) > ceps
1221 ii[self.k_fe:] = True
1222 ii = np.flatnonzero(ii)
1223 hess1 = hess[ii, :][:, ii]
1224 pcov[np.ix_(ii, ii)] = np.linalg.inv(-hess1)
1226 params_object = MixedLMParams.from_components(fe_params, cov_re=cov_re)
1228 results = MixedLMResults(self, params_prof, pcov / scale)
1229 results.params_object = params_object
1230 results.fe_params = fe_params
1231 results.cov_re = cov_re
1232 results.scale = scale
1233 results.cov_re_unscaled = mdf.cov_re_unscaled
1234 results.method = mdf.method
1235 results.converged = True
1236 results.cov_pen = self.cov_pen
1237 results.k_fe = self.k_fe
1238 results.k_re = self.k_re
1239 results.k_re2 = self.k_re2
1240 results.k_vc = self.k_vc
1242 return MixedLMResultsWrapper(results)
1244 def get_fe_params(self, cov_re, vcomp):
1245 """
1246 Use GLS to update the fixed effects parameter estimates.
1248 Parameters
1249 ----------
1250 cov_re : array_like
1251 The covariance matrix of the random effects.
1253 Returns
1254 -------
1255 The GLS estimates of the fixed effects parameters.
1256 """
1258 if self.k_fe == 0:
1259 return np.array([])
1261 if self.k_re == 0:
1262 cov_re_inv = np.empty((0, 0))
1263 else:
1264 cov_re_inv = np.linalg.inv(cov_re)
1266 # Cache these quantities that do not change.
1267 if not hasattr(self, "_endex_li"):
1268 self._endex_li = []
1269 for group_ix, _ in enumerate(self.group_labels):
1270 mat = np.concatenate(
1271 (self.exog_li[group_ix],
1272 self.endog_li[group_ix][:, None]), axis=1)
1273 self._endex_li.append(mat)
1275 xtxy = 0.
1276 for group_ix, group in enumerate(self.group_labels):
1277 vc_var = self._expand_vcomp(vcomp, group_ix)
1278 exog = self.exog_li[group_ix]
1279 ex_r, ex2_r = self._aex_r[group_ix], self._aex_r2[group_ix]
1280 solver = _smw_solver(1., ex_r, ex2_r, cov_re_inv, 1 / vc_var)
1281 u = solver(self._endex_li[group_ix])
1282 xtxy += np.dot(exog.T, u)
1284 fe_params = np.linalg.solve(xtxy[:, 0:-1], xtxy[:, -1])
1286 return fe_params
1288 def _reparam(self):
1289 """
1290 Returns parameters of the map converting parameters from the
1291 form used in optimization to the form returned to the user.
1293 Returns
1294 -------
1295 lin : list-like
1296 Linear terms of the map
1297 quad : list-like
1298 Quadratic terms of the map
1300 Notes
1301 -----
1302 If P are the standard form parameters and R are the
1303 transformed parameters (i.e. with the Cholesky square root
1304 covariance and square root transformed variance components),
1305 then P[i] = lin[i] * R + R' * quad[i] * R
1306 """
1308 k_fe, k_re, k_re2, k_vc = self.k_fe, self.k_re, self.k_re2, self.k_vc
1309 k_tot = k_fe + k_re2 + k_vc
1310 ix = np.tril_indices(self.k_re)
1312 lin = []
1313 for k in range(k_fe):
1314 e = np.zeros(k_tot)
1315 e[k] = 1
1316 lin.append(e)
1317 for k in range(k_re2):
1318 lin.append(np.zeros(k_tot))
1319 for k in range(k_vc):
1320 lin.append(np.zeros(k_tot))
1322 quad = []
1323 # Quadratic terms for fixed effects.
1324 for k in range(k_tot):
1325 quad.append(np.zeros((k_tot, k_tot)))
1327 # Quadratic terms for random effects covariance.
1328 ii = np.tril_indices(k_re)
1329 ix = [(a, b) for a, b in zip(ii[0], ii[1])]
1330 for i1 in range(k_re2):
1331 for i2 in range(k_re2):
1332 ix1 = ix[i1]
1333 ix2 = ix[i2]
1334 if (ix1[1] == ix2[1]) and (ix1[0] <= ix2[0]):
1335 ii = (ix2[0], ix1[0])
1336 k = ix.index(ii)
1337 quad[k_fe+k][k_fe+i2, k_fe+i1] += 1
1338 for k in range(k_tot):
1339 quad[k] = 0.5*(quad[k] + quad[k].T)
1341 # Quadratic terms for variance components.
1342 km = k_fe + k_re2
1343 for k in range(km, km+k_vc):
1344 quad[k][k, k] = 1
1346 return lin, quad
1348 def _expand_vcomp(self, vcomp, group_ix):
1349 """
1350 Replicate variance parameters to match a group's design.
1352 Parameters
1353 ----------
1354 vcomp : array_like
1355 The variance parameters for the variance components.
1356 group_ix : int
1357 The group index
1359 Returns an expanded version of vcomp, in which each variance
1360 parameter is copied as many times as there are independent
1361 realizations of the variance component in the given group.
1362 """
1363 if len(vcomp) == 0:
1364 return np.empty(0)
1365 vc_var = []
1366 for j in range(len(self.exog_vc.names)):
1367 d = self.exog_vc.mats[j][group_ix].shape[1]
1368 vc_var.append(vcomp[j] * np.ones(d))
1369 if len(vc_var) > 0:
1370 return np.concatenate(vc_var)
1371 else:
1372 # Cannot reach here?
1373 return np.empty(0)
1375 def _augment_exog(self, group_ix):
1376 """
1377 Concatenate the columns for variance components to the columns
1378 for other random effects to obtain a single random effects
1379 exog matrix for a given group.
1380 """
1381 ex_r = self.exog_re_li[group_ix] if self.k_re > 0 else None
1382 if self.k_vc == 0:
1383 return ex_r
1385 ex = [ex_r] if self.k_re > 0 else []
1386 any_sparse = False
1387 for j, _ in enumerate(self.exog_vc.names):
1388 ex.append(self.exog_vc.mats[j][group_ix])
1389 any_sparse |= sparse.issparse(ex[-1])
1390 if any_sparse:
1391 for j, x in enumerate(ex):
1392 if not sparse.issparse(x):
1393 ex[j] = sparse.csr_matrix(x)
1394 ex = sparse.hstack(ex)
1395 ex = sparse.csr_matrix(ex)
1396 else:
1397 ex = np.concatenate(ex, axis=1)
1399 return ex
1401 def loglike(self, params, profile_fe=True):
1402 """
1403 Evaluate the (profile) log-likelihood of the linear mixed
1404 effects model.
1406 Parameters
1407 ----------
1408 params : MixedLMParams, or array_like.
1409 The parameter value. If array-like, must be a packed
1410 parameter vector containing only the covariance
1411 parameters.
1412 profile_fe : bool
1413 If True, replace the provided value of `fe_params` with
1414 the GLS estimates.
1416 Returns
1417 -------
1418 The log-likelihood value at `params`.
1420 Notes
1421 -----
1422 The scale parameter `scale` is always profiled out of the
1423 log-likelihood. In addition, if `profile_fe` is true the
1424 fixed effects parameters are also profiled out.
1425 """
1427 if type(params) is not MixedLMParams:
1428 params = MixedLMParams.from_packed(params, self.k_fe,
1429 self.k_re, self.use_sqrt,
1430 has_fe=False)
1432 cov_re = params.cov_re
1433 vcomp = params.vcomp
1435 # Move to the profile set
1436 if profile_fe:
1437 fe_params = self.get_fe_params(cov_re, vcomp)
1438 else:
1439 fe_params = params.fe_params
1441 if self.k_re > 0:
1442 try:
1443 cov_re_inv = np.linalg.inv(cov_re)
1444 except np.linalg.LinAlgError:
1445 cov_re_inv = None
1446 _, cov_re_logdet = np.linalg.slogdet(cov_re)
1447 else:
1448 cov_re_inv = np.zeros((0, 0))
1449 cov_re_logdet = 0
1451 # The residuals
1452 expval = np.dot(self.exog, fe_params)
1453 resid_all = self.endog - expval
1455 likeval = 0.
1457 # Handle the covariance penalty
1458 if (self.cov_pen is not None) and (self.k_re > 0):
1459 likeval -= self.cov_pen.func(cov_re, cov_re_inv)
1461 # Handle the fixed effects penalty
1462 if (self.fe_pen is not None):
1463 likeval -= self.fe_pen.func(fe_params)
1465 xvx, qf = 0., 0.
1466 for group_ix, group in enumerate(self.group_labels):
1468 vc_var = self._expand_vcomp(vcomp, group_ix)
1469 cov_aug_logdet = cov_re_logdet + np.sum(np.log(vc_var))
1471 exog = self.exog_li[group_ix]
1472 ex_r, ex2_r = self._aex_r[group_ix], self._aex_r2[group_ix]
1473 solver = _smw_solver(1., ex_r, ex2_r, cov_re_inv, 1 / vc_var)
1475 resid = resid_all[self.row_indices[group]]
1477 # Part 1 of the log likelihood (for both ML and REML)
1478 ld = _smw_logdet(1., ex_r, ex2_r, cov_re_inv, 1 / vc_var,
1479 cov_aug_logdet)
1480 likeval -= ld / 2.
1482 # Part 2 of the log likelihood (for both ML and REML)
1483 u = solver(resid)
1484 qf += np.dot(resid, u)
1486 # Adjustment for REML
1487 if self.reml:
1488 mat = solver(exog)
1489 xvx += np.dot(exog.T, mat)
1491 if self.reml:
1492 likeval -= (self.n_totobs - self.k_fe) * np.log(qf) / 2.
1493 _, ld = np.linalg.slogdet(xvx)
1494 likeval -= ld / 2.
1495 likeval -= (self.n_totobs - self.k_fe) * np.log(2 * np.pi) / 2.
1496 likeval += ((self.n_totobs - self.k_fe) *
1497 np.log(self.n_totobs - self.k_fe) / 2.)
1498 likeval -= (self.n_totobs - self.k_fe) / 2.
1499 else:
1500 likeval -= self.n_totobs * np.log(qf) / 2.
1501 likeval -= self.n_totobs * np.log(2 * np.pi) / 2.
1502 likeval += self.n_totobs * np.log(self.n_totobs) / 2.
1503 likeval -= self.n_totobs / 2.
1505 return likeval
1507 def _gen_dV_dPar(self, ex_r, solver, group_ix, max_ix=None):
1508 """
1509 A generator that yields the element-wise derivative of the
1510 marginal covariance matrix with respect to the random effects
1511 variance and covariance parameters.
1513 ex_r : array_like
1514 The random effects design matrix
1515 solver : function
1516 A function that given x returns V^{-1}x, where V
1517 is the group's marginal covariance matrix.
1518 group_ix : int
1519 The group index
1520 max_ix : {int, None}
1521 If not None, the generator ends when this index
1522 is reached.
1523 """
1525 axr = solver(ex_r)
1527 # Regular random effects
1528 jj = 0
1529 for j1 in range(self.k_re):
1530 for j2 in range(j1 + 1):
1531 if max_ix is not None and jj > max_ix:
1532 return
1533 # Need 2d
1534 mat_l, mat_r = ex_r[:, j1:j1+1], ex_r[:, j2:j2+1]
1535 vsl, vsr = axr[:, j1:j1+1], axr[:, j2:j2+1]
1536 yield jj, mat_l, mat_r, vsl, vsr, j1 == j2
1537 jj += 1
1539 # Variance components
1540 for j, _ in enumerate(self.exog_vc.names):
1541 if max_ix is not None and jj > max_ix:
1542 return
1543 mat = self.exog_vc.mats[j][group_ix]
1544 axmat = solver(mat)
1545 yield jj, mat, mat, axmat, axmat, True
1546 jj += 1
1548 def score(self, params, profile_fe=True):
1549 """
1550 Returns the score vector of the profile log-likelihood.
1552 Notes
1553 -----
1554 The score vector that is returned is computed with respect to
1555 the parameterization defined by this model instance's
1556 `use_sqrt` attribute.
1557 """
1559 if type(params) is not MixedLMParams:
1560 params = MixedLMParams.from_packed(
1561 params, self.k_fe, self.k_re, self.use_sqrt,
1562 has_fe=False)
1564 if profile_fe:
1565 params.fe_params = self.get_fe_params(params.cov_re, params.vcomp)
1567 if self.use_sqrt:
1568 score_fe, score_re, score_vc = self.score_sqrt(
1569 params, calc_fe=not profile_fe)
1570 else:
1571 score_fe, score_re, score_vc = self.score_full(
1572 params, calc_fe=not profile_fe)
1574 if self._freepat is not None:
1575 score_fe *= self._freepat.fe_params
1576 score_re *= self._freepat.cov_re[self._freepat._ix]
1577 score_vc *= self._freepat.vcomp
1579 if profile_fe:
1580 return np.concatenate((score_re, score_vc))
1581 else:
1582 return np.concatenate((score_fe, score_re, score_vc))
1584 def score_full(self, params, calc_fe):
1585 """
1586 Returns the score with respect to untransformed parameters.
1588 Calculates the score vector for the profiled log-likelihood of
1589 the mixed effects model with respect to the parameterization
1590 in which the random effects covariance matrix is represented
1591 in its full form (not using the Cholesky factor).
1593 Parameters
1594 ----------
1595 params : MixedLMParams or array_like
1596 The parameter at which the score function is evaluated.
1597 If array-like, must contain the packed random effects
1598 parameters (cov_re and vcomp) without fe_params.
1599 calc_fe : bool
1600 If True, calculate the score vector for the fixed effects
1601 parameters. If False, this vector is not calculated, and
1602 a vector of zeros is returned in its place.
1604 Returns
1605 -------
1606 score_fe : array_like
1607 The score vector with respect to the fixed effects
1608 parameters.
1609 score_re : array_like
1610 The score vector with respect to the random effects
1611 parameters (excluding variance components parameters).
1612 score_vc : array_like
1613 The score vector with respect to variance components
1614 parameters.
1616 Notes
1617 -----
1618 `score_re` is taken with respect to the parameterization in
1619 which `cov_re` is represented through its lower triangle
1620 (without taking the Cholesky square root).
1621 """
1623 fe_params = params.fe_params
1624 cov_re = params.cov_re
1625 vcomp = params.vcomp
1627 try:
1628 cov_re_inv = np.linalg.inv(cov_re)
1629 except np.linalg.LinAlgError:
1630 cov_re_inv = None
1632 score_fe = np.zeros(self.k_fe)
1633 score_re = np.zeros(self.k_re2)
1634 score_vc = np.zeros(self.k_vc)
1636 # Handle the covariance penalty.
1637 if self.cov_pen is not None:
1638 score_re -= self.cov_pen.deriv(cov_re, cov_re_inv)
1640 # Handle the fixed effects penalty.
1641 if calc_fe and (self.fe_pen is not None):
1642 score_fe -= self.fe_pen.deriv(fe_params)
1644 # resid' V^{-1} resid, summed over the groups (a scalar)
1645 rvir = 0.
1647 # exog' V^{-1} resid, summed over the groups (a k_fe
1648 # dimensional vector)
1649 xtvir = 0.
1651 # exog' V^{_1} exog, summed over the groups (a k_fe x k_fe
1652 # matrix)
1653 xtvix = 0.
1655 # V^{-1} exog' dV/dQ_jj exog V^{-1}, where Q_jj is the jj^th
1656 # covariance parameter.
1657 xtax = [0., ] * (self.k_re2 + self.k_vc)
1659 # Temporary related to the gradient of log |V|
1660 dlv = np.zeros(self.k_re2 + self.k_vc)
1662 # resid' V^{-1} dV/dQ_jj V^{-1} resid (a scalar)
1663 rvavr = np.zeros(self.k_re2 + self.k_vc)
1665 for group_ix, group in enumerate(self.group_labels):
1667 vc_var = self._expand_vcomp(vcomp, group_ix)
1669 exog = self.exog_li[group_ix]
1670 ex_r, ex2_r = self._aex_r[group_ix], self._aex_r2[group_ix]
1671 solver = _smw_solver(1., ex_r, ex2_r, cov_re_inv, 1 / vc_var)
1673 # The residuals
1674 resid = self.endog_li[group_ix]
1675 if self.k_fe > 0:
1676 expval = np.dot(exog, fe_params)
1677 resid = resid - expval
1679 if self.reml:
1680 viexog = solver(exog)
1681 xtvix += np.dot(exog.T, viexog)
1683 # Contributions to the covariance parameter gradient
1684 vir = solver(resid)
1685 for (jj, matl, matr, vsl, vsr, sym) in\
1686 self._gen_dV_dPar(ex_r, solver, group_ix):
1687 dlv[jj] = _dotsum(matr, vsl)
1688 if not sym:
1689 dlv[jj] += _dotsum(matl, vsr)
1691 ul = _dot(vir, matl)
1692 ur = ul.T if sym else _dot(matr.T, vir)
1693 ulr = np.dot(ul, ur)
1694 rvavr[jj] += ulr
1695 if not sym:
1696 rvavr[jj] += ulr.T
1698 if self.reml:
1699 ul = _dot(viexog.T, matl)
1700 ur = ul.T if sym else _dot(matr.T, viexog)
1701 ulr = np.dot(ul, ur)
1702 xtax[jj] += ulr
1703 if not sym:
1704 xtax[jj] += ulr.T
1706 # Contribution of log|V| to the covariance parameter
1707 # gradient.
1708 if self.k_re > 0:
1709 score_re -= 0.5 * dlv[0:self.k_re2]
1710 if self.k_vc > 0:
1711 score_vc -= 0.5 * dlv[self.k_re2:]
1713 rvir += np.dot(resid, vir)
1715 if calc_fe:
1716 xtvir += np.dot(exog.T, vir)
1718 fac = self.n_totobs
1719 if self.reml:
1720 fac -= self.k_fe
1722 if calc_fe and self.k_fe > 0:
1723 score_fe += fac * xtvir / rvir
1725 if self.k_re > 0:
1726 score_re += 0.5 * fac * rvavr[0:self.k_re2] / rvir
1727 if self.k_vc > 0:
1728 score_vc += 0.5 * fac * rvavr[self.k_re2:] / rvir
1730 if self.reml:
1731 xtvixi = np.linalg.inv(xtvix)
1732 for j in range(self.k_re2):
1733 score_re[j] += 0.5 * _dotsum(xtvixi.T, xtax[j])
1734 for j in range(self.k_vc):
1735 score_vc[j] += 0.5 * _dotsum(xtvixi.T, xtax[self.k_re2 + j])
1737 return score_fe, score_re, score_vc
1739 def score_sqrt(self, params, calc_fe=True):
1740 """
1741 Returns the score with respect to transformed parameters.
1743 Calculates the score vector with respect to the
1744 parameterization in which the random effects covariance matrix
1745 is represented through its Cholesky square root.
1747 Parameters
1748 ----------
1749 params : MixedLMParams or array_like
1750 The model parameters. If array-like must contain packed
1751 parameters that are compatible with this model instance.
1752 calc_fe : bool
1753 If True, calculate the score vector for the fixed effects
1754 parameters. If False, this vector is not calculated, and
1755 a vector of zeros is returned in its place.
1757 Returns
1758 -------
1759 score_fe : array_like
1760 The score vector with respect to the fixed effects
1761 parameters.
1762 score_re : array_like
1763 The score vector with respect to the random effects
1764 parameters (excluding variance components parameters).
1765 score_vc : array_like
1766 The score vector with respect to variance components
1767 parameters.
1768 """
1770 score_fe, score_re, score_vc = self.score_full(params, calc_fe=calc_fe)
1771 params_vec = params.get_packed(use_sqrt=True, has_fe=True)
1773 score_full = np.concatenate((score_fe, score_re, score_vc))
1774 scr = 0.
1775 for i in range(len(params_vec)):
1776 v = self._lin[i] + 2 * np.dot(self._quad[i], params_vec)
1777 scr += score_full[i] * v
1778 score_fe = scr[0:self.k_fe]
1779 score_re = scr[self.k_fe:self.k_fe + self.k_re2]
1780 score_vc = scr[self.k_fe + self.k_re2:]
1782 return score_fe, score_re, score_vc
1784 def hessian(self, params):
1785 """
1786 Returns the model's Hessian matrix.
1788 Calculates the Hessian matrix for the linear mixed effects
1789 model with respect to the parameterization in which the
1790 covariance matrix is represented directly (without square-root
1791 transformation).
1793 Parameters
1794 ----------
1795 params : MixedLMParams or array_like
1796 The model parameters at which the Hessian is calculated.
1797 If array-like, must contain the packed parameters in a
1798 form that is compatible with this model instance.
1800 Returns
1801 -------
1802 hess : 2d ndarray
1803 The Hessian matrix, evaluated at `params`.
1804 """
1806 if type(params) is not MixedLMParams:
1807 params = MixedLMParams.from_packed(params, self.k_fe, self.k_re,
1808 use_sqrt=self.use_sqrt,
1809 has_fe=True)
1811 fe_params = params.fe_params
1812 vcomp = params.vcomp
1813 cov_re = params.cov_re
1814 if self.k_re > 0:
1815 cov_re_inv = np.linalg.inv(cov_re)
1816 else:
1817 cov_re_inv = np.empty((0, 0))
1819 # Blocks for the fixed and random effects parameters.
1820 hess_fe = 0.
1821 hess_re = np.zeros((self.k_re2 + self.k_vc, self.k_re2 + self.k_vc))
1822 hess_fere = np.zeros((self.k_re2 + self.k_vc, self.k_fe))
1824 fac = self.n_totobs
1825 if self.reml:
1826 fac -= self.exog.shape[1]
1828 rvir = 0.
1829 xtvix = 0.
1830 xtax = [0., ] * (self.k_re2 + self.k_vc)
1831 m = self.k_re2 + self.k_vc
1832 B = np.zeros(m)
1833 D = np.zeros((m, m))
1834 F = [[0.] * m for k in range(m)]
1835 for group_ix, group in enumerate(self.group_labels):
1837 vc_var = self._expand_vcomp(vcomp, group_ix)
1839 exog = self.exog_li[group_ix]
1840 ex_r, ex2_r = self._aex_r[group_ix], self._aex_r2[group_ix]
1841 solver = _smw_solver(1., ex_r, ex2_r, cov_re_inv, 1 / vc_var)
1843 # The residuals
1844 resid = self.endog_li[group_ix]
1845 if self.k_fe > 0:
1846 expval = np.dot(exog, fe_params)
1847 resid = resid - expval
1849 viexog = solver(exog)
1850 xtvix += np.dot(exog.T, viexog)
1851 vir = solver(resid)
1852 rvir += np.dot(resid, vir)
1854 for (jj1, matl1, matr1, vsl1, vsr1, sym1) in\
1855 self._gen_dV_dPar(ex_r, solver, group_ix):
1857 ul = _dot(viexog.T, matl1)
1858 ur = _dot(matr1.T, vir)
1859 hess_fere[jj1, :] += np.dot(ul, ur)
1860 if not sym1:
1861 ul = _dot(viexog.T, matr1)
1862 ur = _dot(matl1.T, vir)
1863 hess_fere[jj1, :] += np.dot(ul, ur)
1865 if self.reml:
1866 ul = _dot(viexog.T, matl1)
1867 ur = ul if sym1 else np.dot(viexog.T, matr1)
1868 ulr = _dot(ul, ur.T)
1869 xtax[jj1] += ulr
1870 if not sym1:
1871 xtax[jj1] += ulr.T
1873 ul = _dot(vir, matl1)
1874 ur = ul if sym1 else _dot(vir, matr1)
1875 B[jj1] += np.dot(ul, ur) * (1 if sym1 else 2)
1877 # V^{-1} * dV/d_theta
1878 E = [(vsl1, matr1)]
1879 if not sym1:
1880 E.append((vsr1, matl1))
1882 for (jj2, matl2, matr2, vsl2, vsr2, sym2) in\
1883 self._gen_dV_dPar(ex_r, solver, group_ix, jj1):
1885 re = sum([_multi_dot_three(matr2.T, x[0], x[1].T)
1886 for x in E])
1887 vt = 2 * _dot(_multi_dot_three(vir[None, :], matl2, re),
1888 vir[:, None])
1890 if not sym2:
1891 le = sum([_multi_dot_three(matl2.T, x[0], x[1].T)
1892 for x in E])
1893 vt += 2 * _dot(_multi_dot_three(
1894 vir[None, :], matr2, le), vir[:, None])
1896 D[jj1, jj2] += vt
1897 if jj1 != jj2:
1898 D[jj2, jj1] += vt
1900 rt = _dotsum(vsl2, re.T) / 2
1901 if not sym2:
1902 rt += _dotsum(vsr2, le.T) / 2
1904 hess_re[jj1, jj2] += rt
1905 if jj1 != jj2:
1906 hess_re[jj2, jj1] += rt
1908 if self.reml:
1909 ev = sum([_dot(x[0], _dot(x[1].T, viexog)) for x in E])
1910 u1 = _dot(viexog.T, matl2)
1911 u2 = _dot(matr2.T, ev)
1912 um = np.dot(u1, u2)
1913 F[jj1][jj2] += um + um.T
1914 if not sym2:
1915 u1 = np.dot(viexog.T, matr2)
1916 u2 = np.dot(matl2.T, ev)
1917 um = np.dot(u1, u2)
1918 F[jj1][jj2] += um + um.T
1920 hess_fe -= fac * xtvix / rvir
1921 hess_re = hess_re - 0.5 * fac * (D/rvir - np.outer(B, B) / rvir**2)
1922 hess_fere = -fac * hess_fere / rvir
1924 if self.reml:
1925 QL = [np.linalg.solve(xtvix, x) for x in xtax]
1926 for j1 in range(self.k_re2 + self.k_vc):
1927 for j2 in range(j1 + 1):
1928 a = _dotsum(QL[j1].T, QL[j2])
1929 a -= np.trace(np.linalg.solve(xtvix, F[j1][j2]))
1930 a *= 0.5
1931 hess_re[j1, j2] += a
1932 if j1 > j2:
1933 hess_re[j2, j1] += a
1935 # Put the blocks together to get the Hessian.
1936 m = self.k_fe + self.k_re2 + self.k_vc
1937 hess = np.zeros((m, m))
1938 hess[0:self.k_fe, 0:self.k_fe] = hess_fe
1939 hess[0:self.k_fe, self.k_fe:] = hess_fere.T
1940 hess[self.k_fe:, 0:self.k_fe] = hess_fere
1941 hess[self.k_fe:, self.k_fe:] = hess_re
1943 return hess
1945 def get_scale(self, fe_params, cov_re, vcomp):
1946 """
1947 Returns the estimated error variance based on given estimates
1948 of the slopes and random effects covariance matrix.
1950 Parameters
1951 ----------
1952 fe_params : array_like
1953 The regression slope estimates
1954 cov_re : 2d array_like
1955 Estimate of the random effects covariance matrix
1956 vcomp : array_like
1957 Estimate of the variance components
1959 Returns
1960 -------
1961 scale : float
1962 The estimated error variance.
1963 """
1965 try:
1966 cov_re_inv = np.linalg.inv(cov_re)
1967 except np.linalg.LinAlgError:
1968 cov_re_inv = None
1970 qf = 0.
1971 for group_ix, group in enumerate(self.group_labels):
1973 vc_var = self._expand_vcomp(vcomp, group_ix)
1975 exog = self.exog_li[group_ix]
1976 ex_r, ex2_r = self._aex_r[group_ix], self._aex_r2[group_ix]
1978 solver = _smw_solver(1., ex_r, ex2_r, cov_re_inv, 1 / vc_var)
1980 # The residuals
1981 resid = self.endog_li[group_ix]
1982 if self.k_fe > 0:
1983 expval = np.dot(exog, fe_params)
1984 resid = resid - expval
1986 mat = solver(resid)
1987 qf += np.dot(resid, mat)
1989 if self.reml:
1990 qf /= (self.n_totobs - self.k_fe)
1991 else:
1992 qf /= self.n_totobs
1994 return qf
1996 def fit(self, start_params=None, reml=True, niter_sa=0,
1997 do_cg=True, fe_pen=None, cov_pen=None, free=None,
1998 full_output=False, method=None, **kwargs):
1999 """
2000 Fit a linear mixed model to the data.
2002 Parameters
2003 ----------
2004 start_params : array_like or MixedLMParams
2005 Starting values for the profile log-likelihood. If not a
2006 `MixedLMParams` instance, this should be an array
2007 containing the packed parameters for the profile
2008 log-likelihood, including the fixed effects
2009 parameters.
2010 reml : bool
2011 If true, fit according to the REML likelihood, else
2012 fit the standard likelihood using ML.
2013 niter_sa : int
2014 Currently this argument is ignored and has no effect
2015 on the results.
2016 cov_pen : CovariancePenalty object
2017 A penalty for the random effects covariance matrix
2018 do_cg : bool, defaults to True
2019 If False, the optimization is skipped and a results
2020 object at the given (or default) starting values is
2021 returned.
2022 fe_pen : Penalty object
2023 A penalty on the fixed effects
2024 free : MixedLMParams object
2025 If not `None`, this is a mask that allows parameters to be
2026 held fixed at specified values. A 1 indicates that the
2027 corresponding parameter is estimated, a 0 indicates that
2028 it is fixed at its starting value. Setting the `cov_re`
2029 component to the identity matrix fits a model with
2030 independent random effects. Note that some optimization
2031 methods do not respect this constraint (bfgs and lbfgs both
2032 work).
2033 full_output : bool
2034 If true, attach iteration history to results
2035 method : str
2036 Optimization method. Can be a scipy.optimize method name,
2037 or a list of such names to be tried in sequence.
2039 Returns
2040 -------
2041 A MixedLMResults instance.
2042 """
2044 _allowed_kwargs = ['gtol', 'maxiter', 'eps', 'maxcor', 'ftol',
2045 'tol', 'disp', 'maxls']
2046 for x in kwargs.keys():
2047 if x not in _allowed_kwargs:
2048 warnings.warn("Argument %s not used by MixedLM.fit" % x)
2050 if method is None:
2051 method = ['bfgs', 'lbfgs', 'cg']
2052 elif isinstance(method, str):
2053 method = [method]
2055 for meth in method:
2056 if meth.lower() in ["newton", "ncg"]:
2057 raise ValueError(
2058 "method %s not available for MixedLM" % meth)
2060 self.reml = reml
2061 self.cov_pen = cov_pen
2062 self.fe_pen = fe_pen
2064 self._freepat = free
2066 if full_output:
2067 hist = []
2068 else:
2069 hist = None
2071 if start_params is None:
2072 params = MixedLMParams(self.k_fe, self.k_re, self.k_vc)
2073 params.fe_params = np.zeros(self.k_fe)
2074 params.cov_re = np.eye(self.k_re)
2075 params.vcomp = np.ones(self.k_vc)
2076 else:
2077 if isinstance(start_params, MixedLMParams):
2078 params = start_params
2079 else:
2080 # It's a packed array
2081 if len(start_params) == self.k_fe + self.k_re2 + self.k_vc:
2082 params = MixedLMParams.from_packed(
2083 start_params, self.k_fe, self.k_re, self.use_sqrt,
2084 has_fe=True)
2085 elif len(start_params) == self.k_re2 + self.k_vc:
2086 params = MixedLMParams.from_packed(
2087 start_params, self.k_fe, self.k_re, self.use_sqrt,
2088 has_fe=False)
2089 else:
2090 raise ValueError("invalid start_params")
2092 if do_cg:
2093 kwargs["retall"] = hist is not None
2094 if "disp" not in kwargs:
2095 kwargs["disp"] = False
2096 packed = params.get_packed(use_sqrt=self.use_sqrt, has_fe=False)
2098 if niter_sa > 0:
2099 warnings.warn("niter_sa is currently ignored")
2101 # Try optimizing one or more times
2102 for j in range(len(method)):
2103 rslt = super(MixedLM, self).fit(start_params=packed,
2104 skip_hessian=True,
2105 method=method[j],
2106 **kwargs)
2107 if rslt.mle_retvals['converged']:
2108 break
2109 packed = rslt.params
2110 if j + 1 < len(method):
2111 next_method = method[j + 1]
2112 warnings.warn(
2113 "Retrying MixedLM optimization with %s" % next_method,
2114 ConvergenceWarning)
2115 else:
2116 msg = ("MixedLM optimization failed, " +
2117 "trying a different optimizer may help.")
2118 warnings.warn(msg, ConvergenceWarning)
2120 # The optimization succeeded
2121 params = np.atleast_1d(rslt.params)
2122 if hist is not None:
2123 hist.append(rslt.mle_retvals)
2125 converged = rslt.mle_retvals['converged']
2126 if not converged:
2127 gn = self.score(rslt.params)
2128 gn = np.sqrt(np.sum(gn**2))
2129 msg = "Gradient optimization failed, |grad| = %f" % gn
2130 warnings.warn(msg, ConvergenceWarning)
2132 # Convert to the final parameterization (i.e. undo the square
2133 # root transform of the covariance matrix, and the profiling
2134 # over the error variance).
2135 params = MixedLMParams.from_packed(
2136 params, self.k_fe, self.k_re, use_sqrt=self.use_sqrt, has_fe=False)
2137 cov_re_unscaled = params.cov_re
2138 vcomp_unscaled = params.vcomp
2139 fe_params = self.get_fe_params(cov_re_unscaled, vcomp_unscaled)
2140 params.fe_params = fe_params
2141 scale = self.get_scale(fe_params, cov_re_unscaled, vcomp_unscaled)
2142 cov_re = scale * cov_re_unscaled
2143 vcomp = scale * vcomp_unscaled
2145 f1 = (self.k_re > 0) and (np.min(np.abs(np.diag(cov_re))) < 0.01)
2146 f2 = (self.k_vc > 0) and (np.min(np.abs(vcomp)) < 0.01)
2147 if f1 or f2:
2148 msg = "The MLE may be on the boundary of the parameter space."
2149 warnings.warn(msg, ConvergenceWarning)
2151 # Compute the Hessian at the MLE. Note that this is the
2152 # Hessian with respect to the random effects covariance matrix
2153 # (not its square root). It is used for obtaining standard
2154 # errors, not for optimization.
2155 hess = self.hessian(params)
2156 hess_diag = np.diag(hess)
2157 if free is not None:
2158 pcov = np.zeros_like(hess)
2159 pat = self._freepat.get_packed(use_sqrt=False, has_fe=True)
2160 ii = np.flatnonzero(pat)
2161 hess_diag = hess_diag[ii]
2162 if len(ii) > 0:
2163 hess1 = hess[np.ix_(ii, ii)]
2164 pcov[np.ix_(ii, ii)] = np.linalg.inv(-hess1)
2165 else:
2166 pcov = np.linalg.inv(-hess)
2167 if np.any(hess_diag >= 0):
2168 msg = ("The Hessian matrix at the estimated parameter values " +
2169 "is not positive definite.")
2170 warnings.warn(msg, ConvergenceWarning)
2172 # Prepare a results class instance
2173 params_packed = params.get_packed(use_sqrt=False, has_fe=True)
2174 results = MixedLMResults(self, params_packed, pcov / scale)
2175 results.params_object = params
2176 results.fe_params = fe_params
2177 results.cov_re = cov_re
2178 results.vcomp = vcomp
2179 results.scale = scale
2180 results.cov_re_unscaled = cov_re_unscaled
2181 results.method = "REML" if self.reml else "ML"
2182 results.converged = converged
2183 results.hist = hist
2184 results.reml = self.reml
2185 results.cov_pen = self.cov_pen
2186 results.k_fe = self.k_fe
2187 results.k_re = self.k_re
2188 results.k_re2 = self.k_re2
2189 results.k_vc = self.k_vc
2190 results.use_sqrt = self.use_sqrt
2191 results.freepat = self._freepat
2193 return MixedLMResultsWrapper(results)
2195 def get_distribution(self, params, scale, exog):
2196 return _mixedlm_distribution(self, params, scale, exog)
2199class _mixedlm_distribution(object):
2200 """
2201 A private class for simulating data from a given mixed linear model.
2203 Parameters
2204 ----------
2205 model : MixedLM instance
2206 A mixed linear model
2207 params : array_like
2208 A parameter vector defining a mixed linear model. See
2209 notes for more information.
2210 scale : scalar
2211 The unexplained variance
2212 exog : array_like
2213 An array of fixed effect covariates. If None, model.exog
2214 is used.
2216 Notes
2217 -----
2218 The params array is a vector containing fixed effects parameters,
2219 random effects parameters, and variance component parameters, in
2220 that order. The lower triangle of the random effects covariance
2221 matrix is stored. The random effects and variance components
2222 parameters are divided by the scale parameter.
2224 This class is used in Mediation, and possibly elsewhere.
2225 """
2227 def __init__(self, model, params, scale, exog):
2229 self.model = model
2230 self.exog = exog if exog is not None else model.exog
2232 po = MixedLMParams.from_packed(
2233 params, model.k_fe, model.k_re, False, True)
2235 self.fe_params = po.fe_params
2236 self.cov_re = scale * po.cov_re
2237 self.vcomp = scale * po.vcomp
2238 self.scale = scale
2240 group_idx = np.zeros(model.nobs, dtype=np.int)
2241 for k, g in enumerate(model.group_labels):
2242 group_idx[model.row_indices[g]] = k
2243 self.group_idx = group_idx
2245 def rvs(self, n):
2246 """
2247 Return a vector of simulated values from a mixed linear
2248 model.
2250 The parameter n is ignored, but required by the interface
2251 """
2253 model = self.model
2255 # Fixed effects
2256 y = np.dot(self.exog, self.fe_params)
2258 # Random effects
2259 u = np.random.normal(size=(model.n_groups, model.k_re))
2260 u = np.dot(u, np.linalg.cholesky(self.cov_re).T)
2261 y += (u[self.group_idx, :] * model.exog_re).sum(1)
2263 # Variance components
2264 for j, _ in enumerate(model.exog_vc.names):
2265 ex = model.exog_vc.mats[j]
2266 v = self.vcomp[j]
2267 for i, g in enumerate(model.group_labels):
2268 exg = ex[i]
2269 ii = model.row_indices[g]
2270 u = np.random.normal(size=exg.shape[1])
2271 y[ii] += np.sqrt(v) * np.dot(exg, u)
2273 # Residual variance
2274 y += np.sqrt(self.scale) * np.random.normal(size=len(y))
2276 return y
2279class MixedLMResults(base.LikelihoodModelResults, base.ResultMixin):
2280 '''
2281 Class to contain results of fitting a linear mixed effects model.
2283 MixedLMResults inherits from statsmodels.LikelihoodModelResults
2285 Parameters
2286 ----------
2287 See statsmodels.LikelihoodModelResults
2289 Attributes
2290 ----------
2291 model : class instance
2292 Pointer to MixedLM model instance that called fit.
2293 normalized_cov_params : ndarray
2294 The sampling covariance matrix of the estimates
2295 params : ndarray
2296 A packed parameter vector for the profile parameterization.
2297 The first `k_fe` elements are the estimated fixed effects
2298 coefficients. The remaining elements are the estimated
2299 variance parameters. The variance parameters are all divided
2300 by `scale` and are not the variance parameters shown
2301 in the summary.
2302 fe_params : ndarray
2303 The fitted fixed-effects coefficients
2304 cov_re : ndarray
2305 The fitted random-effects covariance matrix
2306 bse_fe : ndarray
2307 The standard errors of the fitted fixed effects coefficients
2308 bse_re : ndarray
2309 The standard errors of the fitted random effects covariance
2310 matrix and variance components. The first `k_re * (k_re + 1)`
2311 parameters are the standard errors for the lower triangle of
2312 `cov_re`, the remaining elements are the standard errors for
2313 the variance components.
2315 See Also
2316 --------
2317 statsmodels.LikelihoodModelResults
2318 '''
2320 def __init__(self, model, params, cov_params):
2322 super(MixedLMResults, self).__init__(model, params,
2323 normalized_cov_params=cov_params)
2324 self.nobs = self.model.nobs
2325 self.df_resid = self.nobs - np.linalg.matrix_rank(self.model.exog)
2327 @cache_readonly
2328 def fittedvalues(self):
2329 """
2330 Returns the fitted values for the model.
2332 The fitted values reflect the mean structure specified by the
2333 fixed effects and the predicted random effects.
2334 """
2335 fit = np.dot(self.model.exog, self.fe_params)
2336 re = self.random_effects
2337 for group_ix, group in enumerate(self.model.group_labels):
2338 ix = self.model.row_indices[group]
2340 mat = []
2341 if self.model.exog_re_li is not None:
2342 mat.append(self.model.exog_re_li[group_ix])
2343 for j in range(self.k_vc):
2344 mat.append(self.model.exog_vc.mats[j][group_ix])
2345 mat = np.concatenate(mat, axis=1)
2347 fit[ix] += np.dot(mat, re[group])
2349 return fit
2351 @cache_readonly
2352 def resid(self):
2353 """
2354 Returns the residuals for the model.
2356 The residuals reflect the mean structure specified by the
2357 fixed effects and the predicted random effects.
2358 """
2359 return self.model.endog - self.fittedvalues
2361 @cache_readonly
2362 def bse_fe(self):
2363 """
2364 Returns the standard errors of the fixed effect regression
2365 coefficients.
2366 """
2367 p = self.model.exog.shape[1]
2368 return np.sqrt(np.diag(self.cov_params())[0:p])
2370 @cache_readonly
2371 def bse_re(self):
2372 """
2373 Returns the standard errors of the variance parameters.
2375 The first `k_re x (k_re + 1)` elements of the returned array
2376 are the standard errors of the lower triangle of `cov_re`.
2377 The remaining elements are the standard errors of the variance
2378 components.
2380 Note that the sampling distribution of variance parameters is
2381 strongly skewed unless the sample size is large, so these
2382 standard errors may not give meaningful confidence intervals
2383 or p-values if used in the usual way.
2384 """
2385 p = self.model.exog.shape[1]
2386 return np.sqrt(self.scale * np.diag(self.cov_params())[p:])
2388 def _expand_re_names(self, group_ix):
2389 names = list(self.model.data.exog_re_names)
2391 for j, v in enumerate(self.model.exog_vc.names):
2392 vg = self.model.exog_vc.colnames[j][group_ix]
2393 na = ["%s[%s]" % (v, s) for s in vg]
2394 names.extend(na)
2396 return names
2398 @cache_readonly
2399 def random_effects(self):
2400 """
2401 The conditional means of random effects given the data.
2403 Returns
2404 -------
2405 random_effects : dict
2406 A dictionary mapping the distinct `group` values to the
2407 conditional means of the random effects for the group
2408 given the data.
2409 """
2410 try:
2411 cov_re_inv = np.linalg.inv(self.cov_re)
2412 except np.linalg.LinAlgError:
2413 raise ValueError("Cannot predict random effects from " +
2414 "singular covariance structure.")
2416 vcomp = self.vcomp
2417 k_re = self.k_re
2419 ranef_dict = {}
2420 for group_ix, group in enumerate(self.model.group_labels):
2422 endog = self.model.endog_li[group_ix]
2423 exog = self.model.exog_li[group_ix]
2424 ex_r = self.model._aex_r[group_ix]
2425 ex2_r = self.model._aex_r2[group_ix]
2426 vc_var = self.model._expand_vcomp(vcomp, group_ix)
2428 # Get the residuals relative to fixed effects
2429 resid = endog
2430 if self.k_fe > 0:
2431 expval = np.dot(exog, self.fe_params)
2432 resid = resid - expval
2434 solver = _smw_solver(self.scale, ex_r, ex2_r, cov_re_inv,
2435 1 / vc_var)
2436 vir = solver(resid)
2438 xtvir = _dot(ex_r.T, vir)
2440 xtvir[0:k_re] = np.dot(self.cov_re, xtvir[0:k_re])
2441 xtvir[k_re:] *= vc_var
2442 ranef_dict[group] = pd.Series(
2443 xtvir, index=self._expand_re_names(group_ix))
2445 return ranef_dict
2447 @cache_readonly
2448 def random_effects_cov(self):
2449 """
2450 Returns the conditional covariance matrix of the random
2451 effects for each group given the data.
2453 Returns
2454 -------
2455 random_effects_cov : dict
2456 A dictionary mapping the distinct values of the `group`
2457 variable to the conditional covariance matrix of the
2458 random effects given the data.
2459 """
2461 try:
2462 cov_re_inv = np.linalg.inv(self.cov_re)
2463 except np.linalg.LinAlgError:
2464 cov_re_inv = None
2466 vcomp = self.vcomp
2468 ranef_dict = {}
2469 for group_ix in range(self.model.n_groups):
2471 ex_r = self.model._aex_r[group_ix]
2472 ex2_r = self.model._aex_r2[group_ix]
2473 label = self.model.group_labels[group_ix]
2474 vc_var = self.model._expand_vcomp(vcomp, group_ix)
2476 solver = _smw_solver(self.scale, ex_r, ex2_r, cov_re_inv,
2477 1 / vc_var)
2479 n = ex_r.shape[0]
2480 m = self.cov_re.shape[0]
2481 mat1 = np.empty((n, m + len(vc_var)))
2482 mat1[:, 0:m] = np.dot(ex_r[:, 0:m], self.cov_re)
2483 mat1[:, m:] = np.dot(ex_r[:, m:], np.diag(vc_var))
2484 mat2 = solver(mat1)
2485 mat2 = np.dot(mat1.T, mat2)
2487 v = -mat2
2488 v[0:m, 0:m] += self.cov_re
2489 ix = np.arange(m, v.shape[0])
2490 v[ix, ix] += vc_var
2491 na = self._expand_re_names(group_ix)
2492 v = pd.DataFrame(v, index=na, columns=na)
2493 ranef_dict[label] = v
2495 return ranef_dict
2497 # Need to override since t-tests are only used for fixed effects
2498 # parameters.
2499 def t_test(self, r_matrix, scale=None, use_t=None):
2500 """
2501 Compute a t-test for a each linear hypothesis of the form Rb = q
2503 Parameters
2504 ----------
2505 r_matrix : array_like
2506 If an array is given, a p x k 2d array or length k 1d
2507 array specifying the linear restrictions. It is assumed
2508 that the linear combination is equal to zero.
2509 scale : float, optional
2510 An optional `scale` to use. Default is the scale specified
2511 by the model fit.
2512 use_t : bool, optional
2513 If use_t is None, then the default of the model is used.
2514 If use_t is True, then the p-values are based on the t
2515 distribution.
2516 If use_t is False, then the p-values are based on the normal
2517 distribution.
2519 Returns
2520 -------
2521 res : ContrastResults instance
2522 The results for the test are attributes of this results instance.
2523 The available results have the same elements as the parameter table
2524 in `summary()`.
2525 """
2526 if scale is not None:
2527 import warnings
2528 warnings.warn('scale is has no effect and is deprecated. It will'
2529 'be removed in the next version.',
2530 DeprecationWarning)
2532 if r_matrix.shape[1] != self.k_fe:
2533 raise ValueError("r_matrix for t-test should have %d columns"
2534 % self.k_fe)
2536 d = self.k_re2 + self.k_vc
2537 z0 = np.zeros((r_matrix.shape[0], d))
2538 r_matrix = np.concatenate((r_matrix, z0), axis=1)
2539 tst_rslt = super(MixedLMResults, self).t_test(r_matrix, use_t=use_t)
2540 return tst_rslt
2542 def summary(self, yname=None, xname_fe=None, xname_re=None,
2543 title=None, alpha=.05):
2544 """
2545 Summarize the mixed model regression results.
2547 Parameters
2548 ----------
2549 yname : str, optional
2550 Default is `y`
2551 xname_fe : list[str], optional
2552 Fixed effects covariate names
2553 xname_re : list[str], optional
2554 Random effects covariate names
2555 title : str, optional
2556 Title for the top table. If not None, then this replaces
2557 the default title
2558 alpha : float
2559 significance level for the confidence intervals
2561 Returns
2562 -------
2563 smry : Summary instance
2564 this holds the summary tables and text, which can be
2565 printed or converted to various output formats.
2567 See Also
2568 --------
2569 statsmodels.iolib.summary2.Summary : class to hold summary results
2570 """
2572 from statsmodels.iolib import summary2
2573 smry = summary2.Summary()
2575 info = OrderedDict()
2576 info["Model:"] = "MixedLM"
2577 if yname is None:
2578 yname = self.model.endog_names
2580 param_names = self.model.data.param_names[:]
2581 k_fe_params = len(self.fe_params)
2582 k_re_params = len(param_names) - len(self.fe_params)
2584 if xname_fe is not None:
2585 if len(xname_fe) != k_fe_params:
2586 msg = "xname_fe should be a list of length %d" % k_fe_params
2587 raise ValueError(msg)
2588 param_names[:k_fe_params] = xname_fe
2590 if xname_re is not None:
2591 if len(xname_re) != k_re_params:
2592 msg = "xname_re should be a list of length %d" % k_re_params
2593 raise ValueError(msg)
2594 param_names[k_fe_params:] = xname_re
2596 info["No. Observations:"] = str(self.model.n_totobs)
2597 info["No. Groups:"] = str(self.model.n_groups)
2599 gs = np.array([len(x) for x in self.model.endog_li])
2600 info["Min. group size:"] = "%.0f" % min(gs)
2601 info["Max. group size:"] = "%.0f" % max(gs)
2602 info["Mean group size:"] = "%.1f" % np.mean(gs)
2604 info["Dependent Variable:"] = yname
2605 info["Method:"] = self.method
2606 info["Scale:"] = self.scale
2607 info["Log-Likelihood:"] = self.llf
2608 info["Converged:"] = "Yes" if self.converged else "No"
2609 smry.add_dict(info)
2610 smry.add_title("Mixed Linear Model Regression Results")
2612 float_fmt = "%.3f"
2614 sdf = np.nan * np.ones((self.k_fe + self.k_re2 + self.k_vc, 6))
2616 # Coefficient estimates
2617 sdf[0:self.k_fe, 0] = self.fe_params
2619 # Standard errors
2620 sdf[0:self.k_fe, 1] = np.sqrt(np.diag(self.cov_params()[0:self.k_fe]))
2622 # Z-scores
2623 sdf[0:self.k_fe, 2] = sdf[0:self.k_fe, 0] / sdf[0:self.k_fe, 1]
2625 # p-values
2626 sdf[0:self.k_fe, 3] = 2 * norm.cdf(-np.abs(sdf[0:self.k_fe, 2]))
2628 # Confidence intervals
2629 qm = -norm.ppf(alpha / 2)
2630 sdf[0:self.k_fe, 4] = sdf[0:self.k_fe, 0] - qm * sdf[0:self.k_fe, 1]
2631 sdf[0:self.k_fe, 5] = sdf[0:self.k_fe, 0] + qm * sdf[0:self.k_fe, 1]
2633 # All random effects variances and covariances
2634 jj = self.k_fe
2635 for i in range(self.k_re):
2636 for j in range(i + 1):
2637 sdf[jj, 0] = self.cov_re[i, j]
2638 sdf[jj, 1] = np.sqrt(self.scale) * self.bse[jj]
2639 jj += 1
2641 # Variance components
2642 for i in range(self.k_vc):
2643 sdf[jj, 0] = self.vcomp[i]
2644 sdf[jj, 1] = np.sqrt(self.scale) * self.bse[jj]
2645 jj += 1
2647 sdf = pd.DataFrame(index=param_names, data=sdf)
2648 sdf.columns = ['Coef.', 'Std.Err.', 'z', 'P>|z|',
2649 '[' + str(alpha/2), str(1-alpha/2) + ']']
2650 for col in sdf.columns:
2651 sdf[col] = [float_fmt % x if np.isfinite(x) else ""
2652 for x in sdf[col]]
2654 smry.add_df(sdf, align='r')
2656 return smry
2658 @cache_readonly
2659 def llf(self):
2660 return self.model.loglike(self.params_object, profile_fe=False)
2662 @cache_readonly
2663 def aic(self):
2664 """Akaike information criterion"""
2665 if self.reml:
2666 return np.nan
2667 if self.freepat is not None:
2668 df = self.freepat.get_packed(use_sqrt=False, has_fe=True).sum() + 1
2669 else:
2670 df = self.params.size + 1
2671 return -2 * (self.llf - df)
2673 @cache_readonly
2674 def bic(self):
2675 """Bayesian information criterion"""
2676 if self.reml:
2677 return np.nan
2678 if self.freepat is not None:
2679 df = self.freepat.get_packed(use_sqrt=False, has_fe=True).sum() + 1
2680 else:
2681 df = self.params.size + 1
2682 return -2 * self.llf + np.log(self.nobs) * df
2684 def profile_re(self, re_ix, vtype, num_low=5, dist_low=1., num_high=5,
2685 dist_high=1.):
2686 """
2687 Profile-likelihood inference for variance parameters.
2689 Parameters
2690 ----------
2691 re_ix : int
2692 If vtype is `re`, this value is the index of the variance
2693 parameter for which to construct a profile likelihood. If
2694 `vtype` is 'vc' then `re_ix` is the name of the variance
2695 parameter to be profiled.
2696 vtype : str
2697 Either 're' or 'vc', depending on whether the profile
2698 analysis is for a random effect or a variance component.
2699 num_low : int
2700 The number of points at which to calculate the likelihood
2701 below the MLE of the parameter of interest.
2702 dist_low : float
2703 The distance below the MLE of the parameter of interest to
2704 begin calculating points on the profile likelihood.
2705 num_high : int
2706 The number of points at which to calculate the likelihood
2707 above the MLE of the parameter of interest.
2708 dist_high : float
2709 The distance above the MLE of the parameter of interest to
2710 begin calculating points on the profile likelihood.
2712 Returns
2713 -------
2714 An array with two columns. The first column contains the
2715 values to which the parameter of interest is constrained. The
2716 second column contains the corresponding likelihood values.
2718 Notes
2719 -----
2720 Only variance parameters can be profiled.
2721 """
2723 pmodel = self.model
2724 k_fe = pmodel.k_fe
2725 k_re = pmodel.k_re
2726 k_vc = pmodel.k_vc
2727 endog, exog = pmodel.endog, pmodel.exog
2729 # Need to permute the columns of the random effects design
2730 # matrix so that the profiled variable is in the first column.
2731 if vtype == 're':
2732 ix = np.arange(k_re)
2733 ix[0] = re_ix
2734 ix[re_ix] = 0
2735 exog_re = pmodel.exog_re.copy()[:, ix]
2737 # Permute the covariance structure to match the permuted
2738 # design matrix.
2739 params = self.params_object.copy()
2740 cov_re_unscaled = params.cov_re
2741 cov_re_unscaled = cov_re_unscaled[np.ix_(ix, ix)]
2742 params.cov_re = cov_re_unscaled
2743 ru0 = cov_re_unscaled[0, 0]
2745 # Convert dist_low and dist_high to the profile
2746 # parameterization
2747 cov_re = self.scale * cov_re_unscaled
2748 low = (cov_re[0, 0] - dist_low) / self.scale
2749 high = (cov_re[0, 0] + dist_high) / self.scale
2751 elif vtype == 'vc':
2752 re_ix = self.model.exog_vc.names.index(re_ix)
2753 params = self.params_object.copy()
2754 vcomp = self.vcomp
2755 low = (vcomp[re_ix] - dist_low) / self.scale
2756 high = (vcomp[re_ix] + dist_high) / self.scale
2757 ru0 = vcomp[re_ix] / self.scale
2759 # Define the sequence of values to which the parameter of
2760 # interest will be constrained.
2761 if low <= 0:
2762 raise ValueError("dist_low is too large and would result in a "
2763 "negative variance. Try a smaller value.")
2764 left = np.linspace(low, ru0, num_low + 1)
2765 right = np.linspace(ru0, high, num_high+1)[1:]
2766 rvalues = np.concatenate((left, right))
2768 # Indicators of which parameters are free and fixed.
2769 free = MixedLMParams(k_fe, k_re, k_vc)
2770 if self.freepat is None:
2771 free.fe_params = np.ones(k_fe)
2772 vcomp = np.ones(k_vc)
2773 mat = np.ones((k_re, k_re))
2774 else:
2775 # If a freepat already has been specified, we add the
2776 # constraint to it.
2777 free.fe_params = self.freepat.fe_params
2778 vcomp = self.freepat.vcomp
2779 mat = self.freepat.cov_re
2780 if vtype == 're':
2781 mat = mat[np.ix_(ix, ix)]
2782 if vtype == 're':
2783 mat[0, 0] = 0
2784 else:
2785 vcomp[re_ix] = 0
2786 free.cov_re = mat
2787 free.vcomp = vcomp
2789 klass = self.model.__class__
2790 init_kwargs = pmodel._get_init_kwds()
2791 if vtype == 're':
2792 init_kwargs['exog_re'] = exog_re
2794 likev = []
2795 for x in rvalues:
2797 model = klass(endog, exog, **init_kwargs)
2799 if vtype == 're':
2800 cov_re = params.cov_re.copy()
2801 cov_re[0, 0] = x
2802 params.cov_re = cov_re
2803 else:
2804 params.vcomp[re_ix] = x
2806 # TODO should use fit_kwargs
2807 rslt = model.fit(start_params=params, free=free,
2808 reml=self.reml, cov_pen=self.cov_pen)._results
2809 likev.append([x * rslt.scale, rslt.llf])
2811 likev = np.asarray(likev)
2813 return likev
2816class MixedLMResultsWrapper(base.LikelihoodResultsWrapper):
2817 _attrs = {'bse_re': ('generic_columns', 'exog_re_names_full'),
2818 'fe_params': ('generic_columns', 'xnames'),
2819 'bse_fe': ('generic_columns', 'xnames'),
2820 'cov_re': ('generic_columns_2d', 'exog_re_names'),
2821 'cov_re_unscaled': ('generic_columns_2d', 'exog_re_names'),
2822 }
2823 _upstream_attrs = base.LikelihoodResultsWrapper._wrap_attrs
2824 _wrap_attrs = base.wrap.union_dicts(_attrs, _upstream_attrs)
2826 _methods = {}
2827 _upstream_methods = base.LikelihoodResultsWrapper._wrap_methods
2828 _wrap_methods = base.wrap.union_dicts(_methods, _upstream_methods)
2831def _handle_missing(data, groups, formula, re_formula, vc_formula):
2833 tokens = set([])
2835 forms = [formula]
2836 if re_formula is not None:
2837 forms.append(re_formula)
2838 if vc_formula is not None:
2839 forms.extend(vc_formula.values())
2841 import tokenize
2842 from io import StringIO
2843 from statsmodels.compat.python import asunicode
2844 skiptoks = {"(", ")", "*", ":", "+", "-", "**", "/"}
2846 for fml in forms:
2847 # Unicode conversion is for Py2 compatability
2848 rl = StringIO(fml)
2850 def rlu():
2851 line = rl.readline()
2852 return asunicode(line, 'ascii')
2853 g = tokenize.generate_tokens(rlu)
2854 for tok in g:
2855 if tok not in skiptoks:
2856 tokens.add(tok.string)
2857 tokens = sorted(tokens & set(data.columns))
2859 data = data[tokens]
2860 ii = pd.notnull(data).all(1)
2861 if type(groups) != "str":
2862 ii &= pd.notnull(groups)
2864 return data.loc[ii, :], groups[np.asarray(ii)]