cubmods package#
Submodules#
cubmods.cub module#
CUB models in Python. Module for CUB (Combination of Uniform and Binomial).
Description#
This module contains methods and classes for CUB model family.
Manual, Examples and References:#
See the Models manual
List of TODOs:#
…
Credits#
- Author:
Massimo Pierini
- Date:
2023-24
- Credits:
Domenico Piccolo, Rosaria Simone
- Contacts:
Classes and Functions#
- class cubmods.cub.CUBresCUB00(model, df, formula, m, n, sample, f, theoric, diss, est_names, estimates, e_types, varmat, stderrs, pval, wald, loglike, muloglik, loglikuni, AIC, BIC, seconds, time_exe, logliksat=None, dev=None, logliksatcov=None, niter=None, maxiter=None, tol=None, sh=None, rho=None, ass_pars=None)#
Bases:
CUBres
Object returned by
.mle()
function. See here the Base for details.Methods
as_dataframe
()DataFrame of estimated parameters
as_txt
()Print the summary.
plot
([ci, saveas, figsize])Main function to plot an object of the Class.
plot_confell
([figsize, ci, equal, ...])Plots the estimated parameter values in the parameter space and the asymptotic confidence ellipse.
plot_ordinal
([figsize, kind, ax, saveas])Plots relative frequencies of observed sample, estimated probability distribution and, if provided, probability distribution of a known model.
save
(fname)Save a CUBresult object to file named
fname
+.cub.fit
summary
()Call
as_txt()
- plot(ci=0.95, saveas=None, figsize=(7, 15))#
Main function to plot an object of the Class.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figureci (float) – level \((1-\alpha/2)\) for the confidence ellipse
saveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- plot_confell(figsize=(7, 5), ci=0.95, equal=True, magnified=False, ax=None, saveas=None)#
Plots the estimated parameter values in the parameter space and the asymptotic confidence ellipse.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figure (useful only ifax
is not None)ci (float) – level \((1-\alpha/2)\) for the confidence ellipse
equal (bool) – if the plot must have equal aspect
magnified (bool) – if False the limits will be the entire parameter space, otherwise let matplotlib choose the limits
ax (matplolib ax, optional) – matplotlib axis, if None a new figure will be created, defaults to None
saveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- plot_ordinal(figsize=(7, 5), kind='bar', ax=None, saveas=None)#
Plots relative frequencies of observed sample, estimated probability distribution and, if provided, probability distribution of a known model.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figure (useful only ifax
is not None)kind (str) – choose a barplot (
'bar'
default) of a scatterplot ('scatter'
)ax (matplolib ax, optional) – matplotlib axis, if None a new figure will be created, defaults to None
saveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- cubmods.cub.cmf(m, pi, xi)#
Cumulative probability of a specified CUB model.
\(\Pr(R \leq r | \pmb\theta),\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
xi (float) – feeling parameter \(\xi\)
- Returns:
an array of the CMF for the specified model
- Return type:
numpy array
- cubmods.cub.draw(m, pi, xi, n, df, formula, seed=None)#
Draw a random sample from a specified CUB model.
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
xi (float) – feeling parameter \(\xi\)
n (int) – number of ordinal responses to be drawn
df (DataFrame) – original DataFrame
formula (str) – the formula used
seed (int, optional) – the seed to ensure reproducibility, defaults to None
- Returns:
an instance of
CUBsample
(see here) containing ordinal responses drawn from the specified model
- cubmods.cub.gini(m, pi, xi)#
The Gini index of a specified CUB model.
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
xi (float) – feeling parameter \(\xi\)
- Returns:
the Gini index of the model
- Return type:
float
- cubmods.cub.init_theta(f, m)#
Preliminary estimators for CUB models without covariates.
Computes preliminary parameter estimates of a CUB model without covariates for given ordinal responses. These preliminary estimators are used within the package code to start the E-M algorithm.
- Parameters:
f (array of int) – array of the absolute frequencies of given ordinal responses
m (int) – number of ordinal categories
- Returns:
a tuple of \((\pi^{(0)}, \xi^{(0)})\)
- cubmods.cub.laakso(m, pi, xi)#
The Laakso index of a specified CUB model.
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
xi (float) – feeling parameter \(\xi\)
- Returns:
the Laakso index of the model
- Return type:
float
- cubmods.cub.loglik(m, pi, xi, f)#
Compute the log-likelihood function of a CUB model without covariates for a given absolute frequency distribution.
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
xi (float) – feeling parameter \(\xi\)
f (array of int) – array of absolute frequency distribution
- Returns:
the log-likelihood value
- Return type:
float
- cubmods.cub.mean(m, pi, xi)#
Expected value of a specified CUB model.
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
xi (float) – feeling parameter \(\xi\)
- Returns:
the expected value of the model
- Return type:
float
- cubmods.cub.median(m, pi, xi)#
The median of a specified CUB model.
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
xi (float) – feeling parameter \(\xi\)
- Returns:
the median of the model
- Return type:
float
- cubmods.cub.mle(sample, m, df, formula, ass_pars=None, maxiter=500, tol=0.0001)#
Main function for CUB models without covariates.
Function to estimate and validate a CUB model without covariates for given ordinal responses.
- Parameters:
sample (array of int) – array of ordinal responses
m (int) – number of ordinal categories
df (DataFrame) – original DataFrame
formula (str) – the formula used
ass_pars (dictionary, optional) – dictionary of hypothesized parameters, defaults to None
maxiter (int) – maximum number of iterations allowed for running the optimization algorithm
tol (float) – fixed error tolerance for final estimates
- Returns:
an instance of
CUBresCUB00
(see the Class for details)- Return type:
object
- cubmods.cub.pmf(m, pi, xi)#
Probability distribution of a specified CUB model.
\(\Pr(R = r | \pmb\theta),\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
xi (float) – feeling parameter \(\xi\)
- Returns:
the vector of the probability distribution of a CUB model.
- Return type:
numpy array
- cubmods.cub.prob(m, pi, xi, r)#
Probability \(\Pr(R = r | \pmb\theta)\) of a specified CUB model.
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
xi (float) – feeling parameter \(\xi\)
r (int) – ordinal value (must be \(1 \leq r \leq m\))
- Returns:
the probability \(\Pr(R = r | \pmb\theta)\)
- Return type:
float
- cubmods.cub.skew(pi, xi)#
Skewness normalized \(\eta\) index
- Parameters:
pi (float) – uncertainty parameter \(\pi\)
xi (float) – feeling parameter \(\xi\)
- Returns:
the skewness of the model
- Return type:
float
- cubmods.cub.std(m, pi, xi)#
Standard deviation of a specified CUB model.
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
xi (float) – feeling parameter \(\xi\)
- Returns:
the standard deviation of the model
- Return type:
float
- cubmods.cub.var(m, pi, xi)#
Variance of a specified CUB model.
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
xi (float) – feeling parameter \(\xi\)
- Returns:
the variance of the model
- Return type:
float
- cubmods.cub.varcov(m, pi, xi, ordinal)#
Compute the variance-covariance matrix of parameter estimates of a CUB model without covariates.
- References:
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
xi (float) – feeling parameter \(\xi\)
ordinal (array of int) – array of ordinal responses
- Returns:
the variance-covariance matrix of the CUB model
- Return type:
numpy ndarray
cubmods.cub_0w module#
CUB models in Python. Module for CUB (Combination of Uniform and Binomial) with covariates for the feeling component.
Description:#
This module contains methods and classes for CUB_0W model family.
Manual, Examples and References:#
List of TODOs:#
…
Credits#
- Author:
Massimo Pierini
- Date:
2023-24
- Credits:
Domenico Piccolo, Rosaria Simone
- Contacts:
Classes and Functions#
- class cubmods.cub_0w.CUBresCUB0W(model, df, formula, m, n, sample, f, theoric, diss, est_names, estimates, e_types, varmat, stderrs, pval, wald, loglike, muloglik, loglikuni, AIC, BIC, seconds, time_exe, logliksat=None, dev=None, logliksatcov=None, niter=None, maxiter=None, tol=None, sh=None, rho=None, ass_pars=None)#
Bases:
CUBres
Object returned by
.mle()
function. See here the Base for details.Methods
as_dataframe
()DataFrame of estimated parameters
as_txt
()Print the summary.
plot
([saveas, figsize])Main function to plot an object of the Class.
plot_ordinal
([figsize, ax, kind, saveas])Plots relative average frequencies of observed sample, estimated average probability distribution and, if provided, average probability distribution of a known model.
save
(fname)Save a CUBresult object to file named
fname
+.cub.fit
summary
()Call
as_txt()
- plot(saveas=None, figsize=(7, 5))#
Main function to plot an object of the Class.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figuresaveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- plot_ordinal(figsize=(7, 5), ax=None, kind='bar', saveas=None)#
Plots relative average frequencies of observed sample, estimated average probability distribution and, if provided, average probability distribution of a known model.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figure (useful only ifax
is not None)kind (str) – choose a barplot (
'bar'
default) of a scatterplot ('scatter'
)ax (matplolib ax, optional) – matplotlib axis, if None a new figure will be created, defaults to None
saveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- cubmods.cub_0w.draw(m, pi, gamma, W, df, formula, seed=None)#
Draw a random sample from a specified CUB model with covariates for the feeling component.
- Parameters:
m (int) – number of ordinal categories
n (int) – number of ordinal responses to be drawn
pi (float) – uncertainty parameter \(\pi\)
gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)W (pandas dataframe) – dataframe of covariates for explaining the feeling component
df (DataFrame) – original DataFrame
formula (str) – the formula used
seed (int, optional) – the seed to ensure reproducibility, defaults to None; it must be \(\neq 0\)
- Returns:
an instance of
CUBsample
(see here) containing ordinal responses drawn from the specified model
- cubmods.cub_0w.effe01(gamma, esterno01, m)#
Auxiliary function for the log-likelihood estimation of CUB models with covariates for the feeling component.
Compute the opposite of the scalar function that is maximized when running the E-M algorithm for CUB models with covariates for the feeling parameter.
It is called as an argument for
minimize
within CUB function for models with covariates for feeling or for both feeling and uncertainty.- Parameters:
gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)esterno01 – a matrix binding together: the vector \(\pmb\tau\) of the posterior probabilities that each observation has been generated by the first component distribution of the mixture, the ordinal data \(\pmb r\) and the matrix \(\pmb w\) of the selected covariates accounting for an intercept term
m (int) – number of ordinal categories
- Returns:
the expected value of the inconplete log-likelihood
- Return type:
float
- cubmods.cub_0w.init_gamma(sample, m, W)#
Preliminary parameter estimates of a CUB model with covariates for the feeling component.
Compute preliminary parameter estimates for the feeling component of a CUB model fitted to ordinal responses. These estimates are set as initial values for parameters to start the E-M algorithm.
- Parameters:
sample (array of int) – array of ordinal responses
m (int) – number of ordinal categories
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
- Returns:
an array \(\pmb\gamma^{(0)}\)
- Return type:
array of float
- cubmods.cub_0w.loglik(sample, m, pi, gamma, W)#
Log-likelihood function of a CUB model with covariates for the feeling component
Compute the log-likelihood function of a CUB model fitting ordinal data, with covariates for explaining the feeling component.
- Parameters:
sample (array of int) – array of ordinal responses
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)W (pandas dataframe) – dataframe of covariates for explaining the feeling component
- Returns:
the log-likelihood value
- Return type:
float
- cubmods.cub_0w.mle(sample, m, W, df, formula, ass_pars=None, maxiter=500, tol=0.0001)#
Main function for CUB models with covariates for the feeling component.
Function to estimate and validate a CUB model for given ordinal responses, with covariates for explaining the feeling component.
- Parameters:
sample (array of int) – array of ordinal responses
m (int) – number of ordinal categories
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
df (DataFrame) – original DataFrame
formula (str) – the formula used
ass_pars (dictionary, optional) – dictionary of hypothesized parameters, defaults to None
maxiter (int) – maximum number of iterations allowed for running the optimization algorithm
tol (float) – fixed error tolerance for final estimates
- Returns:
an instance of
CUBresCUB0W
(see the Class for details)- Return type:
object
- cubmods.cub_0w.pmf(m, pi, gamma, W)#
Average probability distribution of a specified CUB model with covariates for the feeling component.
\(\frac{1}{n} \sum_{i=1}^n \Pr(R_i=r|\pmb\theta; \pmb T_i),\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)W (pandas dataframe) – dataframe of covariates for explaining the feeling component
- Returns:
the vector of the probability distribution.
- Return type:
numpy array
- cubmods.cub_0w.pmfi(m, pi, gamma, W)#
Probability distribution for each subject of a specified CUB model with covariates for the feeling component.
Auxiliary function of
.draw()
.\(\Pr(R_i=r|\pmb\theta; \pmb T_i),\; i=1 \ldots n ,\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)W (pandas dataframe) – dataframe of covariates for explaining the feeling component
- Returns:
the matrix of the probability distribution of dimension \(n \times r\)
- Return type:
numpy ndarray
- cubmods.cub_0w.prob(sample, m, pi, gamma, W)#
Probability distribution of a CUB model with covariates for the feeling component given an observed sample
Compute the probability distribution of a CUB model with covariates for the feeling component, given an observed sample.
\(\Pr(R_i=r_i|\pmb\theta;\pmb T_i),\; i=1 \ldots n\)
- Parameters:
sample (array of int) – array of ordinal responses
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)W (pandas dataframe) – dataframe of covariates for explaining the feeling component
- Returns:
the array of the probability distribution.
- Return type:
numpy array
- cubmods.cub_0w.varcov(sample, m, pi, gamma, W)#
Variance-covariance matrix of CUB models with covariates for the feeling component
Compute the variance-covariance matrix of parameter estimates of a CUB model with covariates for the feeling component.
- Parameters:
sample (array of int) – array of ordinal responses
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)W (pandas dataframe) – dataframe of covariates for explaining the feeling component
- Returns:
the variance-covariance matrix of the CUB model
- Return type:
numpy ndarray
cubmods.cub_y0 module#
CUB models in Python. Module for CUB (Combination of Uniform and Binomial) with covariates for the uncertainty component.
Description:#
This module contains methods and classes for CUB_Y0 model family.
Manual, Examples and References:#
List of TODOs:#
…
Credits#
- Author:
Massimo Pierini
- Date:
2023-24
- Credits:
Domenico Piccolo, Rosaria Simone
- Contacts:
Classes and Functions#
- class cubmods.cub_y0.CUBresCUBY0(model, df, formula, m, n, sample, f, theoric, diss, est_names, estimates, e_types, varmat, stderrs, pval, wald, loglike, muloglik, loglikuni, AIC, BIC, seconds, time_exe, logliksat=None, dev=None, logliksatcov=None, niter=None, maxiter=None, tol=None, sh=None, rho=None, ass_pars=None)#
Bases:
CUBres
Object returned by
.mle()
function. See here the Base for details.Methods
as_dataframe
()DataFrame of estimated parameters
as_txt
()Print the summary.
plot
([saveas, figsize])Main function to plot an object of the Class.
plot_ordinal
([figsize, ax, kind, saveas])Plots relative average frequencies of observed sample, estimated average probability distribution and, if provided, average probability distribution of a known model.
save
(fname)Save a CUBresult object to file named
fname
+.cub.fit
summary
()Call
as_txt()
- plot(saveas=None, figsize=(7, 5))#
Main function to plot an object of the Class.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figuresaveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- plot_ordinal(figsize=(7, 5), ax=None, kind='bar', saveas=None)#
Plots relative average frequencies of observed sample, estimated average probability distribution and, if provided, average probability distribution of a known model.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figure (useful only ifax
is not None)kind (str) – choose a barplot (
'bar'
default) of a scatterplot ('scatter'
)ax (matplolib ax, optional) – matplotlib axis, if None a new figure will be created, defaults to None
saveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- cubmods.cub_y0.draw(m, beta, xi, Y, df, formula, seed=None)#
Draw a random sample from a specified CUB model with covariates for the uncertainty component.
- Parameters:
m (int) – number of ordinal categories
n (int) – number of ordinal responses to be drawn
xi (float) – uncertainty parameter \(\xi\)
beta (array of float) – array \(\pmb \beta\) of parameters for the uncertainty component, whose length equals
Y.columns.size+1
to include an intercept term in the model (first entry)Y (pandas dataframe) – dataframe of covariates for explaining the uncertainty component
df (DataFrame) – original DataFrame
formula (str) – the formula used
seed (int, optional) – the seed to ensure reproducibility, defaults to None; it must be \(\neq 0\)
- Returns:
an instance of
CUBsample
(see here) containing ordinal responses drawn from the specified model
- cubmods.cub_y0.effe10(beta, esterno10)#
Auxiliary function for the log-likelihood estimation of CUB models.
Compute the opposite of the scalar function that is maximized when running the E-M algorithm for CUB models with covariates for the uncertainty parameter.
- It is called as an argument for optim within CUB function for models with covariates for
uncertainty or for both feeling and uncertainty.
- Parameters:
beta (array of float) – array \(\pmb \beta\) of parameters for the uncertainty component, whose length equals
Y.columns.size+1
to include an intercept term in the model (first entry)esterno10 – A matrix binding together: the matrix \(\pmb y\) of the selected covariates (accounting for an intercept term) and a vector \(\tau\) (whose length equals the number of observations) of the posterior probabilities that each observation has been generated by the first component distribution of the mixture
- Returns:
the expected value of the inconplete log-likelihood
- Return type:
float
- cubmods.cub_y0.loglik(m, sample, Y, beta, xi)#
Log-likelihood function of a CUB model with covariates for the uncertainty component
Compute the log-likelihood function of a CUB model fitting ordinal responses with covariates for explaining the uncertainty component.
- Parameters:
sample (array of int) – array of ordinal responses
m (int) – number of ordinal categories
xi (float) – uncertainty parameter \(\xi\)
beta (array of float) – array \(\pmb \beta\) of parameters for the uncertainty component, whose length equals
Y.columns.size+1
to include an intercept term in the model (first entry)Y (pandas dataframe) – dataframe of covariates for explaining the uncertainty component
- Returns:
the log-likelihood value
- Return type:
float
- cubmods.cub_y0.mle(sample, m, Y, df, formula, ass_pars=None, maxiter=500, tol=0.0001)#
Main function for CUB models with covariates for the uncertainty component.
Estimate and validate a CUB model for given ordinal responses, with covariates for explaining the uncertainty component.
- Parameters:
sample (array of int) – array of ordinal responses
m (int) – number of ordinal categories
Y (pandas dataframe) – dataframe of covariates for explaining the uncertainty component
df (DataFrame) – original DataFrame
formula (str) – the formula used
ass_pars (dictionary, optional) – dictionary of hypothesized parameters, defaults to None
maxiter (int) – maximum number of iterations allowed for running the optimization algorithm
tol (float) – fixed error tolerance for final estimates
- Returns:
an instance of
CUBresCUBY0
(see the Class for details)- Return type:
object
- cubmods.cub_y0.pmf(m, beta, xi, Y)#
Average probability distribution of a specified CUB model with covariates.
\(\frac{1}{n} \sum_{i=1}^n \Pr(R_i=r|\pmb\theta; \pmb T_i),\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
xi (float) – feeling parameter \(\xi\)
beta (array of float) – array \(\pmb \beta\) of parameters for the uncertainty component, whose length equals
Y.columns.size+1
to include an intercept term in the model (first entry)Y (pandas dataframe) – dataframe of covariates for explaining the uncertainty component
- Returns:
the vector of the probability distribution.
- Return type:
numpy array
- cubmods.cub_y0.pmfi(m, beta, xi, Y)#
Probability distribution for each subject of a specified CUB model with covariates.
Auxiliary function of
.draw()
.\(\Pr(R_i=r|\pmb\theta; \pmb T_i),\; i=1 \ldots n ,\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
xi (float) – feeling parameter \(\xi\)
beta (array of float) – array \(\pmb \beta\) of parameters for the uncertainty component, whose length equals
Y.columns.size+1
to include an intercept term in the model (first entry)Y (pandas dataframe) – dataframe of covariates for explaining the uncertainty component
- Returns:
the matrix of the probability distribution of dimension \(n \times r\)
- Return type:
numpy ndarray
- cubmods.cub_y0.prob(m, sample, Y, beta, xi)#
Probability distribution of a CUB model with covariates for the uncertainty component given an observed sample
Compute the probability distribution of a CUB model with covariates for the feeling component, given an observed sample.
\(\Pr(R_i=r_i|\pmb\theta;\pmb T_i),\; i=1 \ldots n\)
- Parameters:
sample (array of int) – array of ordinal responses
m (int) – number of ordinal categories
xi (float) – uncertainty parameter \(\xi\)
beta (array of float) – array \(\pmb \beta\) of parameters for the uncertainty component, whose length equals
Y.columns.size+1
to include an intercept term in the model (first entry)Y (pandas dataframe) – dataframe of covariates for explaining the uncertainty component
- Returns:
the array of the probability distribution.
- Return type:
numpy array
- cubmods.cub_y0.varcov(m, sample, Y, beta, xi)#
Variance-covariance matrix of CUB model with covariates for the uncertainty parameter.
Compute the variance-covariance matrix of parameter estimates of a CUB model with covariates for the uncertainty component.
- Parameters:
sample (array of int) – array of ordinal responses
m (int) – number of ordinal categories
xi (float) – uncertainty parameter \(\xi\)
beta (array of float) – array \(\pmb \beta\) of parameters for the uncertainty component, whose length equals
Y.columns.size+1
to include an intercept term in the model (first entry)Y (pandas dataframe) – dataframe of covariates for explaining the uncertainty component
- Returns:
the variance-covariance matrix of the CUB model
- Return type:
numpy ndarray
cubmods.cub_yw module#
CUB models in Python. Module for CUB (Combination of Uniform and Binomial) with covariates for both feeling and uncertainty.
Description:#
This module contains methods and classes for CUB_YW model family.
Manual, Examples and References:#
List of TODOs:#
…
Credits#
- Author:
Massimo Pierini
- Date:
2023-24
- Credits:
Domenico Piccolo, Rosaria Simone
- Contacts:
Classes and Functions#
- class cubmods.cub_yw.CUBresCUBYW(model, df, formula, m, n, sample, f, theoric, diss, est_names, estimates, e_types, varmat, stderrs, pval, wald, loglike, muloglik, loglikuni, AIC, BIC, seconds, time_exe, logliksat=None, dev=None, logliksatcov=None, niter=None, maxiter=None, tol=None, sh=None, rho=None, ass_pars=None)#
Bases:
CUBres
“Object returned by
.mle()
function. See here the Base for details.Methods
as_dataframe
()DataFrame of estimated parameters
as_txt
()Print the summary.
plot
([saveas, figsize])Main function to plot an object of the Class.
plot_ordinal
([figsize, ax, kind, saveas])Plots relative average frequencies of observed sample, estimated average probability distribution and, if provided, average probability distribution of a known model.
save
(fname)Save a CUBresult object to file named
fname
+.cub.fit
summary
()Call
as_txt()
- plot(saveas=None, figsize=(7, 5))#
Main function to plot an object of the Class.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figuresaveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- plot_ordinal(figsize=(7, 5), ax=None, kind='bar', saveas=None)#
Plots relative average frequencies of observed sample, estimated average probability distribution and, if provided, average probability distribution of a known model.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figure (useful only ifax
is not None)kind (str) – choose a barplot (
'bar'
default) of a scatterplot ('scatter'
)ax (matplolib ax, optional) – matplotlib axis, if None a new figure will be created, defaults to None
saveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- cubmods.cub_yw.draw(m, beta, gamma, Y, W, df, formula, seed=None)#
Draw a random sample from a specified CUB model with covariates for both feeling and uncertainty.
- Parameters:
n (int) – number of ordinal responses to be drawn
m (int) – number of ordinal categories
beta (array of float) – array \(\pmb \beta\) of parameters for the uncertainty component, whose length equals
Y.columns.size+1
to include an intercept term in the model (first entry)gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)Y (pandas dataframe) – dataframe of covariates for explaining the uncertainty component
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
df (DataFrame) – original DataFrame
formula (str) – the formula used
- Returns:
an instance of
CUBsample
(see here) containing ordinal responses drawn from the specified model
- cubmods.cub_yw.loglik(m, sample, Y, W, beta, gamma)#
Log-likelihood function of a CUB model with covariates for both feeling and uncertainty.
Compute the log-likelihood function of a CUB model fitting ordinal data with covariates for explaining both the feeling and the uncertainty components.
- Parameters:
m (int) – number of ordinal categories
sample (array of int) – array of ordinal responses
beta (array of float) – array \(\pmb \beta\) of parameters for the uncertainty component, whose length equals
Y.columns.size+1
to include an intercept term in the model (first entry)gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)Y (pandas dataframe) – dataframe of covariates for explaining the uncertainty component
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
- Returns:
the log-likelihood value
- Return type:
float
- cubmods.cub_yw.mle(sample, m, Y, W, df, formula, ass_pars=None, maxiter=500, tol=0.0001)#
Main function for CUB models with covariates for both the uncertainty and the feeling components.
Estimate and validate a CUB model for given ordinal responses, with covariates for explaining both the feeling and the uncertainty components by means of logistic transform.
- Parameters:
m (int) – number of ordinal categories
sample (array of int) – array of ordinal responses
Y (pandas dataframe) – dataframe of covariates for explaining the uncertainty component
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
df (DataFrame) – original DataFrame
formula (str) – the formula used
ass_pars (dictionary, optional) – dictionary of hypothesized parameters, defaults to None
maxiter (int) – maximum number of iterations allowed for running the optimization algorithm
tol (float) – fixed error tolerance for final estimates
- Returns:
an instance of
CUBresCUBYW
(see the Class for details)- Return type:
object
- cubmods.cub_yw.pmf(m, beta, gamma, Y, W)#
Average probability distribution of a specified CUB model with covariates for both feeling and uncertainty.
\(\frac{1}{n} \sum_{i=1}^n \Pr(R_i=r|\pmb\theta; \pmb T_i),\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
beta (array of float) – array \(\pmb \beta\) of parameters for the uncertainty component, whose length equals
Y.columns.size+1
to include an intercept term in the model (first entry)gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)Y (pandas dataframe) – dataframe of covariates for explaining the uncertainty component
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
- Returns:
the vector of the probability distribution.
- Return type:
numpy array
- cubmods.cub_yw.pmfi(m, beta, gamma, Y, W)#
Probability distribution for each subject of a specified CUB model with covariates for both feeling and uncertainty.
Auxiliary function of
.draw()
.\(\Pr(R_i=r|\pmb\theta; \pmb T_i),\; i=1 \ldots n ,\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
beta (array of float) – array \(\pmb \beta\) of parameters for the uncertainty component, whose length equals
Y.columns.size+1
to include an intercept term in the model (first entry)gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)Y (pandas dataframe) – dataframe of covariates for explaining the uncertainty component
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
- Returns:
the matrix of the probability distribution of dimension \(n \times r\)
- Return type:
numpy ndarray
- cubmods.cub_yw.prob(m, sample, Y, W, beta, gamma)#
Probability distribution of a CUB model with covariates for both feeling and uncertainty.
Compute the probability distribution of a CUB model with covariates for both the feeling and the uncertainty components.
\(\Pr(R_i=r_i|\pmb\theta;\pmb T_i),\; i=1 \ldots n\)
- Parameters:
m (int) – number of ordinal categories
sample (array of int) – array of ordinal responses
beta (array of float) – array \(\pmb \beta\) of parameters for the uncertainty component, whose length equals
Y.columns.size+1
to include an intercept term in the model (first entry)gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)Y (pandas dataframe) – dataframe of covariates for explaining the uncertainty component
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
- Returns:
the array of the probability distribution.
- Return type:
numpy array
- cubmods.cub_yw.varcov(m, sample, Y, W, beta, gamma)#
Variance-covariance matrix of a CUB model with covariates for both uncertainty and feeling.
Compute the variance-covariance matrix of parameter estimates of a CUB model with covariates for both the uncertainty and the feeling components.
- Parameters:
m (int) – number of ordinal categories
sample (array of int) – array of ordinal responses
beta (array of float) – array \(\pmb \beta\) of parameters for the uncertainty component, whose length equals
Y.columns.size+1
to include an intercept term in the model (first entry)gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)Y (pandas dataframe) – dataframe of covariates for explaining the uncertainty component
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
- Returns:
the variance-covariance matrix of the CUB model
- Return type:
numpy ndarray
cubmods.cube module#
CUB models in Python. Module for CUBE (Combination of Uniform and Beta-Binomial).
Description:#
This module contains methods and classes for CUBE model family.
Manual, Examples and References:#
List of TODOs:#
TODO: adjust 3d plots legend
Credits#
- Author:
Massimo Pierini
- Date:
2023-24
- Credits:
Domenico Piccolo, Rosaria Simone
- Contacts:
Classes and Functions#
- class cubmods.cube.CUBresCUBE(model, df, formula, m, n, sample, f, theoric, diss, est_names, estimates, e_types, varmat, stderrs, pval, wald, loglike, muloglik, loglikuni, AIC, BIC, seconds, time_exe, logliksat=None, dev=None, logliksatcov=None, niter=None, maxiter=None, tol=None, sh=None, rho=None, ass_pars=None)#
Bases:
CUBres
Object returned by
.mle()
function. See here the Base for details.Methods
as_dataframe
()DataFrame of estimated parameters
as_txt
()Print the summary.
plot
([ci, saveas, confell, test3, figsize])Main function to plot an object of the Class.
plot3d
(ax[, ci, magnified])Plots the estimated parameter values in the parameter space and the asymptotic confidence ellipsoid with its projections.
plot_ordinal
([figsize, ax, kind, saveas])Plots relative frequencies of observed sample, estimated probability distribution and, if provided, probability distribution of a known model.
save
(fname)Save a CUBresult object to file named
fname
+.cub.fit
summary
()Call
as_txt()
- plot(ci=0.95, saveas=None, confell=False, test3=True, figsize=(7, 15))#
Main function to plot an object of the Class.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figureci (float) – level \((1-\alpha/2)\) for the confidence ellipsoid
confell (bool) – DEPRECATED, defaults to False
test3 (bool) – DEPRECATED, defaults to True
saveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- plot3d(ax, ci=0.95, magnified=False)#
Plots the estimated parameter values in the parameter space and the asymptotic confidence ellipsoid with its projections.
- Parameters:
ci (float) – level \((1-\alpha/2)\) for the confidence ellipsoid
magnified (bool) – if False the limits will be the entire parameter space, otherwise let matplotlib choose the limits
ax (matplolib ax, optional) – matplotlib axis, if None a new figure will be created, defaults to None
- plot_ordinal(figsize=(7, 5), ax=None, kind='bar', saveas=None)#
Plots relative frequencies of observed sample, estimated probability distribution and, if provided, probability distribution of a known model.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figure (useful only ifax
is not None)kind (str) – choose a barplot (
'bar'
default) of a scatterplot ('scatter'
)ax (matplolib ax, optional) – matplotlib axis, if None a new figure will be created, defaults to None
saveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- cubmods.cube.betar(m, xi, phi)#
Beta-Binomial distribution.
Return the Beta-Binomial distribution with given parameters.
- Parameters:
m (int) – number of ordinal categories
xi (float) – feeling parameter \(\xi\)
phi (float) – overdispersion parameter \(\phi\)
- Returns:
array of length \(m\) of the Beta-Binomial distribution.
- Return type:
numpy array
- cubmods.cube.cmf(m, pi, xi, phi)#
Cumulative probability of a specified CUBE model.
\(\Pr(R \leq r | \pmb\theta),\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
xi (float) – feeling parameter \(\xi\)
phi (float) – overdispersion parameter \(\phi\)
- Returns:
array of length \(m\) of the cumulative probability of a CUBE model without covariates.
- Return type:
numpy array
- cubmods.cube.draw(m, pi, xi, phi, n, df, formula, seed=None)#
Draw a random sample from a specified CUBE model.
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
xi (float) – feeling parameter \(\xi\)
phi (float) – overdispersion parameter \(\phi\)
n (int) – number of ordinal responses to be drawn
df (DataFrame) – original DataFrame
formula (str) – the formula used
seed (int, optional) – the seed to ensure reproducibility, defaults to None
- Returns:
an instance of
CUBsample
(see here) containing ordinal responses drawn from the specified model
- cubmods.cube.effecube(params, tau, f, m)#
Auxiliary function for the log-likelihood estimation of CUBE models without covariates.
Define the opposite of the scalar function that is maximized when running the E-M algorithm for CUBE models without covariates.
- Parameters:
params (array of float) – array of initial estimates for the feeling and the overdispersion parameters
tau (array) – a column vector of length \(m\) containing the posterior probabilities that each observed category has been generated by the first component distribution of the mixture
f (array) – array of the absolute frequencies of the observations
m (int) – number of ordinal categories
- Returns:
the expected value of the inconplete log-likelihood
- Return type:
float
- cubmods.cube.init_theta(sample, m)#
Naive estimates for CUBE models without covariates.
Compute naive parameter estimates of a CUBE model without covariates for given ordinal responses. These preliminary estimators are used within the package code to start the E-M algorithm.
- Parameters:
sample (array of int) – array of ordinal responses
m (int) – number of ordinal categories
- Returns:
a tuple of \((\pi^{(0)}, \xi^{(0)}, \phi^{(0)})\)
- Return type:
tuple of float
- cubmods.cube.loglik(m, pi, xi, phi, f)#
Log-likelihood function of a CUBE model without covariates.
Compute the log-likelihood function of a CUBE model without covariates fitting the given absolute frequency distribution.
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
xi (float) – feeling parameter \(\xi\)
phi (float) – overdispersion parameter \(\phi\)
f (array of int) – array of absolute frequency distribution
- Returns:
the log-likelihood value
- Return type:
float
- cubmods.cube.mean(m, pi, xi)#
Mean of a CUBE model.
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
xi (float) – feeling parameter \(\xi\)
- Returns:
the expected value of the model
- Return type:
float
- cubmods.cube.mle(sample, m, df, formula, ass_pars=None, maxiter=1000, tol=1e-06)#
Main function for CUBE models without covariates.
Estimate and validate a CUBE model without covariates.
- Parameters:
sample (array of int) – array of ordinal responses
m (int) – number of ordinal categories
df (DataFrame) – original DataFrame
formula (str) – the formula used
ass_pars (dictionary, optional) – dictionary of hypothesized parameters, defaults to None
maxiter (int) – maximum number of iterations allowed for running the optimization algorithm
tol (float) – fixed error tolerance for final estimates
- Returns:
an instance of
CUBresCUBE
(see the Class for details)- Return type:
object
- cubmods.cube.pmf(m, pi, xi, phi)#
Probability distribution of a specified CUBE model.
\(\Pr(R = r | \pmb\theta),\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
xi (float) – feeling parameter \(\xi\)
phi (float) – overdispersion parameter \(\phi\)
- Returns:
array of length \(m\) of the distribution of a CUBE model without covariates.
- Return type:
numpy array
- cubmods.cube.prob(m, pi, xi, phi, r)#
Probability \(\Pr(R = r | \pmb\theta)\) of a CUBE model without covariates.
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
xi (float) – feeling parameter \(\xi\)
phi (float) – overdispersion parameter \(\phi\)
r (int) – ordinal response
- Returns:
the probability \(\Pr(R = r | \pmb\theta)\) of a CUBE model without covariates.
- Return type:
numpy array
- cubmods.cube.var(m, pi, xi, phi)#
Variance of a CUBE model.
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
xi (float) – feeling parameter \(\xi\)
phi (float) – overdispersion parameter \(\phi\)
- Returns:
the variance of the model
- Return type:
float
- cubmods.cube.varcov(m, pi, xi, phi, sample)#
Variance-covariance matrix for CUBE models based on the observed information matrix.
Compute the variance-covariance matrix of parameter estimates for a CUBE model without covariates as the inverse of the observed information matrix.
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
xi (float) – feeling parameter \(\xi\)
phi (float) – overdispersion parameter \(\phi\)
sample (array of int) – array of ordinal responses
- Returns:
the variance-covariance matrix of the CUBE model
- Return type:
numpy ndarray
cubmods.cube_0w0 module#
CUB models in Python. Module for CUBE (Combination of Uniform and Beta-Binomial) with covariates for the feeling component.
Description:#
This module contains methods and classes for CUBE_0W0 model family.
Manual, Examples and References:#
List of TODOs:#
…
Credits#
- Author:
Massimo Pierini
- Date:
2023-24
- Credits:
Domenico Piccolo, Rosaria Simone
- Contacts:
Classes and Functions#
- class cubmods.cube_0w0.CUBresCUBE0W0(model, df, formula, m, n, sample, f, theoric, diss, est_names, estimates, e_types, varmat, stderrs, pval, wald, loglike, muloglik, loglikuni, AIC, BIC, seconds, time_exe, logliksat=None, dev=None, logliksatcov=None, niter=None, maxiter=None, tol=None, sh=None, rho=None, ass_pars=None)#
Bases:
CUBres
Object returned by
.mle()
function. See here the Base for details.Methods
as_dataframe
()DataFrame of estimated parameters
as_txt
()Print the summary.
plot
([saveas, figsize])Main function to plot an object of the Class.
plot_ordinal
([figsize, ax, kind, saveas])Plots relative average frequencies of observed sample, estimated average probability distribution and, if provided, average probability distribution of a known model.
save
(fname)Save a CUBresult object to file named
fname
+.cub.fit
summary
()Call
as_txt()
- plot(saveas=None, figsize=(7, 5))#
Main function to plot an object of the Class.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figuresaveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- plot_ordinal(figsize=(7, 5), ax=None, kind='bar', saveas=None)#
Plots relative average frequencies of observed sample, estimated average probability distribution and, if provided, average probability distribution of a known model.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figure (useful only ifax
is not None)kind (str) – choose a barplot (
'bar'
default) of a scatterplot ('scatter'
)ax (matplolib ax, optional) – matplotlib axis, if None a new figure will be created, defaults to None
saveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- cubmods.cube_0w0.betabinomialxi(m, sample, xivett, phi)#
Beta-Binomial probabilities of ordinal responses, given feeling parameter for each observation.
Compute the Beta-Binomial probabilities of given ordinal responses, with feeling parameter specified for each observation, and with the same overdispersion parameter for all the responses.
- Parameters:
m (int) – number of ordinal categories
sample (array) – array of ordinal responses. Missing values are not allowed: they should be preliminarily deleted
xivett (array) – array of feeling parameters of the Beta-Binomial distribution for given ordinal responses
phi (float) – overdispersion parameter \(\phi\)
- Returns:
array of the same length as ordinal: each entry is the Beta-Binomial probability for the given observation for the corresponding feeling and overdispersion parameters.
- Return type:
array
- cubmods.cube_0w0.draw(m, pi, gamma, phi, W, df, formula, seed=None)#
Draw a random sample from a specified CUBE model.
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)phi (float) – overdispersion parameter \(\phi\)
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
n (int) – number of ordinal responses to be drawn
df (DataFrame) – original DataFrame
formula (str) – the formula used
seed (int, optional) – the seed to ensure reproducibility, defaults to None
- Returns:
an instance of
CUBsample
(see here) containing ordinal responses drawn from the specified model
- cubmods.cube_0w0.effe(pars, sample, W, m)#
Auxiliary function for the log-likelihood estimation of CUBE models with covariates only for the feeling component.
Compute the opposite of the scalar function that is maximized when running the E-M algorithm for CUBE models with covariates only for the feeling component.
- Parameters:
pars (array) – array of length equal to
W.index.size+3
whose entries are the initial parameters estimatessample (array of int) – array of ordinal responses
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
m (int) – number of ordinal categories
- Returns:
negative log-likelihood
- Return type:
float
- cubmods.cube_0w0.init_theta(m, sample, W, maxiter, tol)#
Preliminary estimates of parameters for CUBE models with covariates only for feeling.
Compute preliminary parameter estimates of a CUBE model with covariates only for feeling, given ordinal responses. These estimates are set as initial values to start the corresponding E-M algorithm within the package. Preliminary estimates for the uncertainty and the overdispersion parameters are computed by short runs of EM. As to the feeling component, it considers the nested CUB model with covariates and calls code{link{inibestgama}} to derive initial estimates for the coefficients of the selected covariates for feeling.
- Parameters:
m (int) – number of ordinal categories
sample (array of int) – array of ordinal responses
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
maxiter (int) – maximum number of iterations allowed for preliminary iterations
tol (float) – fixed error tolerance for final estimates for preliminary iterations
- Returns:
a tuple of \((\pi^{(0)}, \pmb \gamma^{(0)}, \phi^{(0)})\), where \(\pi^{(0)}\) is the initial estimate for the uncertainty parameter, \(\pmb \gamma^{(0)}\) is the vector of initial estimates for the feeling component (including an intercept term in the first entry), and \(\phi^{(0)}\) is the initial estimate for the overdispersion parameter.
“rtype”: tuple
- cubmods.cube_0w0.loglik(m, sample, W, pi, gamma, phi)#
Log-likelihood function of CUBE model with covariates only for feeling.
Compute the log-likelihood function of a CUBE model for ordinal data with subjects’ covariates only for feeling.
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)phi (float) – overdispersion parameter \(\phi\)
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
sample (array of int) – array of ordinal responses
- Returns:
the log-likelihood value
- Return type:
float
- cubmods.cube_0w0.mle(sample, m, W, df, formula, ass_pars=None, maxiter=1000, tol=1e-06)#
Main function for CUBE models with covariates only for feeling
Estimate and validate a CUBE model for ordinal data, with covariates only for explaining the feeling component.
- Parameters:
sample (array of int) – array of ordinal responses
m (int) – number of ordinal categories
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
df (DataFrame) – original DataFrame
formula (str) – the formula used
ass_pars (dictionary, optional) – dictionary of hypothesized parameters, defaults to None
maxiter (int) – maximum number of iterations allowed for preliminary iterations
tol (float) – fixed error tolerance for final estimates for preliminary iterations; the informatio matrix (to compute the variance-covariance matrix) is approximated with
approx_hess()
(seestatsmodels.tools.numdiff
for details)
- Returns:
an instance of
CUBresCUBE0W0
(see the Class for details)- Return type:
object
- cubmods.cube_0w0.pmf(m, pi, gamma, phi, W)#
Average probability distribution of a specified CUB model with covariates for the feeling component.
\(\frac{1}{n} \sum_{i=1}^n \Pr(R_i=r|\pmb\theta; \pmb T_i),\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)phi (float) – overdispersion parameter \(\phi\)
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
- Returns:
the array of the average probability distribution
- Return type:
numpy array
- cubmods.cube_0w0.pmfi(m, pi, gamma, phi, W)#
Probability distribution for each subject of a specified CUBE model with covariates for feeling only.
Auxiliary function of
.draw()
.\(\Pr(R_i=r|\pmb\theta; \pmb T_i),\; i=1 \ldots n ,\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)phi (float) – overdispersion parameter \(\phi\)
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
- Returns:
the matrix of the probability distribution of dimension \(n \times r\)
- Return type:
numpy ndarray
- cubmods.cube_0w0.prob(m, sample, W, pi, gamma, phi)#
Probability distribution of a CUBE model with covariates for feeling.
Compute the probability distribution of a CUB model with covariates for both the feeling and the uncertainty components. Auxiliary function of
.loglik()
\(\Pr(R_i=r_i|\pmb\theta;\pmb T_i),\; i=1 \ldots n\)
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)phi (float) – overdispersion parameter \(\phi\)
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
sample (array of int) – array of ordinal responses
- Returns:
the array of the probability distribution.
- Return type:
numpy array
cubmods.cube_ywz module#
CUB models in Python. Module for CUBE (Combination of Uniform and Beta-Binomial) with covariates.
Description:#
This module contains methods and classes for CUB_YWZ model family.
Manual, Examples and References:#
List of TODOs:#
…
Credits#
- Author:
Massimo Pierini
- Date:
2023-24
- Credits:
Domenico Piccolo, Rosaria Simone
- Contacts:
Classes and Functions#
- class cubmods.cube_ywz.CUBresCUBEYWZ(model, df, formula, m, n, sample, f, theoric, diss, est_names, estimates, e_types, varmat, stderrs, pval, wald, loglike, muloglik, loglikuni, AIC, BIC, seconds, time_exe, logliksat=None, dev=None, logliksatcov=None, niter=None, maxiter=None, tol=None, sh=None, rho=None, ass_pars=None)#
Bases:
CUBres
Object returned by
.mle()
function. See here the Base for details.Methods
as_dataframe
()DataFrame of estimated parameters
as_txt
()Print the summary.
plot
([saveas, figsize])Main function to plot an object of the Class.
plot_ordinal
([figsize, ax, kind, saveas])Plots relative average frequencies of observed sample, estimated average probability distribution and, if provided, average probability distribution of a known model.
save
(fname)Save a CUBresult object to file named
fname
+.cub.fit
summary
()Call
as_txt()
- plot(saveas=None, figsize=(7, 5))#
Main function to plot an object of the Class.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figuresaveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- plot_ordinal(figsize=(7, 5), ax=None, kind='bar', saveas=None)#
Plots relative average frequencies of observed sample, estimated average probability distribution and, if provided, average probability distribution of a known model.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figure (useful only ifax
is not None)kind (str) – choose a barplot (
'bar'
default) of a scatterplot ('scatter'
)ax (matplolib ax, optional) – matplotlib axis, if None a new figure will be created, defaults to None
saveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- cubmods.cube_ywz.Qdue(pars, tauno, sample, W, Z, m)#
Auxiliary function for the log-likelihood estimation of CUBE models with covariates.
Define the opposite of one of the two scalar functions that are maximized when running the E-M algorithm for CUBE models with covariates for feeling, uncertainty and overdispersion.
- Parameters:
pars (array) – array of initial estimates of parameters for the feeling component and the overdispersion effect
tauno (array) – the column vector of the posterior probabilities that each observed rating has been generated by the distribution of the first component of the mixture
sample (array of int) – array of ordinal responses
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
Z (pandas dataframe) – dataframe of covariates for explaining the overdispersion
m (int) – number of ordinal categories
- cubmods.cube_ywz.Quno(beta, esterno1)#
Auxiliary function for the log-likelihood estimation of CUBE models with covariates.
Define the opposite one of the two scalar functions that are maximized when running the E-M algorithm for CUBE models with covariates for feeling, uncertainty and overdispersion.
It is iteratively called as an argument of “optim” within CUBE function (with covariates) as the function to minimize to compute the maximum likelihood estimates for the feeling and the overdispersion components.
- Parameters:
beta (array of float) – array \(\pmb \beta\) of parameters for the uncertainty component, whose length equals
Y.columns.size+1
to include an intercept term in the model (first entry)esterno1 (ndarray) – matrix binding together the column vector of the posterior probabilities that each observed rating has been generated by the first component distribution of the mixture, with the matrix \(\pmb y\) of explicative variables for the uncertainty component, expanded with a unitary vector in the first column to consider also an intercept term
- cubmods.cube_ywz.auxmat(m, xi, phi, a, b, c, d, e)#
Auxiliary matrix.
Returns an auxiliary matrix needed for computing the variance-covariance matrix of a CUBE model with covariates.
- Parameters:
m (int) – number of ordinal categories
xi (array of float) – feeling parameters \(\pmb\xi\)
phi (array of float) – overdispersion parameter \(\pmb\phi\)
a,b,c,d,e (float) – see the reference paper Piccolo, 2015 for details
- cubmods.cube_ywz.betabinomial(m, sample, xi, phi)#
Beta-Binomial probabilities of ordinal responses, with feeling and overdispersion parameters for each observation.
Compute the Beta-Binomial probabilities of ordinal responses, given feeling and overdispersion parameters for each observation.
The Beta-Binomial distribution is the Binomial distribution in which the probability of success at each trial is random and follows the Beta distribution. It is frequently used in Bayesian statistics, empirical Bayes methods and classical statistics as an overdispersed binomial distribution.
- Parameters:
m (int) – number of ordinal categories
sample (array of int) – array of ordinal responses
xi (float) – feeling parameter \(\xi\)
phi (float) – overdispersion parameter \(\phi\)
- Returns:
array of the same length as
sample
, containing the Beta-Binomial probabilities of each observation, for the corresponding feeling and overdispersion parameters.- Return type:
array
- cubmods.cube_ywz.draw(m, beta, gamma, alpha, df, formula, Y, W, Z, seed=None)#
Draw a random sample from a specified CUBE model.
- Parameters:
m (int) – number of ordinal categories
n (int) – number of ordinal responses to be drawn
beta (array of float) – array \(\pmb \beta\) of parameters for the uncertainty component, whose length equals
Y.columns.size+1
to include an intercept term in the model (first entry)gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)alpha (array of float) – array \(\pmb \alpha\) of parameters for the overdispersion, whose length equals
Z.columns.size+1
to include an intercept term in the model (first entry)Y (pandas dataframe) – dataframe of covariates for explaining the uncertainty component
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
Z (pandas dataframe) – dataframe of covariates for explaining the overdispersion
df (DataFrame) – original DataFrame
formula (str) – the formula used
seed (int, optional) – the seed to ensure reproducibility, defaults to None
- Returns:
an instance of
CUBsample
(see here) containing ordinal responses drawn from the specified model
- cubmods.cube_ywz.init_theta(m, sample, W, p, v)#
Preliminary parameter estimates for CUBE models with covariates.
Compute preliminary parameter estimates for a CUBE model with covariates for all the three parameters. These estimates are set as initial values to start the E-M algorithm within maximum likelihood estimation.
- Parameters:
m (int) – number of ordinal categories
sample (array of int) – array of ordinal responses
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
p (int) – number of covariates for the uncertainty component
v (int) – number of covariates for the overdispersion
- Returns:
a tuple of \((\pmb \beta^{(0)}, \pmb \gamma^{(0)}, \pmb \alpha^{(0)})\) of preliminary estimates of parameter vectors for \(\pi = \pi(\pmb{\beta})\), ; xi=xi(pmb{gamma}),; phi=phi(pmb{alpha})` respectively, of a CUBE model with covariates for all the three parameters. In details, they have length equal to
Y.columns.size+1
,W.columns.size+1
andZ.columns.size+1
, respectively, to account for an intercept term for each component.- Return type:
tuple of arrays
- cubmods.cube_ywz.loglik(m, sample, Y, W, Z, beta, gamma, alpha)#
Log-likelihood function of a CUBE model with covariates.
Compute the log-likelihood function of a CUBE model for ordinal responses, with covariates for explaining all the three parameters.
- Parameters:
m (int) – number of ordinal categories
sample (array of int) – array of ordinal responses
Y (pandas dataframe) – dataframe of covariates for explaining the uncertainty component
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
Z (pandas dataframe) – dataframe of covariates for explaining the overdispersion
beta (array of float) – array \(\pmb \beta\) of parameters for the uncertainty component, whose length equals
Y.columns.size+1
to include an intercept term in the model (first entry)gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)alpha (array of float) – array \(\pmb \alpha\) of parameters for the overdispersion, whose length equals
Z.columns.size+1
to include an intercept term in the model (first entry)
- Returns:
the log-likelihood value
- Return type:
float
- cubmods.cube_ywz.mle(m, sample, Y, W, Z, df, formula, ass_pars=None, maxiter=1000, tol=0.01)#
Main function for CUBE models with covariates.
Function to estimate and validate a CUBE model with explicative covariates for all the three parameters.
- Parameters:
m (int) – number of ordinal categories
sample (array of int) – array of ordinal responses
Y (pandas dataframe) – dataframe of covariates for explaining the uncertainty component
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
Z (pandas dataframe) – dataframe of covariates for explaining the overdispersion
df (DataFrame) – original DataFrame
formula (str) – the formula used
ass_pars (dictionary, optional) – dictionary of hypothesized parameters, defaults to None
maxiter (int) – maximum number of iterations allowed for running the optimization algorithm
tol (float) – fixed error tolerance for final estimates
- Returns:
an instance of
CUBresCUBEYWZ
(see the Class for details)- Return type:
object
- cubmods.cube_ywz.pmf(m, beta, gamma, alpha, Y, W, Z)#
Average probability distribution of a specified CUB model with covariates for the feeling component.
\(\frac{1}{n} \sum_{i=1}^n \Pr(R_i=r|\pmb\theta; \pmb T_i),\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
beta (array of float) – array \(\pmb \beta\) of parameters for the uncertainty component, whose length equals
Y.columns.size+1
to include an intercept term in the model (first entry)gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)alpha (array of float) – array \(\pmb \alpha\) of parameters for the overdispersion, whose length equals
Z.columns.size+1
to include an intercept term in the model (first entry)Y (pandas dataframe) – dataframe of covariates for explaining the uncertainty component
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
Z (pandas dataframe) – dataframe of covariates for explaining the overdispersion
- Returns:
the array of the average probability distribution
- Return type:
numpy array
- cubmods.cube_ywz.pmfi(m, beta, gamma, alpha, Y, W, Z)#
Probability distribution for each subject of a specified CUBE model with covariates.
Auxiliary function of
.draw()
.\(\Pr(R_i=r|\pmb\theta; \pmb T_i),\; i=1 \ldots n ,\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
beta (array of float) – array \(\pmb \beta\) of parameters for the uncertainty component, whose length equals
Y.columns.size+1
to include an intercept term in the model (first entry)gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)alpha (array of float) – array \(\pmb \alpha\) of parameters for the overdispersion, whose length equals
Z.columns.size+1
to include an intercept term in the model (first entry)Y (pandas dataframe) – dataframe of covariates for explaining the uncertainty component
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
Z (pandas dataframe) – dataframe of covariates for explaining the overdispersion
- Returns:
the matrix of the probability distribution of dimension \(n \times r\)
- Return type:
numpy ndarray
- cubmods.cube_ywz.varcov(m, sample, beta, gamma, alpha, Y, W, Z)#
Variance-covariance matrix of a CUBE model with covariates.
Compute the variance-covariance matrix of parameter estimates of a CUBE model with covariates for all the three parameters.
- Parameters:
m (int) – number of ordinal categories
sample (array of int) – array of ordinal responses
Y (pandas dataframe) – dataframe of covariates for explaining the uncertainty component
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
Z (pandas dataframe) – dataframe of covariates for explaining the overdispersion
beta (array of float) – array \(\pmb \beta\) of parameters for the uncertainty component, whose length equals
Y.columns.size+1
to include an intercept term in the model (first entry)gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)alpha (array of float) – array \(\pmb \alpha\) of parameters for the overdispersion, whose length equals
Z.columns.size+1
to include an intercept term in the model (first entry)
- Returns:
the variance-covariance matrix
- Return type:
ndarray
cubmods.cubsh module#
CUB models in Python. Module for CUBSH (Combination of Uniform and Binomial with Shelter Effect).
Description:#
This module contains methods and classes for CUBSH model family.
Manual, Examples and References:#
List of TODOs:#
TODO: fix 3d plots legend
TODO: test all
def _*():
(optional functions)
Credits#
- Author:
Massimo Pierini
- Date:
2023-24
- Credits:
Domenico Piccolo, Rosaria Simone
- Contacts:
Classes and Functions#
- class cubmods.cubsh.CUBresCUBSH(model, df, formula, m, n, sample, f, theoric, diss, est_names, estimates, e_types, varmat, stderrs, pval, wald, loglike, muloglik, loglikuni, AIC, BIC, seconds, time_exe, logliksat=None, dev=None, logliksatcov=None, niter=None, maxiter=None, tol=None, sh=None, rho=None, ass_pars=None)#
Bases:
CUBres
Object returned by
.mle()
function. See here the Base for details.Methods
as_dataframe
()DataFrame of estimated parameters
as_txt
()Print the summary.
plot
([ci, saveas, confell, debug, test3, ...])Main function to plot an object of the Class.
plot3d
(ax[, ci, magnified])Plots the estimated parameter values in the parameter space and the asymptotic confidence ellipsoid with its projections.
plot_ordinal
([figsize, ax, kind, saveas])Plots relative frequencies of observed sample, estimated probability distribution and, if provided, probability distribution of a known model.
save
(fname)Save a CUBresult object to file named
fname
+.cub.fit
summary
()Call
as_txt()
- plot(ci=0.95, saveas=None, confell=False, debug=False, test3=True, figsize=(7, 15))#
Main function to plot an object of the Class.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figureci (float) – level \((1-\alpha/2)\) for the confidence ellipsoid
confell (bool) – DEPRECATED, defaults to False
test3 (bool) – DEPRECATED, defaults to True
debug (bool) – DEPRECATED, defaults to False
saveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- plot3d(ax, ci=0.95, magnified=False)#
Plots the estimated parameter values in the parameter space and the asymptotic confidence ellipsoid with its projections.
- Parameters:
ci (float) – level \((1-\alpha/2)\) for the confidence ellipsoid
magnified (bool) – if False the limits will be the entire parameter space, otherwise let matplotlib choose the limits
ax (matplolib ax, optional) – matplotlib axis, if None a new figure will be created, defaults to None
- plot_ordinal(figsize=(7, 5), ax=None, kind='bar', saveas=None)#
Plots relative frequencies of observed sample, estimated probability distribution and, if provided, probability distribution of a known model.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figure (useful only ifax
is not None)kind (str) – choose a barplot (
'bar'
default) of a scatterplot ('scatter'
)ax (matplolib ax, optional) – matplotlib axis, if None a new figure will be created, defaults to None
saveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- cubmods.cubsh.cmf(m, sh, pi1, pi2, xi)#
Cumulative probability of a specified CUBSH model, using alternative parametrization \((\pi_1, \pi_2)\).
\(\Pr(R \leq r | \pmb\theta),\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
pi1 (float) – Mixing coefficient for the shifted Binomial component of the mixture distribution \(\pi_1\)
pi2 (float) – Mixing coefficient for the discrete Uniform component of the mixture distribution \(\pi_2\)
xi (float) – feeling parameter \(\xi\)
- Returns:
the cumulative probability distribution
- Return type:
array
- cubmods.cubsh.cmf_delta(m, sh, pi, xi, delta)#
Cumulative probability of a specified CUBSH model, using canonic parametrization \((\pi, \delta)\).
\(\Pr(R \leq r | \pmb\theta),\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
pi (float) – uncertainty parameter \(\pi\)
delta (float) – shelter choice parameter \(\delta\)
xi (float) – feeling parameter \(\xi\)
- Returns:
the cumulative probability distribution
- Return type:
array
- cubmods.cubsh.draw(m, sh, pi, xi, delta, n, df, formula, seed=None)#
Draw a random sample from a specified CUBSH model, using canonic parametrization \((\pi, \delta)\).
- Parameters:
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
pi (float) – uncertainty parameter \(\pi\)
delta (float) – shelter choice parameter \(\delta\)
xi (float) – feeling parameter \(\xi\)
n (int) – number of ordinal responses
df (DataFrame) – original DataFrame
formula (str) – the formula used
seed (int, optional) – the seed to ensure reproducibility, defaults to None
- Returns:
an instance of
CUBsample
(see here) containing ordinal responses drawn from the specified model
- cubmods.cubsh.draw2(m, sh, pi1, pi2, xi, n, df, formula, seed=None)#
Draw a random sample from a specified CUBSH model, using alternative parametrization \((\pi_1, \pi_2)\).
- Parameters:
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
pi1 (float) – Mixing coefficient for the shifted Binomial component of the mixture distribution \(\pi_1\)
pi2 (float) – Mixing coefficient for the discrete Uniform component of the mixture distribution \(\pi_2\)
xi (float) – feeling parameter \(\xi\)
n (int) – number of ordinal responses
df (DataFrame) – original DataFrame
formula (str) – the formula used
seed (int, optional) – the seed to ensure reproducibility, defaults to None
- Returns:
an instance of
CUBsample
(see here) containing ordinal responses drawn from the specified model
- cubmods.cubsh.init_theta(f, m, sh)#
Preliminary estimators for CUBSH models.
Computes preliminary parameter estimates of a CUBSH model without covariates for given ordinal responses. These preliminary estimators are used within the package code to start the E-M algorithm.
- Parameters:
f (array of int) – array of the absolute frequencies of given ordinal responses
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
- Returns:
a tuple of \((\pi_1^{(0)}, \pi_2^{(0)}, \xi^{(0)})\)
- cubmods.cubsh.loglik(m, sh, pi1, pi2, xi, f)#
Log-likelihood of a CUB model with shelter effect
Compute the log-likelihood of a CUB model with a shelter effect for the given absolute frequency distribution.
- Parameters:
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
pi1 (float) – Mixing coefficient for the shifted Binomial component of the mixture distribution \(\pi_1\)
pi2 (float) – Mixing coefficient for the discrete Uniform component of the mixture distribution \(\pi_2\)
xi (float) – feeling parameter \(\xi\)
f (array) – Vector of the absolute frequency distribution
- Returns:
the log-likehood value
- Return type:
float
- cubmods.cubsh.mean_delta(m, sh, pi, xi, delta)#
Expected value of a specified CUBSH model, using canonic parametrization \((\pi, \delta)\).
- Parameters:
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
pi (float) – uncertainty parameter \(\pi\)
delta (float) – shelter choice parameter \(\delta\)
xi (float) – feeling parameter \(\xi\)
- Returns:
the expected value of the model
- Return type:
float
- cubmods.cubsh.mle(sample, m, sh, df, formula, maxiter=500, tol=0.0001, ass_pars=None)#
Main function for CUB models with a shelter effect
Estimate and validate a CUB model with a shelter effect.
- Parameters:
sample (array of int) – array of ordinal responses
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
df (DataFrame) – original DataFrame
formula (str) – the formula used
ass_pars (dictionary, optional) – dictionary of hypothesized parameters, defaults to None
maxiter (int) – maximum number of iterations allowed for running the optimization algorithm
tol (float) – fixed error tolerance for final estimates
- Returns:
an instance of
CUBresCUBSH
(see the Class for details)- Return type:
object
- Raise:
Exception if \(m \leq 4\)
- cubmods.cubsh.pi1pi2_to_pidelta(pi1, pi2)#
Compute \((\pi, \delta)\) from \((\pi_1, \pi_2)\)
\(\pi = \dfrac{\pi_1}{\pi_1 + \pi_2}\)
\(\delta = 1 - \pi_1 - \pi_2\)
- Parameters:
pi1 (float) – Mixing coefficient for the shifted Binomial component of the mixture distribution \(\pi_1\)
pi2 (float) – Mixing coefficient for the discrete Uniform component of the mixture distribution \(\pi_2\)
- Returns:
a tuple of \((\pi, \delta)\) the parameters of uncertainty and shelter choice, respectively
- Return type:
tuple
- cubmods.cubsh.pidelta_to_pi1pi2(pi, delta)#
Compute \((\pi_1, \pi_2)\) from \((\pi, \delta)\)
\(\pi_1 = (1 - \delta) \pi\)
\(\pi_2 = (1 - \delta)(1 - \pi)\)
- Parameters:
pi (float) – uncertainty parameter \(\pi\)
delta (float) – shelter choice parameter \(\delta\)
- Returns:
a tuple of \((\pi_1, \pi_2)\) the mixing coefficient of the shifted Binomial and the Uniform components, respectively
- Return type:
tuple
- cubmods.cubsh.plot_simplex(pi1pi2list, ax=None, fname=None)#
Plot simplex of parameters of a CUBSH model.
Note
see the reference Iannario, 2012 for details
Warning
this function still needs several fixes
- Parameters:
pi1pi2list (list) – list of
[pi1, pi2]
parametersax – matplotlib axis
fname – if provided, save the plot to
fname
, defaults to Nonefname – str
- cubmods.cubsh.pmf(m, sh, pi1, pi2, xi)#
Probability distribution of a specified CUBSH model, using alternative parametrization \((\pi_1, \pi_2)\).
\(\Pr(R = r | \pmb\theta),\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
pi1 (float) – Mixing coefficient for the shifted Binomial component of the mixture distribution \(\pi_1\)
pi2 (float) – Mixing coefficient for the discrete Uniform component of the mixture distribution \(\pi_2\)
xi (float) – feeling parameter \(\xi\)
- Returns:
the probability distribution
- Return type:
array
- cubmods.cubsh.pmf_delta(m, sh, pi, xi, delta)#
Probability distribution of a specified CUBSH model, using canonic parametrization \((\pi, \delta)\).
\(\Pr(R = r | \pmb\theta),\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
pi (float) – uncertainty parameter \(\pi\)
delta (float) – shelter choice parameter \(\delta\)
xi (float) – feeling parameter \(\xi\)
- Returns:
the probability distribution
- Return type:
array
- cubmods.cubsh.prob(m, sh, pi1, pi2, xi, r)#
Probability \(\Pr(R = r | \pmb\theta)\) of a CUBSH model without covariates, using alternative parametrization \((\pi_1, \pi_2)\).
- Parameters:
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
pi1 (float) – Mixing coefficient for the shifted Binomial component of the mixture distribution \(\pi_1\)
pi2 (float) – Mixing coefficient for the discrete Uniform component of the mixture distribution \(\pi_2\)
xi (float) – feeling parameter \(\xi\)
r (int) – ordinal response
- Returns:
the probability \(\Pr(R = r | \pmb\theta)\)
- Return type:
float
- cubmods.cubsh.proba_delta(m, sh, pi, xi, delta, r)#
Probability \(\Pr(R = r | \pmb\theta)\) of a CUBSH model without covariates, using canonic parametrization \((\pi, \delta)\).
- Parameters:
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
pi (float) – uncertainty parameter \(\pi\)
delta (float) – shelter choice parameter \(\delta\)
xi (float) – feeling parameter \(\xi\)
r (int) – ordinal response
- Returns:
the probability \(\Pr(R = r | \pmb\theta)\)
- Return type:
float
- cubmods.cubsh.std_delta(m, pi, xi, delta)#
Standard deviation of a specified CUB model, using canonic parametrization \((\pi, \delta)\).
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
delta (float) – shelter choice parameter \(\delta\)
xi (float) – feeling parameter \(\xi\)
- Returns:
the standard deviation of the model
- Return type:
float
- cubmods.cubsh.var_delta(m, pi, xi, delta)#
Variance of a specified CUBSH model, using canonic parametrization \((\pi, \delta)\).
- Parameters:
m (int) – number of ordinal categories
pi (float) – uncertainty parameter \(\pi\)
delta (float) – shelter choice parameter \(\delta\)
xi (float) – feeling parameter \(\xi\)
- Returns:
the variance of the model
- Return type:
float
- cubmods.cubsh.varcov(m, sh, pi1, pi2, xi, n)#
Variance-covariance matrix for CUB models with shelter effect, using alternative parametrization \((\pi_1, \pi_2)\).
Compute the variance-covariance matrix of parameter estimates of a CUB model with shelter effect.
- Parameters:
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
pi1 (float) – Mixing coefficient for the shifted Binomial component of the mixture distribution \(\pi_1\)
pi2 (float) – Mixing coefficient for the discrete Uniform component of the mixture distribution \(\pi_2\)
xi (float) – feeling parameter \(\xi\)
n (int) – number of ordinal responses
- Returns:
the variance-covariance matrix
- Return type:
numpy ndarray
- cubmods.cubsh.varcov_pxd(m, sh, pi, xi, de, n)#
Variance-covariance matrix for CUB models with shelter effect, using canonic parametrization \((\pi, \delta)\).
- Parameters:
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
pi (float) – uncertainty parameter \(\pi\)
de (float) – shelter choice parameter \(\delta\)
xi (float) – feeling parameter \(\xi\)
n (int) – number of ordinal responses
- Returns:
the variance-covariance matrix
- Return type:
numpy ndarray
cubmods.cubsh_ywx module#
CUB models in Python. Module for CUBSH (Combination of Uniform and Binomial with Shelter Effect) with covariates.
Description:#
This module contains methods and classes for CUBSH_YWX model family.
Manual, Examples and References:#
List of TODOs:#
…
Credits#
- Author:
Massimo Pierini
- Date:
2023-24
- Credits:
Domenico Piccolo, Rosaria Simone
- Contacts:
Classes and Functions#
- class cubmods.cubsh_ywx.CUBresCUBSHYWX(model, df, formula, m, n, sample, f, theoric, diss, est_names, estimates, e_types, varmat, stderrs, pval, wald, loglike, muloglik, loglikuni, AIC, BIC, seconds, time_exe, logliksat=None, dev=None, logliksatcov=None, niter=None, maxiter=None, tol=None, sh=None, rho=None, ass_pars=None)#
Bases:
CUBres
Object returned by
.mle()
function. See here the Base for details.Methods
as_dataframe
()DataFrame of estimated parameters
as_txt
()Print the summary.
plot
([saveas, figsize])Main function to plot an object of the Class.
plot_ordinal
([figsize, ax, kind, saveas])Plots relative average frequencies of observed sample, estimated average probability distribution and, if provided, average probability distribution of a known model.
save
(fname)Save a CUBresult object to file named
fname
+.cub.fit
summary
()Call
as_txt()
- plot(saveas=None, figsize=(7, 5))#
Main function to plot an object of the Class.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figuresaveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- plot_ordinal(figsize=(7, 5), ax=None, kind='bar', saveas=None)#
Plots relative average frequencies of observed sample, estimated average probability distribution and, if provided, average probability distribution of a known model.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figure (useful only ifax
is not None)kind (str) – choose a barplot (
'bar'
default) of a scatterplot ('scatter'
)ax (matplolib ax, optional) – matplotlib axis, if None a new figure will be created, defaults to None
saveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- cubmods.cubsh_ywx.Q1(param, dati1, p)#
Auxiliary function for the log-likelihood estimation of GeCUB models.
Define the opposite one of the two scalar functions that are maximized when running the E-M algorithm for GeCUB models with covariates for feeling, uncertainty and shelter effect.
- Parameters:
param (array) – array of initial estimates of parameters for the uncertainty component
dati1 (ndarray or dataframe) – auxiliary matrix
p (int) – number of covariates for the uncertainty component
- cubmods.cubsh_ywx.Q2(param, dati2, m)#
Auxiliary function for the log-likelihood estimation of GeCUB models.
Define the opposite one of the two scalar functions that are maximized when running the E-M algorithm for GeCUB models with covariates for feeling, uncertainty and shelter effect.
- Parameters:
param (array) – array of initial estimates of parameters for the feeling component
dati2 (ndarray or dataframe) – auxiliary matrix
m (int) – number of ordinal categories
- cubmods.cubsh_ywx.draw(m, sh, beta, gamma, omega, Y, W, X, df, formula, seed=None)#
Draw a random sample from a specified CUBSH model with covariates (aka GeCUB model).
- Parameters:
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
beta (array of float) – array \(\pmb \beta\) of parameters for the uncertainty component, whose length equals
Y.columns.size+1
to include an intercept term in the model (first entry)gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)omega (array) – array \(\pmb \omega\) of parameters for the shelter effect, whose length equals
X.columns.size+1
to include an intercept term in the model (first entry)Y (pandas dataframe) – dataframe of covariates for explaining the uncertainty component
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
X (pandas dataframe) – dataframe of covariates for explaining the shelter effect
n (int) – number of ordinal responses to be drawn
df (DataFrame) – original DataFrame
formula (str) – the formula used
seed (int, optional) – the seed to ensure reproducibility, defaults to None
- Returns:
an instance of
CUBsample
(see here) containing ordinal responses drawn from the specified model
- cubmods.cubsh_ywx.init_theta(m, sample, p, s, W)#
Preliminary estimators for CUBSH models with covariates.
Computes preliminary parameter estimates of a CUBSH model without covariates for given ordinal responses. These preliminary estimators are used within the package code to start the E-M algorithm.
- Parameters:
m (int) – number of ordinal categories
sample (array of int) – array of ordinal responses
p (int) – number of covariates for the uncertainty component
s (int) – number of covariates for the shelter effect
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
- Returns:
a tuple of \((\pmb \beta^{(0)}, \pmb \gamma^{(0)}, \pmb \omega^{(0)})\) of preliminary estimates of parameter vectors for \(\pi = \pi(\pmb{\beta})\), ; xi=xi(pmb{gamma}),; delta=delta(pmb{omega})` respectively, of a CUBSH model with covariates for all the three parameters. In details, they have length equal to
Y.columns.size+1
,W.columns.size+1
andX.columns.size+1
, respectively, to account for an intercept term for each component.- Return type:
tuple of arrays
- cubmods.cubsh_ywx.loglik(m, sample, sh, Y, W, X, beta, gamma, omega)#
Log-likelihood function of a CUBSH model with covariates.
Compute the log-likelihood function of a CUBE model for ordinal responses, with covariates for explaining all the three parameters (GeCUB model).
- Parameters:
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
beta (array of float) – array \(\pmb \beta\) of parameters for the uncertainty component, whose length equals
Y.columns.size+1
to include an intercept term in the model (first entry)gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)omega (array) – array \(\pmb \omega\) of parameters for the shelter effect, whose length equals
X.columns.size+1
to include an intercept term in the model (first entry)Y (pandas dataframe) – dataframe of covariates for explaining the uncertainty component
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
X (pandas dataframe) – dataframe of covariates for explaining the shelter effect
sample (array of int) – array of ordinal responses
- Returns:
the log-likelihood value
- Return type:
float
- cubmods.cubsh_ywx.mle(m, sample, sh, Y, W, X, df, formula, ass_pars=None, maxiter=500, tol=0.0001)#
Main function for CUBSH models with covariates for all the components
Function to estimate and validate a CUBSH model for given ordinal responses, with covariates for explaining all the components and the shelter effect.
- Parameters:
m (int) – number of ordinal categories
sample (array of int) – array of ordinal responses
sh (int) – Category corresponding to the shelter choice \([1,m]\)
Y (pandas dataframe) – dataframe of covariates for explaining the uncertainty component
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
X (pandas dataframe) – dataframe of covariates for explaining the shelter effect
df (DataFrame) – original DataFrame
formula (str) – the formula used
ass_pars (dictionary, optional) – dictionary of hypothesized parameters, defaults to None
maxiter (int) – maximum number of iterations allowed for running the optimization algorithm
tol (float) – fixed error tolerance for final estimates
- Returns:
an instance of
CUBresCUBSHYWZ
(see the Class for details)- Return type:
object
- cubmods.cubsh_ywx.pmf(m, sh, beta, gamma, omega, Y, W, X)#
Average probability distribution of a specified CUBSH model with covariates (aka GeCUB model).
\(\frac{1}{n} \sum_{i=1}^n \Pr(R_i=r|\pmb\theta; \pmb T_i),\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
beta (array of float) – array \(\pmb \beta\) of parameters for the uncertainty component, whose length equals
Y.columns.size+1
to include an intercept term in the model (first entry)gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)omega (array) – array \(\pmb \omega\) of parameters for the shelter effect, whose length equals
X.columns.size+1
to include an intercept term in the model (first entry)Y (pandas dataframe) – dataframe of covariates for explaining the uncertainty component
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
X (pandas dataframe) – dataframe of covariates for explaining the shelter effect
- Returns:
the probability distribution
- Return type:
array
- cubmods.cubsh_ywx.pmfi(m, sh, beta, gamma, omega, Y, W, X)#
Probability distribution for each subject of a specified CUBSH model with covariates (aka GeCUB model).
Auxiliary function of
.draw()
.\(\Pr(R_i=r|\pmb\theta; \pmb T_i),\; i=1 \ldots n ,\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
beta (array of float) – array \(\pmb \beta\) of parameters for the uncertainty component, whose length equals
Y.columns.size+1
to include an intercept term in the model (first entry)gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)omega (array) – array \(\pmb \omega\) of parameters for the shelter effect, whose length equals
X.columns.size+1
to include an intercept term in the model (first entry)Y (pandas dataframe) – dataframe of covariates for explaining the uncertainty component
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
X (pandas dataframe) – dataframe of covariates for explaining the shelter effect
- Returns:
the matrix of the probability distribution of dimension \(n \times r\)
- Return type:
numpy ndarray
- cubmods.cubsh_ywx.prob(m, sample, sh, Y, W, X, beta, gamma, omega)#
Probability distribution of a CUBSH model with covariates.
Compute the probability distribution of a CUBSH model with covariates.
\(\Pr(R_i=r_i|\pmb\theta;\pmb T_i),\; i=1 \ldots n\)
- Parameters:
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
beta (array of float) – array \(\pmb \beta\) of parameters for the uncertainty component, whose length equals
Y.columns.size+1
to include an intercept term in the model (first entry)gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)omega (array) – array \(\pmb \omega\) of parameters for the shelter effect, whose length equals
X.columns.size+1
to include an intercept term in the model (first entry)Y (pandas dataframe) – dataframe of covariates for explaining the uncertainty component
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
X (pandas dataframe) – dataframe of covariates for explaining the shelter effect
sample (array of int) – array of ordinal responses
- Returns:
the array of the probability distribution.
- Return type:
numpy array
- cubmods.cubsh_ywx.varcov(sample, m, sh, Y, W, X, beta, gamma, omega)#
Variance-covariance matrix of a CUBSH model with covariates
Compute the variance-covariance matrix of parameter estimates of a CUBSH model with covariates.
- Parameters:
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
beta (array of float) – array \(\pmb \beta\) of parameters for the uncertainty component, whose length equals
Y.columns.size+1
to include an intercept term in the model (first entry)gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)omega (array) – array \(\pmb \omega\) of parameters for the shelter effect, whose length equals
X.columns.size+1
to include an intercept term in the model (first entry)Y (pandas dataframe) – dataframe of covariates for explaining the uncertainty component
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
X (pandas dataframe) – dataframe of covariates for explaining the shelter effect
sample (array of int) – array of ordinal responses
- Returns:
the variance-covariance matrix of the model
- Return type:
numpy ndarray
cubmods.cush module#
CUB models in Python. Module for CUSH (Combination of Uniform and Shelter effect).
Description:#
This module contains methods and classes for CUSH model family.
Manual, Examples and References:#
List of TODOs:#
TODO: check and fix gini & laakso
Credits#
- Author:
Massimo Pierini
- Date:
2023-24
- Credits:
Domenico Piccolo, Rosaria Simone
- Contacts:
Classes and Functions#
- class cubmods.cush.CUBresCUSH(model, df, formula, m, n, sample, f, theoric, diss, est_names, estimates, e_types, varmat, stderrs, pval, wald, loglike, muloglik, loglikuni, AIC, BIC, seconds, time_exe, logliksat=None, dev=None, logliksatcov=None, niter=None, maxiter=None, tol=None, sh=None, rho=None, ass_pars=None)#
Bases:
CUBres
Object returned by
.mle()
function. See here the Base for details.Methods
as_dataframe
()DataFrame of estimated parameters
as_txt
()Print the summary.
plot
([ci, saveas, figsize])Main function to plot an object of the Class.
plot_estim
([ci, ax, magnified, figsize, saveas])Plots the estimated parameter values in the parameter space and the asymptotic standard error.
plot_ordinal
([figsize, kind, ax, saveas])Plots relative frequencies of observed sample, estimated probability distribution and, if provided, probability distribution of a known model.
save
(fname)Save a CUBresult object to file named
fname
+.cub.fit
summary
()Call
as_txt()
- plot(ci=0.95, saveas=None, figsize=(7, 8))#
Main function to plot an object of the Class.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figureci (float) – level \((1-\alpha/2)\) for the standard error
saveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- plot_estim(ci=0.95, ax=None, magnified=False, figsize=(7, 7), saveas=None)#
Plots the estimated parameter values in the parameter space and the asymptotic standard error.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figure (useful only ifax
is not None)ci (float) – level \((1-\alpha/2)\) for the confidence ellipse
magnified (bool) – if False the limits will be the entire parameter space, otherwise let matplotlib choose the limits
ax (matplolib ax, optional) – matplotlib axis, if None a new figure will be created, defaults to None
saveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- plot_ordinal(figsize=(7, 7), kind='bar', ax=None, saveas=None)#
Plots relative frequencies of observed sample, estimated probability distribution and, if provided, probability distribution of a known model.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figure (useful only ifax
is not None)kind (str) – choose a barplot (
'bar'
default) of a scatterplot ('scatter'
)ax (matplolib ax, optional) – matplotlib axis, if None a new figure will be created, defaults to None
saveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- cubmods.cush.LRT(m, fc, n)#
Likelihood Ratio Test between the CUSH model and the null model.
- Parameters:
m (int) – number of ordinal categories
fc (float) – relative frequency of the shelter category
n (int) – number of observations
- Returns:
the value of the LRT
- Return type:
float
- cubmods.cush.draw(m, sh, delta, n, df, formula, seed=None)#
Draw a random sample from a specified CUSH model.
- Parameters:
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
delta (float) – shelter choice parameter \(\delta\)
n (int) – number of ordinal responses
df (DataFrame) – original DataFrame
formula (str) – the formula used
seed (int, optional) – the seed to ensure reproducibility, defaults to None
- Returns:
an instance of
CUBsample
(see here) containing ordinal responses drawn from the specified model
- cubmods.cush.gini(delta)#
The Gini index of a specified CUSH model.
- Parameters:
delta (float) – shelter choice parameter \(\delta\)
- Returns:
the Gini index of the model
- Return type:
float
- cubmods.cush.laakso(m, delta)#
The Laakso index of a specified CUSH model.
- Parameters:
m (int) – number of ordinal categories
delta (float) – shelter choice parameter \(\delta\)
- Returns:
the Laakso index of the model
- Return type:
float
- cubmods.cush.loglik(sample, m, sh, delta)#
Log-likelihood function for a CUSH model without covariates
Compute the log-likelihood function for a CUSH model without covariate for the given ordinal responses.
- Parameters:
sample (array of int) – array of ordinal responses
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
delta (float) – shelter choice parameter \(\delta\)
- Returns:
the log-likehood value
- Return type:
float
- cubmods.cush.mean(m, sh, delta)#
Expected value of a specified CUSH model.
- Parameters:
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
delta (float) – shelter choice parameter \(\delta\)
- Returns:
the expected value of the model
- Return type:
float
- cubmods.cush.mle(sample, m, sh, df, formula, ass_pars=None, maxiter=None, tol=None)#
Main function for CUSH model without covariates.
Estimate and validate a CUSH model for given ordinal responses, without covariates.
- Parameters:
sample (array of int) – array of ordinal responses
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
df (DataFrame) – original DataFrame
formula (str) – the formula used
ass_pars (dictionary, optional) – dictionary of hypothesized parameters, defaults to None
maxiter (None) – default to None; ensure compatibility with
gem.from_formula()
tol (None) – default to None; ensure compatibility with
gem.from_formula()
- Returns:
an instance of
CUBresCUSH
(see the Class for details)- Return type:
object
- cubmods.cush.pmf(m, sh, delta)#
Probability distribution of a specified CUSH model.
\(\Pr(R = r | \pmb\theta),\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
delta (float) – shelter choice parameter \(\delta\)
- Returns:
the probability distribution
- Return type:
array
- cubmods.cush.var(m, sh, delta)#
Variance of a specified CUSH model.
- Parameters:
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
delta (float) – shelter choice parameter \(\delta\)
- Returns:
the variance of the model
- Return type:
float
cubmods.cush2 module#
CUB models in Python. Module for CUSH2 (Combination of Uniform and 2 Shelter Choices).
Description:#
This module contains methods and classes for CUSH2 model family.
Manual, Examples and References:#
List of TODOs:#
…
Credits#
- Author:
Massimo Pierini
- Date:
2023-24
- Credits:
Domenico Piccolo, Rosaria Simone
- Contacts:
Classes and Functions#
- class cubmods.cush2.CUBresCUSH2(model, df, formula, m, n, sample, f, theoric, diss, est_names, estimates, e_types, varmat, stderrs, pval, wald, loglike, muloglik, loglikuni, AIC, BIC, seconds, time_exe, logliksat=None, dev=None, logliksatcov=None, niter=None, maxiter=None, tol=None, sh=None, rho=None, ass_pars=None)#
Bases:
CUBres
Object returned by
.mle()
function. See here the Base for details.Methods
as_dataframe
()DataFrame of estimated parameters
as_txt
()Print the summary.
plot
([ci, saveas, figsize])Main function to plot an object of the Class.
plot_ordinal
([figsize, ax, kind, saveas])Plots relative frequencies of observed sample, estimated probability distribution and, if provided, probability distribution of a known model.
plot_par_space
([figsize, ax, ci, saveas])Plots the estimated parameter values in the parameter space and the asymptotic standard error.
save
(fname)Save a CUBresult object to file named
fname
+.cub.fit
summary
()Call
as_txt()
- plot(ci=0.95, saveas=None, figsize=(7, 11))#
Main function to plot an object of the Class.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figureci (float) – level \((1-\alpha/2)\) for the standard error
saveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- plot_ordinal(figsize=(7, 5), ax=None, kind='bar', saveas=None)#
Plots relative frequencies of observed sample, estimated probability distribution and, if provided, probability distribution of a known model.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figure (useful only ifax
is not None)kind (str) – choose a barplot (
'bar'
default) of a scatterplot ('scatter'
)ax (matplolib ax, optional) – matplotlib axis, if None a new figure will be created, defaults to None
saveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- plot_par_space(figsize=(7, 5), ax=None, ci=0.95, saveas=None)#
Plots the estimated parameter values in the parameter space and the asymptotic standard error.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figure (useful only ifax
is not None)ci (float) – level \((1-\alpha/2)\) for the confidence ellipse
ax (matplolib ax, optional) – matplotlib axis, if None a new figure will be created, defaults to None
saveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- cubmods.cush2.draw(m, sh1, sh2, df, formula, delta1, delta2, n, seed=None)#
Draw a random sample from a specified CUSH2 model.
- Parameters:
m (int) – number of ordinal categories
sh1 (int) – Category corresponding to the 1st shelter choice \([1,m]\)
sh2 (int) – Category corresponding to the 2nd shelter choice \([1,m]\)
delta1 (float) – 1st shelter choice parameter \(\delta_1\)
delta2 (float) – 2nd shelter choice parameter \(\delta_2\)
n (int) – number of ordinal responses
df (DataFrame) – original DataFrame
formula (str) – the formula used
seed (int, optional) – the seed to ensure reproducibility, defaults to None
- Returns:
an instance of
CUBsample
(see here) containing ordinal responses drawn from the specified model
- cubmods.cush2.loglik(sample, m, c1, c2)#
Log-likelihood function for a CUSH2 model without covariates.
Compute the log-likelihood function for a CUSH2 model without covariate for the given ordinal responses.
- Parameters:
sample (array of int) – array of ordinal responses
m (int) – number of ordinal categories
c1 (int) – Category corresponding to the 1st shelter choice \([1,m]\)
c2 (int) – Category corresponding to the 2nd shelter choice \([1,m]\)
- Returns:
the log-likehood value
- Return type:
float
- cubmods.cush2.mle(sample, m, c1, c2, df, formula, ass_pars=None, maxiter=None, tol=None)#
Main function for CUSH2 models without covariates.
Estimate and validate a CUSH2 model for ordinal responses, without covariates.
- Parameters:
sample (array of int) – array of ordinal responses
m (int) – number of ordinal categories
c1 (int) – Category corresponding to the 1st shelter choice \([1,m]\)
c2 (int) – Category corresponding to the 2nd shelter choice \([1,m]\)
df (DataFrame) – original DataFrame
formula (str) – the formula used
ass_pars (dictionary, optional) – dictionary of hypothesized parameters, defaults to None
maxiter (None) – default to None; ensure compatibility with
gem.from_formula()
tol (None) – default to None; ensure compatibility with
gem.from_formula()
- Returns:
an instance of
CUBresCUSH2
(see the Class for details)- Return type:
object
- cubmods.cush2.pmf(m, c1, c2, d1, d2)#
Probability distribution of a specified CUSH2 model.
\(\Pr(R = r | \pmb\theta),\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
c1 (int) – Category corresponding to the 1st shelter choice \([1,m]\)
c2 (int) – Category corresponding to the 2nd shelter choice \([1,m]\)
d1 (float) – 1st shelter choice parameter \(\delta_1\)
d2 (float) – 2nd shelter choice parameter \(\delta_2\)
- Returns:
the probability distribution
- Return type:
array
- cubmods.cush2.varcov(m, n, d1, d2, fc1, fc2)#
Compute the variance-covariance matrix of parameter estimates of a CUSH2 model without covariates.
- Parameters:
m (int) – number of ordinal categories
n (int) – number of ordinal responses
d1 (float) – 1st shelter choice parameter \(\delta_1\)
d2 (float) – 2nd shelter choice parameter \(\delta_2\)
fc1 (float) – relative frequency of 1st shelter choice
fc2 (float) – relative frequency of 2nd shelter choice
- Returns:
the variance-covariance matrix
- Return type:
numpy ndarray
cubmods.cush2_x0 module#
CUB models in Python. Module for CUSH2 (Combination of Uniform and 2 Shelter Choices) with covariates for the 1st shelter choice.
Description:#
This module contains methods and classes for CUSH2 model family.
Manual, Examples and References:#
List of TODOs:#
…
Credits#
- Author:
Massimo Pierini
- Date:
2023-24
- Credits:
Domenico Piccolo, Rosaria Simone
- Contacts:
Classes and Functions#
- class cubmods.cush2_x0.CUBresCUSH2X0(model, df, formula, m, n, sample, f, theoric, diss, est_names, estimates, e_types, varmat, stderrs, pval, wald, loglike, muloglik, loglikuni, AIC, BIC, seconds, time_exe, logliksat=None, dev=None, logliksatcov=None, niter=None, maxiter=None, tol=None, sh=None, rho=None, ass_pars=None)#
Bases:
CUBres
Object returned by
.mle()
function. See here the Base for details.Methods
as_dataframe
()DataFrame of estimated parameters
as_txt
()Print the summary.
plot
([saveas, figsize])Main function to plot an object of the Class.
plot_ordinal
([figsize, ax, kind, saveas])Plots relative average frequencies of observed sample, estimated average probability distribution and, if provided, average probability distribution of a known model.
save
(fname)Save a CUBresult object to file named
fname
+.cub.fit
summary
()Call
as_txt()
- plot(saveas=None, figsize=(7, 5))#
Main function to plot an object of the Class.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figuresaveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- plot_ordinal(figsize=(7, 5), ax=None, kind='bar', saveas=None)#
Plots relative average frequencies of observed sample, estimated average probability distribution and, if provided, average probability distribution of a known model.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figure (useful only ifax
is not None)kind (str) – choose a barplot (
'bar'
default) of a scatterplot ('scatter'
)ax (matplolib ax, optional) – matplotlib axis, if None a new figure will be created, defaults to None
saveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- cubmods.cush2_x0.draw(m, sh1, sh2, omega1, delta2, X1, df, formula, seed=None)#
Draw a random sample from a specified CUSH2 model, with covariates for the 1st shelter choice only.
- Parameters:
m (int) – number of ordinal categories
sh1 (int) – Category corresponding to the 1st shelter choice \([1,m]\)
sh2 (int) – Category corresponding to the 2nd shelter choice \([1,m]\)
omega1 (array) – array \(\pmb \omega_1\) of parameters for the 1st shelter effect, whose length equals
X1.columns.size+1
to include an intercept term in the model (first entry)delta2 (float) – 2nd shelter choice parameter \(\delta_2\)
X1 (DataFrame) – dataframe of covariates for explaining the 1st shelter effect
df (DataFrame) – original DataFrame
formula (str) – the formula used
seed (int, optional) – the seed to ensure reproducibility, defaults to None
- Returns:
an instance of
CUBsample
(see here) containing ordinal responses drawn from the specified model
- cubmods.cush2_x0.effe(pars, sample, m, sh1, sh2, X1)#
Auxiliary function for the log-likelihood estimation of CUSH2 models.
Compute the opposite of the scalar function that is maximized when running the E-M algorithm for CUSH2 models with covariates for the 1st shelter choice.
- Parameters:
pars (array) – array of parameters
sample (array of int) – array of ordinal responses
m (int) – number of ordinal categories
sh1 (int) – Category corresponding to the 1st shelter choice \([1,m]\)
sh2 (int) – Category corresponding to the 2nd shelter choice \([1,m]\)
X1 (DataFrame) – dataframe of covariates for explaining the 1st shelter effect
- cubmods.cush2_x0.loglik(sample, m, sh1, sh2, omega1, delta2, X1)#
Log-likelihood function for a CUSH2 model with covariates for the 1st shelter choice only.
Compute the log-likelihood function for a CUSH2 model with covariates for the 1st shelter choice only, for the given ordinal responses.
- Parameters:
sample (array of int) – array of ordinal responses
m (int) – number of ordinal categories
sh1 (int) – Category corresponding to the 1st shelter choice \([1,m]\)
sh2 (int) – Category corresponding to the 2nd shelter choice \([1,m]\)
omega1 (array) – array \(\pmb \omega_1\) of parameters for the 1st shelter effect, whose length equals
X1.columns.size+1
to include an intercept term in the model (first entry)delta2 (float) – 2nd shelter choice parameter \(\delta_2\)
X1 (DataFrame) – dataframe of covariates for explaining the 1st shelter effect
- Returns:
the log-likehood value
- Return type:
float
- cubmods.cush2_x0.mle(sample, m, sh1, sh2, X1, df, formula, ass_pars=None)#
Main function for CUSH2 models with covariates for the 1st shelter choice only.
Estimate and validate a CUSH2 model for given ordinal responses, with covariates for the 1st shelter choice only.
- Parameters:
sample (array of int) – array of ordinal responses
sh1 (int) – Category corresponding to the 1st shelter choice \([1,m]\)
sh2 (int) – Category corresponding to the 2nd shelter choice \([1,m]\)
X1 (DataFrame) – dataframe of covariates for explaining the 1st shelter effect
df (DataFrame) – original DataFrame
formula (str) – the formula used
ass_pars (dictionary, optional) – dictionary of hypothesized parameters, defaults to None
- Returns:
an instance of
CUBresCUSH2X0
(see the Class for details)- Return type:
object
- cubmods.cush2_x0.pmf(m, sh1, sh2, omega1, delta2, X1)#
Average probability distribution of a specified CUSH2 model with covariates for the 1st shelter choice.
\(\frac{1}{n} \sum_{i=1}^n \Pr(R_i=r|\pmb\theta; \pmb T_i),\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
sh1 (int) – Category corresponding to the 1st shelter choice \([1,m]\)
sh2 (int) – Category corresponding to the 2nd shelter choice \([1,m]\)
omega1 (array) – array \(\pmb \omega_1\) of parameters for the 1st shelter effect, whose length equals
X1.columns.size+1
to include an intercept term in the model (first entry)delta2 (float) – 2nd shelter choice parameter \(\delta_2\)
X1 (DataFrame) – dataframe of covariates for explaining the 1st shelter effect
- Returns:
the average probability distribution
- Return type:
array
- cubmods.cush2_x0.pmfi(m, sh1, sh2, omega1, delta2, X1)#
Probability distribution for each subject of a specified CUSH2 model with covariates for the first shelter choice only.
Auxiliary function of
.draw()
.\(\Pr(R_i=r|\pmb\theta; \pmb T_i),\; i=1 \ldots n ,\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
sh1 (int) – Category corresponding to the 1st shelter choice \([1,m]\)
sh2 (int) – Category corresponding to the 2nd shelter choice \([1,m]\)
omega1 (array) – array \(\pmb \omega_1\) of parameters for the 1st shelter effect, whose length equals
X1.columns.size+1
to include an intercept term in the model (first entry)delta2 (float) – 2nd shelter choice parameter \(\delta_2\)
X1 (DataFrame) – dataframe of covariates for explaining the 1st shelter effect
- Returns:
the matrix of the probability distribution of dimension \(n \times r\)
- Return type:
numpy ndarray
cubmods.cush2_xx module#
CUB models in Python. Module for CUSH2 (Combination of Uniform and 2 Shelter Choices) with covariates.
Description:#
This module contains methods and classes for CUSH2 model family with covariates for both shelter choices.
Manual, Examples and References:#
List of TODOs:#
…
Credits#
- Author:
Massimo Pierini
- Date:
2023-24
- Credits:
Domenico Piccolo, Rosaria Simone
- Contacts:
Classes and Functions#
- class cubmods.cush2_xx.CUBresCUSH2XX(model, df, formula, m, n, sample, f, theoric, diss, est_names, estimates, e_types, varmat, stderrs, pval, wald, loglike, muloglik, loglikuni, AIC, BIC, seconds, time_exe, logliksat=None, dev=None, logliksatcov=None, niter=None, maxiter=None, tol=None, sh=None, rho=None, ass_pars=None)#
Bases:
CUBres
Object returned by
.mle()
function. See here the Base for details.Methods
as_dataframe
()DataFrame of estimated parameters
as_txt
()Print the summary.
plot
([saveas, figsize])Main function to plot an object of the Class.
plot_ordinal
([figsize, ax, kind, saveas])Plots relative average frequencies of observed sample, estimated average probability distribution and, if provided, average probability distribution of a known model.
save
(fname)Save a CUBresult object to file named
fname
+.cub.fit
summary
()Call
as_txt()
- plot(saveas=None, figsize=(7, 5))#
Main function to plot an object of the Class.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figuresaveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- plot_ordinal(figsize=(7, 5), ax=None, kind='bar', saveas=None)#
Plots relative average frequencies of observed sample, estimated average probability distribution and, if provided, average probability distribution of a known model.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figure (useful only ifax
is not None)kind (str) – choose a barplot (
'bar'
default) of a scatterplot ('scatter'
)ax (matplolib ax, optional) – matplotlib axis, if None a new figure will be created, defaults to None
saveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- cubmods.cush2_xx.draw(m, sh1, sh2, omega1, omega2, X1, X2, df, formula, seed=None)#
Draw a random sample from a specified CUSH2 model, with covariates for both shelter choices.
- Parameters:
m (int) – number of ordinal categories
sh1 (int) – Category corresponding to the 1st shelter choice \([1,m]\)
sh2 (int) – Category corresponding to the 2nd shelter choice \([1,m]\)
omega1 (array) – array \(\pmb \omega_1\) of parameters for the 1st shelter effect, whose length equals
X1.columns.size+1
to include an intercept term in the model (first entry)omega2 (array) – array \(\pmb \omega_2\) of parameters for the 2nd shelter effect, whose length equals
X2.columns.size+1
to include an intercept term in the model (first entry)X1 (DataFrame) – dataframe of covariates for explaining the 1st shelter effect
X2 (DataFrame) – dataframe of covariates for explaining the 2nd shelter effect
df (DataFrame) – original DataFrame
formula (str) – the formula used
seed (int, optional) – the seed to ensure reproducibility, defaults to None
- Returns:
an instance of
CUBsample
(see here) containing ordinal responses drawn from the specified model
- cubmods.cush2_xx.effe(pars, sample, m, sh1, sh2, X1, X2)#
Auxiliary function for the log-likelihood estimation of CUSH2 models.
Compute the opposite of the scalar function that is maximized when running the E-M algorithm for CUSH2 models with covariates for both shelter choices.
- Parameters:
pars (array) – array of parameters
sample (array of int) – array of ordinal responses
m (int) – number of ordinal categories
sh1 (int) – Category corresponding to the 1st shelter choice \([1,m]\)
sh2 (int) – Category corresponding to the 2nd shelter choice \([1,m]\)
X1 (DataFrame) – dataframe of covariates for explaining the 1st shelter effect
X2 (DataFrame) – dataframe of covariates for explaining the 2nd shelter effect
- cubmods.cush2_xx.loglik(sample, m, sh1, sh2, omega1, omega2, X1, X2)#
Log-likelihood function for a CUSH2 model with covariates for both shelter choices.
Compute the log-likelihood function for a CUSH2 model with covariates for both shelter choices, for the given ordinal responses.
- Parameters:
sample (array of int) – array of ordinal responses
m (int) – number of ordinal categories
sh1 (int) – Category corresponding to the 1st shelter choice \([1,m]\)
sh2 (int) – Category corresponding to the 2nd shelter choice \([1,m]\)
omega1 (array) – array \(\pmb \omega_1\) of parameters for the 1st shelter effect, whose length equals
X1.columns.size+1
to include an intercept term in the model (first entry)omega2 (array) – array \(\pmb \omega_2\) of parameters for the 2nd shelter effect, whose length equals
X2.columns.size+1
to include an intercept term in the model (first entry)X1 (DataFrame) – dataframe of covariates for explaining the 1st shelter effect
X2 (DataFrame) – dataframe of covariates for explaining the 2nd shelter effect
- Returns:
the log-likehood value
- Return type:
float
- cubmods.cush2_xx.mle(sample, m, sh1, sh2, X1, X2, df, formula, ass_pars=None)#
Main function for CUSH2 models with covariates for both shelter choices.
Estimate and validate a CUSH2 model for given ordinal responses, with covariates for both shelter choices.
- Parameters:
sample (array of int) – array of ordinal responses
sh1 (int) – Category corresponding to the 1st shelter choice \([1,m]\)
sh2 (int) – Category corresponding to the 2nd shelter choice \([1,m]\)
X1 (DataFrame) – dataframe of covariates for explaining the 1st shelter effect
X2 (DataFrame) – dataframe of covariates for explaining the 2nd shelter effect
df (DataFrame) – original DataFrame
formula (str) – the formula used
ass_pars (dictionary, optional) – dictionary of hypothesized parameters, defaults to None
- Returns:
an instance of
CUBresCUSH2XX
(see the Class for details)- Return type:
object
- cubmods.cush2_xx.pmf(m, sh1, sh2, omega1, omega2, X1, X2)#
Average probability distribution of a specified CUSH2 model with covariates for both shelter choices.
\(\frac{1}{n} \sum_{i=1}^n \Pr(R_i=r|\pmb\theta; \pmb T_i),\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
sh1 (int) – Category corresponding to the 1st shelter choice \([1,m]\)
sh2 (int) – Category corresponding to the 2nd shelter choice \([1,m]\)
omega1 (array) – array \(\pmb \omega_1\) of parameters for the 1st shelter effect, whose length equals
X1.columns.size+1
to include an intercept term in the model (first entry)omega2 (array) – array \(\pmb \omega_2\) of parameters for the 2nd shelter effect, whose length equals
X2.columns.size+1
to include an intercept term in the model (first entry)X1 (DataFrame) – dataframe of covariates for explaining the 1st shelter effect
X2 (DataFrame) – dataframe of covariates for explaining the 2nd shelter effect
- Returns:
the average probability distribution
- Return type:
array
- cubmods.cush2_xx.pmfi(m, sh1, sh2, omega1, omega2, X1, X2)#
Probability distribution for each subject of a specified CUSH2 model with covariates for both shelter choices.
Auxiliary function of
.draw()
.\(\Pr(R_i=r|\pmb\theta; \pmb T_i),\; i=1 \ldots n ,\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
sh1 (int) – Category corresponding to the 1st shelter choice \([1,m]\)
sh2 (int) – Category corresponding to the 2nd shelter choice \([1,m]\)
omega1 (array) – array \(\pmb \omega_1\) of parameters for the 1st shelter effect, whose length equals
X1.columns.size+1
to include an intercept term in the model (first entry)omega2 (array) – array \(\pmb \omega_2\) of parameters for the 2nd shelter effect, whose length equals
X2.columns.size+1
to include an intercept term in the model (first entry)X1 (DataFrame) – dataframe of covariates for explaining the 1st shelter effect
X2 (DataFrame) – dataframe of covariates for explaining the 2nd shelter effect
- Returns:
the matrix of the probability distribution of dimension \(n \times r\)
- Return type:
numpy ndarray
cubmods.cush_x module#
CUB models in Python. Module for CUSH (Combination of Uniform and Shelter effect) with covariates.
Description:#
This module contains methods and classes for CUSH model family.
Manual, Examples and References:#
List of TODOs:#
…
Credits#
- Author:
Massimo Pierini
- Date:
2023-24
- Credits:
Domenico Piccolo, Rosaria Simone
- Contacts:
Classes and Functions#
- class cubmods.cush_x.CUBresCUSHX(model, df, formula, m, n, sample, f, theoric, diss, est_names, estimates, e_types, varmat, stderrs, pval, wald, loglike, muloglik, loglikuni, AIC, BIC, seconds, time_exe, logliksat=None, dev=None, logliksatcov=None, niter=None, maxiter=None, tol=None, sh=None, rho=None, ass_pars=None)#
Bases:
CUBres
Object returned by
.mle()
function. See here the Base for details.Methods
as_dataframe
()DataFrame of estimated parameters
as_txt
()Print the summary.
plot
([saveas, figsize])Main function to plot an object of the Class.
plot_ordinal
([figsize, ax, kind, saveas])Plots avreage relative frequencies of observed sample, estimated average probability distribution and, if provided, average probability distribution of a known model.
save
(fname)Save a CUBresult object to file named
fname
+.cub.fit
summary
()Call
as_txt()
- plot(saveas=None, figsize=(7, 5))#
Main function to plot an object of the Class.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figuresaveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- plot_ordinal(figsize=(7, 5), ax=None, kind='bar', saveas=None)#
Plots avreage relative frequencies of observed sample, estimated average probability distribution and, if provided, average probability distribution of a known model.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figure (useful only ifax
is not None)kind (str) – choose a barplot (
'bar'
default) of a scatterplot ('scatter'
)ax (matplolib ax, optional) – matplotlib axis, if None a new figure will be created, defaults to None
saveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- cubmods.cush_x.draw(m, sh, omega, X, df, formula, seed=None)#
Draw a random sample from a specified CUSH model with covariates
- Parameters:
m (int) – number of ordinal categories
sample (array of int) – array of ordinal responses
sh (int) – Category corresponding to the shelter choice \([1,m]\)
omega (array) – array \(\pmb \omega\) of parameters for the shelter effect, whose length equals
X.columns.size+1
to include an intercept term in the model (first entry)X (pandas dataframe) – dataframe of covariates for explaining the shelter effect
df (DataFrame) – original DataFrame
formula (str) – the formula used
seed (int, optional) – the seed to ensure reproducibility, defaults to None
- Returns:
an instance of
CUBsample
(see here) containing ordinal responses drawn from the specified model
- cubmods.cush_x.effe(pars, esterno, m, sh)#
Auxiliary function for the log-likelihood estimation of CUSH models with covariates
Compute the opposite of the loglikelihood function for CUSH models with covariates to explain the shelter effect. It is called as an argument for “optim” within
.mle()
function as the function to minimize.- Parameters:
pars (array) – array of the initial parameters estimates
esterno (ndarray) – matrix binding together the vector of ordinal data and the matrix
XX
of explanatory variables whose first column is a column of ones needed to consider an intercept termm (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
- cubmods.cush_x.loglik(m, sample, X, omega, sh)#
Log-likelihood function for CUSH models with covariates.
Compute the log-likelihood function for CUSH models with covariates to explain the shelter effect.
- Parameters:
m (int) – number of ordinal categories
sample (array of int) – array of ordinal responses
sh (int) – Category corresponding to the shelter choice \([1,m]\)
omega (array) – array \(\pmb \omega\) of parameters for the shelter effect, whose length equals
X.columns.size+1
to include an intercept term in the model (first entry)X (pandas dataframe) – dataframe of covariates for explaining the shelter effect
- Returns:
the log-likelihood value
- Return type:
float
- cubmods.cush_x.mle(m, sample, X, sh, df, formula, ass_pars=None, maxiter=None, tol=None)#
Main function for CUSH models with covariates.
Estimate and validate a CUSH model for ordinal responses, with covariates to explain the shelter effect.
- Parameters:
m (int) – number of ordinal categories
sample (array of int) – array of ordinal responses
sh (int) – Category corresponding to the shelter choice \([1,m]\)
X (pandas dataframe) – dataframe of covariates for explaining the shelter effect
df (DataFrame) – original DataFrame
formula (str) – the formula used
ass_pars (dictionary, optional) – dictionary of hypothesized parameters, defaults to None
maxiter (None) – default to None; ensure compatibility with
gem.from_formula()
tol (None) – default to None; ensure compatibility with
gem.from_formula()
- Returns:
an instance of
CUBresCUSHX
(see the Class for details)- Return type:
object
- cubmods.cush_x.pmf(m, sh, omega, X)#
Average probability distribution of a specified CUSH model with covariates.
\(\frac{1}{n} \sum_{i=1}^n \Pr(R_i=r|\pmb\theta; \pmb T_i),\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
omega (array) – array \(\pmb \omega\) of parameters for the shelter effect, whose length equals
X.columns.size+1
to include an intercept term in the model (first entry)X (pandas dataframe) – dataframe of covariates for explaining the shelter effect
- Returns:
the probability distribution
- Return type:
array
- cubmods.cush_x.pmfi(m, sh, omega, X)#
Probability distribution for each subject of a specified CUSH model with covariates
\(\Pr(R_i=r|\pmb\theta; \pmb T_i),\; i=1 \ldots n ,\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
sh (int) – Category corresponding to the shelter choice \([1,m]\)
omega (array) – array \(\pmb \omega\) of parameters for the shelter effect, whose length equals
X.columns.size+1
to include an intercept term in the model (first entry)X (pandas dataframe) – dataframe of covariates for explaining the shelter effect
- Returns:
the matrix of the probability distribution of dimension \(n \times r\)
- Return type:
numpy ndarray
- cubmods.cush_x.prob(m, sample, X, omega, sh)#
Probability distribution of a specified CUSH model with covariates.
\(\Pr(R_i=r_i|\pmb\theta;\pmb T_i),\;i = 1 \ldots n\)
- Parameters:
m (int) – number of ordinal categories
sample (array of int) – array of ordinal responses
sh (int) – Category corresponding to the shelter choice \([1,m]\)
omega (array) – array \(\pmb \omega\) of parameters for the shelter effect, whose length equals
X.columns.size+1
to include an intercept term in the model (first entry)X (pandas dataframe) – dataframe of covariates for explaining the shelter effect
- Returns:
the probability array \(\Pr(R = r | \pmb\theta)\) for observed responses
- Return type:
float
cubmods.gem module#
CUB models in Python. Module for GEM (Generalized Mixtures).
Description:#
This module contains methods and classes for GEM maximum likelihood estimation and sample drawing.
Manual, Examples and References:#
List of TODOs:#
TODO: implement best shelter search
Credits#
- Author:
Massimo Pierini
- Date:
2023-24
- Credits:
Domenico Piccolo, Rosaria Simone
- Contacts:
Classes and Functions#
- cubmods.gem.draw(formula, df=None, m=7, model='cub', n=500, sh=None, seed=None, **params)#
Main function to draw a sample from GEneralized Mixture models.
- Parameters:
formula (str) – a formula used to draw the sample, see Manual for details
df (DataFrame) – the DataFrame with covariates (if any)
m (int) – number of ordinal categories
model (str) – the model family; default to
"cub"
; options"cube"
and"cush"
sh (int) – category corresponding to the shelter choice \([1,m]\)
n (int) – number of ordinal responses; it is only effective if the model is without covariates
ass_pars (dictionary, optional) – dictionary of hypothesized parameters, defaults to None
options (dict) – a dictionary of extra options
maxiter
andtol
; see the reference guide for detailsseed (int, optional) – the seed to ensure reproducibility, defaults to None
- Returns:
an instance of
CUBsample
(see here) containing ordinal responses drawn from the specified model- Return type:
obj
- cubmods.gem.estimate(formula, df, m=None, model='cub', sh=None, ass_pars=None, options={})#
Main function to estimate and validate GEneralized Mixture models.
- Parameters:
formula (str) – a formula used to estimate the model’s parameters, see Manual for details
df (DataFrame) – the DataFrame with observed ordinal sample and covariates (if any)
m (int) – number of ordinal categories
model (str) – the model family; default to
"cub"
; options"cube"
and"cush"
sh (int) – category corresponding to the shelter choice \([1,m]\)
ass_pars (dictionary, optional) – dictionary of hypothesized parameters, defaults to None
options (dict) – a dictionary of extra options
maxiter
andtol
; see the reference guide for details
- Returns:
an instance of the Base Class
CUBres
extended by the family module; see each module for details- Return type:
obj
cubmods.general module#
CUB models in Python. Module for General functions.
Description:#
This module contains methods and classes for general functions.
List of TODOs:#
…
Credits#
- Author:
Massimo Pierini
- Date:
2023-24
- Credits:
Domenico Piccolo, Rosaria Simone
- Contacts:
Classes and Functions#
- exception cubmods.general.InvalidCategoriesError(m, model)#
Bases:
Exception
Exception: if m is not suitable for model.
- exception cubmods.general.InvalidSampleSizeError(n)#
Bases:
Exception
Exception: if the sample size is not strictly greater than zero.
- exception cubmods.general.NoShelterError(model)#
Bases:
Exception
Exception: if a shelter choice is needed but it hasn’t been provided.
- exception cubmods.general.NotImplementedModelError(model, formula)#
Bases:
Exception
Exception: if the requested model is known but not yet implemented.
- exception cubmods.general.ParameterOutOfBoundsError(param, value)#
Bases:
Exception
Exception: if the provided parameter value is out of bounds.
- exception cubmods.general.ShelterGreaterThanM(m, sh)#
Bases:
Exception
Exception: if the provided shelter choice is greater than \(m\).
- exception cubmods.general.UnknownModelError(model)#
Bases:
Exception
Exception: if the requested family is unknown.
- cubmods.general.addones(A)#
Expand with a unitary vector in the first column of the given matrix to consider also an intercept term for CUB models with covariates.
- Parameters:
A – a matrix to be expanded
- Returns:
the expanded matrix
- Return type:
same of
A
- cubmods.general.aic(l, p)#
Akaike Information Criterion.
- Parameters:
l (float) – log-likelihood
p (int) – number of parameters
- Returns:
the AIC value
- Return type:
float
- cubmods.general.bic(l, p, n)#
Bayesian Information Criterion.
- Parameters:
l (float) – log-likelihood
p (int) – number of parameters
n (int) – number of observations
- Returns:
the BIC value
- Return type:
float
- cubmods.general.bitgamma(sample, m, W, gamma)#
Shifted Binomial distribution with covariates.
Return the shifted Binomial probabilities of ordinal responses where the feeling component is explained by covariates via a logistic link.
- Parameters:
sample (array) – array of ordinal responses
m (int) – number of ordinal categories
W (pandas dataframe) – dataframe of covariates for explaining the feeling component
gamma (array of float) – array \(\pmb \gamma\) of parameters for the feeling component, whose length equals
W.columns.size+1
to include an intercept term in the model (first entry)
- Returns:
an array of the same length as
sample
, where each entry is the shifted Binomial probability for the corresponding observation and feeling value.- Return type:
array
- cubmods.general.bitxi(m, sample, xi)#
Shifted Binomial probabilities of ordinal responses
Compute the shifted Binomial probabilities of ordinal responses.
- Parameters:
m (int) – number of ordinal categories
sample (array) – array of ordinal responses
xi (float) – feeling parameter \(\xi\)
- Returns:
A vector of the same length as
sample
, where each entry is the shifted Binomial probability of the corresponding observation.- Return type:
array
- cubmods.general.choices(m)#
Array of ordinal categories.
- Parameters:
m (int) – number of ordinal categories
- Returns:
array of int from 1 to m
- Return type:
array
- cubmods.general.colsof(A)#
Number of columns of the given matrix or dataframe.
- Parameters:
A (ndarray, dataframe) – the matrix or dataframe
- Returns:
number of columns
- Return type:
int
- cubmods.general.conf_border(Sigma, mx, my, ax, conf=0.95, plane='z', xyz0=(0, 0, 0))#
Plot the bivariate projection of a trivariate confidence ellipse on a plane.
Auxiliary function of
plot_ellipsoid()
.Note
Solution by https://gist.github.com/randolf-scholz.
- Parameters:
Sigma (ndarray) – bivariate variance-covariance matrix
mx (float) – center of the ellipse on the \(x\) axies
my (float) – center of the ellipse on the \(y\) axies
ax – matpplotlib axis
conf (float) – confidence level of the trivariate ellipsoid.
plane (str) – plane for the projection; could be
x
,y
orz
xyz0 (tuple) – tuple of the bivariate ellipse position
- cubmods.general.conf_ell(vcov, mux, muy, ci, ax, color='b', label=True, alpha=0.25)#
Plot bivariate confidence ellipse of estimated parameters at level
ci
\(=(1 - \alpha/2)\)- Parameters:
vcov (ndarray) – Variance-covariance matrix \(2 \times 2\)
mux (float) – estimate of first parameter
muy (float) – estimate of second parameter
ci (float) – confidence level \(=(1 - \alpha/2)\)
ax – matplotlib axis
color (str) – color of confidence ellipse
label (bool) – whether to add a label of confidence level
alpha (float) – transparency of confidence ellipse
- cubmods.general.dissimilarity(p_obs, p_est)#
Normalized dissimilarity measure.
Compute the normalized dissimilarity measure between observed relative frequencies and estimated (theoretical) probabilities of a discrete distribution.
- Parameters:
p_obs (array) – Vector of observed relative frequencies
p_est (array) – Vector of estimated (theoretical) probabilities
- Returns:
Numeric value of the dissimilarity index, assessing the distance to a perfect fit.
- Return type:
float
- cubmods.general.dummies2(df, DD)#
Create dummy variables from polychotomous variables.
Auxiliary function of
cubmods.gem.from_formula()
. A dummy variable is created for all polychotomous variables namedC(<varname>)
.- Parameters:
df (DataFrame) – a DataFrame with all the covariates and the ordinal response
DD (list) – the list of all covariates for each component
- Returns:
a tuple of the DataFrame with the dummy variables and the column names
- Return type:
tuple
- cubmods.general.equal3d(ax)#
Equalize 3d axes.
Auxiliary function of
.plot_ellipsoid()
.
- cubmods.general.expit(x)#
Expit function.
It is the inverse of logit. Aka sigmoid or standard logistic.
- Parameters:
x (float) – the argument
- Returns:
the expit of x
- Return type:
float
- cubmods.general.formula_parser(formula, model='cub')#
Parse a CUB class formula.
Auxiliary function of
cubmods.gem
functions.TODO: add specific Exceptions for formula
- Parameters:
formula (str) – the formula to be parsed
model (str) – the model family
- Returns:
a tuple of the ordinal response column name and a list of all covariates’ column names for each component
- Return type:
tuple
- cubmods.general.freq(sample, m, dataframe=False)#
Absolute frequecies of an observed sample of ordinal responses.
- Parameters:
sample (array of int) – array of ordinal responses
m (int) – number of ordinal categories
dataframe (bool) – if
True
return a DataFrame instead of an array, defaults toFalse
- Returns:
the absolute frequencies of the observed sample
- Return type:
array or dataframe
- cubmods.general.get_cov_ellipsoid(cov, mu=array([0., 0., 0.]), ci=0.95)#
Return the 3d points representing the covariance matrix
cov
centred atmu
, at confidence levelci
\(=(1 - \alpha/2)\).Auxiliary function of
.plot_ellipsoid()
.- Parameters:
cov (ndarray) – Variance-covariance matrix \(3 \times 3\)
mu (array) – ellispoid center \((x_0, y_0, z_0)\)
ci (float) – confidence level \(=(1 - \alpha/2)\)
- Returns:
a tuple of 3d points
(X, Y, Z)
- Return type:
tuple
- cubmods.general.get_minor(A, i, j)#
Get a minor of a matrix.
Auxiliary function of
.plot_ellipsoid()
.Note
Solution by PaulDong
- Parameters:
A (ndarray) – a generic matrix
i (int) – row of the minor
j (int) – column of the minor
- Returns:
the minor of
A
- Return type:
ndarray
- cubmods.general.hadprod(Amat, xvett)#
Hadamard product of a matrix with a vector
Return the Hadamard product between the given matrix and vector: this operation corresponds to multiply every row of the matrix by the corresponding element of the vector, and it is equivalent to the standard matrix multiplication to the right with the diagonal matrix whose diagonal is the given vector. It is possible only if the length of the vector equals the number of rows of the matrix. It is an auxiliary function needed for computing the variance-covariance matrix of the estimated model with covariates.
Note
if
xvett
is a row vector, reshapes it to column vector- Parameters:
Amat (ndarray) – A generic matrix
xvett (array) – A generic vector
- Returns:
the Hadamard product \(\pmb A \odot \pmb x\)
- Return type:
ndarray
- cubmods.general.kkk(sample, m)#
Sequence of combinatorial coefficients
Compute the sequence of binomial coefficients \(\binom{m-1}{r-1}\), for \(r= 1, \ldots m\), and then returns a vector of the same length as ordinal, whose i-th component is the corresponding binomial coefficient \(\binom{m-1}{r_i-1}\)
- Parameters:
sample (array) – array of ordinal responses
m (int) – number of ordinal categories
- Returns:
an array of \(\binom{m-1}{r_i-1}\)
- Return type:
array
- cubmods.general.load_object(fname)#
Load a saved object from file.
It can used be used to load a
CUBsample
or aCUBres
object, previously saved on a file.Note
see the Classes for details about these objects
- Parameters:
fname (str) – filename
- Returns:
the loaded object, instance of
CUBsample
orCUBres
- Return type:
object
- cubmods.general.logis(Y, param)#
The logistic transform.
Create a matrix
YY
binding arrayY
with a vector of ones, placed as the first column ofYY
. It applies the logistic transform componentwise to the standard matrix multiplication betweenYY
andparam
.- Parameters:
Y (ndarray, dataframe) – A generic matrix or a dataframe
param (array) – Vector of coefficients, whose length is
Y.columns.size+1
(to consider also an intercept term)
- Returns:
a vector whose length is
Y.index.size
and whose i-th component is the logistic function
- cubmods.general.logit(x)#
Logit function.
It is the inverse of the standard logistic function, aka log-odds.
- Parameters:
x (float) – the argument
- Returns:
the logit of x
- Return type:
float
- cubmods.general.lsat(f, n)#
Log-likelihood of saturated model.
Saturated level ,that is the theoretically maximum information that can be obtained by a model using as many parameters as possible. Then, the saturated log-likelihood is computed by assuming that the model is specified by as many parameters as available observations. This is the extreme benchmark for comparing previous log-likelihood quantities.
- Parameters:
f (array) – absolute frequencies of observed ordinal responses
n (int) – number of observations
- Returns:
log-likelihood of saturated model
- Return type:
float
- cubmods.general.luni(m, n)#
Log-likelihood of null model.
Null level, that is when no structure is searched for. Specifically, this is equivalent to assume a discrete Uniform over the support so that any category has the same probability.
- Parameters:
m (int) – number of ordinal categories
n (int) – number of observations
- Returns:
the log-likelihood of null model
- Return type:
float
- cubmods.general.plot_ellipsoid(V, E, ax, zlabel, ci=0.95, magnified=False)#
Plot a trivariate confidence ellipsoid.
- Parameters:
V (ndarray) – Variance-covariance matrix
E (array) – Vector of estimated parameters
ax – matplotlib axis
zlabel (str) – label for \(z\) axis
ci (float) – confidence level \((1 - \alpha/2)\)
magnified (bool) – if
False
plots in the full parameter space
- cubmods.general.probbit(m, xi)#
Probability distribution of shifted binomial random variable.
- Parameters:
m (int) – number of ordinal categories
xi (float) – feeling parameter \(\xi\)
- Returns:
the vector of the probability distribution of a shifted Binomial model.
- Return type:
array
- cubmods.general.unique(l)#
Unique elements in a 3-dimensional list.
Auxiliary function of
.dummies2()
.- Parameters:
l (list) – the list to analyze
- Returns:
the list of unique elements
- Return type:
list
cubmods.ihg module#
CUB models in Python. Module for IHG (Inverse HyperGeometric).
Description:#
This module contains methods and classes for IHG model family without covariates.
Manual, Examples and References:#
List of TODOs:#
…
Credits#
- Author:
Massimo Pierini
- Date:
2023-24
- Credits:
Domenico Piccolo, Rosaria Simone
- Contacts:
Classes and Functions#
- class cubmods.ihg.CUBresIHG(model, df, formula, m, n, sample, f, theoric, diss, est_names, estimates, e_types, varmat, stderrs, pval, wald, loglike, muloglik, loglikuni, AIC, BIC, seconds, time_exe, logliksat=None, dev=None, logliksatcov=None, niter=None, maxiter=None, tol=None, sh=None, rho=None, ass_pars=None)#
Bases:
CUBres
Object returned by
.mle()
function. See here the Base for details.Methods
as_dataframe
()DataFrame of estimated parameters
as_txt
()Print the summary.
plot
([ci, saveas, figsize])Main function to plot an object of the Class.
plot_estim
([ci, ax, magnified])Plots the estimated parameter values in the parameter space and the asymptotic standard error.
plot_ordinal
([figsize, ax, kind, saveas])Plots relative frequencies of observed sample, estimated probability distribution and, if provided, probability distribution of a known model.
save
(fname)Save a CUBresult object to file named
fname
+.cub.fit
summary
()Call
as_txt()
- plot(ci=0.95, saveas=None, figsize=(7, 8))#
Main function to plot an object of the Class.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figureci (float) – level \((1-\alpha/2)\) for the standard error
saveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- plot_estim(ci=0.95, ax=None, magnified=False)#
Plots the estimated parameter values in the parameter space and the asymptotic standard error.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figure (useful only ifax
is not None)ci (float) – level \((1-\alpha/2)\) for the confidence ellipse
magnified (bool) – if False the limits will be the entire parameter space, otherwise let matplotlib choose the limits
ax (matplolib ax, optional) – matplotlib axis, if None a new figure will be created, defaults to None
- Returns:
ax
or a tuple(fig, ax)
- plot_ordinal(figsize=(7, 5), ax=None, kind='bar', saveas=None)#
Plots relative frequencies of observed sample, estimated probability distribution and, if provided, probability distribution of a known model.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figure (useful only ifax
is not None)kind (str) – choose a barplot (
'bar'
default) of a scatterplot ('scatter'
)ax (matplolib ax, optional) – matplotlib axis, if None a new figure will be created, defaults to None
saveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- cubmods.ihg.draw(m, theta, n, df, formula, seed=None)#
Draw a random sample from a specified IHG model.
- Parameters:
m (int) – number of ordinal categories
theta (float) – parameter \(\theta\) (probability of 1st shelter category)
n (int) – number of ordinal responses to be drawn
df (DataFrame) – original DataFrame
formula (str) – the formula used
seed (int, optional) – the seed to ensure reproducibility, defaults to None
- Returns:
an instance of
CUBsample
(see here) containing ordinal responses drawn from the specified model
- cubmods.ihg.effe(theta, m, f)#
Compute the negative log-likelihood function of a IHG model without covariates for a given absolute frequency distribution. Auxiliary function of
mle()
for optimization algorithm.- Parameters:
theta (float) – parameter \(\theta\) (probability of 1st shelter category)
m (int) – number of ordinal categories
f (array of int) – array of absolute frequency distribution
- Returns:
the log-likelihood value
- Return type:
float
- cubmods.ihg.init_theta(m, f)#
Preliminary estimators for IHG models without covariates.
Computes preliminary parameter estimates of a IHG model without covariates for given ordinal responses. These preliminary estimators are used within the package code to start the E-M algorithm.
- Parameters:
f (array of int) – array of the absolute frequencies of given ordinal responses
m (int) – number of ordinal categories
- Returns:
the value of \(\theta^{(0)}\)
- cubmods.ihg.loglik(m, theta, f)#
Compute the log-likelihood function of a IHG model without covariates for a given absolute frequency distribution.
- Parameters:
theta (float) – parameter :math:` heta` (probability of 1st shelter category)
m (int) – number of ordinal categories
f (array of int) – array of absolute frequency distribution
- Returns:
the log-likelihood value
- Return type:
float
- cubmods.ihg.mle(m, sample, df, formula, ass_pars=None)#
Main function for CUB models without covariates.
Function to estimate and validate a CUB model without covariates for given ordinal responses.
- Parameters:
sample (array of int) – array of ordinal responses
m (int) – number of ordinal categories
df (DataFrame) – original DataFrame
formula (str) – the formula used
ass_pars (dictionary, optional) – dictionary of hypothesized parameters, defaults to None
- Returns:
an instance of
CUBresIHG
(see the Class for details)- Return type:
object
- cubmods.ihg.pmf(m, theta)#
Probability distribution of a specified IHG model without covariates.
\(\Pr(R = r | \pmb\theta),\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
theta (float) – parameter \(\theta\) (probability of 1st shelter category)
- Returns:
the vector of the probability distribution of a CUB model.
- Return type:
numpy array
- cubmods.ihg.var(m, theta)#
Variance of a specified IHG model.
- Parameters:
m (int) – number of ordinal categories
theta (float) – parameter \(\theta\) (probability of 1st shelter category)
- Returns:
the variance of the model
- Return type:
float
cubmods.ihg_v module#
CUB models in Python. Module for IHG (Inverse HyperGeometric) with covariates.
Description:#
This module contains methods and classes for IHG model family with covariates.
Manual, Examples and References:#
List of TODOs:#
…
Credits#
- Author:
Massimo Pierini
- Date:
2023-24
- Credits:
Domenico Piccolo, Rosaria Simone
- Contacts:
Classes and Functions#
- class cubmods.ihg_v.CUBresIHGV(model, df, formula, m, n, sample, f, theoric, diss, est_names, estimates, e_types, varmat, stderrs, pval, wald, loglike, muloglik, loglikuni, AIC, BIC, seconds, time_exe, logliksat=None, dev=None, logliksatcov=None, niter=None, maxiter=None, tol=None, sh=None, rho=None, ass_pars=None)#
Bases:
CUBres
Methods
as_dataframe
()DataFrame of estimated parameters
as_txt
()Print the summary.
plot
([saveas, figsize])Main function to plot an object of the Class.
plot_ordinal
([figsize, ax, kind, saveas])Plots avreage relative frequencies of observed sample, estimated average probability distribution and, if provided, average probability distribution of a known model.
save
(fname)Save a CUBresult object to file named
fname
+.cub.fit
summary
()Call
as_txt()
- plot(saveas=None, figsize=(7, 5))#
Main function to plot an object of the Class.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figuresaveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- plot_ordinal(figsize=(7, 5), ax=None, kind='bar', saveas=None)#
Plots avreage relative frequencies of observed sample, estimated average probability distribution and, if provided, average probability distribution of a known model.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figure (useful only ifax
is not None)kind (str) – choose a barplot (
'bar'
default) of a scatterplot ('scatter'
)ax (matplolib ax, optional) – matplotlib axis, if None a new figure will be created, defaults to None
saveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- cubmods.ihg_v.draw(m, nu, V, df, formula, seed=None)#
Draw a random sample from a specified IHG model with covariates
- Parameters:
m (int) – number of ordinal categories
sample (array of int) – array of ordinal responses
nu (array) – array \(\pmb \nu\) of parameters for \(\theta\), whose length equals
V.columns.size+1
to include an intercept term in the model (first entry)V (pandas dataframe) – dataframe of covariates for explaining the parameter \(\theta\)
df (DataFrame) – original DataFrame
formula (str) – the formula used
seed (int, optional) – the seed to ensure reproducibility, defaults to None
- Returns:
an instance of
CUBsample
(see here) containing ordinal responses drawn from the specified model
- cubmods.ihg_v.effe(nu, m, sample, V)#
Auxiliary function for the log-likelihood estimation of IHG models with covariates
Compute the opposite of the loglikelihood function for IHG models with covariates. It is called as an argument for “optim” within
.mle()
function as the function to minimize.- Parameters:
nu (float) – initial parameter estimate
V (pandas dataframe) – dataframe of covariates for explaining the parameter \(\theta\)
m (int) – number of ordinal categories
sample (array of int) – array of ordinal responses
- cubmods.ihg_v.init_theta(m, f)#
Preliminary estimators for IHG models without covariates.
Computes preliminary parameter estimates of a IHG model without covariates for given ordinal responses. These preliminary estimators are used within the package code to start the E-M algorithm.
- Parameters:
f (array of int) – array of the absolute frequencies of given ordinal responses
m (int) – number of ordinal categories
- Returns:
the array of \(\pmb\nu^{(0)}\)
- cubmods.ihg_v.loglik(m, sample, V, nu)#
Log-likelihood function for IHG models with covariates.
Compute the log-likelihood function for CUSH models with covariates to explain the shelter effect.
- Parameters:
m (int) – number of ordinal categories
sample (array of int) – array of ordinal responses
nu (array) – array \(\pmb \nu\) of parameters for \(\theta\), whose length equals
V.columns.size+1
to include an intercept term in the model (first entry)V (pandas dataframe) – dataframe of covariates for explaining the parameter \(\theta\)
- Returns:
the log-likelihood value
- Return type:
float
- cubmods.ihg_v.mle(m, sample, V, df, formula, ass_pars=None)#
Main function for IHG models with covariates.
Estimate and validate a IHG model for ordinal responses, with covariates.
- Parameters:
m (int) – number of ordinal categories
sample (array of int) – array of ordinal responses
V (pandas dataframe) – dataframe of covariates for explaining the parameter \(\theta\)
df (DataFrame) – original DataFrame
formula (str) – the formula used
ass_pars (dictionary, optional) – dictionary of hypothesized parameters, defaults to None
- Returns:
an instance of
CUBresIHGV
(see the Class for details)- Return type:
object
- cubmods.ihg_v.pmf(m, V, nu)#
Average probability distribution of a specified IHG model with covariates.
\(\frac{1}{n} \sum_{i=1}^n \Pr(R_i=r|\pmb\theta; \pmb T_i),\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
nu (array) – array \(\pmb \nu\) of parameters for \(\theta\), whose length equals
V.columns.size+1
to include an intercept term in the model (first entry)V (pandas dataframe) – dataframe of covariates for explaining the parameter \(\theta\)
- Returns:
the probability distribution
- Return type:
array
- cubmods.ihg_v.pmfi(m, V, nu)#
Probability distribution for each subject of a specified IHG model with covariates
\(\Pr(R_i=r|\pmb\theta; \pmb T_i),\; i=1 \ldots n ,\; r=1 \ldots m\)
- Parameters:
m (int) – number of ordinal categories
nu (array) – array \(\pmb \nu\) of parameters for \(\theta\), whose length equals
V.columns.size+1
to include an intercept term in the model (first entry)V (pandas dataframe) – dataframe of covariates for explaining the parameter \(\theta\)
- Returns:
the matrix of the probability distribution of dimension \(n \times r\)
- Return type:
numpy ndarray
- cubmods.ihg_v.prob(m, sample, V, nu)#
Probability distribution of a IHG model with covariates given an observed sample.
Compute the probability distribution of a IHG model with covariates, given an observed sample.
\(\Pr(R_i=r_i|\pmb\theta;\pmb T_i),\; i=1 \ldots n\)
- Parameters:
m (int) – number of ordinal categories
sample (array of int) – array of ordinal responses
nu (array) – array \(\pmb \nu\) of parameters for \(\theta\), whose length equals
V.columns.size+1
to include an intercept term in the model (first entry)V (pandas dataframe) – dataframe of covariates for explaining the parameter \(\theta\)
- Returns:
the array of the probability distribution.
- Return type:
numpy array
cubmods.multicub module#
CUB models in Python. Module for MULTICUB and MULTICUBE.
Description:#
This module contains methods and classes for MULTICUB and MULTICUBE tool.
Manual, Examples and References:#
List of TODOs:#
…
Credits#
- Author:
Massimo Pierini
- Date:
2023-24
- Credits:
Domenico Piccolo, Rosaria Simone
- Contacts:
Classes and Functions#
- cubmods.multicub.multi(ords, ms=None, model='cub', title=None, labels=None, shs=None, plot=True, print_res=False, pos=None, xlim=(0, 1), ylim=(0, 1), equal=True, confell=True, alpha=0.2, ci=0.95, figsize=(7, 7), ax=None)#
Joint plot of estimated CUB models in the parameter space
Return a plot of estimated CUB models represented as points in the parameter space.
- Parameters:
ords (list) – list of arrays of observed ordinal responses
model (str) – model; defaults to
cub
; optionscube
title (str) – title of the plot
labels (list) – labels of the points
shs (int or list) – shelter effect(s); can be an int if the same shelter effect is valid for all samples or a list to specify different shelter choices
plot (bool) – if
True
(default) plot the results;print_res (bool) – if
True
print the results; defaults toFalse
pos (list) – position of the \(\delta\) or \(\phi\) estimated values
xlim (tuple) – x-axis limits
ylim (tuple) – y-axis limits
equal (bool) – if the plot must have equal aspect; defaults to
True
alpha (float) – confidence ellipse transparency
confell (bool) – if
True
(default) plot confidence ellipse (for CUB model only)ci (float) – level \((1-\alpha/2)\) for the confidence ellipse
figsize (tuple of float) – tuple of
(length, height)
for the figure (useful only ifax
is not None)ax (matplolib ax, optional) – matplotlib axis, if None a new figure will be created, defaults to None
- Returns:
ax
- cubmods.multicub.pos_kwargs(pos)#
Position of the \(\delta\) or \(\phi\) estimated values
1 8 2 7 @ 3 6 4 5
- Parameters:
pos (int) – position (1..8)
- Returns:
a dictionary for
matplotlib
- Return type:
dict
cubmods.smry module#
CUB models in Python. Module for summary tools.
Description:#
This module contains methods and classes for summary tools.
List of TODOs:#
TODO: risultati inferenziali come DataFrame nel Manuale e negli esempi
TODO: bounds opzionali in CUBE mle (anche CUBSH?)
TODO: 2 decimali nei 3d plot?
TODO: dissim in multicub plot (aggiungere opzione)
TODO: grandezza punti phi in multicube
Credits#
- Author:
Massimo Pierini
- Date:
2023-24
- Credits:
Domenico Piccolo, Rosaria Simone
- Contacts:
Classes and Functions#
- class cubmods.smry.CUBres(model, df, formula, m, n, sample, f, theoric, diss, est_names, estimates, e_types, varmat, stderrs, pval, wald, loglike, muloglik, loglikuni, AIC, BIC, seconds, time_exe, logliksat=None, dev=None, logliksatcov=None, niter=None, maxiter=None, tol=None, sh=None, rho=None, ass_pars=None)#
Bases:
object
Default Class for MLE results; each model module extends this Class to an ad hoc Class with specific functions. An instance of the extended Class is returned by
.mle()
functions of model modules.- Variables:
model – the model family
df – the original DataFrame with observed sample and covariates (if any)
formula – the formula used to fit the data
m – number of ordinal categories
n – number of observed ordinal responses
sample – the observed sample of ordinal resposes
f – absolute frequecies of the sample
theoric – estimated probabilty distribution
diss – dissimilarity index
est_names – name of estimated parameters
estimates – values of estimated parameters
e_types – parameters’ component
varmat – variance-covariance matrix of estimated parameters
srtderrs – standard errors of estimated parameters
pval – p-values of estimated parameters
wald – Wald test statistics of estimated parameters
loglike – log-likelihood value
muloglik – average log-likelihood for each observation
loglikuni – log-likelihood of null model
AIC – Akaike Information Criterion
BIC – Bayesian Information Criterino
seconds – execution time of the algorithm
time_exe – when the algorithm has been executed
logliksat – log-likelihood of saturated model (for models without covariates only)
logliksatcov – deprecated
dev – deviance
niter – number of iterations of the EM algorithm
maxiter – maximum number of iterations of the EM algorithm
tol – fixed error tolerance
sh – shelter choice(s), if any
rho – coefficient of correlation between \(\pi\) and \(\xi\)
ass_pars – parameters of known model to be compared with the estimates
Methods
DataFrame of estimated parameters
as_txt
()Print the summary.
save
(fname)Save a CUBresult object to file named
fname
+.cub.fit
summary
()Call
as_txt()
- as_dataframe()#
DataFrame of estimated parameters
- as_txt()#
Print the summary. Auxiliary function of
summary()
.
- save(fname)#
Save a CUBresult object to file named
fname
+.cub.fit
- summary()#
Call
as_txt()
- class cubmods.smry.CUBsample(rv, m, pars, model, df, formula, diss, theoric, par_names, p_types, sh=None, seed=None)#
Bases:
object
An instance of this Class is returned by
.draw()
functions. See the corresponding model’s function for details.- Variables:
rv – array of drawn ordinal responses
m – number of ordinal categories
n – number of drawn responses
p – number of model’s parameters
pars – parameters’ values array
model – the model family
df – original DataFrame (if provided) with a column of the drawn sample
formula – the formula used to draw the sample
diss – dissimilarity index between drawn and theoretical distribution
theoric – theoretical distribution
par_names – names of the parameters
p_types – parameters’ component
sh – shelter choice(s), if any
seed – the
seed
used to ensure reproducibility
Methods
The parameters' values specified.
plot
([figsize, kind, ax, saveas])Basic plot function.
save
(fname)Save a CUBsample object to file named
fname
+cub.sample
summary
()Print the summary of the drawn sample.
- as_dataframe()#
The parameters’ values specified.
- Returns:
a DataFrame with parameters’ names and values
- Return type:
DataFrame
- plot(figsize=(7, 5), kind='bar', ax=None, saveas=None)#
Basic plot function.
- Parameters:
figsize (tuple of float) – tuple of
(length, height)
for the figure (useful only ifax
is not None)kind (str) – choose a barplot (
'bar'
default) of a scatterplot ('scatter'
)ax (matplolib ax, optional) – matplotlib axis, if None a new figure will be created, defaults to None
saveas (str) – if provided, name of the file to save the plot
- Returns:
ax
or a tuple(fig, ax)
- save(fname)#
Save a CUBsample object to file named
fname
+cub.sample
- summary()#
Print the summary of the drawn sample.