inmoose.edgepy.mglmOneWay

inmoose.edgepy.mglmOneWay(y, design=None, group=None, dispersion=0, offset=0, weights=None, coef_start=None, maxit=50, tol=1e-10)

Fit multiple negative binomial GLMs with log-link by Fisher scoring with a single explanatory factor in the model.

This is a low-level work-horse used by higher-level functions, especially glmFit(). Most users will not need to call this function directly.

This function fits a negative binomial GLM to each row of y. The row-wise GLMs all have the same design matrix but possibly different dispersions, offsets and weights. It is low-level in that it operates on atomic objects (matrices and vectors).

This function fits a oneway layout to each response vector. It treats the libraries as belonging to a number of groups and calls mglmOneGroup() for each group. It treats the dispersion parameter of the negative binomial distribution as a known input.

Parameters:

y (array_like) – matrix of negative binomial counts. Rows for genes and columns for libraries.
design (array_like, optional) – design matrix of the GLM. Assumed to be full column rank. Defaults to ~ 0 + group if group is specified, otherwise to ~ 1.
group (Factor) – group memberships for oneway layout. If both design and group are specified, then they must agree in terms of designAsFactor(). If design = None, then a group-means design matrix is implied.
dispersion (float or array_like) – scalar or vector giving the dispersion parameter for each GLM. Can be a scalar giving one value for all genes, or a vector of length equal to the number of genes giving genewise dispersions.
offset (array_like) – vector or matrix giving the offset that is to be included in the log linear model predictor. Can be a scalar, a vector of length equal to the number of libraries, or a matrix of the same shape as y.
weights (matrix, optional) – vector or matrix of non-negative quantitative weights. Can be a vector of length equal to the number of libraries, or a matrix of the same shape as y.
coef_start (array_like, optional) – matrix of starting values for the linear model coefficient. Number of rows should agree with y and number of columns should agree with design. This argument does not usually need to be set as the automatic starting values perform well.
maxit (int) – the maximum number of iterations for the Fisher scoring algorithm. The iteration will be stopped when this limit is reached even if the convergence criterion has not been satisfied.
tol (float) – the convergence tolerance.

Returns:

tuple with the following components:

matrix of estimated coefficients for the linear models. Rows correspond to row of y and columns to columns of design
matrix of fitted values. Same shape as y.

Return type:

tuple