inmoose.edgepy.mglmOneWay

inmoose.edgepy.mglmOneWay(y, design=None, group=None, dispersion=0, offset=0, weights=None, coef_start=None, maxit=50, tol=1e-10)

Fit multiple negative binomial GLMs with log-link by Fisher scoring with a single explanatory factor in the model.

This is a low-level work-horse used by higher-level functions, especially glmFit(). Most users will not need to call this function directly.

This function fits a negative binomial GLM to each row of y. The row-wise GLMs all have the same design matrix but possibly different dispersions, offsets and weights. It is low-level in that it operates on atomic objects (matrices and vectors).

This function fits a oneway layout to each response vector. It treats the libraries as belonging to a number of groups and calls mglmOneGroup() for each group. It treats the dispersion parameter of the negative binomial distribution as a known input.

Parameters:
  • y (array_like) – matrix of negative binomial counts. Rows for genes and columns for libraries.

  • design (array_like, optional) – design matrix of the GLM. Assumed to be full column rank. Defaults to ~ 0 + group if group is specified, otherwise to ~ 1.

  • group (Factor) – group memberships for oneway layout. If both design and group are specified, then they must agree in terms of designAsFactor(). If design = None, then a group-means design matrix is implied.

  • dispersion (float or array_like) – scalar or vector giving the dispersion parameter for each GLM. Can be a scalar giving one value for all genes, or a vector of length equal to the number of genes giving genewise dispersions.

  • offset (array_like) – vector or matrix giving the offset that is to be included in the log linear model predictor. Can be a scalar, a vector of length equal to the number of libraries, or a matrix of the same shape as y.

  • weights (matrix, optional) – vector or matrix of non-negative quantitative weights. Can be a vector of length equal to the number of libraries, or a matrix of the same shape as y.

  • coef_start (array_like, optional) – matrix of starting values for the linear model coefficient. Number of rows should agree with y and number of columns should agree with design. This argument does not usually need to be set as the automatic starting values perform well.

  • maxit (int) – the maximum number of iterations for the Fisher scoring algorithm. The iteration will be stopped when this limit is reached even if the convergence criterion has not been satisfied.

  • tol (float) – the convergence tolerance.

Returns:

tuple with the following components:

  • matrix of estimated coefficients for the linear models. Rows correspond to row of y and columns to columns of design

  • matrix of fitted values. Same shape as y.

Return type:

tuple