inmoose.limma.lmFit

inmoose.limma.lmFit(obj, design=None, ndups=None, spacing=None, block=None, correlation=None, weights=None, method='ls')

Fit linear models for each gene given a series of arrays

This function fits multiple linear models by weighted or generalized least squares. It accepts data from an experiment involving a series of microarrays with the same set of probes. A linear model is fitted to the expression data of each probe. The expression data should be log-ratios for two-color array platforms or log-expression values for one-channel platforms. To fit linear models to the individual channels of two-color array data, see lmscFit(). The coefficients of the fitted models describe the differences between the RNA sources hybridized to the arrays. The probe-wise fitted model results are stored in a compact form suitable for further processing by other functions in the limma package.

This function allows for missing values and accepts quantitative precision weights through the weights argument. It also supported two different correlation structures. If block is not None, then different arrays are assumed to be correlated. If block is None and ndups is greater than one then replicate spots on the same array are assumed to be correlated. It is not possible at this time to fit models with a block structure and a duplicate-spot correlation structure simultaneously.

If obj is a matrix then it should contain log-ratios or log-expression data with rows corresponding to probes and columns to arrays. A vector is treated the same as a matrix with a single column. For objects of other classes, a matrix of expression values is taken from the appropriate component or slot of the object. If obj is of class MAList or marrayNorm, then the matrix of log-ratios (M-values) is extracted.

The arguments design, ndups, spacing and weights will be extracted from the data obj if available. On the other hand, if any of these are set to a non-None value in the function call then this value will override the value found in obj.

If the argument block is used, then it is assumed that ndups=1.

The correlation argument has a default value of 0.75, but in normal use this default value should not be relied on and the correlation value should be estimated using the function duplicateCorrelation. The default value is likely to be too high in particular if used with the block argument.

The actual linear model computations are done by passing the data to one of the lower-level functions lm_series(), gls_series() or mrlm(). The function mrlm() is used if method="robust". If method="ls", then gls_series() is used if a correlation structure has been specified, i.e. if ndups > 1 or block is non-null and correlation is different from zero. If method="ls" and there is no correlation structure, lm_series is used.

An overview of linear model functions in limma is given by Linear Models for Microarrays.

See also

getEAWP

extract expression values, gene annotation and so from the data obj.

Parameters:
  • obj (matrix-like) – a matrix-like data object containing log-ratios or log-expression values for a series of arrays, with rows corresponding to genes and columns to samples. Any type of data object that can be processed by getEAWP() is acceptable.

  • design (patsy formula-like) – the design matrix of the microarray experiment, with rows corresponding to samples and columns to coefficients to be estimated. Defaults to obj.design if not None, otherwise to the unit vector, meaning that all samples will be treated as replicates of a single treatment group.

  • ndups (int) – positive integer giving the number of times each distinct probe is printed on each array

  • spacing (int) – positive integer giving the spacing between duplicate occurrences of the same probe, spacing=1 for consecutive rows.

  • block (array-like) – vector or factor specifying a blocking variable on the arrays. Has length equal to the number of arrays. Must be None if ndups > 2.

  • correlation – the inter-duplicate or inter-technical replicate correlation

  • weights – non-negative precision weights. Can be a numeric matrix of individual weights of same size as the object expression matrix, or a numeric vector of gene weights with length equal to nrow of the expression matrix.

  • method ({ "ls", "robust" }) – fitting method: "ls" for least squares or "robust" for robust regression

Returns:

object containing the result of the fits. The row names of obj are preserved in the fit object and can be retrieved by fit.index where fit is the output of lmFit(). The column names of design are preserved as column names and can be retrieved by fit.columns.

Return type:

MArrayLM