inmoose.deseq2.estimateSizeFactorsForMatrix

inmoose.deseq2.estimateSizeFactorsForMatrix(counts, type_='ratio', locfunc=<function median>, geoMeans=None, controlGenes=None)

Low-level function to estimate size factors with robust regression

Given a matrix or data frame of count data, this function estimates the size factors as follows: each row is divided by the geometric means of the columns. The median (or, if requested, another location estimator) of these ratios (skipping the genes with a geometric mean of zero) is used as the size factor for this row. Typically, one will not call this function directly, but use DESeqDataSet.estimateSizeFactors().

Parameters:
  • counts (array-like) – matrix of raw counts. One column per gene, one row per sample.

  • type ("ratio" or "poscounts") – the algorithm to estimate the size factors: standard median ratio ("ratio"), or there the geometric mean is only calculated over positive counts ("poscounts").

  • locfunc – a function to compute a location for a sample. By default, the median is used.

  • geoMeans (ndarray, optional) – by default, the geometric means of the counts are calculated within the function. A vector of geometric means from another count matrix can be provided for a “frozen” size factor calculation.

  • controlGenes (array-like, optional) – index vector specifying those genes to use for size factor estimation (e.g. housekeeping or spike-in genes)

Returns:

the estimated size factors, one element per row of counts

Return type:

ndarray