inmoose.edgepy.aveLogCPM
- inmoose.edgepy.aveLogCPM(y, lib_size=None, offset=None, prior_count=2, dispersion=None, weights=None)
Compute average log2 counts per million for each row of counts.
This function uses
mglmOneGroup()to compute average counts per million (AveCPM) for each row of counts, and returnslog2(AveCPM). An average value ofprior_countis added to the counts before runningmglmOneGroup(). Ifprior_countis a vector, each entry will be added to all counts in the corresponding row ofy, as described inaddPriorCount().This function is similar to
log2(rowMeans(cpm(y, ...))), but with the refinement that larger library sizes are given more weight in the average. The two version will agree for large value of the dispersion.See also
cpmfor individual logCPM values, rather than genewise averages
addPriorCountuses the same strategy to add the prior counts
mglmOneGroupcomputations for this function rely on
mglmOneGroup()
- Parameters:
y (matrix) – matrix of counts. Rows for genes and columns for libraries.
lib_size (array_like, optional) – vector of library sizes. Defaults to
np.sum(y, axis=0). Ignored ifoffsetis notNone.offset (matrix, optional) – matrix of offsets for the log-linear models. Defaults to
None.prior_count (float or array_like, optional) – scalar or vector of length
y.shape[0], containing the average value(s) to be added to each count to avoid infinite value on the log-scale. Defaults to2.dispersion (float or array_like, optional) – scalar or vector of negative binomial dispersions.
weights (matrix, optional) – matrix of observation weights
- Returns:
numeric vector giving
log2(AveCPM)for each row ofy- Return type:
ndarray