inmoose.edgepy.predFC

inmoose.edgepy.predFC(y, design, prior_count=0.125, offset=None, dispersion=0, weights=None)

Compute estimated coefficients for a negative binomial GLM in such a way that the log-fold-changes are shrunk towards zero.

This function computes predictive log-fold changes (pfc) for a NB GLM. The pfc are posterior Bayesian estimators of the true log-fold-changes. They are predictive of values that might be replicated in a future experiment.

Specifically, the function adds a small prior count to each observation before fitting the GLM (see addPriorCount() for details). The actual prior count that is added is proportional to the library size. This has the effect that any log-fold-change that was zero prior to augmentation remains zero and non-zero log-fold-changes are shrunk towards zero.

The prior counts can be viewed as equivalent to a prior belief that the log-fold-changes are small, and the output can be viewed as posterior log-fold-changes from this Bayesian viewpoint. The output coefficients are called predictive log-fold-changes because, depending on the prior, They may be a better prediction of the true log-fold-changes than the raw estimates.

Log-fold-changes for genes with low counts are shrunk more than those for genes with high counts. In particular, infinite log-fold-changes arising from zero counts are avoided. The exact degree to which this is done depends on the negative binomial dispersion.

Parameters:
  • y (array_like) – matrix of counts

  • design (array_like) – the design matrix for the experiment

  • prior_count (float) – the average prior count to be added to each observation. Larger values produce more shrinkage.

  • offset (array_like) – vector or matrix giving the offset in the log-linear model predicto, as in glmFit(). Usually equal to log library size.

  • dispersion (array_like) – vector of negative binomial dispersions

  • weights (array_like, optional) – observation weights

Returns:

matrix of (shrunk) linear model coefficients on the log2 scale

Return type:

ndarray

References

B. Phipson. 2013. Empirical Bayes modelling of expression profiles and their associations. PhD thesis. University of Melbourne, Australia. http://repository.unimelb.edu.au/10187/17614