inmoose.limma.squeezeVar
- inmoose.limma.squeezeVar(var, df, covariate=None, robust=False, winsor_tail_p=(0.05, 0.1))
Squeeze a set of sample variances together by computing empirical Bayes posterior means This function implements empirical Bayes algorithms proposed by [Smyth2004] and [Phipson2016].
A conjugate Bayesian hierarchical model is assumed for a set of sample variances. The hyperparameters are estimated by fitting a scaled F-distribution to the sample variances. The function returns the posterior variances and the estimated hyperparameters.
Specifically, the sample variances
varare assumed to follow scaled chi-squared distributions, conditional on the true variances, and a scaled inverse chi-squared prior is assumed for the true variances. The scale and degrees of freedom of this prior distribution are estimated from the values ofvar.The effect of this function is to squeeze the variances towards a common value, or to a global trend if a
covariateis provided. The squeezed variances have a smaller expected mean square error to the true variances than do the sample variances themselves.If
covariateis notNone, then the scale parameter of the prior distribution is assumed to depend on the covariate. If the covariate is average log-expression, then the effect is an intensity-dependent trend similar to that in [Sartor2006].robust=Trueimplements the robust empirical Bayes procedure of [Phipson2016] which allows some of thevarvalues to be outliers.- Parameters:
var (array_like) – 1-D array of independent sample variances
df (array_like) – 1-D array of degrees of freedom for the sample variances
covariate – if not
None,var_priorwill depend on this numeric covariate. Otherwise,var_prioris constant.robust (bool) – whether the estimation of
df_priorandvar_priorbe robustified against outlier sample varianceswinsor_tail_p (float or pair of floats) – left and right tail proportions of
xto Winsorize. Only used whenrobust=True
- Returns:
a dictionary with keys:
"var_post", 1-D array of posterior variances. Same length asvar."var_prior", location or scale of prior distribution. 1-D array of same length asvarifcovariateis notNone, otherwise a single value."df_prior", degrees of freedom of prior distribution. 1-D array of same length asvarifrobust=True, otherwise a single value.
- Return type:
dict