inmoose.limma.classifyTestsF
- inmoose.limma.classifyTestsF(self, cor_matrix=None, df=inf, p_value=0.01, fstat_only=False)
For each gene, classify a series of related t-statistics as significantly up or down using nested F-tests.
This function implements the
nestedFmultiple testing option offered bydecideTests(). Users should generally usedecideTests()rather than callingclassifyTestsF()directly because, by itself,classifyTestsF()does not incorporate any multiple testing adjustment across genes. Instead, it simply tests across contrasts for each gene individually.classifyTestsF()used a nested F-test approach giving particular attention to correctly classifying genes that have two or more significant t-statistics, i.e. which are differentially expressed in two or more conditions. For each row oftstat, the overall F-statistics is constructed from the t-statistics as forFStat. At least one contrast will be classified as significant if and only if the overall F-statistic is significant. If the overall F-statistic is significant, then the function makes a best choice as to which t-statistics contributed to this result. The methodology is based on the principle that any t-statistic should be called significant if the F-test is still significant for that row when all the larger t-statistics are set to the same absolute size as the t-statistic in question.Compared to conventional multiple testing methods, the nested F-test approach achieves better consistency between related contrasts. (For example, if B is judged to be different from C, then at least one of B or C should be different to A.) the approach was first used by [Michaud2008]. The nested F-test approach provides weak control of the family-wise error rate, i.e. it correctly controls the type I error rate of calling any contrast as significant if all the null hypotheses are true. In other words, it provides error rate control at the overall F-test level but does not provide strict error rate control at the individual contrast level.
Usually,
selfis a limma linear model fitted object, from which a matrix of t-statistics can be extracted, but it can also be a numeric matrix of t-statistics. In either case, rows correspond to genes and columns to coefficients or contrasts. Thecor_matrixis the same as the correlation matrix of the coefficients from which the t-statistics were calculated anddfis the degrees of freedom of the t-statistics. All statistics for the same gene must have the same degrees of freedom.If
fstat_only=True, this function just returns the vector of overall F-statistics for each gene.- Parameters:
self (MArrayLM or ndarray) – matrix of t-statistics, or a
MArrayLMobject from which the t-statistics may be extractedcor_matrix (ndarray) – covariance matrix of each of t-statistics. Will be extracted automatically from the
MArrayLMobject, but otherwise defaults to the identity matrix.df (array_like) – array of degrees of freedom for the t-statistics. Should be broadcastable to the shape of
tstats. Will be extract automatically from theMArrayLMobject but otherwise defaults tonp.inf.p_value (float) – value between 0 and 1 giving the desired size of the test
fstat_only (bool) – if
Truethen return the overall F-statistic as forFStatinstead of classifying the test results.
- Returns:
if
fstats_only=False, then an object of classTestResults, which is essentially a matrix with elements -1, 0 or 1 depending on whether each t-statistics is classified as significantly negative, not significant or significantly positive respectively. iffstats_only=True, then an array of F-statistics is returned with attributesdf1anddf2giving the corresponding degrees of freedom.- Return type:
TestResults or ndarray