inmoose.edgepy.topTags
- inmoose.edgepy.topTags(self, n=10, adjust_method='fdr_bh', sort_by='PValue', p_value=1)
Extract the most differentially expressed genes (or sequence tags) from a test object, ranked either by p-value or by absolute log-fold-change.
This function accepts a test statistic object created by any of the functions
exactTest(),glmLRT(),glmTreat()orglmQLFTest()and extracts a readable dataframe of the most differentially expressed genes. The dataframe collates the annotation and differential expression statistics for the top genes. The dataframe is wrapped in aTopTagsobject that records the test statistic used and the multiple testing adjustment method.topTags()permits ranking by fold-change but the authors do not recommend fold-change ranking or fold-change cutoffs for routine RNA-Seq analysis. The p-value ranking is intended to be more biologically meaningful, especially if the p-values were computed usingglmTreat().- Parameters:
self (DGEExact or DGELRT) – object containing test statistics and p-values
n (int) – maximum number of genes/tags to return
adjust_method (str) – specify the method used to adjust p-values for multiple testing. See
statsmodels.stats.multitest()for possible values. Also accepts the values accepted byp.adjustfrom thestatspackage.sort_by ({"PValue, "logFC", "none"}) –
specify the sort method
"PValue"to sort by p-value"logFC"to sort by absolute log-fold-change"none"for no sorting
p_value (float) – cutoff value for adjusted p-values. Only tags with adjusted p-values equal or lower than specified are returned.
- Returns:
a dataframe containing differential expression results for the top genes in a sorted order. The number of rows is the smaller of
nand the number of genes with adjusted p-value less than or equal top_value. The dataframe includes all the annotation columns fromself.genesand all statistic columns fromselfplus one of:"FDR", false discovery rate (only whenadjust_methodis"fdr_bh","fdr_by"))"FWER", family-wise error rate (only whenadjust_methodis"holm","simes-hochberg","hommel"or"bonferroni")
For consistency with other modules, the dataframe also contains a
"adj_pvalue"column with the same content as the"FDR"or"FWER"column.The object also contains the following components:
adjust_method, string specifying the method used to adjust p-values for multiple testing, same as input argumentcomparisonthe names of the two groups being compared (forDGEExactobjects) or the glm contrast being tested (forDGELRTobjects).test, string stating the name of the test
- Return type:
TopTags