inmoose.deseq2.DESeqDataSet.DESeqDataSet.results
- DESeqDataSet.results(contrast=None, name=None, lfcThreshold=0, altHypothesis='greaterAbs', listValues=(1, -1), cooksCutoff=None, independentFiltering=True, alpha=0.1, filter=None, theta=None, pAdjustMethod='fdr_bh', filterRun=None, saveCols=None, test=None, addMLE=False, tidy=False, parallel=False, minmu=0.5, confint=False)
Extract results from a
DESeq()analysisThis function extracts a result table from a DESeq analysis giving base means across samples, log2 fold changes, standard errors, test statistics, p-values and adjusted p-values.
The results table when printed will provide the information about the comparison, e.g. “log2 fold change (MAP): condition treated vs. untreated”, meaning that the estimates are of log2(treated / untreated), as would be returned using
contrast=c("condition", "treated", "untreated"). Multiple results can be returned for analyses beyond a simple two group comparison, so this function takes argumentscontrastandnameto help the user pick out the comparisons of interest for printing a results table. The use of thecontrastargument is recommended for exact specification of the levels which should be compared and their order.If
resultsis run without specifyingcontrastorname, it will return the comparison of the last level of the last variable in the design formula over the first level of this variable. For example, for a simple two-group comparison, this would return the log2 fold changes of the second group over the first group (the reference level).The argument
contrastcan be used to generate results tables for any comparison of interest, for example, the log2 fold change between two levels of a factor, and its usage is described below. It can also accommodate more complicated numeric comparisons. Note thatcontrastwill set to 0 the estimated LFC in a comparison of two groups, where all the counts in the two groups are equal to 0 (while other groups have positive counts), whilenamewill not automatically set these LFC to 0. The test statistic used for a contrast is: \(c^t \\beta / \sqrt{c^t \Sigma c}\).The argument
namecan be used to generate results tables for individual effects, which must be individual elements ofobj.resultsNames(). These individual effects could represent continuous covariates, effects for individual levels, or individual interaction effects.Information on the comparison which was used to build the results table, and the statistical test which was used for p-values (Wald test or likelihood ratio test) is stored within the object returned by
results(). This information is stored in the columns of the results table (seeDESeqResults).On p-values
By default, independent filtering is performed to select a set of genes for multiples test correction which maximizes the number of adjusted p-values less than a given critical value
alpha(by default 0.1). See the reference in this page for details on independent filtering. The filter used for maximizing the number of rejections is the mean of normalized counts for all samples in the dataset. Several arguments fromfiltered_p()of the genefilter package (used withinresults()) are provided here to control the independent filtering behavior. Note that the code offiltered_p()is copied into this package to avoid extra dependencies).The threshold that is chosen is the lowest quantile of the filter for which the number of rejections is close to the peack of a curve fit to the number of rejections over the filter quantiles. “Close to” is defined as within 1 residuel standard deviation. The adjusted p-values for the genes which do not pass the filter threshold are set to
nan.By default,
results()assigns a p-value ofnanto genes containing count outliers, as identified using Cook’s distance. See thecooksCutoffargument for control of this behavior. Cook’s distances for each sample are accessible asobj.layers["cooks"]. This measure is useful for identifying rows where the observed counts might not fit to a Negative Binomial distribution.For analyses using the likelihood ratio test (using
nbinomLRT()), the p-values are determined solely by the difference in deviance between the full and reduced model formula. A single log2 fold change is printed in the results table for consistenct with other results table outputs, however the test statistic and p-values may nevertheless involve the testing of one or more log2 fold changes. Which log2 fold change is printed in the results table can be controlled using thenameargument, or by default this will be the estimated coefficient for the last element ofobj.resultsNames().If
useT = Truewas specified when runningDESeq()ornbinomWaldTest(), the the p-value generated byresults()will also make use of the t-distribution for the Wald statistic, using the degrees of freedom inobj.var["tDegreesFreedom"].- Parameters:
obj (DESeqDataSet) – a
DESeqDataSetone which one of the following function has already been called:DESeq(),nbinomWaldTest()ornbinomLRT().contrast –
this argument specifies what comparison to extract from
objto build a results table. One of either:- a list of exactly three strings (simplest case):
the name of a factor in the design formula
the name of the numerator level for the fold change
the name of the denominator level for the fold change
- a list of two string lists (more general case):
the names of the numerators for LFC
the names of the denominators for LFC
These names should be elements of
obj.resultsNames(). If the list has a single element (the numerators) then the denominator list is considered empty.a numeric list with one element for each element in
obj.resultsNames()(most general case)
If specified, the
nameargument is ignored.name (str) – the name of the individual effect (coefficient) for building a results table. Use this argument rather than
contrastfor continuous variables, individual effects or for individual interaction terms. The value provided tonamemust be an element ofobj.resultsNames().lfcThreshold (float) – a non-negative value which specifies a log2 fold change threshold. The default value is 0, corresponding to a test that the log2 fold changes are equal to zero. The user can specify the alternative hypothesis using the
altHypothesisargument, which defaults to testing for log2 fold changes greater in absolute value than a given threshold. IflfcThresholdis specified, the results are for Wald tests, and LRT p-values will be overwritten.altHypothesis (str) –
specifies the alternative hypothesis, i.e. those values of log2 fold change which the user is interested in finding. the complement of this set of values is the null hypothesis which will be tested. If the log2 fold change specified by
nameor bycontrastis written as \(\\beta\), then the possible values foraltHypothesisrepresent the following alternative hypotheses:"greaterAbs": \(|\\beta| > \\text{lfcThreshold}\), and p-values are two-tailed"lessAbs": \(|\\beta| < \\text{lfcThreshold}\), and p-values are the maximum of the upper and lower tests. The Wald statistic given is positive, an SE-scaled distance from the closest boundary"greater": \(\\beta > \\text{lfcThreshold}\)"less": \(\\beta < \\text{lfcThreshold}\)
listValues ((float, float)) – only used if a numerators-denominators list is provided to
contrast. The log2 fold changes in the list are multiplied by these values. The first number should be positive and the second one negative. Defaults to(1,-1).cooksCutoff (float) – threshold on Cook’s distance, such that if one or more samples for a gene have a distance higher, the corresponding p-value is set to
nan. The default cutoff is the .99 quantile of the \(F(p, m-p)\) distribution, where \(p\) is the number of coefficients being fitted, and \(m\) is the number of samples. Set toinforFalseto disable the resetting of p-values tonan. Note: this test excludes the Cook’s distance of samples belonging to experimental groups with only 2 samples.independentFiltering (bool) – whether independent filtering should be applied automatically
alpha (float) – the significance cutoff used for optimizing the independent filtering (defaults to 0.1). If the adjusted p-value cutoff (FDR) will be a value other than 0.1,
alphashould be set to that value.filter – the vector of filter statistics over which the independent filtering will be optimized. By default, the mean of normalized counts is used.
theta (array-like) – the quantiles at which to assess the number of rejections from independent filtering
pAdjustMethod (str) – the method to use to adjust p-values, see
p_adjust()filterRun – an optional custom function for performing independent filtering and p-value adjustment, with arguments
res(aDESeqResultsobject),filter(the quantity for filtering tests),alpha(the target FDR),pAdjustMethod. This function should return aDESeqResultsobject with aadj_pvaluecolumn.saveCols (array-like) – the columns of
obj.varto pass into the output results tabletest ({ "Wald", "LRT" }) – automatically detected if not provided. The one exception is after
nbinomLRT()has been run,test="Wald"will generate Wald statistics and Wald test p-values.addMLE (bool) – if
betaPrio=Truewas used (non-default). This argument specifies if the “unshrunken” maximum likelihood estimates (MLE) of log2 fold change should be added as a column to the results table (defaults toFalse). This argument is preserved for backward compatibility, as nowbetaPrior=Trueby default and the recommended pipeline is to generate shrunken MAP estimates usinglfcShrink(). This argument functionality is only implemented forcontrastspecified as a 3-element string list.tidy (bool) – whether to output the results table with a header
parallel (bool) – unimplemented
minmu (float) – lower bound on the estimated count (used when calculating contrasts)
confint (bool or float) – whether the confidence 95% intervals should be output for log2 fold changes. Alternatively, can be a value between 0 and 1 specifying the required confidence level.
- Returns:
a results table (subclass of
DEResults), containing the following results columns:"baseMean","log2FoldChange","lfcSE","stat","pvalue"and"pad". The"lfcSE"gives the standard error of the"log2FoldChange". For the Wald test,"stat"is the Wald statistic: the"log2FoldChange"divided by"lfcSE", which is compared to a standard Normal distribution to generate a two-tailed"pvalue". For the likelihood ratio test (LRT),"stat"is the difference in deviance between the reduced model and the full model, which is compared to a chi-squared distribution to generate a"pvalue".- Return type: