sleuth_results | R Documentation |
This function extracts Wald or Likelihood Ratio test results from a sleuth object.
sleuth_results(obj, test, test_type = "wt", which_model = "full", rename_cols = TRUE, show_all = TRUE, pval_aggregate = obj$pval_aggregate, ...)
obj |
a |
test |
a character string denoting the test to extract. Possible tests can be found by using |
test_type |
'wt' for Wald test or 'lrt' for Likelihood Ratio test. |
which_model |
a character string denoting the model. If extracting a wald test, use the model name. Not used if extracting a likelihood ratio test. |
rename_cols |
if |
show_all |
if |
pval_aggregate |
if |
... |
advanced options for sleuth_results. See details. |
The columns returned by this function will depend on a few factors: whether the test is a Wald test or
Likelihood Ratio test, and whether pval_aggregate
is TRUE
.
The sleuth model is a measurement error in the response model. It attempts to segregate the variation due to
the inference procedure by kallisto from the variation due to the covariates – the biological and technical
factors of the experiment (represented by the columns in obj$sample_to_covariates
). For the Wald test,
the 'b' column represents the estimate of the selected coefficient. In the default setting, it is analogous to,
but not equivalent to, the fold-change. The transformed values are on the natural-log scale, and so the
the estimated coefficient is also on the natural-log scale. This value is taking into account the estimated
'inferential variance' estimated from the kallisto bootstraps.
If the user wishes to get gene-level results from this function, there are two ways of doing so:
p-value aggregation mode: if pval_aggregate
argument is TRUE, this function will
aggregate the transcript-level p-values to the gene-level using the lancaster method. See below for advanced
options related to this mode. This is the recommended way to do gene-level aggregation. See the paper
count aggregation mode: This is the gene-level aggregation method introduced in sleuth version 0.28.1.
This mode is activated if obj$gene_mode
is TRUE
. In this mode, the modeling and testing was done
using aggregated counts (or TPMs), and so the results are same as for the transcript-level results, except the
target IDs are now gene IDs instead of transcript IDs.
An important note if pval_aggregate
or the old gene_mode
is TRUE
: when combining the
gene annotations from obj$target_mapping
, all of the columns except for the transcript ID,
obj$target_mapping$target_id
, will be included. If there are transcript-level entries for any of the other
columns, this will result in duplicate rows in the results table (usually an undesirable result).
Here are advanced options for customizing the p-value aggregation procedure:
weight_func
: if pval_aggregate
is TRUE
, then this is used to weight the p-values for
lancaster's method. This function must take the observed means of the transcripts as the only defined argument.
The default is identity
.
If pval_aggregate
is FALSE
, returns a data.frame
with the following columns:
target_id
: transcript name, e.g. "ENST#####" (dependent on the transcriptome used in kallisto).
If gene_mode
is TRUE, this will instead be the IDs specified by the obj$gene_column
from obj$target_mapping
.
...
: if there is a target mapping data frame, all of the annotations columns are added from
obj$target_mapping
before the other columns.
pval
: p-value of the chosen model
qval
: false discovery rate adjusted p-value, using Benjamini-Hochberg (see p.adjust
)
test_stat
(LRT only): Chi-squared test statistic (likelihood ratio test). Only seen with Likelihood Ratio test results.
rss
(LRT only): the residual sum of squares under the "null model". Only seen with Likelihood Ratio test results.
degrees_free
(LRT only): the degrees of freedom (equal to difference between the two models). Only seen with Likelihood Ratio test results.
b
(Wald only): 'beta' value (effect size). Technically a biased estimator of the fold change. Only seen with Wald test results.
se_b
(Wald only): standard error of the beta. Only seen with Wald test results.
mean_obs
: mean of natural log counts of observations
var_obs
: variance of observation
tech_var
: technical variance of observation from the bootstraps (named 'sigma_q_sq' if rename_cols is FALSE
)
sigma_sq
: raw estimator of the variance once the technical variance has been removed
smooth_sigma_sq
: smooth regression fit for the shrinkage estimation
final_simga_sq
: max(sigma_sq, smooth_sigma_sq); used for covariance estimation of beta
(named 'smooth_sigma_sq_pmax' if rename_cols is FALSE
)
If pval_aggregate
is TRUE
, returns a data.frame
with the following columns:
target_id
: gene ID specified by obj$gene_column
, e.g. "ENSG#####" (dependent on the transcriptome
used in kallisto).
...
: all of the additional annotation columns (not 'target_id'
or obj$gene_column
) are
added from obj$target_mapping
before the other columns.
num_aggregated_transcripts
: the number of transcripts aggregated for a given gene. These only include
filtered transcripts.
sum_mean_obs_counts
: this is the sum of the mean observations across all filtered transcripts
within a gene. Note that the weighting function is applied before summing.
pval
: the aggregated p-value calculated by the lancaster method. See the aggregation package for details.
qval
: adjusted p-values using the Benchamini-Hochberg method.
sleuth_wt
and sleuth_lrt
to compute tests, models
to
view which models, tests
to view which tests were performed (and can be extracted)
models(sleuth_obj) # for this example, assume the formula is ~condition, and a coefficient is IP results_table <- sleuth_results(sleuth_obj, 'conditionIP')