Calculate divergence metrics and store in TSENATAnalysis
Source:R/s4_functions_divergence.R
calculate_divergence.RdWrapper around .calculate_divergence() that manages TSENATAnalysis object. Calculates Tsallis divergence between experimental conditions for transcripts across multiple q-values to detect condition-specific transcript remodeling.
Usage
calculate_divergence(
analysis,
q = NULL,
verbose = FALSE,
nthreads = NULL,
output_file = NULL,
control_group = NULL,
paired = FALSE,
method = NULL,
bootstrap = FALSE,
nboot = NULL,
progress = FALSE,
...
)Arguments
- analysis
TSENATAnalysisobject.- q
numeric. Q-value(s) for divergence (single value or vector). If NULL, reads fromanalysis@config$q. If not in config, defaults to seq(0.01, 2, by = 0.05) for full spectrum computation.- verbose
logical. Print progress messages. Default: TRUE.- nthreads
numericorNULL. Number of CPU threads for parallel processing. If NULL, reads from@config$nthreads(or defaults to 1).- output_file
characterorNULL. Optional file path to save results. Supported formats: .rds (for S4 objects), .tsv, .csv, .txt (for tables). Default: NULL (no file output).- control_group
characterorNULL. Control group identifier for divergence comparison. If NULL, reads from@config$control_groupif available.- paired
logical. Whether to use paired design. Default: FALSE. If not specified, reads from@config$pairedif available.- method
characterorNULL. Statistical method for divergence calculation. If NULL, reads from@config$methodif available.- bootstrap
logical. Whether to compute bootstrap confidence intervals. Default: FALSE. If not specified, reads from@config$bootstrapif available.- nboot
numericorNULL. Number of bootstrap replicates. Default: NULL. If NULL, reads from@config$nbootif available.- progress
logical. Show progress bar during computation. Default: FALSE.- ...
Additional arguments passed to the base divergence function.
Value
Modified TSENATAnalysis with divergence metrics in
@divergence_results (stored as list of data.frames or matrices).
Details
Key Features:
Multi-q analysis: Divergence computed across full q-spectrum simultaneously
Bootstrap confidence intervals: Quantify uncertainty in divergence estimates
Multiple testing correction: Hochberg, Benjamini-Yekutieli, or no correction
Paired designs: Supports paired/repeated measures via subject_col parameter
Effect size reporting: Log-fold-change and confidence intervals per gene
Flexible control group: Compare any condition vs. any other condition
**Mathematical Background:** Tsallis divergence D_q between two probability distributions:
D_q(P||Q) = (log_2(N) - entropy_q(P) + entropy_q(Q)) / (q - 1)Measures how much transcript composition changes from control to condition. Values near 0: Similar isoform composition; Large positive values: Major change.
**Example Use Case:**
Control sample: All reads from dominant isoform (low entropy)
Tumor sample: Reads spread across multiple isoforms (high entropy)
Result: Large divergence indicating condition-specific isoform switching.
**Parameter Resolution:** Parameters are resolved in priority order: 1. Explicit arguments passed to function 2. Values from analysis@config (if present) 3. Function defaults
Requires diversity results from calculate_diversity() as prerequisite.
Examples
# Load example data (matching TSENAT.Rmd workflow)
data(readcounts)
readcounts <- as.matrix(readcounts)
mode(readcounts) <- 'numeric'
metadata_df <- read.table(
system.file('extdata', 'metadata.tsv', package = 'TSENAT'),
header = TRUE, sep = '\t'
)
gff3_dataset <- system.file('extdata', 'annotation.gff3.gz', package = 'TSENAT')
# Build analysis from vignette data and create manageable subset
config <- TSENAT_config(sample_col = 'sample', condition_col = 'condition')
analysis <- build_analysis(
readcounts = readcounts,
tx2gene = gff3_dataset,
metadata = metadata_df,
config = config,
tpm = tpm,
effective_length = effective_length
)
# Use 200+ genes to ensure diversity filtering doesn't remove all genes
analysis <- filter_analysis(analysis, min_samples = 1, subset_n_genes = 200)
# Compute diversity first (required for divergence)
analysis <- calculate_diversity(analysis, q = c(0.5, 1.0, 1.5))
# Calculate divergence across q-values
analysis <- calculate_divergence(analysis, q = c(0.5, 1.0, 1.5))
# Check divergence results using unified accessor
head(results(analysis, type = 'divergence'))
#> q_0.5 q_1 q_1.5
#> DES 1.00000000 0.97550617 0.37328802
#> SYNM 0.95161102 0.86914555 0.28623852
#> HIPK2 0.10360517 0.11224915 0.03518892
#> REG3A 0.29954936 0.20518742 0.04708207
#> SH3PXD2A 0.03982424 0.04235353 0.01285111
#> EFNB2 0.11930832 0.12438627 0.03827434