Wrapper around [.calculate_srh()] that manages TSENATAnalysis object. Tests for genes with condition-specific q-dependent entropy patterns by testing whether the effect of q-values DIFFERS between experimental conditions. This detects disease-relevant or condition-specific isoform switching patterns.
Usage
calculate_srh(
analysis,
condition_col,
output_file = NULL,
paired = NULL,
subject_col = NULL,
multicorr = c("hochberg", "benjamini-yekutieli", "westfall-young", "none"),
entropy_col = "diversity",
q_col = "q",
gene_col = "gene",
wy_randomizations = 500,
nperm_mode = c("standard", "conservative", "interactive"),
nthreads = NULL,
alpha = 0.05,
p_threshold = 0.05,
eta2_threshold_moderate = 0.01,
eta2_threshold_strong = 0.1,
min_nperm = 100,
max_nperm = 10000,
verbose = FALSE,
...
)Arguments
- analysis
TSENATAnalysisobject. Must have diversity results fromcalculate_diversity().- condition_col
character. Column name for sample grouping/condition (REQUIRED). Specifies the condition/treatment variable for testing q×condition interactions. Example: 'sample_type', 'treatment', 'disease_status'.- output_file
characterorNULL. Optional file path to save results. Supported formats: .rds (for S4 objects). Default: NULL (no file output).- paired
logicalorNULL. If TRUE, uses paired/blocked design (requiressubject_col). If NULL, reads from@config$paired.- subject_col
characterorNULL. Column name for subject/block identifiers (required whenpaired=TRUE). If NULL, reads from@config$subject_col.- multicorr
character. Multiple testing correction: 'hochberg' (default), 'benjamini-yekutieli', 'westfall-young', or 'none'.- entropy_col
character. Column name containing entropy/diversity data. Default: 'diversity'.- q_col
character. Column name containing q-values. Default: 'q'.- gene_col
character. Column name containing gene identifiers. Default: 'gene'.- wy_randomizations
numericorcharacter. Number of permutations for Westfall-Young correction. Use 'auto' to estimate from data. Default: 500.- nperm_mode
character. Mode for automatic permutation estimation: 'standard' (default), 'conservative', or 'interactive'.- nthreads
numericorNULL. Number of parallel threads for computation. If NULL, reads from@config$nthreads.- alpha
numeric. Significance level for p-value correction methods (default: 0.05). Used by all multiple testing correction methods.- p_threshold
numeric. P-value threshold for classification of interaction significance (default: 0.05).- eta2_threshold_moderate
numeric. Effect size boundary for 'moderate' classification (default: 0.01).- eta2_threshold_strong
numeric. Effect size boundary for 'strong' classification (default: 0.10).- min_nperm
integer. Minimum permutations for automatic estimation when wy_randomizations='auto' (default: 100).- max_nperm
integer. Maximum permutations for automatic estimation when wy_randomizations='auto' (default: 10000).- verbose
logical. If TRUE, prints progress messages. Default: FALSE.- ...
Additional arguments passed to the base
.calculate_srh()function.
Details
## Key Features
- **Q×Condition Interaction**: Tests if entropy patterns across q-values differ by condition (main discovery goal) - **Multi-q Analysis**: Combines diversity results for multiple q-values into a single SummarizedExperiment for joint hypothesis testing - **Rank-Based Statistics**: Scheirer-Ray-Hare test (two-way ANOVA on ranked data, - **Scheirer-Ray-Hare Test**: Two-way non-parametric ANOVA on ranks - **Multiple Testing Correction**: Hochberg, Benjamini-Yekutieli, or permutation (Westfall-Young) procedures - **AR(1) Correlation Handling**: Westfall-Young preserves q-value spatial correlations (important for ordered q measurements) - **Effect Sizes**: Eta-squared (\(\eta^2\)) for q×condition interactions
## Statistical Hypotheses
Tests the null hypothesis: - H0 = Gene entropy q-effect does NOT differ between conditions (q-independent) - H1 = Gene entropy q-dependence is CONDITION-SPECIFIC (interaction exists)
A significant interaction indicates condition-specific patterns in how entropy varies across the q-value spectrum, revealing biological processes specific to that condition.
## Biological Example
Gene shows strong isoform switching (q-dependent entropy) in tumor cells but NOT in healthy cells → Identified as disease-relevant q-dependent gene.
For condition-specific q-dependent genes: - **Condition A**: Strong entropy variation across q (q-dependent isoform usage) - **Condition B**: Flat entropy profile across q (uniform isoform usage) - **Interaction**: Condition-specific q-dependence pattern reveals disease-associated splicing regulation
Analyzes how gene interactions change across q-value spectrum using rank-based (Scheirer-Ray-Hare) or parametric (GAM) statistical tests.
**Parameter resolution priority** (explicit > @config > default/auto-detect):
condition_col: REQUIRED - must be explicitly providedq: ALWAYS auto-detected from diversity_results (all q-values tested together)paired: explicit arg >@config$paired> FALSE (default)subject_col: explicit arg >@config$subject_colmulticorr: explicit arg >@config$multicorr> 'hochberg'nthreads: explicit arg >@config$nthreads> 1 (default)test: explicit arg >@config$test> 'auto' (auto-selection)nperm_mode: explicit arg >@config$nperm_mode> 'standard'
Examples
# Load example data (matching TSENAT.Rmd workflow)
data(readcounts)
readcounts <- as.matrix(readcounts)
mode(readcounts) <- 'numeric'
metadata_df <- read.table(
system.file('extdata', 'metadata.tsv', package = 'TSENAT'),
header = TRUE, sep = '\t'
)
gff3_dataset <- system.file('extdata', 'annotation.gff3.gz', package =
'TSENAT')
# Create config first (required when metadata is provided)
config <- TSENAT_config(sample_col = 'sample', condition_col = 'condition')
# Build analysis from vignette data and create manageable subset
analysis <- build_analysis(
readcounts = readcounts,
tx2gene = gff3_dataset,
metadata = metadata_df,
config = config,
tpm = tpm,
effective_length = effective_length
)
analysis <- filter_analysis(
analysis,
min_samples = 1,
subset_n_genes = 200
)
analysis <- calculate_diversity(analysis, q = c(0.5, 1.0, 1.5))
# Test Q×Condition interaction (condition_col is REQUIRED)
analysis <- calculate_srh(
analysis,
condition_col = 'condition',
multicorr = 'hochberg'
)
# View results using unified accessor
rank_test_res <- results(analysis, type = 'rank_test')
if (!is.null(rank_test_res)) head(rank_test_res)
#> gene n_q_values_tested f_statistic p_value adj_p_value ss_interaction
#> 1 SH3PXD2A 3 0.10283830 0.9024993 1 0.35754547
#> 2 ATG5 3 0.31874647 0.7288031 1 0.41779676
#> 3 FAXDC2 3 0.07408088 0.9287176 1 0.53373469
#> 4 CYBB 3 0.19803026 0.8211066 1 0.33734628
#> 5 PICALM 3 0.07600418 0.9269395 1 0.04742451
#> 6 CCDC149 3 0.09984740 0.9051897 1 0.45310777
#> ss_residual df_interaction df_residual effect_size_eta2 interaction_class
#> 1 0.1850105 2 45 0.6590021 Robust across q
#> 2 0.4288228 2 45 0.4934882 Robust across q
#> 3 0.6214901 2 45 0.4620180 Robust across q
#> 4 0.4670058 2 45 0.4194013 Robust across q
#> 5 0.1261091 2 45 0.2732872 Robust across q
#> 6 1.2613314 2 45 0.2642892 Robust across q
#> test_method heteroscedastic boundary_clustered highly_skewed
#> 1 srh_unpaired FALSE FALSE FALSE
#> 2 srh_unpaired FALSE FALSE FALSE
#> 3 srh_unpaired FALSE FALSE FALSE
#> 4 srh_unpaired FALSE FALSE FALSE
#> 5 srh_unpaired FALSE FALSE FALSE
#> 6 srh_unpaired FALSE FALSE FALSE