Skip to contents

Calculate p-values using label shuffling.

Usage

label_shuffling(
  x,
  samples,
  control,
  method,
  randomizations = 100,
  pcorr = "BH",
  paired = FALSE,
  paired_method = c("swap", "signflip"),
  nthreads = 1
)

Arguments

x

A matrix with the splicing diversity values.

samples

Character vector with an equal length to the number of columns in the input dataset, specifying the category of each sample.

control

Name of the control sample category, defined in the samples vector, e.g. control = 'Normal' or control = 'WT'.

method

Method to use for calculating the average splicing diversity value in a condition. Can be 'mean' or 'median'.

randomizations

The number of random shuffles.

pcorr

P-value correction method applied to the results, as defined in the p.adjust() function.

paired

Logical; if TRUE perform a paired permutation scheme (default: FALSE). When paired is TRUE, permutations should preserve pairing between samples; the function currently permutes sample labels and therefore paired analyses are only meaningful when the caller has arranged samples accordingly.

paired_method

Character; method for paired permutations. One of 'swap' (randomly swap labels within pairs) or 'signflip' (perform sign-flip permutations; can enumerate all 2^n_pairs combinations for an exact test when randomizations = 0 or randomizations >= 2^n_pairs).

nthreads

Number of threads for parallel processing (default: 1). Set to > 1 to parallelize per-feature p-value computation.

Value

Raw and corrected p-values.

Details

The permutation p-values are computed two-sided as the proportion of permuted log2 fold-changes at least as extreme as the observed value, with a pseudocount added: (count + 1) / (n_perm + 1).

Note

The permutation test returns two-sided empirical p-values using a pseudocount to avoid zero p-values for small numbers of permutations. See the function documentation for details.

Examples

set.seed(123)
# Create a matrix of splicing diversity values (2 genes x 4 samples)
mat <- matrix(rnorm(8), nrow = 2)
samples <- c('Normal', 'Normal', 'Tumor', 'Tumor')

# Run label shuffling test (100 permutations)
result <- label_shuffling(mat, samples, control = 'Normal', 
                          method = 'mean', randomizations = 100, pcorr = 'BH')
head(result)
#>      raw_p_values adjusted_p_values
#> [1,]            1                 1
#> [2,]            1                 1