
deseq2如何加速运行 DESeq2 设置多线程并行计算

register(MulticoreParam(4))   #  windows系统使用register(SnowParam(4))
dds <- DESeq(dds,parallel = T)

The above steps should take less than 30 seconds for most analyses. For experiments with complex designs and many samples (e.g. dozens of coefficients, ~100s of samples), one may want to have faster computation than provided by the default run of DESeq. We have two recommendations:

By using the argument fitType="glmGamPoi", one can leverage the faster NB GLM engine written by Constantin Ahlmann-Eltze. Note that glmGamPoi’s interface in DESeq2 requires use of test="LRT" and specification of a reduced design.

One can take advantage of parallelized computation. Parallelizing
DESeq, results, and lfcShrink can be easily accomplished by loading
the BiocParallel package, and then setting the following arguments:
parallel=TRUE and BPPARAM=MulticoreParam(4), for example, splitting
the job over 4 cores.
However, some words of advice on
parallelization: first, it is recommend to filter genes where all
samples have low counts, to avoid sending data unnecessarily to child
processes, when those genes have low power and will be independently
filtered anyway; secondly, there is often diminishing returns for
adding more cores due to overhead of sending data to child processes,
therefore I recommend first starting with small number of additional
cores. Note that obtaining results for coefficients or contrasts
listed in resultsNames(dds) is fast and will not need parallelization.
As an alternative to BPPARAM, one can register cores at the beginning
of an analysis, and then just specify parallel=TRUE to the functions
when called.

A basic task in the analysis of count data from RNA-seq is the detection of differentially expressed genes. The count data are presented as a table which reports, for each sample, the number of sequence fragments that have been assigned to each gene. Analogous data also arise for other assay types, including comparative ChIP-Seq, HiC, shRNA screening, and mass spectrometry. An important analysis question is the quantification and statistical inference of systematic changes between conditions, as compared to within-condition variability. The package DESeq2 provides methods to test for differential expression by use of negative binomial generalized linear models; the estimates of dispersion and logarithmic fold changes incorporate data-driven prior distributions. This vignette explains the use of the package and demonstrates typical workflows. An RNA-seq workflow on the Bioconductor website covers similar material to this vignette but at a slower pace, including the generation of count matrices from FASTQ files. DESeq2 package version: 1.36.0

