bcbio-nextgen: Error at RNA-Seq pipeline
Hey @roryk !
I get the following error while runnig RNA-Seq pipeline:
[2020-11-16T20:06Z] Resource requests: cufflinks, samtools; memory: 4.00, 4.00; cores: 28, 28
[2020-11-16T20:06Z] Configuring 1 jobs to run, using 28 cores each with 112.1g of memory reserved for each job
[2020-11-16T20:06Z] Timing: disambiguation
[2020-11-16T20:06Z] Timing: transcript assembly
[2020-11-16T20:06Z] Timing: estimate expression (threaded)
[2020-11-16T20:06Z] multiprocessing: generate_transcript_counts
[2020-11-16T20:06Z] multiprocessing: run_salmon_index
[2020-11-16T20:06Z] Transcriptome index for /data/kokyriakidis/rnaparkinson/PRJNA283498/work/inputs/transcriptome/hg38.fa detected, skipping building.
Traceback (most recent call last):
File "/data/software/rnaseq/tools/bin/bcbio_nextgen.py", line 245, in <module>
main(**kwargs)
File "/data/software/rnaseq/tools/bin/bcbio_nextgen.py", line 46, in main
[2020-11-16T20:06Z] multiprocessing: run_salmon_bam
run_main(**kwargs)
File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 50, in run_main
fc_dir, run_info_yaml)
File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 257, in rnaseqpipeline
samples = rnaseq.quantitate_expression_parallel(samples, run_parallel)
File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/bcbio/pipeline/rnaseq.py", line 241, in quantitate_expression_parallel
samples = run_parallel("run_salmon_bam", samples)
File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
return run_multicore(fn, items, config, parallel=parallel)
File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 1048, in __call__
if self.dispatch_one_batch(iterator):
File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 866, in dispatch_one_batch
self._dispatch(tasks)
File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 784, in _dispatch
job = self._backend.apply_async(batch, callback=cb)
File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
result = ImmediateResult(func)
File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 572, in __init__
self.results = batch()
File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 263, in __call__
for func, args, kwargs in self.items]
File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 263, in <listcomp>
for func, args, kwargs in self.items]
File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/bcbio/utils.py", line 55, in wrapper
return f(*args, **kwargs)
File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/bcbio/distributed/multitasks.py", line 76, in run_salmon_bam
return salmon.run_salmon_bam(*args)
File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/bcbio/rnaseq/salmon.py", line 30, in run_salmon_bam
data = dd.update_summary_qc(data, "salmon", base=dd.get_salmon_fraglen_file(data))
File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/bcbio/pipeline/datadict.py", line 393, in update_summary_qc
base = tz.first(files)
File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/toolz/itertoolz.py", line 376, in first
return next(iter(seq))
StopIteration
root@genome-assembly-1:/data/kokyriakidis/rnaparkinson/PRJNA283498/work#
My config file is:
---
details:
- analysis: RNA-seq
genome_build: hg38
algorithm:
aligner: star
strandedness: unstranded
quantify_genome_alignments: true
upload:
dir: ../final
bcbio version 1.2.4
The last commands from the log are:
[2020-11-16T19:50Z] gffread -g /data/software/rnaseq/genomes/Hsapiens/hg38/seq/hg38.fa -w /data/kokyriakidis/rnaparkinson/PRJNA283498/work/bcbiotx/tmp6ot1mf22/hg38.fa.tmp /data/software/rnaseq/genomes/Hsapiens/hg38/rnaseq/ref-transcripts.gtf
[2020-11-16T19:51Z] /data/software/rnaseq/galaxy/../anaconda/bin/salmon index --keepDuplicates -k 31 -p 28 -i /data/kokyriakidis/rnaparkinson/PRJNA283498/work/bcbiotx/tmpuuzhprs2/hg38 -t /data/kokyriakidis/rnaparkinson/PRJNA283498/work/inputs/transcriptome/hg38.fa
[2020-11-16T19:53Z] /data/software/rnaseq/galaxy/../anaconda/bin/salmon quant -l IU -p 28 -t /data/kokyriakidis/rnaparkinson/PRJNA283498/work/inputs/transcriptome/hg38.fa -o /data/kokyriakidis/rnaparkinson/PRJNA283498/work/bcbiotx/tmpsl_x3h30/C_0024 -a /data/kokyriakidis/rnaparkinson/PRJNA283498/work/align/C_0024/C_0024_star/C_0024.transcriptome.bam --numBootstraps 30
Any thoughts?
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 15 (3 by maintainers)
Commits related to this issue
- fix stalled RNA-seq runs when there is no flenDist.txt related to #3377 — committed to naumenko-sa/bcbio-nextgen by naumenko-sa 4 years ago
- fix stalled RNA-seq runs when there is no flenDist.txt related to #3377 (#3397) — committed to bcbio/bcbio-nextgen by naumenko-sa 4 years ago
- Add missing r-janitor dependency. Closes bcbio/bcbio-nextgen#3377. — committed to chapmanb/cloudbiolinux by roryk 4 years ago
Hi @kokyriakidis,
Thanks, sorry for these problems and for being slow about getting back to you. I agree it’s probably from this directory change, let me see if I can reproduce and fix it. Sorry for not catching this when I made the directory change.