bcbio-nextgen: Error at RNA-Seq pipeline

Hey @roryk !

I get the following error while runnig RNA-Seq pipeline:


[2020-11-16T20:06Z] Resource requests: cufflinks, samtools; memory: 4.00, 4.00; cores: 28, 28
[2020-11-16T20:06Z] Configuring 1 jobs to run, using 28 cores each with 112.1g of memory reserved for each job
[2020-11-16T20:06Z] Timing: disambiguation
[2020-11-16T20:06Z] Timing: transcript assembly
[2020-11-16T20:06Z] Timing: estimate expression (threaded)
[2020-11-16T20:06Z] multiprocessing: generate_transcript_counts
[2020-11-16T20:06Z] multiprocessing: run_salmon_index
[2020-11-16T20:06Z] Transcriptome index for /data/kokyriakidis/rnaparkinson/PRJNA283498/work/inputs/transcriptome/hg38.fa detected, skipping building.
Traceback (most recent call last):
  File "/data/software/rnaseq/tools/bin/bcbio_nextgen.py", line 245, in <module>
    main(**kwargs)
  File "/data/software/rnaseq/tools/bin/bcbio_nextgen.py", line 46, in main
[2020-11-16T20:06Z] multiprocessing: run_salmon_bam
    run_main(**kwargs)
  File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 50, in run_main
    fc_dir, run_info_yaml)
  File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
    for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
  File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 257, in rnaseqpipeline
    samples = rnaseq.quantitate_expression_parallel(samples, run_parallel)
  File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/bcbio/pipeline/rnaseq.py", line 241, in quantitate_expression_parallel
    samples = run_parallel("run_salmon_bam", samples)
  File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
  File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 1048, in __call__
    if self.dispatch_one_batch(iterator):
  File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 866, in dispatch_one_batch
    self._dispatch(tasks)
  File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 784, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
    result = ImmediateResult(func)
  File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 572, in __init__
    self.results = batch()
  File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 263, in __call__
    for func, args, kwargs in self.items]
  File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 263, in <listcomp>
    for func, args, kwargs in self.items]
  File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/bcbio/utils.py", line 55, in wrapper
    return f(*args, **kwargs)
  File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/bcbio/distributed/multitasks.py", line 76, in run_salmon_bam
    return salmon.run_salmon_bam(*args)
  File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/bcbio/rnaseq/salmon.py", line 30, in run_salmon_bam
    data = dd.update_summary_qc(data, "salmon", base=dd.get_salmon_fraglen_file(data))
  File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/bcbio/pipeline/datadict.py", line 393, in update_summary_qc
    base = tz.first(files)
  File "/data/software/rnaseq/anaconda/lib/python3.6/site-packages/toolz/itertoolz.py", line 376, in first
    return next(iter(seq))
StopIteration
root@genome-assembly-1:/data/kokyriakidis/rnaparkinson/PRJNA283498/work#

My config file is:

---
details:
  - analysis: RNA-seq
    genome_build: hg38
    algorithm:
      aligner: star
      strandedness: unstranded
      quantify_genome_alignments: true
upload:
  dir: ../final

bcbio version 1.2.4

The last commands from the log are:

[2020-11-16T19:50Z] gffread -g /data/software/rnaseq/genomes/Hsapiens/hg38/seq/hg38.fa -w /data/kokyriakidis/rnaparkinson/PRJNA283498/work/bcbiotx/tmp6ot1mf22/hg38.fa.tmp /data/software/rnaseq/genomes/Hsapiens/hg38/rnaseq/ref-transcripts.gtf
[2020-11-16T19:51Z] /data/software/rnaseq/galaxy/../anaconda/bin/salmon index --keepDuplicates -k 31 -p 28 -i /data/kokyriakidis/rnaparkinson/PRJNA283498/work/bcbiotx/tmpuuzhprs2/hg38 -t /data/kokyriakidis/rnaparkinson/PRJNA283498/work/inputs/transcriptome/hg38.fa
[2020-11-16T19:53Z] /data/software/rnaseq/galaxy/../anaconda/bin/salmon quant -l IU -p 28 -t /data/kokyriakidis/rnaparkinson/PRJNA283498/work/inputs/transcriptome/hg38.fa -o /data/kokyriakidis/rnaparkinson/PRJNA283498/work/bcbiotx/tmpsl_x3h30/C_0024 -a /data/kokyriakidis/rnaparkinson/PRJNA283498/work/align/C_0024/C_0024_star/C_0024.transcriptome.bam --numBootstraps 30

Any thoughts?

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 15 (3 by maintainers)

Commits related to this issue

Most upvoted comments

Hi @kokyriakidis,

Thanks, sorry for these problems and for being slow about getting back to you. I agree it’s probably from this directory change, let me see if I can reproduce and fix it. Sorry for not catching this when I made the directory change.