bcbio-nextgen: Error running bcbioRNASeq from within bcbio: there is no package called ‘bcbioRNASeq’

Hello!

I’m trying to run a bulk RNA-seq analysis using the following template:

# Template for human RNA-seq using Illumina prepared samples
---
details:
  - analysis: RNA-seq
    genome_build: sacCer3
    algorithm:
## for hg38, change the aligner to hisat2
      aligner: hisat2
      tools_on: bcbiornaseq
      bcbiornaseq:
        organism: saccharomyces cerevisiae
        interesting_groups: panel
upload:
  dir: ../final

However, this ends with the following error:

[2021-11-26T07:15Z] Storing in local filesystem: /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-11-26_rna-seq-analysis/tpm/tximport-tpm.csv
[2021-11-26T07:15Z] Storing in local filesystem: /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-11-26_rna-seq-analysis/counts/tximport-counts.csv
[2021-11-26T07:15Z] Storing in local filesystem: /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-11-26_rna-seq-analysis/tx2gene.csv
[2021-11-26T07:15Z] Storing directory in local filesystem: /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-11-26_rna-seq-analysis/transcriptome
[2021-11-26T07:15Z] multiprocessing: upload_samples_project
[2021-11-26T07:15Z] Storing in local filesystem: /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-11-26_rna-seq-analysis/bcbio-nextgen.log
[2021-11-26T07:15Z] multiprocessing: upload_samples_project
[2021-11-26T07:15Z] multiprocessing: upload_samples_project
[2021-11-26T07:15Z] Storing in local filesystem: /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-11-26_rna-seq-analysis/bcbio-nextgen.log
[2021-11-26T07:15Z] multiprocessing: upload_samples_project
[2021-11-26T07:15Z] multiprocessing: upload_samples_project
[2021-11-26T07:15Z] Storing in local filesystem: /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-11-26_rna-seq-analysis/bcbio-nextgen.log
[2021-11-26T07:15Z] Timing: bcbioRNAseq loading
[2021-11-26T07:15Z] multiprocessing: run_bcbiornaseqload
[2021-11-26T07:15Z] Loading bcbioRNASeq object.
[2021-11-26T07:15Z] Error in library(bcbioRNASeq) : there is no package called ‘bcbioRNASeq’
[2021-11-26T07:15Z] Execution halted
[2021-11-26T07:15Z] Uncaught exception occurred
Traceback (most recent call last):
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/home/user/bcbio-nextgen/anaconda/bin/Rscript --vanilla /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/bcbioRNASeq/load_bcbioRNAseq.R
Error in library(bcbioRNASeq) : there is no package called ‘bcbioRNASeq’
Execution halted
' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/home/user/bcbio-nextgen/anaconda/bin/bcbio_nextgen.py", line 245, in <module>
    main(**kwargs)
  File "/home/user/bcbio-nextgen/anaconda/bin/bcbio_nextgen.py", line 46, in main
    run_main(**kwargs)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/pipeline/main.py", line 50, in run_main
    fc_dir, run_info_yaml)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
    for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/pipeline/main.py", line 290, in rnaseqpipeline
    run_parallel("run_bcbiornaseqload", [sample])
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 1048, in __call__
    if self.dispatch_one_batch(iterator):
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 866, in dispatch_one_batch
    self._dispatch(tasks)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 784, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
    result = ImmediateResult(func)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 572, in __init__
    self.results = batch()
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 263, in __call__
    for func, args, kwargs in self.items]
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 263, in <listcomp>
    for func, args, kwargs in self.items]
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/utils.py", line 59, in wrapper
    return f(*args, **kwargs)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/distributed/multitasks.py", line 92, in run_bcbiornaseqload
    return bcbiornaseq.make_bcbiornaseq_object(*args)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/rnaseq/bcbiornaseq.py", line 33, in make_bcbiornaseq_object
    do.run([rcmd, "--vanilla", r_file], "Loading bcbioRNASeq object.")
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/home/user/bcbio-nextgen/anaconda/bin/Rscript --vanilla /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/bcbioRNASeq/load_bcbioRNAseq.R
Error in library(bcbioRNASeq) : there is no package called ‘bcbioRNASeq’
Execution halted
' returned non-zero exit status 1.

This is strange to see, because the package does seem to be installed in the rbcbiornaseq environment:

$ bcbio_conda list -n rbcbiornaseq r-bcbiornaseq
# packages in environment at /home/user/bcbio-nextgen/anaconda/envs/rbcbiornaseq:
#
# Name                    Version                   Build  Channel
r-bcbiornaseq             0.3.42            r41hdfd78af_0    bioconda

About this issue

  • Original URL
  • State: open
  • Created 3 years ago
  • Reactions: 2
  • Comments: 73 (72 by maintainers)

Most upvoted comments

@naumenko-sa Hi Sergey, following up on this, I’m working on a code update this week and will ping you back soon.

Hello, I’m getting a similar error trying to install trinity by conda: ERROR conda.core.link:_execute(730): An error occurred while installing package ‘bioconda::bioconductor-go.db-3.14.0-r41hdfd78af_0’. I’ve tried with many conda versions but the error persist: What can I do to fix it?

Sure thing, here you go. I changed the extension to .txt so that GitHub would accept it.

ref-transcripts.gtf.txt

Thanks for the reply @mjsteinbaugh

I’m attaching the file you requested: tx2gene.csv: tx2gene.csv

I’m also attaching a file with the commands I used to set up bcbio, to download the data and to set up the bcbio runs.

The relevant lines for this analysis are 115-166 (downloading the data) and 206-235 (setting up and running the analysis).

Hope this helps.

VM-setup.txt

@naumenko-sa @mjsteinbaugh

Thanks again for all your help so far. After upgrading to the latest development version and getting the sacCer3 data, the RNA-seq analysis progressed further for me, but still ended up crashing.

Let me know if you want me to share a script with everything I’m doing here, in case it could help with reproducing and debugging. Here’s the error I’m running into:

[2021-12-18T11:58Z] multiprocessing: upload_samples_project
[2021-12-18T11:58Z] Storing in local filesystem: /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/bcbio-nextgen.log
[2021-12-18T11:58Z] multiprocessing: upload_samples_project
[2021-12-18T11:58Z] multiprocessing: upload_samples_project
[2021-12-18T11:58Z] multiprocessing: upload_samples_project
[2021-12-18T11:58Z] Storing in local filesystem: /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/bcbio-nextgen.log
[2021-12-18T11:58Z] multiprocessing: upload_samples_project
[2021-12-18T11:58Z] Timing: bcbioRNAseq loading
[2021-12-18T11:58Z] multiprocessing: run_bcbiornaseqload
[2021-12-18T11:58Z] Loading bcbioRNASeq object.
[2021-12-18T11:58Z] Loading required package: basejump
[2021-12-18T11:58Z] Attaching package: ‘basejump’
[2021-12-18T11:58Z] The following objects are masked from ‘package:stats’:
[2021-12-18T11:58Z]     complete.cases, cor, end, median, na.omit, quantile, sd, start, var
[2021-12-18T11:58Z] The following objects are masked from ‘package:utils’:
[2021-12-18T11:58Z]     head, relist, tail
[2021-12-18T11:58Z] The following objects are masked from ‘package:base’:
[2021-12-18T11:58Z]     %in%, anyDuplicated, append, as.factor, as.list, as.matrix,
[2021-12-18T11:58Z]     as.table, basename, cbind, colnames, colnames<-, colSums, dirname,
[2021-12-18T11:58Z]     do.call, duplicated, eval, expand.grid, get, grep, grepl, gsub,
[2021-12-18T11:58Z]     intersect, is.unsorted, lapply, mapply, match, mean, merge, mget,
[2021-12-18T11:58Z]     ncol, nrow, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
[2021-12-18T11:58Z]     rbind, rep.int, rowMeans, rownames, rownames<-, rowSums, sapply,
[2021-12-18T11:58Z]     setdiff, sort, split, sub, subset, summary, t, table, tapply,
[2021-12-18T11:58Z]     union, unique, unsplit, which, which.max, which.min
[2021-12-18T11:58Z] 🧪 # bcbioRNASeq
[2021-12-18T11:58Z] ℹ Importing bcbio-nextgen RNA-seq run.
[2021-12-18T11:58Z] 🧪 ## Run info
[2021-12-18T11:58Z] uploadDir: /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final
[2021-12-18T11:58Z] projectDir: 2021-12-18_rna-seq-analysis
[2021-12-18T11:58Z] ℹ 7 samples detected:
[2021-12-18T11:58Z] • AE1
[2021-12-18T11:58Z] • AE2
[2021-12-18T11:58Z] • AE3
[2021-12-18T11:58Z] • bcbioRNASeq
[2021-12-18T11:58Z] • RT1
[2021-12-18T11:58Z] • RT2
[2021-12-18T11:58Z] • RT3
[2021-12-18T11:58Z] → Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/project-summary.yaml' using yaml::`yaml.load_file()`.
[2021-12-18T11:58Z] → Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/data_versions.csv' using data.table::`fread()`.
[2021-12-18T11:58Z] → Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/programs.txt' using data.table::`fread()`.
[2021-12-18T11:58Z] → Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/bcbio-nextgen.log' using base::`readLines()`.
[2021-12-18T11:58Z] → Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/bcbio-nextgen-commands.log' using base::`readLines()`.
[2021-12-18T11:58Z] 🧪 ## Sample metadata
[2021-12-18T11:58Z] → Getting sample metadata from YAML.
[2021-12-18T11:58Z] Loading a subset of samples:
[2021-12-18T11:58Z] • AE1
[2021-12-18T11:58Z] • AE2
[2021-12-18T11:58Z] • AE3
[2021-12-18T11:58Z] • RT1
[2021-12-18T11:58Z] • RT2
[2021-12-18T11:58Z] • RT3
[2021-12-18T11:58Z] → Getting sample quality control metrics from YAML.
[2021-12-18T11:58Z] 🧪 ## Counts
[2021-12-18T11:58Z] 🧪 ### tximport
[2021-12-18T11:58Z] → Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/tx2gene.csv' using data.table::`fread()`.
[2021-12-18T11:58Z] Error in validObject(.Object) :
[2021-12-18T11:58Z]   invalid class “Tx2Gene” object: Some transcript and gene identifiers are identical.
[2021-12-18T11:58Z] Calls: bcbioRNASeq ... .local -> new -> initialize -> initialize -> validObject
[2021-12-18T11:58Z] Execution halted
[2021-12-18T11:58Z] Uncaught exception occurred
Traceback (most recent call last):
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/home/user/bcbio-nextgen/anaconda/envs/rbcbiornaseq/bin/Rscript --vanilla /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/bcbioRNASeq/load_bcbioRNAseq.R
Loading required package: basejump
Attaching package: ‘basejump’
The following objects are masked from ‘package:stats’:
    complete.cases, cor, end, median, na.omit, quantile, sd, start, var
The following objects are masked from ‘package:utils’:
    head, relist, tail
The following objects are masked from ‘package:base’:
    %in%, anyDuplicated, append, as.factor, as.list, as.matrix,
    as.table, basename, cbind, colnames, colnames<-, colSums, dirname,
    do.call, duplicated, eval, expand.grid, get, grep, grepl, gsub,
    intersect, is.unsorted, lapply, mapply, match, mean, merge, mget,
    ncol, nrow, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rep.int, rowMeans, rownames, rownames<-, rowSums, sapply,
    setdiff, sort, split, sub, subset, summary, t, table, tapply,
    union, unique, unsplit, which, which.max, which.min
🧪 # bcbioRNASeq
ℹ Importing bcbio-nextgen RNA-seq run.
🧪 ## Run info
uploadDir: /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final
projectDir: 2021-12-18_rna-seq-analysis
ℹ 7 samples detected:
• AE1
• AE2
• AE3
• bcbioRNASeq
• RT1
• RT2
• RT3
→ Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/project-summary.yaml' using yaml::`yaml.load_file()`.
→ Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/data_versions.csv' using data.table::`fread()`.
→ Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/programs.txt' using data.table::`fread()`.
→ Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/bcbio-nextgen.log' using base::`readLines()`.
→ Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/bcbio-nextgen-commands.log' using base::`readLines()`.
🧪 ## Sample metadata
→ Getting sample metadata from YAML.
Loading a subset of samples:
• AE1
• AE2
• AE3
• RT1
• RT2
• RT3
→ Getting sample quality control metrics from YAML.
🧪 ## Counts
🧪 ### tximport
→ Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/tx2gene.csv' using data.table::`fread()`.
Error in validObject(.Object) : 
  invalid class “Tx2Gene” object: Some transcript and gene identifiers are identical.
Calls: bcbioRNASeq ... .local -> new -> initialize -> initialize -> validObject
Execution halted
' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/home/user/bcbio-nextgen/anaconda/bin/bcbio_nextgen.py", line 245, in <module>
    main(**kwargs)
  File "/home/user/bcbio-nextgen/anaconda/bin/bcbio_nextgen.py", line 46, in main
    run_main(**kwargs)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/pipeline/main.py", line 50, in run_main
    fc_dir, run_info_yaml)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
    for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/pipeline/main.py", line 290, in rnaseqpipeline
    run_parallel("run_bcbiornaseqload", [sample])
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 1048, in __call__
    if self.dispatch_one_batch(iterator):
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 866, in dispatch_one_batch
    self._dispatch(tasks)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 784, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
    result = ImmediateResult(func)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 572, in __init__
    self.results = batch()
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 263, in __call__
    for func, args, kwargs in self.items]
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 263, in <listcomp>
    for func, args, kwargs in self.items]
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/utils.py", line 59, in wrapper
    return f(*args, **kwargs)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/distributed/multitasks.py", line 92, in run_bcbiornaseqload
    return bcbiornaseq.make_bcbiornaseq_object(*args)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/rnaseq/bcbiornaseq.py", line 31, in make_bcbiornaseq_object
    do.run([rcmd, "--vanilla", r_file], "Loading bcbioRNASeq object.")
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/home/user/bcbio-nextgen/anaconda/envs/rbcbiornaseq/bin/Rscript --vanilla /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/bcbioRNASeq/load_bcbioRNAseq.R
Loading required package: basejump
Attaching package: ‘basejump’
The following objects are masked from ‘package:stats’:
    complete.cases, cor, end, median, na.omit, quantile, sd, start, var
The following objects are masked from ‘package:utils’:
    head, relist, tail
The following objects are masked from ‘package:base’:
    %in%, anyDuplicated, append, as.factor, as.list, as.matrix,
    as.table, basename, cbind, colnames, colnames<-, colSums, dirname,
    do.call, duplicated, eval, expand.grid, get, grep, grepl, gsub,
    intersect, is.unsorted, lapply, mapply, match, mean, merge, mget,
    ncol, nrow, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rep.int, rowMeans, rownames, rownames<-, rowSums, sapply,
    setdiff, sort, split, sub, subset, summary, t, table, tapply,
    union, unique, unsplit, which, which.max, which.min
🧪 # bcbioRNASeq
ℹ Importing bcbio-nextgen RNA-seq run.
🧪 ## Run info
uploadDir: /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final
projectDir: 2021-12-18_rna-seq-analysis
ℹ 7 samples detected:
• AE1
• AE2
• AE3
• bcbioRNASeq
• RT1
• RT2
• RT3
→ Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/project-summary.yaml' using yaml::`yaml.load_file()`.
→ Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/data_versions.csv' using data.table::`fread()`.
→ Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/programs.txt' using data.table::`fread()`.
→ Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/bcbio-nextgen.log' using base::`readLines()`.
→ Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/bcbio-nextgen-commands.log' using base::`readLines()`.
🧪 ## Sample metadata
→ Getting sample metadata from YAML.
Loading a subset of samples:
• AE1
• AE2
• AE3
• RT1
• RT2
• RT3
→ Getting sample quality control metrics from YAML.
🧪 ## Counts
🧪 ### tximport
→ Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2021-12-18_rna-seq-analysis/tx2gene.csv' using data.table::`fread()`.
Error in validObject(.Object) : 
  invalid class “Tx2Gene” object: Some transcript and gene identifiers are identical.
Calls: bcbioRNASeq ... .local -> new -> initialize -> initialize -> validObject
Execution halted
' returned non-zero exit status 1.

good job @mjsteinbaugh . I confirm that it works in seqc bcbio test - the report is there. @amizeranschi let us know if that works for you as well!

@naumenko-sa OK these issues should be fixed with r-acidmarkdown v0.1.5, which I’m rolling out onto bioconda shortly.

OK I’ll look into this and maybe we can do a minor bug fix in bcbioRNASeq to address it

the recipe is merged, thanks so much!

@mjsteinbaugh I see you are reverting r-bcbiornaseq back to r4.0 and bioconductor 3.13.

on the bcbio side: the conda installation of r-bcbiornaseq=0.3.42 + r4.1. + bioconductor3.14 in a separate env went ok, and it worked ok (but the latest small fixes) we already introduced R4.1 native pipes in bcbio code for bcbiornaseq calls. https://github.com/bcbio/bcbio-nextgen/blob/master/bcbio/rnaseq/bcbiornaseq.py#L170, so reverting to bioconductor3.13 will break bcbio code.

it is easy to fix, just let me know whether bioconductor 3.13 is the final choice for r-bcbiornaseq=0.3.44

Sorry, I am releasing today, I need a freeze of bcbio code.

@naumenko-sa OK I’m working on the bioconda build this morning https://github.com/bioconda/bioconda-recipes/pull/31978/

I’ve pushed Rmd change, created a new tag, and submitted a PR to bioconda: https://github.com/bioconda/bioconda-recipes/pull/31985

If you could facilitate merging it - it would be really appreciated. I would be able to go for a bcbio release then.

Thanks Michael for the quick fix, we are almost there!

I confirm that after the manual update in anaconda/envs/rbcbiornaseq/bin/R with

if (!requireNamespace("BiocManager", quietly = TRUE)) {
    install.packages("BiocManager")
}
install.packages(
    pkgs = "bcbioRNASeq",
    repos = c(
        "https://r.acidgenomics.com",
        BiocManager::repositories()
    )
)

It passes the previous break point. It fails then at https://github.com/bcbio/bcbio-nextgen/blob/master/bcbio/rnaseq/bcbiornaseq.py#L110 with:

subprocess.CalledProcessError: Command '/n/data1/cores/bcbio/naumenko/bcbio_devel/anaconda/envs/rbcbiornaseq/bin/Rscript --vanilla -e rmarkdown::draft("/n/data1/cores/bcbio/naumenko/_example_bcbio_runs/3_bulk_rnaseq_6samples_chr22_fast/seqc/final/bcbioRNASeq/quality_control.Rmd", template="quality_control", package="bcbioRNASeq", edit=FALSE)
Error in rmarkdown::draft("/n/data1/cores/bcbio/naumenko/_example_bcbio_runs/3_bulk_rnaseq_6samples_chr22_fast/seqc/final/bcbioRNASeq/quality_control.Rmd",  : 
  The template 'quality_control' was not found in the bcbioRNASeq package

Could you please take a look? Sergey

Thanks @mjsteinbaugh ! Almost there!

With these changes + tools_on: keep_gene_version which keeps transcript versions in tx2gene: https://github.com/bcbio/bcbio-nextgen/pull/3568

ENST00000456328.2,ENSG00000223972
ENST00000450305.2,ENSG00000223972
ENST00000488147.1,ENSG00000227232
ENST00000619216.1,ENSG00000278267
ENST00000473358.1,ENSG00000243485
ENST00000469289.1,ENSG00000243485
ENST00000607096.1,ENSG00000284332
ENST00000417324.1,ENSG00000237613
ENST00000461467.1,ENSG00000237613
ENST00000606857.1,ENSG00000268020

I am getting:

→ Importing '/n/data1/cores/bcbio/naumenko/_example_bcbio_runs/3_bulk_rnaseq_6samples_chr22_fast/seqc/final/2021-12-02_seqc/bcbio-nextgen-commands.log' using base::`readLines()`.
🧪 ## Sample metadata
→ Getting sample metadata from YAML.
Loading a subset of samples:
• HBRR_rep1
• HBRR_rep2
• HBRR_rep3
• UHRR_rep1
• UHRR_rep2
• UHRR_rep3
→ Getting sample quality control metrics from YAML.
🧪 ## Counts
🧪 ### tximport
→ Importing '/n/data1/cores/bcbio/naumenko/_example_bcbio_runs/3_bulk_rnaseq_6samples_chr22_fast/seqc/final/2021-12-02_seqc/tx2gene.csv' using data.table::`fread()`.
→ Importing salmon transcript-level counts from 'quant.sf' files using tximport 1.22.0.
countsFromAbundance: lengthScaledTPM
txOut: TRUE
reading in files with read_tsv
1 2 3 4 5 6 
Error in .isTximportReturn(txi) : Assert failure.
[2] identical(rownames(infReps[[1L]]), rownames(abundance)) is not TRUE.
Calls: bcbioRNASeq -> .tximport -> assert -> .isTximportReturn -> assert
Execution halted
' returned non-zero exit status 1.

The sample sheet:

samplename,description,category
UHRR_rep1,UHRR_rep1,UHRR
HBRR_rep1,HBRR_rep1,HBRR
UHRR_rep2,UHRR_rep2,UHRR
HBRR_rep2,HBRR_rep2,HBRR
UHRR_rep3,UHRR_rep3,UHRR
HBRR_rep3,HBRR_rep3,HBRR

The yaml template:

details:
  - analysis: RNA-seq
    genome_build: hg38
    algorithm:
      quality_format: standard
      aligner: false
      strandedness: unstranded
      tools_on:
      - bcbiornaseq
      - keep_gene_version
      bcbiornaseq:
        organism: homo sapiens
        interesting_groups: category
upload:
  dir: ../final
resources:
  star:
    cores: 10
    memory: 10G

The basic tximport companion works ok: https://github.com/bcbio/bcbio-nextgen/blob/master/bcbio/scripts/R/bcbio2se.R

Sergey

Hi @amizeranschi !

Thanks for testing and reporting! I’ve fixed the paths to Rscript: https://github.com/bcbio/bcbio-nextgen/pull/3567

I am getting the below error now:

> library(bcbioRNASeq)
Loading required package: basejump
Error: package or namespace load failed for ‘basejump’:
 object ‘metadataBlacklist’ is not exported by 'namespace:AcidBase'
Error: package ‘basejump’ could not be loaded
> library(basejump)
Error: package or namespace load failed for ‘basejump’:
 object ‘metadataBlacklist’ is not exported by 'namespace:AcidBase'

@mjsteinbaugh could you please help us with this error?

Sergey