bcbio-nextgen: Encoding errors with CIVIC svprioritize and IPython distributed tasks

A typical error, which occurs somehow inside bedtools, is like this:

Traceback (most recent call last):
  File "/opt/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/ipythontasks.py", line 52, in _setup_logging
    yield config
  File "/opt/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/ipythontasks.py", line 447, in detect_sv
    return ipython.zip_args(apply(structural.detect_sv, *args))
  File "/opt/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/ipythontasks.py", line 80, in apply
    return object(*args, **kwargs)
  File "/opt/bcbio/anaconda/lib/python3.6/site-packages/bcbio/structural/__init__.py", line 205, in detect_sv
    for svdata in caller_fn(items):
  File "/opt/bcbio/anaconda/lib/python3.6/site-packages/bcbio/structural/cnvkit.py", line 57, in run
    return _cnvkit_by_type(items, background)
  File "/opt/bcbio/anaconda/lib/python3.6/site-packages/bcbio/structural/cnvkit.py", line 67, in _cnvkit_by_type
    return _run_cnvkit_single(items[0])
  File "/opt/bcbio/anaconda/lib/python3.6/site-packages/bcbio/structural/cnvkit.py", line 110, in _run_cnvkit_single
    return _associate_cnvkit_out(ckouts, [data])
  File "/opt/bcbio/anaconda/lib/python3.6/site-packages/bcbio/structural/cnvkit.py", line 90, in _associate_cnvkit_out
    ckout = _add_plots_to_output(ckout, data)
  File "/opt/bcbio/anaconda/lib/python3.6/site-packages/bcbio/structural/cnvkit.py", line 640, in _add_plots_to_output
    scatter = _add_scatter_plot(out, data)
  File "/opt/bcbio/anaconda/lib/python3.6/site-packages/bcbio/structural/cnvkit.py", line 710, in _add_scatter_plot
    priority_bed = plot._prioritize_plot_regions(pybedtools.BedTool(priority_bed), data, os.path.dirname(out_file))
  File "/opt/bcbio/anaconda/lib/python3.6/site-packages/bcbio/structural/plot.py", line 91, in _prioritize_plot_regions
    for r in region_bt:
  File "pybedtools/cbedtools.pyx", line 754, in pybedtools.cbedtools.IntervalIterator.__next__
  File "/opt/bcbio/anaconda/lib/python3.6/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 749: ordinal not in range(128)

Note that this only occurs when running distributed. Running local (-t local) even with multiprocessing, won’t trigger this error. Testing this locally doesn’t yield problems either.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 17 (8 by maintainers)

Commits related to this issue

Most upvoted comments

No reason other than being scared of breaking stuff. A lot of the locale setting is pretty system specific so trying to do as little of it as minimally invasively as possible to make stuff work. I’ve always ended up breaking things whenever changing these so was afraid to try something more global. You might be tougher than me, so totally up to you.