bcbio-nextgen: Encoding errors with CIVIC svprioritize and IPython distributed tasks
A typical error, which occurs somehow inside bedtools, is like this:
Traceback (most recent call last):
File "/opt/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/ipythontasks.py", line 52, in _setup_logging
yield config
File "/opt/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/ipythontasks.py", line 447, in detect_sv
return ipython.zip_args(apply(structural.detect_sv, *args))
File "/opt/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/ipythontasks.py", line 80, in apply
return object(*args, **kwargs)
File "/opt/bcbio/anaconda/lib/python3.6/site-packages/bcbio/structural/__init__.py", line 205, in detect_sv
for svdata in caller_fn(items):
File "/opt/bcbio/anaconda/lib/python3.6/site-packages/bcbio/structural/cnvkit.py", line 57, in run
return _cnvkit_by_type(items, background)
File "/opt/bcbio/anaconda/lib/python3.6/site-packages/bcbio/structural/cnvkit.py", line 67, in _cnvkit_by_type
return _run_cnvkit_single(items[0])
File "/opt/bcbio/anaconda/lib/python3.6/site-packages/bcbio/structural/cnvkit.py", line 110, in _run_cnvkit_single
return _associate_cnvkit_out(ckouts, [data])
File "/opt/bcbio/anaconda/lib/python3.6/site-packages/bcbio/structural/cnvkit.py", line 90, in _associate_cnvkit_out
ckout = _add_plots_to_output(ckout, data)
File "/opt/bcbio/anaconda/lib/python3.6/site-packages/bcbio/structural/cnvkit.py", line 640, in _add_plots_to_output
scatter = _add_scatter_plot(out, data)
File "/opt/bcbio/anaconda/lib/python3.6/site-packages/bcbio/structural/cnvkit.py", line 710, in _add_scatter_plot
priority_bed = plot._prioritize_plot_regions(pybedtools.BedTool(priority_bed), data, os.path.dirname(out_file))
File "/opt/bcbio/anaconda/lib/python3.6/site-packages/bcbio/structural/plot.py", line 91, in _prioritize_plot_regions
for r in region_bt:
File "pybedtools/cbedtools.pyx", line 754, in pybedtools.cbedtools.IntervalIterator.__next__
File "/opt/bcbio/anaconda/lib/python3.6/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 749: ordinal not in range(128)
Note that this only occurs when running distributed. Running local (-t local
) even with multiprocessing, won’t trigger this error. Testing this locally doesn’t yield problems either.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 17 (8 by maintainers)
No reason other than being scared of breaking stuff. A lot of the locale setting is pretty system specific so trying to do as little of it as minimally invasively as possible to make stuff work. I’ve always ended up breaking things whenever changing these so was afraid to try something more global. You might be tougher than me, so totally up to you.