LongQC: pandas.errors.EmptyDataError: No columns to parse from file

Hi,

Thanks for the great work.

I experience a similar issue as described here #28 and here #34.

longQC:2021-10-27 08:06:14,443:598:INFO:Generating coverage related plots...
Traceback (most recent call last):
  File "/storage/home/hcoda1/3/apfennig3/LongQC/longQC.py", line 956, in <module>
    main(args)
  File "/storage/home/hcoda1/3/apfennig3/LongQC/longQC.py", line 62, in main
    args.handler(args)
  File "/storage/home/hcoda1/3/apfennig3/LongQC/longQC.py", line 602, in command_sample
    lc = LqCoverage(cov_path, isTranscript=args.transcript, control_filtering=pb_control)
  File "/storage/home/hcoda1/3/apfennig3/LongQC/lq_coverage.py", line 88, in __init__
    self.df = pd.read_table(table_path, sep='\t', header=None, dtype={3: str, 4: str})
  File "/storage/home/hcoda1/3/apfennig3/.conda/envs/GBL/lib/python3.9/site-packages/pandas/util/_decorators.py", line 311, in wrapper
    return func(*args, **kwargs)
  File "/storage/home/hcoda1/3/apfennig3/.conda/envs/GBL/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 683, in read_table
    return _read(filepath_or_buffer, kwds)
  File "/storage/home/hcoda1/3/apfennig3/.conda/envs/GBL/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 482, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/storage/home/hcoda1/3/apfennig3/.conda/envs/GBL/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 811, in __init__
    self._engine = self._make_engine(self.engine)
  File "/storage/home/hcoda1/3/apfennig3/.conda/envs/GBL/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 1040, in _make_engine
    return mapping[engine](self.f, **self.options)  # type: ignore[call-arg]
  File "/storage/home/hcoda1/3/apfennig3/.conda/envs/GBL/lib/python3.9/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 69, in __init__
    self._reader = parsers.TextReader(self.handles.handle, **kwds)
  File "pandas/_libs/parsers.pyx", line 549, in pandas._libs.parsers.TextReader.__cinit__
pandas.errors.EmptyDataError: No columns to parse from file

However, I don’t think it’s a memory issue. I already reduced the index size to 100M. The peak RSS is 6.7G and 22.7G during the spiked-in control, which seems to run through normal. I requested 64G of Ram, which is why I don’t think memory is the issue here. This is the command I used to execute the pipeline:

python ${home_dir}LongQC/longQC.py sampleqc -o ${home_dir}scratch/QC/ -i 100M -x pb-sequel --sample_name gbl -m 1 -p 64 ${home_dir}scratch/gbl.subreads.bam

The coverage_out.txt file is empty, causing the error. I attached the coverage_err.txt file, the log file, and the files corresponding to the spiked-in control:

coverage_err_gbl.txt qc.log spiked_in_control_gbl.txt spiked_in_control_gbl_stderr.txt

Any thoughts on this?

Thanks, Aaron

About this issue

  • Original URL
  • State: open
  • Created 3 years ago
  • Reactions: 2
  • Comments: 15 (2 by maintainers)

Most upvoted comments

same issue here, using a 64-cores 700GB ram server. edit: same error using version 1.2.0c and git clone version