mne-python: BUG: import of EDF files with accentuated annotations

Hey! So importing an EDF file, I have a crash

ValueError: byte must be in range(0, 256)

occuring at this level: https://github.com/mne-tools/mne-python/blob/f04d1460d10ebe3b9466ccad9c077040d42a6d5f/mne/io/edf/edf.py#L258

Digging into the code, I realize that the (French) people who have recorded the EEG may have put characters with accents.

By changing the above part of the code to :

try:
    tals.extend([i % 256, i // 256])
except:
    print i, tals[10:].decode('ascii', 'ignore')
    raise('DEBUG')

I obtained this:

-5779 0000+44+1166PHOTO 0.5Hz+45+1172+1172.200000+46+1172Yeux ouverts+47+1178+1178.840000+48+1178Yeux fer

It’s indeed French, and the error occurs exactly at the level of the annotation Yeux fer (if you take i = -5779, the i % 256 gives ‘m’, so I assume the original second character causing the bug was ‘é’…

So, what should be the best way to handle this?

Thanks! Guillaume

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Comments: 20 (20 by maintainers)

Commits related to this issue

Most upvoted comments

Shall we just make a warning and replace the character with a neutral one?

It would be better to properly handle UTF-8 encoding. We do it everywhere else (that we know of), and if the file format allows it, there should be some reasonable way to do it.