h5py: Encoding error on Windows I/O with Python 3.6
I use H5py 2.7rc2 with Python 3.6.0 64bits on Windows 7 64bits.
In this example, I try to write and read two files named test_e.h5 and test_é.h5 :
Theses problems are new (Only since I updated to Python 3.6).
Maybe linked to PEP529 change.
Write test :
f = h5py.File('test_e.h5', 'w')
f.close()
f = h5py.File('test_é.h5', 'w')
f.close()
print('Directory content after test:', os.listdir())
Result :
Directory content after test: ['test_e.h5', 'test_é.h5']
Bad file name: test_é.h5 in place of test_é.h5
Read Test
print('Directory content before test: ', os.listdir())
f = h5py.File('test_e.h5', 'r')
f.close()
f = h5py.File('test_é.h5', 'r')
f.close()
Result :
Directory content before test: ['test_e.h5', 'test_é.h5']
Traceback (most recent call last):
File "D:/Dev/format_Hdf5.py", line 16, in <module>
f = h5py.File('test_é.h5', 'r')
File "d:\app\python36\lib\site-packages\h5py\_hl\files.py", line 271, in __init__
fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr)
File "d:\app\python36\lib\site-packages\h5py\_hl\files.py", line 101, in make_fid
fid = h5f.open(name, flags, fapl=fapl)
File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (D:\Build\h5py\h5py-2.7.0rc2\h5py\_objects.c:2853)
File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (D:\Build\h5py\h5py-2.7.0rc2\h5py\_objects.c:2811)
File "h5py\h5f.pyx", line 78, in h5py.h5f.open (D:\Build\h5py\h5py-2.7.0rc2\h5py\h5f.c:2130)
OSError: Unable to open file (Unable to open file: name = 'test_é.h5', errno = 2, error message = 'no such file or directory', flags = 0, o_flags = 0)
Existing file test_é.h5 not found.
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 23 (11 by maintainers)
Just to let you know that HDF5 1.10.6 is supporting UTF-8 filenames in Windows!
I have been in touch with HDF support and they assured me that they will start using UTF-8 filenames on Windows starting with 1.10.3, which is due for release at the end of this year.
That should settle this issue once and for all…
@aragilar using
filename.encode('mbcs')work well as temporary fix for my code. Thanks.Strangely, using
os.fsencode(filename)even with filesystem forced to mbcs don’t work :Result :
Filesystem encoding is mbcs. But, encoded bytes are different (The first is in
utf-8).