adm-zip: filenames with Unicode characters are corrupt
Similar to another issue, filenames with Unicode characters are unusable: 7-zip can neither read nor extract them.
For example:
Tal/A L'infini/Le Passé.txt -> Tal/A L'infini/Le Pass├⌐.tx
Snøfall.txt -> Sn├╕fall.tx
About this issue
- Original URL
- State: open
- Created 6 years ago
- Comments: 21
I just tested this code, little modified example code and for my surprise
for GUI tools I used only Gnome Archive Manager. I also used google drive for testing but it detected correct names with every try.
I have same problem. Have you solved this?
This is not directly related to the OP, but along the same lines and is more of a heads-up. In a work project I am on, one of our developers used a method from the editors auto-complete call addLocalFolderPromise. I was not able to find that method documented so it may be intended to be a private method or was depricated at some point, not sure. As we were using the non-promisified version, the developer thought it would be an improvement to use the promisified version. This is fine if you are using files with non-unicode characters in the filename. If a unicode character is present, like ® in the filename, the folder will get created and the zipped files that contain unicode characters are getting saved with the names having the characters completely removed, breaking the app that was expecting filenames that were linked to be present in the zip directory. So a file named poster®.png will be zipped as poster.png. That comes from the adm-zip.js file lines 357-360
If this should be a bug or separate issue filing, let me know and I’ll be happy to provide further details and open the correct issue/bug with a minimal replication.
I see there’s movement on this, fingers crossed you guys can fix it. Thank you for your efforts so far!
Good to see you guys are talking about Bit 11 as I was thinking along the same lines (see the comment under this article): https://lwn.net/Articles/729835/#:~:text=There are no specs for,the box%2C but for ZIP.
Nice hack, btw, but I am not sure it generates correct ZIP files.
Because adm-zip expects file names to be encoded as utf8 and writes file names encoded as utf8. but forgets to set that bit in flag. So when other apps read zip files created by Adm-zip. They dont understand file names are encoded in utf8 and get garbled results. It is usually not problem when you stick with US alphabetic names.
Maybe you try set this bit in flag instead.
It should work.
Setting the
entry.header.madeto788before writing the zip worked for me. With this theCreated OSis set to03 'Unix'for the dependingCENTRAL HEADER.This workaround might help: