tomcat-jakartaee-migration: "java.util.zip.ZipException: invalid entry size (expected 0 but got 913 bytes)" when opening migrated jar with `java.util.zip.ZipInputStream`

Context

I’m investigating https://youtrack.jetbrains.com/issue/KT-57767 where the build fails when reading tomcat-embed-core-10.1.7-jakartaee.jar

(Note: If you download the above file, you’ll get a .zip file because GitHub doesn’t accept uploading .jars so I had to rename it to .zip.)

That jar was converted from tomcat-embed-core-10.1.7.jar by the Gradle Jakarta EE Migration plugin, which uses the Apache Tomcat migration tool for Jakarta EE.

If I open the migrated jar using java.util.zip.ZipInputStream, I’ll get:

java.util.zip.ZipException: invalid entry size (expected 0 but got 913 bytes)

If I open it using java.util.zip.ZipFile, I’ll get:

java.util.zip.ZipException: invalid CEN header (bad signature)

This suggests that there is a bug in either ZipInputStream (e.g., https://bugs.openjdk.org/browse/JDK-8298530), ZipFile, the tomcat-jakartaee-migration tool, or a combination of them.

Request

Please investigate whether this is a bug in the tomcat-jakartaee-migration tool.

Even if it isn’t, it would still be nice to change the way the tool generates migrated jars such that they can be read by ZipInpuStream or ZipFile API.

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 35 (24 by maintainers)

Commits related to this issue

Most upvoted comments

I was also working on a fix. I’ve combined your fix and mine and committed it. My local test and the unit tests pass.

Could this be summarized to Java inability to read zip files using ZIP64 fields and streaming data headers?

That sounds right. java.util.zip.ZipInputStream currently can read 8-byte size fields from the data descriptor record, but only does so when the actual number of compressed or uncompressed files is larger than Integer.MAX_VALUE. My OpenJDK PR changes this to add an additional check for the precense of a ZIP64 extended information field in the LOC header. If this is present, the Data Descriptor will be read with 8-byte fields.

If so, this means tomcat-jakartaee-migration could simply detect the ZIP64 extra field and either stop with an error, or enable the in memory processing automatically.

You are probably reading from files on disk anyway, so perhaps using java.util.zip.ZipFile would work better. That API does not need to parse the Data Descriptor record, since it has that all info in the CEN headers. java.util.zip.ZipInputStream

The file is unreadable using a ZipInputStream with OpenJDK 11.0.18 and 17.0.6 (tested with ZuluJDK on Windows). Are you sure it works with Java 11?

I agree this is a JDK bug and we could ignore it if this was fixed in a recent JDK. But until a fix is available the tool is altering a working jar such that no JRE can use it, so I think the burden is upon us to find a solution. At least, if the conditions leading to this issue can be identified, the tool could stop and display a warning, suggesting to use -zipInMemory.

I won’t reopen the issue because I don’t have the time to investigate further, but if someone is willing to contribute a fix I’ll certainly review it.

I have been able to reproduce the issue. Even if Tomcat 10.1 doesn’t need to be migrated the use of a zip64 entry is questionable. The same issue could affect another jar unrelated to Tomcat.