liquibase: Too many open files error on application server: LiquibaseEntityResolver opens too many file descriptors and never closes them

Environment

Java version: OpenJDK 11
Liquibase Version: newest (4.3.3)
Liquibase Integration & Version: Java API (we’re using new new Liquibase().listUnrunChangeSets manually). It’s called from within a webapp with Jetty 9 / Tomcat 8 / Tomcat 9 / JSF 2.3 / Weld 3.1.7.
Database Vendor & Version: tested with MySQL 8 and PostgreSQL 12, but probably independent of DBMS
Operating System Type & Version: Ubuntu (tested on 18.04 and 20.10)

Description

Symptoms are the same as #1449, but not sure if it’s the same cause (we’re not using Spring). When running Liquibase, it opens many file decriptors and never closes them. They only get closed when the Garbabe Collector runs, which may be a long in the future. This results in our application server crashing.

I first observed this on a test environment server with Tomcat 8 after upgrading vom liquibase 3 to liquibase 4. The webapp failed to start, and the log contained many too many open files messages. Checking via /proc/pid/fd indeed revealed that the number of file descriptors had reached the limit. At first I could not reproduce this locally on my machine, but now I can and I think I know why it happens:

The problematic part is in LiquibaseEntitiyResolve, which is called when Liquibase parses the database changelogs (we’ve got 11 XML changelog files) after we call `Liquibase#listUnrunChangeSets´.

When the LiquibaseEntityResolver is constructed, it also creates a fallbackResourceAccessor of type ClassLoaderResourceAccessor. The ClassLoaderResourceAccessor loads all JARs of the application, but does so only lazily only when its #openStreams method is called.

Next the LiquibaseEntityResolver creates a CompositeResourceAccessor with the resource loader from the current scope and the fallback resource loader. Now, and that’s where the error starts happening, the composite resource accessor always calls openStreams on all accessors, regardless of whether an earlier found something or not.

This means that init method of the fallback ClassLoaderResourceAccessor is always called, loading all JARs.

In our case the LiquibaseEntityResolver is created about ~20 times, and each time it loads ~300 JARs. The fiel descriptor limit on the server is 4000, so it crashes before liquibase is done.

What I had been wondering is why I cannot observe this behavior locally, but now I’ve got an explanation for that. The same process I described above actually does happen on my machine, but: the file descriptors are closed implicitly by the finalize of the ZipFileSystem. When I debug into the application, I can observe the file descriptors growing on my machine as well. It’s just that I seem to get lucky that the garbage collector kicks in early enough before even more file descriptors are opened. When I increased the file descriptor limit on the test server, it finish running liquibase, but it only ran the finalizer and closed the file descriptor several minutes later (peak was ~6500 open file descriptors)

Is that enough information to reproduce? Not sure what the best solution here might be, but I guess some things that could be done are (i) closing the created file systems explicitly, (ii) not creating a new fallback accessor each time, (iii) not using the fallback accessor when the resource accessor from the current scope already returns a result.

About this issue

Original URL
State: closed
Created 3 years ago
Comments: 20 (4 by maintainers)

Commits related to this issue

Too many open files error on application server: LiquibaseEntityResolver opens too many file descriptors and never closes them #1815 LB-1547 — committed to liquibase/liquibase by nvoxland 3 years ago
Rolled back to 3.10.3 liquibase https://github.com/liquibase/liquibase/issues/1815. Liquibase have a PR open that will be fixed in the next release and apparently is affecting 4.X versions ( https:... — committed to ORCID/ORCID-Source by Camelia-Orcid 3 years ago

Most upvoted comments

Hi all, when can we expect this fix to be released?

vesin on Aug 3, 2021

We have experienced similar performance issues as described in #1754 and the changes seem to have fixed it. I tested with the artifact linked by @nvoxland above.

Florianisme on May 25, 2021

You can get a build with the changes off https://github.com/liquibase/liquibase/actions?query=branch%3ALB-1547

Can you try from that?

nvoxland on May 25, 2021