testng: 7.0.0-beta6 memory issues (regression)

A test task runs over 310,000 independent test executions. In 6.14.3 this completes in about 2.5 minutes. In 7.0.0-beta6 this slows down to a crawl at 200k tests, appearing to GC thrash as small batches of tests complete.

In the past, I’ve seen this occur due to reporting keeping parameters in-memory. The project uses a workaround to replace test parameters with their toString version, as these are too heavy for a large run. I suspect there is some similar per-test memory grab by the reporter.

The next step is probably to take a heap dump and determine what is held. Before running too far into this, do you know what changes were made that might hint at what to look for?

./gradlew :caffeine:strongKeysAndStrongValuesAsyncCaffeineTest

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 25 (25 by maintainers)

Commits related to this issue

Most upvoted comments

In today’s round of optimizations, I was able to drop a test suite task from 315MB (8.6M objects) down to 22MB (0.5M objects). The noticeable pauses from a full GC also disappeared.

There are roughly 30 test suites of equal size (140k tests per suite) in order to reduce the report size, as a single test task failed due to the memory overhead. Previously I had already optimized by aggressively removing test parameters and expected exceptions if successful and shortening + dedup’ing test names in the report. The GC is also tuned.

4.2M tests executing in parallel on TravisCI with 512mb heap now takes ~1 hour.

@krmahadevan It did seem a little better by performing more tests in the stresser. Since there are still places that retain all of them, e.g. ResultMap, unfortunately you have to fix all before we feel the impact.

If you feel that you have to wait on me and get blocked, you can perform my steps:

  1. Build the local jar (gradlew jar)
  2. Download testng-stresser and run the project.
    • Update the build to point to your local jar
    • I added maxHeapSize = '256m' locally to fail faster.
  3. Run gradlew cleanTest test until it slows down
  4. Create a heap dump, e.g. jmap -dump:live,format=b,file=/Users/ben/Downloads/1.hprof "$(pgrep -f GradleWorkerMain | head -n 1)"
  5. Open in your favorite tool (JVisualVM, YourKit, JProfiler). The latter two are free for OSS and are great.
Screen Shot 2020-10-14 at 11 51 52 AM

@ben-manes - Based on the details that you shared here as observations, I went ahead and created a PR https://github.com/cbeust/testng/pull/2390 that attempts at reducing the footprint.

Would it be possible for you to checkout this branch, build TestNG locally and try it out without the tweaks you added to caffeine ? That way I can know for sure if the changes are taking us forward in the right direction.

I tried to reflectively clear the queues inside of this useless TestListenerAdapter instance. That dropped the CLQ#Node instance count by half, but did not reduce the memory usage.

I then reflectively captured TestRunner#m_invokedMethods and clear it. This seems to only be utilized by SuiteRunner#getExcludedMethods() for JUnitReportReporter, SuiteHTMLReporter, and IgnoredMethodsPanel. None of these are used in my build. That halved the CLQ#Node instances again, but did not drop memory.

This lead to SuiteRunner#invokedMethods which is similarly unused. By reflectively clearing out those results, the CLQ#Node dropped from my heap dump and the execution ran successfully.

While it dropped memory, it didn’t decrease as much as desired because the TestResults are still being held by a ConcurrentHashMap. This is from ResultMap inside of TestRunner. Again these are only useful for some reporters and I can clean the results manually.

This finally dropped the memory usage and didn’t affect the test reports from Gradle or Eclipse. While certainly ugly, it does prove that TestNG is holding onto the test results unnecessarily for builds, and could avoid that unless explicitly needed by a reporter. Those reporters could also be much more efficient by not requiring the full results stay in-memory, but rather capture only the details they care about (and stream them to disk if possible).

It would be really great if this memory behavior could be improved so that I could remove my narly hacks 😉. (See CacheValidationListener for all of these hacks…)

When optimizing memory usage again, I noticed that Gradle uses a custom testng reporter. This does not extend TestListenerAdapter and instead is much less memory hungry thanks to capturing the minimal fields it needs and using synchronized for cheaper data structures. The performance seems fine, so perhaps CLQ is overkill?

The only TestListenerAdapter that I see in a heap dump comes SuiteRunner and appears to be unused. This is forcibly added during initialization, but that listener is never inspected afterwards. It appears to be entirely unnecessary?

https://github.com/cbeust/testng/blob/518bd75212805eb5f6f6a9518616abb5fd454159/src/main/java/org/testng/SuiteRunner.java#L37

https://github.com/cbeust/testng/blob/518bd75212805eb5f6f6a9518616abb5fd454159/src/main/java/org/testng/SuiteRunner.java#L193-L196

The actual TextReporter is may be added later, but is not done so by Gradle as disabled. For those using it and others, I think you could rewrite them to not extend TestListenerAdapter and instead incrementally build the output like TestNGTestResultProcessorAdapter does, thereby allowing the heavy-weight objects to be discarded. Ideally those that write textual data could stream to disk, e.g. TextReporter prints the buffer at the end and could simply display a temp file instead.

My latest build optimizations to reduce TestNG’s heavy-weight reporting seems to save 20 minutes due to not choking on full garbage collections (a 22% speedup). I believe all of my low-level hacks could be removed if TestNG would more aggressively discard its heavy result objects instead of unnecessarily keeping them throughout the execution. This would improve performance for large builds significantly, and shouldn’t negatively impact small suites either.