quarkus: Startup RSS regression introduced by #35246

Describe the bug

PR: #35246 has introduced a startup RSS regression of ~14-17% (depending on application and environment) in both JVM and Native modes.

It is possible to see the effects in a simple quickstart (getting-started or config-quickstart)

config-quickstart Startup RSS

Commit	Run Mode	RSS (Mb)
6f55d65	JVM	117.256
3875d03	JVM	133.572
		13.9%

6f55d65	Native	76.744
3875d03	Native	89.892
		17.1%

Expected behavior

No response

Actual behavior

No response

How to Reproduce?

Steps to reproduce:

Build Quarkus commit 3875d03
Build config-quickstart
Start config-quickstart and measure RSS as described in guide: https://quarkus.io/guides/performance-measure#how-do-we-measure-memory-usage
Build Quarkus commit 6f55d65
Rebuild config-quickstart
Restart config-quickstart and measure RSS as described in guide: https://quarkus.io/guides/performance-measure#how-do-we-measure-memory-usage

Output of `uname -a` or `ver`

No response

Output of `java -version`

No response

GraalVM version (if different from Java)

No response

Quarkus version or git rev

No response

Build tool (ie. output of `mvnw --version` or `gradlew --version`)

No response

Additional information

Tests were performed in docker with 16 cores, however, the increase is also measurable with much less cores, down to 2 vCPU’s

About this issue

Original URL
State: closed
Created 10 months ago
Comments: 37 (37 by maintainers)

Most upvoted comments

To add more information:

The issue is caused by special rules to match environment variables and regular properties. The matching is applied to all available environment variables with all properties available in mappings. We hardly noticed this before, because we only had a few extensions migrated, and with the increase in mappings, the issue became more visible.

@franz1981 PR in https://github.com/smallrye/smallrye-config/pull/984 addresses the allocation issues, which should help when we need to match stuff, but adding a filter to the environment variables to consider drastically reduces the number of matching we need to perform (and allocations).

Currently, I’m seeing the following numbers in my box:

Commit	Run Mode	RSS (Mb)
6f55d65	JVM	163,87
3875d03	JVM	195,02
with PR	JVM	163,94

6f55d65	Native	58,52
3875d03	Native	69,11
with PR	Native	58,75

The PR seems to return the number to the previously expected RSS. Maybe with a very slight increase, but I’ll keep looking.

radcortez on Aug 29, 2023

Using the methodology @franz1981 described in method 1 above;

java -agentpath:/path/to/libasyncProfiler.so=start,event=alloc,file=alloc.jfr -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC -XX:-UseTLAB -Xmx512m -Xms512m -XX:+AlwaysPreTouch -jar quarkus-app/quarkus-run.jar

and

java -cp /path/to/converter.jar jfr2flame ./alloc.jfr --alloc --total alloc.html

there is a large increase in allocations coming from io..smallrye.config.ConfigMappings.mapConfiguration()

https://github.com/quarkusio/quarkus/commit/6f55d65b8509e62a15d4a6e13d6a0da656db227e

Screenshot from 2023-08-25 08-40-01

https://github.com/quarkusio/quarkus/commit/3875d031d71994121e97eaad413e915db0cda539

Screenshot from 2023-08-25 08-43-52

johnaohara on Aug 25, 2023

Sure. Thanks!

radcortez on Aug 25, 2023

The zipped flame graphs in https://github.com/quarkusio/quarkus/issues/35406#issuecomment-1684168402 were allocation samples.

We updated the methodology to capture the allocation sizes and give a more accurate picture on what is causing the extra memory pressure, this methodology produced the screenshots in this https://github.com/quarkusio/quarkus/issues/35406#issuecomment-1692922194

Attached are the 2nd allocation size profiles

alloc-sizes-6f55d65.zip

alloc-sizes-3875d03.zip

johnaohara on Aug 25, 2023

That’s a fair point indeed @geoand , or we would end up writing ugly code for every single line of code!

franz1981 on Aug 25, 2023

Correct. I’ll start the work right away when I get back. Meanwhile, feel free to revert the original commit to avoid affecting the performance further.

We now have a baseline and we can work and improve on it. Thanks!

radcortez on Aug 25, 2023

Right, this is how things are meant to work 😃 We actually put effort into optimizing stuff when they start causing a problem.

geoand on Aug 25, 2023

@gsmet right.

So according to the profile data, the ConfigMapping is the problem.

This is serious and we need to address it ASAP.

One way as @gsmet and I have mentioned in the past would be to move the class generation to extension build time

geoand on Aug 25, 2023

@franz1981 given it’s a major regression, I think it would be very helpful if you could help to pinpoint the problem we have.

gsmet on Aug 25, 2023

quarkus: Startup RSS regression introduced by #35246

Describe the bug

Expected behavior

Actual behavior

How to Reproduce?

Output of uname -a or ver

Output of java -version

GraalVM version (if different from Java)

Quarkus version or git rev

Build tool (ie. output of mvnw --version or gradlew --version)

Additional information

About this issue

Most upvoted comments

Output of `uname -a` or `ver`

Output of `java -version`

Build tool (ie. output of `mvnw --version` or `gradlew --version`)