quarkus: Startup RSS regression introduced by #35246
Describe the bug
PR: #35246 has introduced a startup RSS regression of ~14-17% (depending on application and environment) in both JVM and Native modes.
It is possible to see the effects in a simple quickstart (getting-started or config-quickstart)
config-quickstart Startup RSS
| Commit | Run Mode | RSS (Mb) |
|---|---|---|
| 6f55d65 | JVM | 117.256 |
| 3875d03 | JVM | 133.572 |
| 13.9% | ||
| 6f55d65 | Native | 76.744 |
| 3875d03 | Native | 89.892 |
| 17.1% |
Expected behavior
No response
Actual behavior
No response
How to Reproduce?
Steps to reproduce:
- Build Quarkus commit 3875d03
- Build config-quickstart
- Start config-quickstart and measure RSS as described in guide: https://quarkus.io/guides/performance-measure#how-do-we-measure-memory-usage
- Build Quarkus commit 6f55d65
- Rebuild config-quickstart
- Restart config-quickstart and measure RSS as described in guide: https://quarkus.io/guides/performance-measure#how-do-we-measure-memory-usage
Output of uname -a or ver
No response
Output of java -version
No response
GraalVM version (if different from Java)
No response
Quarkus version or git rev
No response
Build tool (ie. output of mvnw --version or gradlew --version)
No response
Additional information
Tests were performed in docker with 16 cores, however, the increase is also measurable with much less cores, down to 2 vCPU’s
About this issue
- Original URL
- State: closed
- Created 10 months ago
- Comments: 37 (37 by maintainers)
To add more information:
The issue is caused by special rules to match environment variables and regular properties. The matching is applied to all available environment variables with all properties available in mappings. We hardly noticed this before, because we only had a few extensions migrated, and with the increase in mappings, the issue became more visible.
@franz1981 PR in https://github.com/smallrye/smallrye-config/pull/984 addresses the allocation issues, which should help when we need to match stuff, but adding a filter to the environment variables to consider drastically reduces the number of matching we need to perform (and allocations).
Currently, I’m seeing the following numbers in my box:
The PR seems to return the number to the previously expected RSS. Maybe with a very slight increase, but I’ll keep looking.
Using the methodology @franz1981 described in method 1 above;
java -agentpath:/path/to/libasyncProfiler.so=start,event=alloc,file=alloc.jfr -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC -XX:-UseTLAB -Xmx512m -Xms512m -XX:+AlwaysPreTouch -jar quarkus-app/quarkus-run.jarand
java -cp /path/to/converter.jar jfr2flame ./alloc.jfr --alloc --total alloc.htmlthere is a large increase in allocations coming from
io..smallrye.config.ConfigMappings.mapConfiguration()https://github.com/quarkusio/quarkus/commit/6f55d65b8509e62a15d4a6e13d6a0da656db227e
https://github.com/quarkusio/quarkus/commit/3875d031d71994121e97eaad413e915db0cda539
Sure. Thanks!
The zipped flame graphs in https://github.com/quarkusio/quarkus/issues/35406#issuecomment-1684168402 were allocation samples.
We updated the methodology to capture the allocation sizes and give a more accurate picture on what is causing the extra memory pressure, this methodology produced the screenshots in this https://github.com/quarkusio/quarkus/issues/35406#issuecomment-1692922194
Attached are the 2nd allocation size profiles
alloc-sizes-6f55d65.zip
alloc-sizes-3875d03.zip
That’s a fair point indeed @geoand , or we would end up writing ugly code for every single line of code!
Correct. I’ll start the work right away when I get back. Meanwhile, feel free to revert the original commit to avoid affecting the performance further.
We now have a baseline and we can work and improve on it. Thanks!
Right, this is how things are meant to work 😃 We actually put effort into optimizing stuff when they start causing a problem.
@gsmet right.
So according to the profile data, the ConfigMapping is the problem.
This is serious and we need to address it ASAP.
One way as @gsmet and I have mentioned in the past would be to move the class generation to extension build time
@franz1981 given it’s a major regression, I think it would be very helpful if you could help to pinpoint the problem we have.