java-buildpack: JVM Kill Agent not printing summary of memory spaces, Internal JVM Error instead

Environment:

Java Buildpack 4.5.1 IaaS: vSphere garden-runc: 1.9.0 diego: 1.23.2 cf: 259 capi: 1.28.0

Problem:

Our Application crashes in regular intervals because of resource exhaustion. In the logs, we see the following line: 2017-10-11T17:05:02.25+0200 [APP/PROC/WEB/3]ERR ResourceExhausted! (1/0)

As expected, right afterwards we see the memory histogram. But after that, there should be the Memory usage summary, correct? (At least during my testing on PWS, I see this working beautifully) In our case, we don’t see the memory summary but this error message (order not guaranteed):

2017-10-11T17:05:31.80+0200 [APP/PROC/WEB/3]OUT #
2017-10-11T17:05:31.80+0200 [APP/PROC/WEB/3]OUT # A fatal error has been detected by the Java Runtime Environment:
2017-10-11T17:05:31.80+0200 [APP/PROC/WEB/3]OUT #
2017-10-11T17:05:31.80+0200 [APP/PROC/WEB/3]OUT #  Internal Error (javaCalls.cpp:53), pid=14, tid=0x00007fceab216700
2017-10-11T17:05:31.80+0200 [APP/PROC/WEB/3]OUT # JRE version: OpenJDK Runtime Environment (8.0_144-b01) (build 1.8.0_144-b01)
2017-10-11T17:05:31.80+0200 [APP/PROC/WEB/3]OUT # Java VM: OpenJDK 64-Bit Server VM (25.144-b01 mixed mode linux-amd64 compressed oops)
2017-10-11T17:05:31.80+0200 [APP/PROC/WEB/3]OUT # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
2017-10-11T17:05:31.80+0200 [APP/PROC/WEB/3]OUT #
2017-10-11T17:05:31.80+0200 [APP/PROC/WEB/3]OUT # An error report file with more information is saved as:
2017-10-11T17:05:31.80+0200 [APP/PROC/WEB/3]OUT # /home/vcap/app/hs_err_pid14.log
2017-10-11T17:05:31.80+0200 [APP/PROC/WEB/3]OUT #  guarantee(!thread->is_Compiler_thread()) failed: cannot make java calls from the compiler
2017-10-11T17:05:31.80+0200 [APP/PROC/WEB/3]OUT #
2017-10-11T17:05:31.80+0200 [APP/PROC/WEB/3]OUT [thread 140525006722816 also had an error]
2017-10-11T17:05:31.85+0200 [APP/PROC/WEB/3]OUT # Compiler replay data is saved as:
2017-10-11T17:05:31.85+0200 [APP/PROC/WEB/3]OUT # /home/vcap/app/replay_pid14.log
2017-10-11T17:05:31.85+0200 [APP/PROC/WEB/3]OUT #
2017-10-11T17:05:31.85+0200 [APP/PROC/WEB/3]OUT #
2017-10-11T17:05:31.85+0200 [APP/PROC/WEB/3]OUT # If you would like to submit a bug report, please visit:
2017-10-11T17:05:31.85+0200 [APP/PROC/WEB/3]OUT #   http://bugreport.java.com/bugreport/crash.jsp
2017-10-11T17:05:31.85+0200 [APP/PROC/WEB/3]OUT #
2017-10-11T17:05:31.91+0200 [APP/PROC/WEB/3]OUT Exit status 134

After this, the application restarts and goes back to working fine.

Expected outcome

In order to diagnose the memory leak, it would be great to get the memory summary.

Notes

At this point, I assume this might be caused by the jvmkill agent trying to gather the memory data, but this is really just a unfounded suspicion.

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 21 (10 by maintainers)

Most upvoted comments

P.s. the short version why this happens: the JIT compiler threads occasionally allocate memory from MetaSpace. That has nothing to do with CodeCacheSize etc. If we hit a OOM in Metaspace at that point, ResourceExhausted will be posted from inside the CompilerThread. There is no real way around this. One can reduce the likelihood of that Bug happening by increasing MaxMetaspaceSize.

Hi, as I understood, the fix solves “no histogram printed” problem, but doesn’t solve “crash related to MetaSpace” problem. Just to be clear, did I understand right? Thanks in advance.

AFAIU (this issue is four years old) it should fix the issue described here, which is a crash due to JVMTI posting the ResourceExhausted in a JIT thread, which causes jvmkill to be invoked, which calls back into the JVM, which is not allowed.

Yeah sorry, the JBS is writable only to OpenJDK authors only.

Since there’s been no response on this issue in a couple of weeks, I’m going to close it. If you’d like to see it re-opened, please comment on the issue and I’ll reopen it.

The memory summary is produced by invoking MXBeans using JNI. It seems that the problem here is that jvmkill is being driven under a (JIT) compiler thread and is then unable to make such JNI calls due to a restriction in Hotspot. Unfortunately, unless this restriction is lifted, jvmkill will not be able to produce a memory summary in these circumstances.

I think the best we can do is attempt to diagnose the case where we are driven under a compiler thread and, in that case, print a suitable warning (which may infer something about the kind of memory pool that is being exceeded, i.e. something to do with compilation) and bypass any actions which involve JNI calls, including producing a memory summary.