deeplearning4j: NL4J: threads failing to stop

Issue Description

We use NL4J to do predictions based on a pretrained keras model. We do this from within tomcat (via spring boot). This is working fine, but the Tomcat shutdown does take quite a while because of threads failing to stop. The messages we get while tomcat shutsdown are these:

May 15, 2019 7:08:07 AM org.apache.coyote.AbstractProtocol pause
INFO: Pausing ProtocolHandler ["http-nio-8080"]
May 15, 2019 7:08:07 AM org.apache.catalina.core.StandardService stopInternal
INFO: Stopping service [Tomcat]
May 15, 2019 7:08:07 AM org.apache.catalina.loader.WebappClassLoaderBase clearReferencesThreads
WARNING: The web application [ROOT] appears to have started a thread named [JavaCPP Deallocator] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread:
 java.base@11.0.2/java.lang.Object.wait(Native Method)
 java.base@11.0.2/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:155)
 java.base@11.0.2/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:176)
 app//org.bytedeco.javacpp.Pointer$DeallocatorThread.run(Pointer.java:302)
May 15, 2019 7:08:07 AM org.apache.catalina.loader.WebappClassLoaderBase clearReferencesThreads
WARNING: The web application [ROOT] appears to have started a thread named [Workspace deallocator thread] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread:
 java.base@11.0.2/java.lang.Object.wait(Native Method)
 java.base@11.0.2/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:155)
 java.base@11.0.2/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:176)
 app//org.nd4j.linalg.memory.provider.BasicWorkspaceManager$WorkspaceDeallocatorThread.run(BasicWorkspaceManager.java:292)
May 15, 2019 7:08:07 AM org.apache.catalina.loader.WebappClassLoaderBase clearReferencesThreads
WARNING: The web application [ROOT] appears to have started a thread named [NativeRandomDeallocator thread 0] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread:
 java.base@11.0.2/java.lang.Object.wait(Native Method)
 java.base@11.0.2/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:155)
 java.base@11.0.2/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:176)
 app//org.nd4j.rng.deallocator.NativeRandomDeallocator$DeallocatorThread.run(NativeRandomDeallocator.java:96)
May 15, 2019 7:08:07 AM org.apache.coyote.AbstractProtocol stop
INFO: Stopping ProtocolHandler ["http-nio-8080"]
May 15, 2019 7:08:07 AM org.apache.coyote.AbstractProtocol destroy
INFO: Destroying ProtocolHandler ["http-nio-8080"]

as you can see, there are three different threads causing issues, these are:

NL4J: “Workspace deallocator thread”
NL4J: “NativeRandomDeallocator thread 0”
JavaCPP: “JavaCPP Deallocator”

I understand this might not be an issue in many cases but it really hurts while doing local development and does not give a good feeling about resource handling in general.

If it is not possible to fix this in an easy way, please provide a way to shutdown these threads programmatically. This would allow to easily solve the issue with the help of some application lifecycle handler by the application itself.

Version Information

Please indicate relevant versions, including, if relevant:

Deeplearning4j version: 1.0.0-beta3
platform information (OS, etc): mac

Contributing

If you have an idea how to tackle this, please let me know - would love to fix it the way you want it.

About this issue

Original URL
State: closed
Created 5 years ago
Comments: 19 (11 by maintainers)

Commits related to this issue

* Start `Pointer.DeallocatorThread` with `setContextClassLoader(null)` as required by containers (issue deeplearning4j/deeplearning4j#7737) — committed to bytedeco/javacpp by saudet 5 years ago

Most upvoted comments

I’m off now for a day, but sure will try on friday

imod on May 15, 2019