deeplearning4j: Using Workspaces and disabling GC calls ends in java.lang.OutOfMemoryError

Issue Description

When I switch my neural net from JVM gc to workspaces by adding .trainingWorkspaceMode(WorkspaceMode.SEPARATE) to my model and disabling periodic gc calls with Nd4j.getMemoryManager().togglePeriodicGc(false);, memory consumption explodes during training, and after a few epochs, I get:

java.lang.OutOfMemoryError: Cannot allocate new FloatPointer(1): totalBytes = -3051750267, physicalBytes = 6G

	at org.bytedeco.javacpp.FloatPointer.<init>(FloatPointer.java:76)
	at org.bytedeco.javacpp.FloatPointer.<init>(FloatPointer.java:41)
	at org.nd4j.linalg.jcublas.blas.JcublasLevel3.sgemm(JcublasLevel3.java:107)
	at org.nd4j.linalg.api.blas.impl.BaseLevel3.gemm(BaseLevel3.java:57)
	at org.nd4j.linalg.api.ndarray.BaseNDArray.mmuli(BaseNDArray.java:3011)
	at org.nd4j.linalg.api.ndarray.BaseNDArray.mmul(BaseNDArray.java:2812)
	at org.deeplearning4j.nn.layers.BaseLayer.preOutput(BaseLayer.java:317)
	at org.deeplearning4j.nn.layers.BaseLayer.activate(BaseLayer.java:328)
	at org.deeplearning4j.nn.layers.recurrent.RnnOutputLayer.output(RnnOutputLayer.java:149)
	at org.deeplearning4j.nn.layers.BaseOutputLayer.activate(BaseOutputLayer.java:189)
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.activationFromPrevLayer(MultiLayerNetwork.java:789)
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.feedForwardToLayer(MultiLayerNetwork.java:929)
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.feedForward(MultiLayerNetwork.java:870)
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.feedForward(MultiLayerNetwork.java:861)
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.silentOutput(MultiLayerNetwork.java:1906)
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.silentOutput(MultiLayerNetwork.java:1936)
	at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.doEvaluation(MultiLayerNetwork.java:2892)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
	at com.intellij.junit4.JUnit45ClassesRequestBuilder$1$1$2$2.runChild(JUnit45ClassesRequestBuilder.java:82)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:49)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
	at org.junit.runner.JUnitCore.run(JUnitCore.java:157)
	at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
	at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
	at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
	at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
Caused by: java.lang.OutOfMemoryError: Physical memory usage is too high: physicalBytes = 6G > maxPhysicalBytes = 6G
	at org.bytedeco.javacpp.Pointer.deallocator(Pointer.java:576)
	at org.bytedeco.javacpp.Pointer.init(Pointer.java:121)
	at org.bytedeco.javacpp.FloatPointer.allocateArray(Native Method)
	at org.bytedeco.javacpp.FloatPointer.<init>(FloatPointer.java:68)
	... 38 more

With normal JVM gc, only a fraction of the memory is used, and training is a little bit faster, too. Which should not be the case. Also tried to disable evaluation after each epoch, but same issue. Here is the java code of the neural net config and the iterator I am using:

https://gist.github.com/Tschigger/e451fdc68b13d19157478b7b4084ec62

Version Information

Ubuntu System RAM: 16gb GPU RAM: 8GB (1070)

deeplearning4j-cuda-8.0, version 0.9.1
nd4j-cuda-8.0-platform, version 0.9.1
datavec-api, version 0.9.1

Contributing

If you’d like to help us fix the issue by contributing some code, but would like guidance or help in doing so, please mention it!

About this issue

Original URL
State: closed
Created 7 years ago
Comments: 30 (16 by maintainers)

Most upvoted comments

It doesn`t matter if I set training or inference workspaces to single or separate or if I outcomment the parameters completely in the network building process. The result is always

[training: NONE; inference: SEPARATE]

no matter what I do or set.

Tschigger on Nov 4, 2017