tensorflow: gpu_options doesn't work
Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template
System information
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04, Ubuntu 18.04
- Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:
- TensorFlow installed from (source or binary): pip install way
- TensorFlow version (use command below): pip install tensorflow-gpu==1.12.* and pip install tensorflow-gpu==1.14.*
- Python version: python 3.6
- Bazel version (if compiling from source):
- GCC/Compiler version (if compiling from source):
- CUDA/cuDNN version: tf1.12 with cuda9, tf1.14 with cuda10
- GPU model and memory: ubuntu 16.04 with 12G, ubuntu 18.04 with 24G
You can collect some of this information using our environment capture
script
You can also obtain the TensorFlow version with: 1. TF 1.0: python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)" 2. TF 2.0: python -c "import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)"
Describe the current behavior I used the codes in AdaNet like
GPU_OPTIONS = tf.GPUOptions(allow_growth=True)
CONFIG = tf.ConfigProto(gpu_options=GPU_OPTIONS)
sess = tf.Session(config = CONFIG)
But it still occupied the whole gpu memory when I run the AdaNet. For example: (1) If there is only 4G memory left, it will occupied the remaining 6G memory (2) If there is no other process occupying the GPU memory (i.e., there is 10G memory left), it will occupied the whole 10G memory when I run the AdaNet. So I think it doesn’t need 10G to run but it takes that much any way.
Describe the expected behavior I expect it only takes the memory it needs, instead of taking all of them. Therefore, I could make use of the 10G GPU memory better. But now, I have no idea how to fix this issue. Could anybody please give me some suggestions? Thanks a lot.
Code to reproduce the issue Provide a reproducible test case that is the bare minimum necessary to generate the problem. I used the codes there AdaNet and change all “tf.Session()” to “tf.Session(config=CONFIG)” with codes as I mentioned before.
Other info / logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 15 (4 by maintainers)
Well, it seems like this issue goes away even though I’m not sure whether that I’m guessing is the right reason.
Previously, in the middle of models, I initialize some variables manually instead of using tf.global_variables_initializer. Now, I only initialize those variables that are not initialized at the very time, and suddenly, it works, now the model wouldn’t take the whole GPU memory, at least it works for now.
@aaroey @ymodak You may close this issue now. Thank you very much for helping me!
The codes I used is here AdaNet, to be specific, the codes I got by “pip install adanet==0.5.0”. Since AdaNet has been updated, AdaNet 0.5.0 will have some differences, compared with AdaNet.
Besides, I changed all the original “sess = tf.Session()” inside to “config, … , sess=tf.Session()”. The files involving “tf.Session()” are “core/ensemble.py”, “core/estimator.py”, and “subnetwork/generator.py”.
I put codes as follows in the beginning of each file, like:
Another way is
I tried both and none of them worked. If there is still ambiguity, please tell me to be specific. Thank you so much for help!
Yes, that’s exactly what I did. Didn’t work
You need to put the gpu configuration on top of your code.
Can you please confirm if this was the case? Also can you please point me to example you tried?Thanks!