jack: ResourceExchausted Error when training BiDAF model on GPU with longer context data to SQuAD

Basically, when training on a 12GB Tesla K80 GPU on Google Cloud (you have to manually install tensorflow-gpu otherwise it defaults to CPU), I get a ResourceExhausted error seemingly during:

UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "

For the regular SQuAD data, the GPU is only consuming 5GB of memory, but on the added data (with longer context lengths), it’s at max memory until exhaustion. The FastQA model trains fine on this data.

About this issue

Original URL
State: closed
Created 7 years ago
Comments: 29 (4 by maintainers)

Most upvoted comments

btw. I am working on a modular QA model where you can stick together your model in your yaml config. It will support many of current SotA models out of the box, without having to write new code. will also make it easy to experiment with convolutional encoders vs RNNs for instance. I will create a PR tomorrow.

dirkweissenborn on Dec 3, 2017

Agreed!

riedelcastro on Dec 3, 2017