fast-bert: notebook not working out of the box

I’m trying to just get the included toxicity notebook to work from a fresh clone and am having some issues:

  1. Out of the box, the data & labels directory are pointing to the wrong place and the DataBunch is using filenames that are not part of the repo. These are fixed easily enough.

  2. It would help if there was a pointer to where to get the PyTorch pretrained model uncased_L-12_H-768_A-12. There is a Google download which will not work with the from_pretrained_model cell:

FileNotFoundError: [Errno 2] No such file or directory: '../../bert/bert-models/uncased_L-12_H-768_A-12/pytorch_model.bin'

I have been able to get past this step by instead of using ‘bert-base-uncased’ instead of BERT_PRETRAINED_PATH as the model spec in the tokenizer and from_pretrained_model steps.

  1. Once I get everything loaded, RuntimeError: CUDA out of memory. Tried to allocate 96.00 MiB (GPU 0; 7.43 GiB total capacity; 6.91 GiB already allocated; 10.94 MiB free; 24.36 MiB cached)

This is a standard 8G GPU compute engine instance on GCP. Advice on how to not run out of memory would help the tutorial a lot.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 17 (5 by maintainers)

Most upvoted comments

I managed to solve this issue by restarting my jupyter kernel before running the model. I also used the gradient_accumulation = 8 setting + batch_size = 8 to get batches of 64 that fit into my GPU memory. I was able to keep sequence length ar 256 with these settings.

Anyway, I solved that ===>maybe others might encounter the same issue. The point was that the pytorch v1.0.1 in windows doesn’t have the distribution stuff. I moved to ubuntu 18.4 and it works.

!git clone https://github.com/NVIDIA/apex.git %cd apex !python setup.py install --cuda_ext --cpp_ext

Regarding “out of memory”, I set max_seq_length = 64 and successfully run on 8GB GTX1070.