simpletransformers: simpletransformers model trained on Colab doesn't work locally

Hello! First of all, thank you for this library! It is awesome! I encountered the following problem while trying to work with simpletransformers: I’ve followed the instructions in this link and trained a BERT model on Google Colab. I then downloaded it and tried to use it locally on my computer. But something went wrong.

Here is my code for training the model (on Google Colab):

from simpletransformers.classification import ClassificationModel
train_args ={"reprocess_input_data": True,
             "fp16":False,
             "num_train_epochs": 1,
             "save_steps": 100_000,
             "logging_steps": 100}

model = ClassificationModel(
    "bert", "bert-base-german-cased",
    num_labels=21,
    args=train_args
)
model.train_model(train_df)

It worked completely fine on Colab but I need it locally, so I downloaded the archive that was created using the following code:

import os
import tarfile

def pack_model(model_path='',file_name=''):
  files = [files for root, dirs, files in os.walk(model_path)][0]
  with tarfile.open(file_name+ '.tar.gz', 'w:gz') as f:
    for file in files:
      f.add(f'{model_path}/{file}')

pack_model('outputs', 'model')

I then unpacked it and placed in the directory with the file with the following code for testing the model:

from simpletransformers.classification import ClassificationModel

train_args ={"reprocess_input_data": True,
             "fp16":False,
             "num_train_epochs": 1,
             "save_steps": 100_000,
             "logging_steps": 100}

model = ClassificationModel(
    "bert", "outputs/",
    num_labels=21,
    args=train_args,
    use_cuda=False
)

test = "Boris Johnson hält eine Pressekonferenz zum Coronavirus-Aktionsplan der Regierung"

predictions, raw_outputs = model.predict([test])
print([predictions[0])

So the only difference in how I create models is use_cuda=False (because I don’t have a GPU locally).

When I executed the code the following output appeared:

I0526 17:16:12.761774 14508 file_utils.py:39] PyTorch version 1.5.0+cpu available.
I0526 17:16:32.703790 14508 configuration_utils.py:283] loading configuration file outputs/config.json
I0526 17:16:32.706786 14508 configuration_utils.py:321] Model config BertConfig {
  "architectures": [
    "BertForSequenceClassification"
  ],
  "attention_probs_dropout_prob": 0.1,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1",
    "2": "LABEL_2",
    "3": "LABEL_3",
    "4": "LABEL_4",
    "5": "LABEL_5",
    "6": "LABEL_6",
    "7": "LABEL_7",
    "8": "LABEL_8",
    "9": "LABEL_9",
    "10": "LABEL_10",
    "11": "LABEL_11",
    "12": "LABEL_12",
    "13": "LABEL_13",
    "14": "LABEL_14",
    "15": "LABEL_15",
    "16": "LABEL_16",
    "17": "LABEL_17",
    "18": "LABEL_18",
    "19": "LABEL_19",
    "20": "LABEL_20"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1,
    "LABEL_10": 10,
    "LABEL_11": 11,
    "LABEL_12": 12,
    "LABEL_13": 13,
    "LABEL_14": 14,
    "LABEL_15": 15,
    "LABEL_16": 16,
    "LABEL_17": 17,
    "LABEL_18": 18,
    "LABEL_19": 19,
    "LABEL_2": 2,
    "LABEL_20": 20,
    "LABEL_3": 3,
    "LABEL_4": 4,
    "LABEL_5": 5,
    "LABEL_6": 6,
    "LABEL_7": 7,
    "LABEL_8": 8,
    "LABEL_9": 9
  },
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "type_vocab_size": 2,
  "vocab_size": 30000
}

I0526 17:16:32.723798 14508 modeling_utils.py:648] loading weights file outputs/pytorch_model.bin
I0526 17:16:41.755029 14508 tokenization_utils.py:929] Model name 'outputs/' not found in model shortcut name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, bert-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-large-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc, bert-base-german-dbmdz-cased, bert-base-german-dbmdz-uncased, bert-base-finnish-cased-v1, bert-base-finnish-uncased-v1, bert-base-dutch-cased). Assuming 'outputs/' is a path, a model identifier, or url to a directory containing tokenizer files.
I0526 17:16:41.770021 14508 tokenization_utils.py:958] Didn't find file outputs/added_tokens.json. We won't load it.
I0526 17:16:41.779017 14508 tokenization_utils.py:1013] loading file outputs/vocab.txt
I0526 17:16:41.788011 14508 tokenization_utils.py:1013] loading file None
I0526 17:16:41.794008 14508 tokenization_utils.py:1013] loading file outputs/special_tokens_map.json
I0526 17:16:41.806000 14508 tokenization_utils.py:1013] loading file outputs/tokenizer_config.json
I0526 17:16:41.986897 14508 classification_model.py:801]  Converting to features started. Cache is not used.

  0%|          | 0/1 [00:00<?, ?it/s]

And it freezes on this moment. Nothing happens despite this progress bar. What should I do to make it work locally? Maybe there is another way?

My system:

  • Windows 10 Home
  • Python: 3.7.2
  • simpletransformers: 0.29.1

Thanks in advance for any help!

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 15 (5 by maintainers)

Most upvoted comments

Problem solved, thank you! Turning off multiprocessing worked.

I have an Intel® Core i5-4200U CPU 1.60GHz 2.30 GHz and 16 GB of RAM.

Which file should I check for model weights? training_args.bin or pytorch_model.bin?

pytorch_model.bin contains the weights.

I’m experiencing the same problem when trying to deploy a BERT model on an Amazon SageMaker endpoint. The training performed fine on SageMaker, but when deploying to an endpoint, the request hangs at “Converting to features started. Cache is not used.” I have tried this on various instance types (with multiple CPUs or more powerful CPUs - going all the way up to the most powerful instances available), but this does not solve the problem.

Oddly, the problem only occurs when deploying to an endpoint. Loading and using the model works fine when run locally on a Jupyter notebook on a relatively small instance type. I’m stumped as to what the problem could be, because there are no additional error messages.

Can you try turning off multiprocessing? (use_multiprocessing: False)