transformers: RuntimeError: unexpected EOF, expected 7491165 more bytes. The file might be corrupted.

❓ Questions & Help

I tried a small chunk of code from the Readme.md

import torch
from transformers import *
MODELS = [(BertModel,       BertTokenizer,       'bert-base-uncased')]
for model_class, tokenizer_class, pretrained_weights in MODELS:
    # Load pretrained model/tokenizer
    tokenizer = tokenizer_class.from_pretrained(pretrained_weights)
    model = model_class.from_pretrained(pretrained_weights)
    input_ids = torch.tensor([tokenizer.encode("Here is some text to encode", add_special_tokens=True)])  # Add special tokens takes care of adding [CLS], [SEP], <s>... tokens in the right way for each model.
    with torch.no_grad():
        last_hidden_states = model(input_ids)[0]

It is giving me the following error

RuntimeError                              Traceback (most recent call last)
<ipython-input-3-6528fe9b0472> in <module>
      3     tokenizer = tokenizer_class.from_pretrained(pretrained_weights)
----> 4     model = model_class.from_pretrained(pretrained_weights)

~/.conda/envs/transformers/lib/python3.7/site-packages/transformers/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
    343 
    344         if state_dict is None and not from_tf:
--> 345             state_dict = torch.load(resolved_archive_file, map_location='cpu')
    346 
    347         missing_keys = []

~/.conda/envs/transformers/lib/python3.7/site-packages/torch/serialization.py in load(f, map_location, pickle_module, **pickle_load_args)
    424         if sys.version_info >= (3, 0) and 'encoding' not in pickle_load_args.keys():
    425             pickle_load_args['encoding'] = 'utf-8'
--> 426         return _load(f, map_location, pickle_module, **pickle_load_args)
    427     finally:
    428         if new_fd:

~/.conda/envs/transformers/lib/python3.7/site-packages/torch/serialization.py in _load(f, map_location, pickle_module, **pickle_load_args)
    618     for key in deserialized_storage_keys:
    619         assert key in deserialized_objects
--> 620         deserialized_objects[key]._set_from_file(f, offset, f_should_read_directly)
    621         if offset is not None:
    622             offset = f.tell()

RuntimeError: unexpected EOF, expected 7491165 more bytes. The file might be corrupted.

Haven’t modified anything in the library.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 16 (3 by maintainers)

Most upvoted comments

Hi! It seems to me that the file that was downloaded was corrupted, probably because of lacking space or a network error. Could you try using the from_pretrained with the force_download option ?

If you are using Window 10 machine, deleting vgg16-something in folder C:\Users\UserName\.cache\torch\checkpoints would solve probelm.

If you are using Window 10 machine, deleting vgg16-something in folder C:\Users\UserName\.cache\torch\checkpoints would solve probelm.

This worked for me