transformers: [Bug] KeyError: 'nllb-moe' when trying to load `nllb-moe-54b` model

System Info

transformers version: 4.28.0.dev0
Platform: Linux-5.4.0-74-generic-x86_64-with-glibc2.10
Python version: 3.8.8
Huggingface_hub version: 0.13.3
PyTorch version (GPU?): 2.0.0+cu117 (True)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?: Yes
Using distributed or parallel set-up in script?: No

Who can help?

@ArthurZucker from https://github.com/huggingface/transformers/pull/22024

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, …)
My own task or dataset (give details below)

Reproduction

Following example script on https://huggingface.co/facebook/nllb-moe-54b (but pointing to local git copy),

pip install git+https://github.com/huggingface/transformers.git
python

>>> from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("../hub/nllb-moe-54b")
>>> model = AutoModelForSeq2SeqLM.from_pretrained("../hub/nllb-moe-54b")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/conda/lib/python3.8/site-packages/transformers/models/auto/auto_factory.py", line 441, in from_pretrained
    config, kwargs = AutoConfig.from_pretrained(
  File "/opt/conda/lib/python3.8/site-packages/transformers/models/auto/configuration_auto.py", line 920, in from_pretrained
    config_class = CONFIG_MAPPING[config_dict["model_type"]]
  File "/opt/conda/lib/python3.8/site-packages/transformers/models/auto/configuration_auto.py", line 626, in __getitem__
    raise KeyError(key)
KeyError: 'nllb_moe'

Note: The system might not have enough RAM, but this errored immediately after reaching it and does not seem like OOM.

Expected behavior

It can load model.

About this issue

Original URL
State: closed
Created a year ago
Comments: 24

Most upvoted comments

are you sure that you are on the latest release of transformers? pip install --upgrade transformers

Wow, I had forgotten about this, but after trying it, I ran it and it works fine, thank you very much.

4drawing95 on Apr 20, 2023

are you sure that you are on the latest release of transformers? pip install --upgrade transformers

ArthurZucker on Apr 20, 2023

So, running this model = AutoModelForSeq2SeqLM.from_pretrained("hf-internal-testing/random-nllb-moe-2-experts") definitely worked for me.

In [3]: model = AutoModelForSeq2SeqLM.from_pretrained("hf-internal-testing/random-nllb-moe-2-experts")
Downloading (…)lve/main/config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.40k/1.40k [00:00<00:00, 272kB/s]
Downloading (…)model.bin.index.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 91.5k/91.5k [00:00<00:00, 992kB/s]
Downloading (…)00001-of-00002.bin";: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7.75G/7.75G [02:04<00:00, 62.0MB/s]
Downloading (…)00002-of-00002.bin";: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9.36G/9.36G [02:17<00:00, 68.0MB/s]
Downloading shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [04:23<00:00, 131.96s/it]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:11<00:00,  5.82s/it]

In [4]:

The issue is most probably related to the config/ the cache! But still will look into it. In the mean time use the model directly 😉

ArthurZucker on Mar 30, 2023