fairseq: [Wav2Vec2] Cannot load newly added Wav2Vec2 checkpoints

🐛 Bug

A recent commit: https://github.com/pytorch/fairseq/commit/2513524a1604dbafcc4ea9cc5a99ae0aa4f19694 added two new fine-tuned Wav2Vec2 checkpoints, however it seems like there is a problem with the saved config as one cannot load those checkpoints. E.g. the following code cannot be run:

import fairseq
model, _, _ = fairseq.checkpoint_utils.load_model_ensemble_and_task([checkpoint_path], arg_overrides={"data": "path/to/dict"})

To Reproduce

The following colab reproduces the error (one just has to run all cells): https://colab.research.google.com/drive/13hJI4w8pOD33hxOJ_qwKkN9QqdKVH5IM?usp=sharing

Kindly pinging @alexeib here 😃

About this issue

Original URL
State: open
Created 3 years ago
Comments: 15 (4 by maintainers)

Most upvoted comments

@patrickvonplaten Hi, I met the same problem. Do you have any solution? Thank you. I run the code:

model, _, _ = fairseq.checkpoint_utils.load_model_ensemble_and_task([cp_path])

I got the error:

ConfigKeyError: Key 'target_dict' not in 'AudioPretrainingConfig' full_key: target_dict reference_type=Optional[AudioPretrainingConfig] object_type=AudioPretrainingConfig

ag027592 on Sep 5, 2021

I am still getting this error

ConfigKeyError: Key 'eval_wer' not in 'AudioPretrainingConfig'
	full_key: eval_wer
	reference_type=Optional[AudioPretrainingConfig]
	object_type=AudioPretrainingConfig

while running the code

import torch
import fairseq
cp_path = '../w2v_large_lv_fsh_swbd_cv_ftsb300_updated.pt'
model, cfg, task = fairseq.checkpoint_utils.load_model_ensemble_and_task([cp_path])
model = model[0]
model.eval()

Package                Version         Location
---------------------- --------------- --------------------------------------------
antlr4-python3-runtime 4.8
backcall               0.2.0
bitarray               2.3.7
certifi                2021.10.8
cffi                   1.15.0
colorama               0.4.4
Cython                 0.29.28
debugpy                1.5.1
decorator              5.1.1
entrypoints            0.3
fairseq                1.0.0a0+5175fd5 
hydra-core             1.0.7
ipykernel              6.4.1
ipython                7.31.1
ipython-genutils       0.2.0
jedi                   0.18.1
jupyter-client         7.1.2
jupyter-core           4.9.1
matplotlib-inline      0.1.2
nest-asyncio           1.5.1
numpy                  1.22.2
omegaconf              2.0.6
parso                  0.8.3
pexpect                4.8.0
pickleshare            0.7.5
pip                    21.2.4
portalocker            2.4.0
prompt-toolkit         3.0.20
protobuf               3.19.4
ptyprocess             0.7.0
pycparser              2.21
Pygments               2.11.2
python-dateutil        2.8.2
PyYAML                 6.0
pyzmq                  22.3.0
regex                  2022.1.18
sacrebleu              2.0.0
setuptools             58.0.4
six                    1.16.0
tabulate               0.8.9
tensorboardX           2.5
torch                  1.10.2
torchaudio             0.10.2
tornado                6.1
tqdm                   4.62.3
traitlets              5.1.1
typing_extensions      4.1.1
wcwidth                0.2.5
wheel                  0.37.1

same problem, Have you solved it?

Kristopher-Chen on Mar 1, 2022

You can solve this by cloning the repo, and then just copying all those missing parameters from the audio fine-tuning config into the audio pretraining config

patrickvonplaten on Mar 1, 2022

We have the robust models otherwise also on the HF Hub: https://huggingface.co/models?arxiv=arxiv:2104.01027 if you’re interested

patrickvonplaten on Feb 18, 2022

@patrickvonplaten a notebook with a working example also, I removed

arg_overrides

https://colab.research.google.com/drive/1gPQ1LzAoEbQtRYRGVPz4zg9Klo-ErwjH?usp=sharing

abodacs on Aug 20, 2021

Still got the same problem 😉

See https://colab.research.google.com/drive/13hJI4w8pOD33hxOJ_qwKkN9QqdKVH5IM?usp=sharing

patrickvonplaten on Aug 19, 2021