OpenNMT-tf: Error while running domain adaption (fine tuning) with distributed mode
Hi,
I have created a new vocabulary files (source and target) on the domain data set and have updated the base model checkpoint file using the below statment:
onmt-update-vocab
--model_dir /home/ubuntu/mayub/datasets/in_use/euro/run1/en_es_transformer_b/
--output_dir /home/ubuntu/mayub/datasets/in_use/euro/run1/en_es_transformer_b/added_vocab/
--src_vocab /home/ubuntu/mayub/datasets/in_use/euro/train_vocab/src_vocab_50k.txt
--tgt_vocab /home/ubuntu/mayub/datasets/in_use/euro/train_vocab/trg_vocab_50k.txt
--new_src_vocab /home/ubuntu/mayub/datasets/in_use/euro/train_vocab/src_vocab_nfpa_50k.txt
--new_tgt_vocab /home/ubuntu/mayub/datasets/in_use/euro/train_vocab/trg_vocab_nfpa_50k.txt
This generates the new checkpoint file which I pass to the fine tuning train_and_eval command:
onmt-main train_and_eval
--model_type Transformer
--checkpoint_path /home/ubuntu/mayub/datasets/in_use/euro/run1/en_es_transformer_b/added_vocab/
--config /home/ubuntu/mayub/datasets/in_use/euro/run1/config_run_da_nfpa.yml
--auto_config --num_gpus 8
Changes I have made to the config file -only updated the train and eval feature and labels file (source and target vocabulary are same)
data:
train_features_file: /home/ubuntu/mayub/datasets/in_use/euro/run1/nfpa_train_tokenized_bpe_applied.en
train_labels_file: /home/ubuntu/mayub/datasets/in_use/euro/run1/nfpa_train_tokenized_bpe_applied.es
eval_features_file: /home/ubuntu/mayub/datasets/in_use/euro/run1/nfpa_dev_tokenized_bpe_applied.en
eval_labels_file: /home/ubuntu/mayub/datasets/in_use/euro/run1/nfpa_dev_tokenized_bpe_applied.es
source_words_vocabulary: /home/ubuntu/mayub/datasets/in_use/euro/train_vocab/src_vocab_50k.txt
target_words_vocabulary: /home/ubuntu/mayub/datasets/in_use/euro/train_vocab/trg_vocab_50k.txt
Below is the error I’m getting:

Not sure where I’m going wrong. Any help appreciated.
Thanks !
Mohammed Ayub
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 22 (8 by maintainers)
You highlighted another issue here, thanks! Models trained with gradient accumulation had some different variable names than models trained without. Fixed in https://github.com/OpenNMT/OpenNMT-tf/commit/ff38e89119a43a18a10b675b8c4c80c40c2ef27a.