transformers: MT5-large model on hub has wrong config

MT5-large model config has wrong parameters "architectures" and "tokenizer_class" as

{
  "architectures": [
    "T5ForConditionalGeneration"
  ],
  "tokenizer_class": "T5Tokenizer"
}

, where it should be MT5 arch and tokenizer as

{
  "architectures": [
    "MT5ForConditionalGeneration"
  ],
  "tokenizer_class": "MT5Tokenizer"
}

@patrickvonplaten

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 16 (11 by maintainers)

Most upvoted comments

@fcakyon I see in history that not only tokenizer_class was changed 👍 Ok so just the tokenizer_class needs to be updated in config.json

@devrimcavusoglu it is but the class is not called MT5Tokenizer. It would have been if the authors did something like this

class MT5Tokenizer(T5Tokenizer):
   pass

Instead of MT5Tokenizer = T5Tokenizer

because tokenizer_class_from_name matches tokenizer via reference to class name i.e. c.__name__

@dkajtoch thats right 👍

great catch! Correcting it now