transformers: Add missing type hints

This issue is part of our Great Code Cleanup 2022. If you’re interested in helping out, take a look at this thread, or come join us on Discord and talk with other contributors!

🚀 Add missing type hints

Type hints are used inconsistently in the transformers repo across both TF and PT models, and it’d be nice to make them a complete, consistent thing for the core models, especially because we want to develop features that depend on them!

Guide to contributing:

  1. Ensure you’ve read our contributing guidelines 📜
  2. Claim your architecture(s) in this thread (ensure no one is working on it). It’s 100% okay to only take the TensorFlow or PyTorch version of a model, if you’re not familiar with both frameworks! It’s also okay to claim multiple models and group those changes into a single PR! 🎯
  3. Implement the changes as in or (see the diff on the model architectures for a few examples) 💪
  4. Open the PR and tag me in it. You should run make fixup at the end to do a code quality check before your final commit!

Tips for making your PR

  1. The files you need to edit will be in src/transformers/models/[model_name]/
  2. For TensorFlow, you want the modeling_tf_[model_name].py file. For PyTorch, you want the modeling_[model_name].py file.
  3. Remember, you do not have to cover every class in that file!. The main thing we want to cover is the call (for TF) or forward (for PT) method for user-facing classes like TFRobertaForMaskedLM or RobertaForSequenceClassification. It’s not necessary to add type hints to layers or base classes like RobertaModel or TFRobertaPreTrainedModel - these are trickier to write, and generally people do not use those classes as standalone models.
  4. If you’re unfamiliar with how type hints work, you can read the Python library documentation on them, but it’s probably even easier to just look at another PR that added them. Take a look at the list of changes in the pull requests linked above!
  5. The types will usually be obvious - most inputs are Optional[Union[np.ndarray, tf.Tensor]] for TF models and Optional[torch.Tensor] for PyTorch models, and boolean inputs are Optional[bool]. Pay attention to the first input of TF models, though, which is usually TFModelInputType - this is because Keras handles that first input in a special way! Other inputs to pay attention to are past_key_values, which can vary between models, and also the model output type. For the base model classes like RobertaModel, you may have to look at the corresponding MainLayer to figure out the right output type! Also, note that the output type may be a tuple if return_dict is False, in which case you should specify Union[Tuple, ...]. Finally, note that in TF models, training is never None, so it should be training: bool and not training: Optional[bool].
  6. Note that some code is copied across our codebase. If you see a line like # Copied from transformers.models.bert..., this means that the code is copied from that source, and our scripts will automatically keep that in sync. If you see that, you should not edit the copied method! Instead, edit the original method it’s copied from, and run make fixup to synchronize that across all the copies. Be sure you installed the development dependencies with pip install -e ".[dev"], as described in the contributor guidelines above, to ensure that the code quality tools in make fixup can run.

How can I find models that need type hints?

I used to maintain a list here, but it got out of date, I’m sorry. Instead, you can use this Colab notebook. If you run this, it will show you models in PyTorch or TF that are still missing type hints. Unlike my manually curated lists, it’s guaranteed to be up to date - but do double-check that someone else in the thread hasn’t claimed a model before you start, because the Colab code will only register type hints after the PR containing them is merged!

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 7
  • Comments: 146 (125 by maintainers)

Commits related to this issue

Most upvoted comments

Hi all, with the end of the sprint I probably won’t be actively maintaining this issue, but there are still models outstanding! If you’re curious about what’s left to do, I’ve added this Colab notebook that you can use to find model classes without type hints. We’d appreciate type hints for any and all of them!

So these ☝️ are the last ones for the pytorch models.

By looking at the git history I appreciate that you are pretty busy these days so no rush on this. Here you have a quick recap on what we have so far:

In the meantime I will start getting an intuition of how the TF ones work, I can’t say that I’m going to address all of them (as there are a lot) but let’s see how far we can get 🐌

Also, will work on BeiT, Deit and ViT (Pytorch)

Hi Matt, at this point I’m keen to finish all the remaining type hints for the pytorch models (17). I will open a couple of PRs so the review won’t be a pain (unless you tell me otherwise)


@Rocketknight1 Hi Matt 👋 , i would like to take Graphormer pytorch version

Hi @sirmammingtonham thank you so much for pointing that out! I just realized I’d been failing to keep the list of models up to date - I made a Colab to check our codebase instead, which is linked at the bottom of the initial post and will always be up to date.

Hey, I would like to work on the Swin model files!

Hi, I’d like to work on MT5 for Pytorch and Tensorflow 🙂

Hi!, I can create a PR for CTRL for Pytorch and Tensorflow.

HI, I would like to work on Tensorflow XLNet

Hi, I would like to work on Tensorflow LED

Good shout @F02934 !

@Rocketknight1 I’ve checked the above Colab notebook and looks like MegatronBERT is done. I will add missing type hints for QDQBertModel

Looks like ImageGPT was done. I can take Luke in PyTorch.

This project is now officially complete! Thank you to everyone in this thread, and to other people who filed PRs, and congratulations to @nablabits who filed the final PR to finish it all!

@nablabits Ah, of course! I should have realized that the base PreTrainedModel classes do not have an overridden call(), because they’re always subclassed before being used. Good catch.

Hey Matt, I will pick now these guys, let’s crush that list 📋:

  • CpmAntModel
  • DecisionTransformerModel
  • DPR family: DPRContextEncoder, DPRQuestionEncoder, DPRReader
  • Deformable Detr family: DeformableDetrForObjectDetection, DeformableDetrModel
  • Deta family: DetaForObjectDetection, DetaModel
  • Detr family: DetrForObjectDetection, DetrForSegmentation, DetrModel


That happens when you link an issue in the PR - Github assumes the PR resolves the issue!

Hey Matt, that makes sense, probably it was picking the 16059 in the title of the PR or in the commit or even the bit fixes 16059 as a whole. I can see some other PRs that didn’t close the issue and still keep a reference in the first comment, eg,

notebook updated based on your suggestion, thank you!

Brilliant, always happy to be useful 🤗

I’m keen to further reduce that list so I will pick some of the first entries:

These ones have not been picked yet:

  • Blip2QFormerModel
  • ConditionalDetrForObjectDetection
  • ConditionalDetrForSegmentation
  • ConditionalDetrModel

Is it ok to open a PR with all of them?


@nablabits If the Colab doc says type hints are missing, then at least some are still missing, so you can totally take it!

(People often forget the return type, and the Colab will still mark a model as missing type hints if it isn’t there)

Hey @Rocketknight1 hope you are doing great, I see that there is still work to do here. ASTModel feels like a great opportunity of learning so I will work on it if no one else has picked it up, (if that’s the case feel free to yell at me 🙃 )

Colab lists TFXLMRobertaPreTrainedModel as one of the models. Can I take that?

If ALBERT or XLMRoBERTa (Tensorflow) are still available then I would like to work on any one of them.

Hi 👋 @Rocketknight1,thanks for the helpful feedback But I closed the pull request that I was facing issue with and open a new pull request with proper code formatting and make fixup command.

new pull request 👉 New Pull Request

@Rocketknight1 Hi Matt, can I work on TFBlenderbot for this issue?

@Rocketknight1 I would love to take TFDeiTModel if that’s alright.

Great! In that case I’d like to claim the TFPegasusPreTrainedModel. I’ll make sure to sync before making any commits. Thanks!

Good Morning @Rocketknight1, I would love to help you with adding types to the following models:

  1. TFAlbertPreTrainedModel
  2. TFBartPretrainedModel
  3. TFBertPreTrainedModel

Hi @Rocketknight1 I’d love to take TFMarianModel with @Batese2001 and @mollerup23!

Hello @Rocketknight1 I would like to tackle TFVisionEncoderDeocderModel if possible with @miyu386 and @AdiaWu.

Hello @Rocketknight1 I’d love to tackle TFEncoderDecoderModel if it is available

Hi folks,

We are almost done with this task! From the notebook created by RK (Rocketknight1), I’m sharing here the remaining models to gain some visibility and encourage one last push.

  1. PyTorch models:

    • ASTModel
    • EsmForProteinFolding
    • TimesformerModel
  2. TF models (excluding pre-trained classes):

    • TFDeiTModel
    • TFEncoderDecoderModel (WIP by Batese2001)
    • TFLEDForConditionalGeneration
    • TFLEDModel
    • TFLxmertForPreTraining
    • TFMarianMTModel
    • TFMarianModel
    • TFRagModel
    • TFRagTokenForGeneration
    • TFSegformerDecodeHead
    • TFTransfoXLLMHeadModel
    • TFTransfoXLModel
    • TFVisionEncoderDecoderModel

Important: Remember to run the notebook to check which models are available (the ones printed in each cell mean models without type hints) and look if anyone has claimed before start working on it.

I would like to work on EsmForProteinFolding

Hi @Rocketknight1 , ~I will take TFViTPreTrainedModel :)~ change of plans, I didn’t notice that PreTrained classes were not priority. I will take TFPegasusModel instead

@Rocketknight1 I will work on ViTModel

Hi I was going to work on MT5 for Pytorch and TF, but looking at src/transformers/models/mt5/ and src/transformers/models/mt5/modelling_tf_mt5, it looks like each of the classes in these files override other models (T5model, T5ForConditionalGeneration, T5EncoderModel, TFT5Model, TFT5ForConditionalGeneration, TFT5EncoderModel). Each of these already have type hints, so I guess there isn’t anything to do for MT5 now?

Edit: Sorry, just seen that the colab is where to look for models that still need work… In that case, I’ll pick up MCTCTForCTC and MCTCTModel!

Hi, I would love to work on FLAVA and LEVIT Pytorch , can you assign this to me?

This applies to other people who recently started, cc @mariagrandury @arcAman07 @RamitPahwa @rchan26 @WhiteWolf47. If you found the model you chose already has type hints, this is why! Please double-check in the Colab to see where type hints are still needed.

Hey, just checked out the colab notebook, i have decided to work on the YolosModel

Thanks @Rocketknight1 the notebook is super helpful! I can work on adding return types to tf xlm and tf GPTJ.

Hey just saw the codebase and realised a lot of models already have type hints. I did notice dpt list of models didn’t have type hints and I would love to work on that issue.

Hi I’d love to work on Reformer, Data2Vec, and RoFormer pytorch 😊

Seems like a lot of these have been merged already but not marked off the list for pytorch (CTRL, OpenAIGPT, BigBirdPegasus, M2M, reformer, data2vec, roformer, visualbert). If there’s anything left I’ll take it lol

I would love to work on PyTorch SEWD for the remaining classes.

Hey! I’ll take OpenAIGPT both for TensorFlow & PyTorch

I would like to work on Pytorch/BigBirdPegasus

I’d like to work on Lxmert for TF and PyTorch.

I’m taking up TF Rag

I would like to work on adding type hints for TF MPNet models.

TFConvBERTModel does not have type hints. I would like to work on adding typing hints for it.

PyTorch SEWD model missing some type hints, I’ll work on it and open a PR.

Some type hints are missing in PyTorch UniSpeech, MPNet, and Nystromformer. I’ll take on it and group those changes into a single PR!

I’ll create a PR for FSMT 😃

M2M model missing type hints, I’ll work on it and open a PR.

I’ll take on PyTorch BigBirdPegasus and open a PR 😃

Hi 🤗. Vilt (PyTorch) model remains without type hints, so I would like to work on it.

I can work on that if @manandey hasn’t already

@asofiaoliveira Feel free to work on this incase you are interested!

Hello, according to the colab notebook, XLMRobertaXL (PyTorch) classes are still missing type annotations They were mentioned here back on March 12th but there is no PR associated, so I can work on that if @manandey hasn’t already

Hi, I would like to work on TensorFlow : CTRL

Hi, I would like to work on TensorFlow : OpenAIGPT

Can I take GPTNeoxForCausalLM and GPTNeoXModel?

Hi @Rocketknight1 I checked the Colab notebook and I will be adding the missing type hints for CVT (Pytorch)

Hi all, with the end of the sprint I probably won’t be actively maintaining this issue, but there are still models outstanding! If you’re curious about what’s left to do, I’ve added this Colab notebook that you can use to find model classes without type hints. We’d appreciate type hints for any and all of them!

Hey there @Rocketknight1 I would like to work on MegatronBERT 😄 (edit: pytorch)

@Rocketknight1 It seems that “ConvBert” is already done PR #16377 for both TensorFlow and Pytorch. So I would like to work on “yoso” Pytorch

Hello @Rocketknight1 I would like to work on ConvBert tensorflow

@Rocketknight1 It appears that all user facing classes in Pytorch VisualBERT are already annotated as mentioned in this PR #16544, so I would like to work on Pytorch BigBirdPegasus

Hi, I would like to work on Pytorch VisualBERT.

Hey, I’d like to work on ProphetNet(Pytorch)

edit: @sgugger Hey Sylvain. Found that the some sections in the docs ( are missing parameters explanations (for example in the ProphetNetConfig there’s no definition for the decoder_start_token_id, and something similar happens for the ProphetNetModel and the ProphetNetEncoder). Not sure if it was done on purpose, so I’m only posting here!

@hiromu166 I have already worked on that model in my PR #16425

I will take IBert, Lxmert for PyTorch.

Hi, I’ll take Funnel

I can take up PyTorch: PLBart and VisualBERT

I’ll work on TF - Blenderbot PyTorch - Blenderbot and BlenderbotSmall

I’ll take Vilt for PyTorch seems available.

I’d like to work on TransfoXL if available

This is Awesome! Hey, @Rocketknight1 I’d like to work on ConvBERT for both PyTorch & TF 🤗

I’m working on MobileBert for both TensorFlow & PyTorch.

Happy to take CTRL and MPNet for Tensorflow

I will work on XLNet for TF and PT

Hey, I am looking into mBART model for TF and PyTorch implementations. If anyone, interested do let me know.

Hey, I would like to work on the BigBirdPegasus model of Pytorch.

Nevermind, XLMRoberta relies entirely on Roberta (for TF) I will work on Reformer instead!

Hi, I will work on RAG(pytorch).

Hi, I’ll take Marian (Pytorch)

I will work on YOSO for PT

I will work on XLMRoberta for TF

I will also work on Pegasus for pytorch

I’d like to work on Perceiver for torch

I will also work on GPTNeo for Pytorch

Hello, I will work on SqueezeBERT for Pytorch!

I’ll work on FNet for PyTorch.

I’m going for FlauBERT now !

I’d like to take PoolFormer

I’ll take Splinter and Segformer Rembert for torch Edit: @p-mishra1 has Segformer. Taking Rembert instead

I’ll work on GPTJ

Happy to take T5 (PyTorch)

@Rocketknight1 isn’t the list missing ConvNext? If so, I’m happy to take care of that one too 👌

I’ll take Distilbert (TensorFlow)

@robotjellyzone You can! Please note that we accepted a PR yesterday to add the TF decorator to BART, so make sure you’re working on the most recent version of the library before you start your PR!

I will work on XLM (PyTorch)

Hi @Rocketknight1,

I would like to work on BART of both TF and PyTorch

can you please confirm with emoji whether i am eligible to take these or not? @Rocketknight1

I’ll take OpenAIGPT!

segformer pytorch

XLMRobertaXL (PyTorch)

Hi @Rocketknight1,

I would like to work on BART of both TF and PyTorch

I’d like to claim GPT-2 (PyTorch).

I’ll take T5 (Tensorflow)!

I’d like to work on XLM (Tensorflow)

I can work on Swin (Pytorch)

I can work on ImageGPT.

I would like to work on Clip for pytorch.

I’d like to work on BigBird

Awesome! Hey @Rocketknight1 – I’d like to work on Longformer for both PyTorch & TF!

@Rocketknight1 I switch to Roberta PyTorch because CamemBERT depends on Roberta modeling

I’d like to work on GPT2 (TF).

@Rocketknight1 no worries, will try and do DistillBert instead

@johnryan465 I just did it as an example, I’m sorry! I’m marking off the completed models now.


I’d like to take Hubert & Wav2Vec2 for Pytorch.


Hi, I would like to work on CamemBERT for PT & TF.

I will take a look at LayoutLMv2 after the first one 😃

Edit: Because CamemBert depends on Roberta I will take PyTorch Roberta 👍

Hi, I would like to work on PyTorch ImageGPT

I would love to work on PyTorch Albert🚀