transformers: model.generate with prefix_allowed_tokens_fn throws RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
Environment info
transformersversion: 4.15.0- Platform: Linux-5.4.0-90-generic-x86_64-with-debian-bullseye-sid
- Python version: 3.7.12
- PyTorch version (GPU?): 1.10.0+cu102 (True)
- Tensorflow version (GPU?): 2.7.0 (True)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: No
- Using distributed or parallel set-up in script?: No
Who can help
Information
Model I am using T5ForConditionalGeneration:
The problem arises when using my own modified scripts: Script to reproduce error is mentioned below.
The tasks I am working on is my own task or dataset: The task requires conditional generation from T5, in such a way, that the output vocabulary is restricted to a small set.
To reproduce
- Run the following script to reproduce the behaviour.
from transformers import T5Tokenizer, T5ForConditionalGeneration, T5Config
lm_model = 't5-small'
model = T5ForConditionalGeneration.from_pretrained(lm_model)
tokenizer = T5Tokenizer.from_pretrained(lm_model)
def restrict_decode_vocab(batch_idx, prefix_beam):
if len(prefix_beam)==3:
restricted_vocab = tokenizer(' ', return_tensors="pt")['input_ids'].tolist()
else:
restricted_vocab = tokenizer('<extra_id_0> cute dog <extra_id_1> the <pad>', return_tensors="pt")['input_ids'].tolist()
return restricted_vocab
source = ['The <extra_id_0> walks in <extra_id_1> park .']
source_encoding = tokenizer(source[:], padding='longest', return_tensors="pt")
input_ids, attention_mask = source_encoding['input_ids'], source_encoding['attention_mask']
decoded_beams = model.generate(input_ids=input_ids, attention_mask=attention_mask, do_sample=True, num_beams=2, prefix_allowed_tokens_fn=restrict_decode_vocab, min_length=4, max_length=4, remove_invalid_values=True)
print(decoded_beams)
- Above script produces the following stack trace.
/home/jsingh319/uploaded_venvs/venv-koala-torch-1.10-python-3.7.12/lib/python3.7/site-packages/transformers/generation_utils.py:2259: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
next_indices = next_tokens // vocab_size
Traceback (most recent call last):
File "reproduce_error.py", line 17, in <module>
decoded_beams = model.generate(input_ids=input_ids, attention_mask=attention_mask, do_sample=True, num_beams=2, prefix_allowed_tokens_fn=restrict_decode_vocab, min_length=4, max_length=4, remove_invalid_values=True)
File "/home/jsingh319/uploaded_venvs/venv-koala-torch-1.10-python-3.7.12/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "/home/jsingh319/uploaded_venvs/venv-koala-torch-1.10-python-3.7.12/lib/python3.7/site-packages/transformers/generation_utils.py", line 1220, in generate
**model_kwargs,
File "/home/jsingh319/uploaded_venvs/venv-koala-torch-1.10-python-3.7.12/lib/python3.7/site-packages/transformers/generation_utils.py", line 2253, in beam_sample
next_tokens = torch.multinomial(probs, num_samples=2 * num_beams)
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
Expected behavior
No error.
Possible solution
The call function for class “InfNanRemoveLogitsProcessor” should include the following statement before returning “scores”.
scores[scores == float("-inf")] = torch.finfo(scores.dtype).min
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 5
- Comments: 16 (6 by maintainers)
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
I don’t think that this is related in any way to the
InfNanRemoveLogitsProcessorprocessor. IMO, the reason for the error here is that in the 3rd generation step, all values ofnext_token_scoresare set to-inf(I think) due to theprefix_allowed_tokens_fnthat you’ve added. This is not a bug IMO withtransformers, but with theprefix_allowed_tokens_fnfunction as it should not set all values to-inf.A tip from my side @iamjanvijay would be to do the following. Create the
PrefixConstrainedLogitsProcessorobject with your function and just play around with it locally (what happens at generation step 3) I think you’ll see then that it sets all values to-infat some point which it shouldn’t doWas running into similar issues when using
prefix_allowed_tokens_fnin tandem with beam-search multinomial sampling, and realized thetop_kandtop_pargs were sometimes preventing all the allowed tokens from being used, as they were outside those two tops.no_repeat_ngram_sizecan have a similar effect.Consider removing
top_kandtop_pif only allowing certain tokens is more important.I’ll close my PR in the meantime. We can reopen it if needed, but I tend to agree with @patrickvonplaten that having everything
float(-inf)can be considered a bug already.