transformers: [Bug] whisper pipeline inference bug on transformers master branch

System Info

OS: ubuntu 20.04

transformer version: master branch. pip install git+

Who can help?

No response


  • The official example scripts
  • My own modified scripts


  • An officially supported task in the examples folder (such as GLUE/SQuAD, …)
  • My own task or dataset (give details below)


Run following code:

import transformers
from packaging.version import Version
import pathlib

def whisper_pipeline():
    task = "automatic-speech-recognition"
    architecture = "openai/whisper-tiny"
    model = transformers.WhisperForConditionalGeneration.from_pretrained(architecture)
    tokenizer = transformers.WhisperTokenizer.from_pretrained(architecture)
    feature_extractor = transformers.WhisperFeatureExtractor.from_pretrained(architecture)
    if Version(transformers.__version__) > Version("4.30.2"):
        model.generation_config.alignment_heads = [[2, 2], [3, 0], [3, 2], [3, 3], [3, 4], [3, 5]]
    return transformers.pipeline(
        task=task, model=model, tokenizer=tokenizer, feature_extractor=feature_extractor

def raw_audio_file():
    # The dataset file comes from
    datasets_path = "/path/to/apollo11_launch.wav"
    return pathlib.Path(datasets_path).read_bytes()

inference_config = {
    "return_timestamps": "word",
    "chunk_length_s": 60,
    "batch_size": 16,
whisper = whisper_pipeline()
raw_audio_file_data = raw_audio_file()
prediction = whisper(raw_audio_file_data, return_timestamps="word", chunk_length_s=60, batch_size=16)

The last line raises error like:

>>> prediction = whisper(raw_audio_file_data, return_timestamps="word", chunk_length_s=60, batch_size=16)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/weichen.xu/miniconda3/envs/py310/lib/python3.10/site-packages/transformers/pipelines/", line 356, in __call__
    return super().__call__(inputs, **kwargs)
  File "/home/weichen.xu/miniconda3/envs/py310/lib/python3.10/site-packages/transformers/pipelines/", line 1132, in __call__
    return next(
  File "/home/weichen.xu/miniconda3/envs/py310/lib/python3.10/site-packages/transformers/pipelines/", line 124, in __next__
    item = next(self.iterator)
  File "/home/weichen.xu/miniconda3/envs/py310/lib/python3.10/site-packages/transformers/pipelines/", line 266, in __next__
    processed = self.infer(next(self.iterator), **self.params)
  File "/home/weichen.xu/miniconda3/envs/py310/lib/python3.10/site-packages/transformers/pipelines/", line 1046, in forward
    model_outputs = self._forward(model_inputs, **forward_params)
  File "/home/weichen.xu/miniconda3/envs/py310/lib/python3.10/site-packages/transformers/pipelines/", line 551, in _forward
    generate_kwargs["num_frames"] = stride[0] // self.feature_extractor.hop_length
TypeError: unsupported operand type(s) for //: 'tuple' and 'int'

Note this error only happens on transformer github master branch. For released version, above code works well.

Expected behavior

My example code should not raise error.

About this issue

  • Original URL
  • State: closed
  • Created 9 months ago
  • Reactions: 3
  • Comments: 17 (4 by maintainers)

Most upvoted comments

also have the same issue, any update on this @sanchit-gandhi ?


I’ve had the same issue. @WeichenXu123 's replies were very helpful, thanks man!

First try setting batch size to 1 if that’s not a problem.

Second, you can try going into the location that the error message says in the 3rd from last row. For me it says “”/home/nofreewill/.local/lib/python3.10/site-packages/transformers/pipelines/", line 552, in _forward" So I opened it, went to line 552 and changed according to @WeichenXu123 's suggestion: generate_kwargs[“num_frames”] = stride[0] // self.feature_extractor.hop_length generate_kwargs[“num_frames”] = stride[0][0] // self.feature_extractor.hop_length

And it works now with batch size > 1 as well

Thanks for the ping. My hunch is that this is due to batch_size being larger than 1. Just to confirm, does the same thing happen if you remove that argument?

Yes It only happens when batch > 1