transformers: QuestionAnsweringPipeline cannot handle impossible answer

Environment info

  • transformers version: latest master. I think the bug was introduced by this PR: #13873 so it’s part of transformers since the 4.11.3 release and I can confirm that I didn’t see this bug with the 4.11.2 release.
  • Platform: linux
  • Python version: 3.8
  • PyTorch version (GPU?): 1.9 (same with 1.10)
  • Using GPU in script?: yes
  • Using distributed or parallel set-up in script?:

Who can help

Hi @Narsil I hope you could look again at #13873 and check the changes it makes for the case when handle_impossible_answer is True. Thanks a lot!

To reproduce

Steps to reproduce the behavior:

  1. Find the run_pipeline_test test in test_pipelines_question_answering.py
  2. Set handle_impossible_answer to Truein the question_answerer so that the code is the following:
    def run_pipeline_test(self, question_answerer, _):
        outputs = question_answerer(
            question="Where was HuggingFace founded ?", context="HuggingFace was founded in Paris.", handle_impossible_answer=True
        )
        self.assertEqual(outputs, {"answer": ANY(str), "start": ANY(int), "end": ANY(int), "score": ANY(float)})

        outputs = question_answerer(
            question=["In what field is HuggingFace working ?", "In what field is HuggingFace working ?"],
            context="HuggingFace was founded in Paris.",
        )
  1. When running this modified test, it fails with a ValueError:
            # Normalize logits and spans to retrieve the answer
            start_ = np.exp(start_ - np.log(np.sum(np.exp(start_), axis=-1, keepdims=True)))
            end_ = np.exp(end_ - np.log(np.sum(np.exp(end_), axis=-1, keepdims=True)))

            if handle_impossible_answer:
>               min_null_score = min(min_null_score, (start_[0] * end_[0]).item())
E               ValueError: can only convert an array of size 1 to a Python scalar

../src/transformers/pipelines/question_answering.py:415: ValueError

Expected behavior

Test should run through.

Additional Info

I came across this problem when upgrading the transformers dependency of haystack and ran our tests with different versions of transformers to find the last working release/first failing release: https://github.com/deepset-ai/haystack/pull/1659

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 1
  • Comments: 18 (8 by maintainers)

Most upvoted comments

Perfect, just found the bug myself and saw this fix. Super cool! thanks!

Thanks for the detail issue, very easy to reproduce, and everything is correct.

I am creating a PR to fix this, however do you have an example where this argument is needed in an obvious way ? I would love to add a meaningful test for this option (already added unit test for it)

Can you provide a reproducible script ?

The one that used to not work:

import os

from transformers import pipeline


pipe = pipeline("question-answering", handle_impossible_answer=True)
out = pipe(question="This", context="that")
print(" - " * 20)
print(out)
print(" - " * 20)

seems to be working fine.