transformers: QuestionAnsweringPipeline cannot handle impossible answer

Environment info

transformers version: latest master. I think the bug was introduced by this PR: #13873 so it’s part of transformers since the 4.11.3 release and I can confirm that I didn’t see this bug with the 4.11.2 release.
Platform: linux
Python version: 3.8
PyTorch version (GPU?): 1.9 (same with 1.10)
Using GPU in script?: yes
Using distributed or parallel set-up in script?:

Who can help

Hi @Narsil I hope you could look again at #13873 and check the changes it makes for the case when handle_impossible_answer is True. Thanks a lot!

To reproduce

Steps to reproduce the behavior:

Find the run_pipeline_test test in test_pipelines_question_answering.py
Set handle_impossible_answer to Truein the question_answerer so that the code is the following:

    def run_pipeline_test(self, question_answerer, _):
        outputs = question_answerer(
            question="Where was HuggingFace founded ?", context="HuggingFace was founded in Paris.", handle_impossible_answer=True
        )
        self.assertEqual(outputs, {"answer": ANY(str), "start": ANY(int), "end": ANY(int), "score": ANY(float)})

        outputs = question_answerer(
            question=["In what field is HuggingFace working ?", "In what field is HuggingFace working ?"],
            context="HuggingFace was founded in Paris.",
        )

When running this modified test, it fails with a ValueError:

            # Normalize logits and spans to retrieve the answer
            start_ = np.exp(start_ - np.log(np.sum(np.exp(start_), axis=-1, keepdims=True)))
            end_ = np.exp(end_ - np.log(np.sum(np.exp(end_), axis=-1, keepdims=True)))

            if handle_impossible_answer:
>               min_null_score = min(min_null_score, (start_[0] * end_[0]).item())
E               ValueError: can only convert an array of size 1 to a Python scalar

../src/transformers/pipelines/question_answering.py:415: ValueError

Expected behavior

Test should run through.

Additional Info

I came across this problem when upgrading the transformers dependency of haystack and ran our tests with different versions of transformers to find the last working release/first failing release: https://github.com/deepset-ai/haystack/pull/1659

About this issue

Original URL
State: closed
Created 3 years ago
Reactions: 1
Comments: 18 (8 by maintainers)

Most upvoted comments

Perfect, just found the bug myself and saw this fix. Super cool! thanks!

shaked571 on Dec 13, 2021

Thanks for the detail issue, very easy to reproduce, and everything is correct.

I am creating a PR to fix this, however do you have an example where this argument is needed in an obvious way ? I would love to add a meaningful test for this option (already added unit test for it)

Narsil on Nov 4, 2021

Can you provide a reproducible script ?

The one that used to not work:

import os

from transformers import pipeline


pipe = pipeline("question-answering", handle_impossible_answer=True)
out = pipe(question="This", context="that")
print(" - " * 20)
print(out)
print(" - " * 20)

seems to be working fine.

Narsil on Nov 24, 2021