cdQA: Predictions for certain paragraphs are inaccurate. What's wrong?

I have been testing cdQA on paragraphs that have been generated from CSV. I convert the structured data into English, then predict answers using BERT.

I’ve described the approach here: https://datascience.stackexchange.com/questions/58186/transform-data-into-english-then-predict-an-answer-using-bert

I combine 2 or 3 sentences into paragraphs, then concatenate multiple paragraphs into one dataframe for cdQA pipeline, then query the dataset but results are often incorrect. An example of a sentence:

According to our website, the Melbourne Convention Centre & South Wharf Precinct
project is located at 1 Convention Centre Pl, South Wharf VIC 3006, Australia. The
Melbourne Convention Centre & South Wharf Precinct project has won three awards. The project started in 2014 and was completed in 2016.

And query

how many awards has the Melbourne Convention Centre project won?

Could this form of English writing be too dissimilar to the corpora and datasets on which BERT was pre-trained and fine-tuned? Can you suggest how I could improve results? Thanks.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 25

Most upvoted comments

finally I found the problem: readme.md says cdqa_pipeline = QAPipeline(model='bert_qa_vCPU-sklearn.joblib')

should be cdqa_pipeline = QAPipeline(reader='bert_qa_vCPU-sklearn.joblib')

Hi @andrelmfarias ,

I’ve been trying out the newly introduced retrieve based on BM25 and it seems to be working great! Thank you for your efforts. Will provide more updates…

Yes, I did try changing some of the parameters (below) but did not notice much improvement. lowercase=True, preprocessor=None, tokenizer=None, stop_words=‘english’, token_pattern=r"(?u)\b\w\w+\b", ngram_range=(1, 2), max_df=0.85, min_df=2, vocabulary=None, paragraphs=None, top_n=3, verbose=False):

@edanweis Indeed the steps for this are not obvious, I am really sorry about that. To test one query on one paragraph, please follow the steps below:

import pandas as pd
from sklearn.externals import joblib
from cdqa.utils.converters import df2squad
from cdqa.utils.download import download_model
from cdqa.reader.bertqa_sklearn import BertProcessor, BertQA

download_model(model='bert-squad_1.1', dir='./models')

paragraph = "According to our website, the Melbourne Convention Centre & South Wharf Precinct project is located at 1 Convention Centre Pl, South Wharf VIC 3006, Australia. The Melbourne Convention Centre & South Wharf Precinct project has won three awards. The project started in 2014 and was completed in 2016."
query = "how many awards has the Melbourne Convention Centre project won?"

# Create dataframe and convert it to squad-like json
df = pd.DataFrame({'title': 'Melbourne Convention', 'paragraphs': [[paragraph]]})
json_data = df2squad(df=df, squad_version='v1.1')

# Add question to json
json_data['data'][0]['paragraphs'][0]['qas'].append({"id":0, "question":query})

# Preprocess json
processor = BertProcessor(do_lower_case=True, is_training=False)
examples, features = processor.fit_transform(X=json_data['data'])

# Load model and predict
qa_model = joblib.load("./models/bert_qa_vCPU-sklearn.joblib")

qa_model.predict(X=(examples, features))

I ran it and the answer I got was correct: “three”

If you do not get the same answer with the cdqa pipeline, you have to improve (fine-tune) your retriever, as I explained above.