-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add BERT_SCORE
to QAAccuracy
and update unit/integration tests
#314
Conversation
BERT_SCORE
to QAAccuracy
BERT_SCORE
to QAAccuracy
BERT_SCORE
to QAAccuracy
and updated unit/integration tests
BERT_SCORE
to QAAccuracy
and updated unit/integration testsBERT_SCORE
to QAAccuracy
and update unit/integration tests
] | ||
|
||
# for all metrics in qa_accuracy (metrics from both the QAAccuracyScores Transform and the BertScore Transform) | ||
SCORE_NAMES = QA_ACCURACY_SCORE_NAMES + [BERT_SCORE] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an incompatible change we should keep tabs on. It's less "dangerous" since we're augmenting the list instead of deleting elements from it, but at the end of the day, we're changing the value of a constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(non-blocking): should we move the BertScore class out of summarization accuracy metrics since it’s used for multiple eval algos now?
If we do, it will be a breaking change that we should keep track of prior to the next release. |
qa_accuracy.py
by creatingSplitWithDelimiter
, a transform that uses atarget_output_delimiter
to split a target output string into a list of possible targets. This allows us to compute multiple BertScores and take the max over all possible targets.triviaQA_sample_small
with 4 records to be used.qa_accuracy_semantic_robustness.py
to import and use QA_ACCURACY_SCORE_NAMES (QAAccuracy metrics w/o BERT_SCORE) instead of SCORE_NAMES.By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.