Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add BERT_SCORE to QAAccuracy and update unit/integration tests #314

Merged
merged 40 commits into from
Aug 12, 2024

Conversation

kirupang-code
Copy link
Contributor

@kirupang-code kirupang-code commented Jul 19, 2024

  • Added BERT_SCORE to qa_accuracy.py by creating SplitWithDelimiter, a transform that uses a target_output_delimiter to split a target output string into a list of possible targets. This allows us to compute multiple BertScores and take the max over all possible targets.
  • Additionally, the integration tests for qa_accuracy were taking over 10 minutes to run 100 records because of the runtime of the bertscore model, so I created a smaller dataset, triviaQA_sample_small with 4 records to be used.
  • Updated qa_accuracy_semantic_robustness.py to import and use QA_ACCURACY_SCORE_NAMES (QAAccuracy metrics w/o BERT_SCORE) instead of SCORE_NAMES.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@kirupang-code kirupang-code changed the title feat: Added BERTScore to QAAccuracy and QAAccuracySemanticRobustness feat: Added BERT_SCORE to QAAccuracy Jul 24, 2024
@kirupang-code kirupang-code changed the title feat: Added BERT_SCORE to QAAccuracy feat: Added BERT_SCORE to QAAccuracy and updated unit/integration tests Jul 24, 2024
@danielezhu danielezhu changed the title feat: Added BERT_SCORE to QAAccuracy and updated unit/integration tests feat: Add BERT_SCORE to QAAccuracy and update unit/integration tests Jul 25, 2024
]

# for all metrics in qa_accuracy (metrics from both the QAAccuracyScores Transform and the BertScore Transform)
SCORE_NAMES = QA_ACCURACY_SCORE_NAMES + [BERT_SCORE]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an incompatible change we should keep tabs on. It's less "dangerous" since we're augmenting the list instead of deleting elements from it, but at the end of the day, we're changing the value of a constant.

@kirupang-code kirupang-code requested a review from danielezhu July 25, 2024 22:31
Copy link
Contributor

@oyangz oyangz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(non-blocking): should we move the BertScore class out of summarization accuracy metrics since it’s used for multiple eval algos now?

@danielezhu
Copy link
Contributor

(non-blocking): should we move the BertScore class out of summarization accuracy metrics since it’s used for multiple eval algos now?

If we do, it will be a breaking change that we should keep track of prior to the next release.

@xiaoyi-cheng xiaoyi-cheng merged commit 4164457 into aws:main Aug 12, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants