Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] Add GPTJ synthetic dataset, fix reference removal regex for… #513

Merged
merged 5 commits into from
Jan 8, 2023

Conversation

theblackcat102
Copy link
Collaborator

Few changes:

  1. Added GPT-J generated answers to the list of datasets for reward model training

  2. Fix the regex for removing reference block and other changes proposed by @agoryuno. We needed to continue experiments on the latest reward trainer so can't wait for your merge

Copy link
Collaborator

@sanagno sanagno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me

@sanagno sanagno merged commit 00cddc4 into main Jan 8, 2023
@sanagno sanagno deleted the sft-hub-dataset branch January 8, 2023 08:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants