-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A more general support for QA and summarization (review and feedback needed) #576
Conversation
you have some merge conflict markers in your files |
Looks good, after the merge! Do you know which models use the SAM optimizer? |
docs/docs/research/general.md
Outdated
@@ -2,6 +2,43 @@ | |||
|
|||
This page lists research papers that are relevant to the project. | |||
|
|||
<<<<<<< HEAD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this means you might need to update from upstream/master
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ekurtulus it might be cleaner if you move your work onto a feature branch in your fork. That way you can keep your main
in sync with LAION-AI/Open-Assistant:main
and might make it easier for then updating your feature branch from main. I find this approach a bit easier to work with given how often main is changing with so much work going on.
Looks great, If you can merge the final conflicts and add a default accuracy metric as before would be great! |
@ekurtulus did you close this on purpose? |
Yes, I realized that I was not using pre-commit files and not branching properly. While trying to set everything up properly, I had to delete and re-fork the main repository. So, it was closed automatically. I will submit a new PR. |
Please see PR 619. |
With this update, I am aiming for generalizing our training framework. I need feedback and review on the structure that I propose. Accordingly, I will make the codebase fully functional and extend it. Additionally, I added more papers to the docs side.
see this issue
I think we need support not only for QA but also summarization. In this PR, I wanted to extend the existing datasets, add support for seq2seq models like T5, PolyLoss and few other modifications. Normally, I was also planning to add support for Sharpness Aware Training, but it requires overriding the Huggingface Trainer's _inner_training_loop function. I can do that but I am sure if it is something we want. If we decide to do so, I can submit another PR.
Furthermore, if the structure of this commit is approved, I can add more tasks, a script for synthetic data generation through data augmentation, and more.