Bth5032/78 blackcat trainer #313

bth5032 · 2023-01-03T06:37:00Z

As per discussion with @theblackcat102 I built the rankgen trainer on top of their framework (wandb). The model seems to be training now in fp32. Apparently t5 has some issue with fp16. Blackcat suggested scaling the weights might help as in 1. This could be a good next step if we want to move with this model. Until then, I think this code is worth comitting since it shows how to add new models which are not AutoModelForSequenceClassification

theblackcat102 · 2023-01-03T12:50:47Z

@bth5032 your wandb loss looks good, but I wonder why is the accuracy validation missing?

Not trying to nickpick anyway, but just curious why not assign the default value for tokenizer_name is same as model_name at the argument_parser then we can remove this if else at the trainer here

Overall the trainer looks fine

theblackcat102

looks good

sanagno

Thanks a lot!

yk · 2023-01-03T14:37:22Z

@bth5032 thank you! could you run pre-commit run --all-files to make linters happy?

bth5032 · 2023-01-04T02:49:44Z

@bth5032 thank you! could you run pre-commit run --all-files to make linters happy?

Thanks!

Not trying to nickpick anyway, but just curious why not assign the default value for tokenizer_name is same as model_name at the argument_parser then we can remove this if else at the trainer here

Ah, I normally use hydra/omegaconf for this, didn't realize you had that utility function, I cleaned that logic up.

@bth5032 your wandb loss looks good, but I wonder why is the accuracy validation missing?

I've been trying to figure that out myself. The compute_metrics function never runs for me... I found this thread and tried setting evaluation_strategy=IntervalStrategy.STEPS and a few other things. But ultimately I verified the condition they listed here does evaluate to true in prediction_step. However I still don't ever see compute_metrics being called...

I'll keep looking into it and submit a PR if I figure it out.

@theblackcat102 @sanagno should be ready for merge.

theblackcat102

looks good

theblackcat102

models.py is fine

bth5032 added 2 commits January 1, 2023 23:30

testing rankgen integration into instructor trainer

34ab948

FP32 Training Works

568a420

bth5032 requested review from yk and andreaskoepf as code owners January 3, 2023 06:37

added precommit hooks and cleaned up configs for rankgen

45c1473

bth5032 mentioned this pull request Jan 3, 2023

[WIP] Train Rankgen ranking model for RLHF #232

Closed

yk linked an issue Jan 3, 2023 that may be closed by this pull request

Train a reward model based on RankGen #78

Closed

yk requested review from theblackcat102 and sanagno and removed request for yk January 3, 2023 09:21

yk added the ml label Jan 3, 2023

theblackcat102 approved these changes Jan 3, 2023

View reviewed changes

sanagno approved these changes Jan 3, 2023

View reviewed changes

bth5032 added 2 commits January 3, 2023 20:47

fixed linting

4569bcf

Cleaned up default argument logic.

da79aa0

bth5032 requested review from theblackcat102 and removed request for andreaskoepf January 4, 2023 21:01

theblackcat102 reviewed Jan 5, 2023

View reviewed changes

bth5032 mentioned this pull request Jan 5, 2023

Get model evaluation working on the reward model trainer #397

Closed

removed old precommit pragma requirement

061d621

theblackcat102 reviewed Jan 5, 2023

View reviewed changes

theblackcat102 merged commit 325c978 into LAION-AI:main Jan 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bth5032/78 blackcat trainer #313

Bth5032/78 blackcat trainer #313

bth5032 commented Jan 3, 2023 •

edited

Loading

theblackcat102 commented Jan 3, 2023 •

edited

Loading

theblackcat102 left a comment

sanagno left a comment

yk commented Jan 3, 2023

bth5032 commented Jan 4, 2023 •

edited

Loading

theblackcat102 left a comment

theblackcat102 left a comment

Bth5032/78 blackcat trainer #313

Bth5032/78 blackcat trainer #313

Conversation

bth5032 commented Jan 3, 2023 • edited Loading

theblackcat102 commented Jan 3, 2023 • edited Loading

theblackcat102 left a comment

Choose a reason for hiding this comment

sanagno left a comment

Choose a reason for hiding this comment

yk commented Jan 3, 2023

bth5032 commented Jan 4, 2023 • edited Loading

theblackcat102 left a comment

Choose a reason for hiding this comment

theblackcat102 left a comment

Choose a reason for hiding this comment

bth5032 commented Jan 3, 2023 •

edited

Loading

theblackcat102 commented Jan 3, 2023 •

edited

Loading

bth5032 commented Jan 4, 2023 •

edited

Loading