wandb tracker in scheduling problems during the training initiation and training stages #100

glide-the · 2024-11-27T09:15:06Z

wandb tracker in scheduling problems during the training initiation and training stages
Login is not performed as expected during startup

…nd training stages.

``` [rank1]: File "/mnt/ceph/develop/jiawei/cogvideox-distillation/training/cogvideox_image_to_video_lora.py", line 1005, in <module> [rank1]: main(args) [rank1]: File "/mnt/ceph/develop/jiawei/cogvideox-distillation/training/cogvideox_image_to_video_lora.py", line 884, in main [rank1]: "gradient_norm_before_clip": gradient_norm_before_clip, [rank1]: UnboundLocalError: local variable 'gradient_norm_before_clip' referenced before assignment ``` ref1 :#84 ref2: #100 It seems that there is a bug in the accelerator.is_main_process task and accelerator.distributed_type. Is_main_process has scheduling problems during the training initiation and training stages.

* is some error ``` [rank1]: File "/mnt/ceph/develop/jiawei/cogvideox-distillation/training/cogvideox_image_to_video_lora.py", line 1005, in <module> [rank1]: main(args) [rank1]: File "/mnt/ceph/develop/jiawei/cogvideox-distillation/training/cogvideox_image_to_video_lora.py", line 884, in main [rank1]: "gradient_norm_before_clip": gradient_norm_before_clip, [rank1]: UnboundLocalError: local variable 'gradient_norm_before_clip' referenced before assignment ``` ref1 :#84 ref2: #100 It seems that there is a bug in the accelerator.is_main_process task and accelerator.distributed_type. Is_main_process has scheduling problems during the training initiation and training stages. * fix

wandb tracker in scheduling problems during the training initiation a…

a9cf49e

…nd training stages.

glide-the requested review from a-r-r-o-w and zRzRzRzRzRzRzR November 28, 2024 16:08

a-r-r-o-w approved these changes Nov 28, 2024

View reviewed changes

sayakpaul merged commit d5cc7c6 into main Nov 29, 2024

sayakpaul deleted the wandb-tracker-fix branch November 29, 2024 09:38

glide-the mentioned this pull request Nov 30, 2024

Unbound fix #105

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wandb tracker in scheduling problems during the training initiation and training stages #100

wandb tracker in scheduling problems during the training initiation and training stages #100

glide-the commented Nov 27, 2024

wandb tracker in scheduling problems during the training initiation and training stages #100

wandb tracker in scheduling problems during the training initiation and training stages #100

Conversation

glide-the commented Nov 27, 2024