-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add I2V sft and fix an error #97
Conversation
Thanks! Could you provide some example results as well? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some preliminary comments. Will let @a-r-r-o-w comment on the new script first
LEARNING_RATES=("1e-4") | ||
LR_SCHEDULES=("cosine_with_restarts") | ||
OPTIMIZERS=("adamw") | ||
MAX_TRAIN_STEPS=("20000") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
20000 steps!?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just follow the setting of train_text_to_video_sft.sh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you performed any experiments yourself?
@@ -799,12 +799,15 @@ def load_model_hook(models, input_dir): | |||
# (this is the forward diffusion process) | |||
noisy_video_latents = scheduler.add_noise(video_latents, noise, timesteps) | |||
noisy_model_input = torch.cat([noisy_video_latents, image_latents], dim=2) | |||
|
|||
model_config.patch_size_t if hasattr(model_config, "patch_size_t") else None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems wrong. No assignment to a variable. Is this expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forget to delete early edit. Fixed it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the script and OFS fixes! Do you have any publically available training runs to verify correctness? For example, we did a 20000 step run on T2V SFT before sharing the script, so something similar to verify I2V would be nice (even if lower number of steps)
Seems definitely better to me. At least it learned semantics and better motion (IMO). |
@jiashenggu Looks good to merge! Could you rebase against the main branch? Because it looks like some changes already in main our made here as well |
Thank you so much for this! I've verified that it works. Please feel free to open PRs for speedups or other suggestions for improvements :) |
i am not sure how to set ofs in training, I just follow the inference pipeline setting