Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference on TPUs instead of GPUs. #415

Open
kennycoder opened this issue Feb 14, 2025 · 0 comments
Open

Inference on TPUs instead of GPUs. #415

kennycoder opened this issue Feb 14, 2025 · 0 comments

Comments

@kennycoder
Copy link

Hi folks!
Our AI Hypercomputer team ported Flux inference implementation to MaxDiffusion and were able to successfully run both Flux-dev and Flux-schnell models using Google's TPUs.

Running tests on 1024 x 1024 images with flash attention and bfloat16 gave the following results:

Model Accelerator Sharding Strategy Batch Size Steps time (secs)
Flux-dev v4-8 DDP 4 28 23
Flux-schnell v4-8 DDP 4 4 2.2
Flux-dev v6e-4 DDP 4 28 5.5
Flux-schnell v6e-4 DDP 4 4 0.8
Flux-schnell v6e-4 FSDP 4 4 1.2

We'd appreciate if you could give us some feedback on the above-mentioned results and our overall approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant