GitHub - harishprabhala/voltaML-fast-stable-diffusion: Lightweight library to accelerate Stable-Diffusion, Dreambooth into fastest inference models with single line of code 🔥 🔥

⚡voltaML-fast-stable-diffusion 🔥 🔥

Lightweight library to accelerate Stable-Diffusion, Dreambooth into fastest inference models with one single line of code.

🔥Accelerate Computer vision, NLP models etc. with voltaML. Upto 10X speed up in inference🔥

Installation

voltaML Docker Container 🐳

git clone https://github.com/VoltaML/voltaML-fast-stable-diffusion.git
cd voltaML-fast-stable-diffusion

sudo docker pull voltaml/volta_diffusion:v0.2 

sudo docker run -it --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -v $(pwd):/code --rm voltaml/volta_diffusion:v0.2

Own setup:

Requirements: Please refer to the requirements.txt file to set it up on your own environment.

It is recommended to use our voltaml/volta_diffusion container or NVIDIA TensorRT container

Usage

Hugging Face Login

Login into your Hugging Face account through the terminal

huggingface-cli login
Token: #enter your huggingface token

Accelerate

bash optimize.sh --model='runwayml/stable-diffusion-v1-5' # your model path/ hugging face name

Inference

For TensorRT

python3 volta_infer.py --backend='TRT' --prompt='a gigantic robotic bipedal dinosaur, highly detailed, photorealistic, digital painting, artstation, concept art, sharp focus, illustration, art by greg rutkowski and alphonse mucha'

For PyTorch

python3 volta_infer.py --backend='PT' --prompt='a gigantic robotic bipedal dinosaur, highly detailed, photorealistic, digital painting, artstation, concept art, sharp focus, illustration, art by greg rutkowski and alphonse mucha'

Benchmark

python3 volta_infer.py --backend='TRT' --benchmark

The below benchmarks have been done for generating a 512x512 image, batch size 1 for 50 iterations.

Model	T4 (it/s)	A10 (it/s)	A100 (it/s)
PyTorch	4.3	8.8	15.1
Flash attention xformers	5.5	15.6	27.5
VoltaML(TRT)	7.7	17.2	36.1

To-Do:

Integrate Flash-attention
Integrate AITemplate
Try Flash-attention with TensorRT

Contribution:

We invite the open source community to contribute and help us better voltaML. Please check out our contribution guide

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
.gitignore		.gitignore
CONTRIBUTION.md		CONTRIBUTION.md
Dockerfile		Dockerfile
LICENSE.md		LICENSE.md
README.md		README.md
optimize.sh		optimize.sh
pytorch_model.py		pytorch_model.py
requirements.txt		requirements.txt
trt_model.py		trt_model.py
volta_accelerate.py		volta_accelerate.py
volta_infer.py		volta_infer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

⚡voltaML-fast-stable-diffusion 🔥 🔥

🔥Accelerate Computer vision, NLP models etc. with voltaML. Upto 10X speed up in inference🔥

Installation

voltaML Docker Container 🐳

Own setup:

Usage

Hugging Face Login

Accelerate

Inference

Benchmark

To-Do:

Contribution:

References

About

Releases

Packages

Languages

License

harishprabhala/voltaML-fast-stable-diffusion

Folders and files

Latest commit

History

Repository files navigation

⚡voltaML-fast-stable-diffusion 🔥 🔥

🔥Accelerate Computer vision, NLP models etc. with voltaML. Upto 10X speed up in inference🔥

Installation

voltaML Docker Container 🐳

Own setup:

Usage

Hugging Face Login

Accelerate

Inference

Benchmark

To-Do:

Contribution:

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages