Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.
Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.
Path: blob/main/diffusers/SDXL_DreamBooth_LoRA_.ipynb
Views: 2535
Fine-tuning Stable Diffusion XL with DreamBooth and LoRA on a free-tier Colab Notebook 🧨
In this notebook, we show how to fine-tune Stable Diffusion XL (SDXL) with DreamBooth and LoRA on a T4 GPU.
SDXL consists of a much larger UNet and two text encoders that make the cross-attention context quite larger than the previous variants.
So, to pull this off, we will make use of several tricks such as gradient checkpointing, mixed-precision, and 8-bit Adam. So, hang tight and let's get started 🧪
Setup 🪓
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 211.8/211.8 MB 2.9 MB/s eta 0:00:00
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 92.6/92.6 MB 9.0 MB/s eta 0:00:00
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 261.4/261.4 kB 21.5 MB/s eta 0:00:00
Make sure to install diffusers
from main
.
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Building wheel for diffusers (pyproject.toml) ... done
Download diffusers SDXL DreamBooth training script.
Dataset 🐶
Let's get our training data! For this example, we'll download some images from the hub
If you already have a dataset on the hub you wish to use, you can skip this part and go straight to: "Prep for training 💻" section, where you'll simply specify the dataset name.
If your images are saved locally, and/or you want to add BLIP generated captions, pick option 1 or 2 below.
Option 1: upload example images from your local files:
Option 2: download example images from the hub:
Preview the images:
Generate custom captions with BLIP
Load BLIP to auto caption your images:
Now let's add the concept token identifier (e.g. TOK) to each caption using a caption prefix. Feel free to change the prefix according to the concept you're training on!
for this example we can use "a photo of TOK," other options include:
For styles - "In the style of TOK"
For faces - "photo of a TOK person"
You can add additional identifiers to the prefix that can help steer the model in the right direction. -- e.g. for this example, instead of "a photo of TOK" we can use "a photo of TOK dog" / "a photo of TOK corgi dog"
Free some memory:
Prep for training 💻
Initialize accelerate
:
Log into your Hugging Face account
Pass your write access token so that we can push the trained checkpoints to the Hugging Face Hub:
Train! 🔬
Set Hyperparameters ⚡
To ensure we can DreamBooth with LoRA on a heavy pipeline like Stable Diffusion XL, we're using:
Gradient checkpointing (
--gradient_accumulation_steps
)8-bit Adam (
--use_8bit_adam
)Mixed-precision training (
--mixed-precision="fp16"
)
Launch training 🚀🚀🚀
To allow for custom captions we need to install the datasets
library, you can skip that if you want to train solely with --instance_prompt
. In that case, specify --instance_data_dir
instead of --dataset_name
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 485.6/485.6 kB 7.9 MB/s eta 0:00:00
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 110.5/110.5 kB 13.0 MB/s eta 0:00:00
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 134.8/134.8 kB 16.0 MB/s eta 0:00:00
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 134.3/134.3 kB 15.4 MB/s eta 0:00:00
Use
--output_dir
to specify your LoRA model repository name!Use
--caption_column
to specify name of the cpation column in your dataset. In this example we used "prompt" to save our captions in the metadata file, change this according to your needs.
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.1.0+cu121 with CUDA 1201 (you have 2.1.0+cu118)
Python 3.10.13 (you have 3.10.12)
Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
Memory-efficient attention, SwiGLU, sparse and more won't be available.
Set XFORMERS_MORE_DETAILS=1 for more details
2023-11-23 07:06:49.633870: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-11-23 07:06:49.633948: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-11-23 07:06:49.638631: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-11-23 07:06:52.427754: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
11/23/2023 07:06:55 - INFO - __main__ - Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cuda
Mixed precision type: fp16
You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
{'thresholding', 'dynamic_thresholding_ratio', 'clip_sample_range', 'variance_type'} was not found in config. Values will be initialized to default values.
{'attention_type', 'reverse_transformer_layers_per_block', 'dropout'} was not found in config. Values will be initialized to default values.
11/23/2023 07:08:28 - WARNING - datasets.builder - Found cached dataset imagefolder (/root/.cache/huggingface/datasets/imagefolder/dog-3c3c059549bb4011/0.0.0/37fbb85cc714a338bea574ac6c7d0b5be5aff46c1862c1989b20e0771199e93f)
100% 1/1 [00:00<00:00, 17.73it/s]
11/23/2023 07:08:29 - INFO - __main__ - ***** Running training *****
11/23/2023 07:08:29 - INFO - __main__ - Num examples = 5
11/23/2023 07:08:29 - INFO - __main__ - Num batches each epoch = 5
11/23/2023 07:08:29 - INFO - __main__ - Num Epochs = 250
11/23/2023 07:08:29 - INFO - __main__ - Instantaneous batch size per device = 1
11/23/2023 07:08:29 - INFO - __main__ - Total train batch size (w. parallel, distributed & accumulation) = 3
11/23/2023 07:08:29 - INFO - __main__ - Gradient Accumulation steps = 3
11/23/2023 07:08:29 - INFO - __main__ - Total optimization steps = 500
Steps: 100% 500/500 [1:08:02<00:00, 7.96s/it, loss=0.00515, lr=0.0001]Model weights saved in corgy_dog_LoRA/pytorch_lora_weights.safetensors
{'image_encoder', 'feature_extractor'} was not found in config. Values will be initialized to default values.
Loading pipeline components...: 0% 0/7 [00:00<?, ?it/s]Loaded tokenizer_2 as CLIPTokenizer from `tokenizer_2` subfolder of stabilityai/stable-diffusion-xl-base-1.0.
Loading pipeline components...: 14% 1/7 [00:00<00:00, 9.49it/s]Loaded tokenizer as CLIPTokenizer from `tokenizer` subfolder of stabilityai/stable-diffusion-xl-base-1.0.
Loaded text_encoder_2 as CLIPTextModelWithProjection from `text_encoder_2` subfolder of stabilityai/stable-diffusion-xl-base-1.0.
Loading pipeline components...: 43% 3/7 [00:17<00:25, 6.34s/it]Loaded scheduler as EulerDiscreteScheduler from `scheduler` subfolder of stabilityai/stable-diffusion-xl-base-1.0.
Loading pipeline components...: 57% 4/7 [00:17<00:12, 4.19s/it]Traceback (most recent call last):
File "/usr/local/bin/accelerate", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/accelerate_cli.py", line 47, in main
args.func(args)
File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 994, in launch_command
simple_launcher(args)
File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 636, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', 'train_dreambooth_lora_sdxl.py', '--pretrained_model_name_or_path=stabilityai/stable-diffusion-xl-base-1.0', '--pretrained_vae_model_name_or_path=madebyollin/sdxl-vae-fp16-fix', '--dataset_name=dog', '--output_dir=corgy_dog_LoRA', '--caption_column=prompt', '--mixed_precision=fp16', '--instance_prompt=a photo of TOK dog', '--resolution=1024', '--train_batch_size=1', '--gradient_accumulation_steps=3', '--gradient_checkpointing', '--learning_rate=1e-4', '--snr_gamma=5.0', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--mixed_precision=fp16', '--use_8bit_adam', '--max_train_steps=500', '--checkpointing_steps=717', '--seed=0', '--push_to_hub']' died with <Signals.SIGKILL: 9>.
Save your model to the hub and check it out 🔥
Your model has finished training.
Access it here: https://huggingface.co/LinoyTsaban/corgy_dog_LoRA
Let's generate some images with it!