Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
ShivamShrirao
GitHub Repository: ShivamShrirao/diffusers
Path: blob/main/examples/research_projects/dreambooth_inpaint/README.md
1979 views

Dreambooth for the inpainting model

This script was added by @thedarkzeno .

Please note that this script is not actively maintained, you can open an issue and tag @thedarkzeno or @patil-suraj though.

export MODEL_NAME="runwayml/stable-diffusion-inpainting" export INSTANCE_DIR="path-to-instance-images" export OUTPUT_DIR="path-to-save-model" accelerate launch train_dreambooth_inpaint.py \ --pretrained_model_name_or_path=$MODEL_NAME \ --instance_data_dir=$INSTANCE_DIR \ --output_dir=$OUTPUT_DIR \ --instance_prompt="a photo of sks dog" \ --resolution=512 \ --train_batch_size=1 \ --gradient_accumulation_steps=1 \ --learning_rate=5e-6 \ --lr_scheduler="constant" \ --lr_warmup_steps=0 \ --max_train_steps=400

Training with prior-preservation loss

Prior-preservation is used to avoid overfitting and language-drift. Refer to the paper to learn more about it. For prior-preservation we first generate images using the model with a class prompt and then use those during training along with our data. According to the paper, it's recommended to generate num_epochs * num_samples images for prior-preservation. 200-300 works well for most cases.

export MODEL_NAME="runwayml/stable-diffusion-inpainting" export INSTANCE_DIR="path-to-instance-images" export CLASS_DIR="path-to-class-images" export OUTPUT_DIR="path-to-save-model" accelerate launch train_dreambooth_inpaint.py \ --pretrained_model_name_or_path=$MODEL_NAME \ --instance_data_dir=$INSTANCE_DIR \ --class_data_dir=$CLASS_DIR \ --output_dir=$OUTPUT_DIR \ --with_prior_preservation --prior_loss_weight=1.0 \ --instance_prompt="a photo of sks dog" \ --class_prompt="a photo of dog" \ --resolution=512 \ --train_batch_size=1 \ --gradient_accumulation_steps=1 \ --learning_rate=5e-6 \ --lr_scheduler="constant" \ --lr_warmup_steps=0 \ --num_class_images=200 \ --max_train_steps=800

Training with gradient checkpointing and 8-bit optimizer:

With the help of gradient checkpointing and the 8-bit optimizer from bitsandbytes it's possible to run train dreambooth on a 16GB GPU.

To install bitandbytes please refer to this readme.

export MODEL_NAME="runwayml/stable-diffusion-inpainting" export INSTANCE_DIR="path-to-instance-images" export CLASS_DIR="path-to-class-images" export OUTPUT_DIR="path-to-save-model" accelerate launch train_dreambooth_inpaint.py \ --pretrained_model_name_or_path=$MODEL_NAME \ --instance_data_dir=$INSTANCE_DIR \ --class_data_dir=$CLASS_DIR \ --output_dir=$OUTPUT_DIR \ --with_prior_preservation --prior_loss_weight=1.0 \ --instance_prompt="a photo of sks dog" \ --class_prompt="a photo of dog" \ --resolution=512 \ --train_batch_size=1 \ --gradient_accumulation_steps=2 --gradient_checkpointing \ --use_8bit_adam \ --learning_rate=5e-6 \ --lr_scheduler="constant" \ --lr_warmup_steps=0 \ --num_class_images=200 \ --max_train_steps=800

Fine-tune text encoder with the UNet.

The script also allows to fine-tune the text_encoder along with the unet. It's been observed experimentally that fine-tuning text_encoder gives much better results especially on faces. Pass the --train_text_encoder argument to the script to enable training text_encoder.

Note: Training text encoder requires more memory, with this option the training won't fit on 16GB GPU. It needs at least 24GB VRAM.

export MODEL_NAME="runwayml/stable-diffusion-inpainting" export INSTANCE_DIR="path-to-instance-images" export CLASS_DIR="path-to-class-images" export OUTPUT_DIR="path-to-save-model" accelerate launch train_dreambooth_inpaint.py \ --pretrained_model_name_or_path=$MODEL_NAME \ --train_text_encoder \ --instance_data_dir=$INSTANCE_DIR \ --class_data_dir=$CLASS_DIR \ --output_dir=$OUTPUT_DIR \ --with_prior_preservation --prior_loss_weight=1.0 \ --instance_prompt="a photo of sks dog" \ --class_prompt="a photo of dog" \ --resolution=512 \ --train_batch_size=1 \ --use_8bit_adam \ --gradient_checkpointing \ --learning_rate=2e-6 \ --lr_scheduler="constant" \ --lr_warmup_steps=0 \ --num_class_images=200 \ --max_train_steps=800