CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutSign UpSign In
huggingface

Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.

GitHub Repository: huggingface/notebooks
Path: blob/main/diffusers/CLIP_Guided_Stable_diffusion_with_diffusers.ipynb
Views: 2535
Kernel: Python 3

Open In Colab

CLIP Guided Stable Diffusion using d🧨ffusers

This notebook shows how to do CLIP guidance with Stable diffusion using diffusers libray. This allows you to use newly released CLIP models by LAION AI..

This notebook is based on the following amazing repos, all credits to the original authors!

Initial Setup

#@title Instal dependancies !pip install -qqq diffusers==0.11.1 transformers ftfy gradio accelerate
|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 229 kB 8.9 MB/s |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.9 MB 68.2 MB/s |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 53 kB 2.2 MB/s |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 5.3 MB 51.1 MB/s |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 163 kB 68.4 MB/s |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 6.6 MB 46.5 MB/s |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 55 kB 4.4 MB/s |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 2.3 MB 56.0 MB/s |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 57 kB 6.2 MB/s |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 270 kB 48.4 MB/s |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 112 kB 58.6 MB/s |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 84 kB 4.2 MB/s |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 54 kB 3.8 MB/s |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 84 kB 2.4 MB/s |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 212 kB 12.2 MB/s |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 63 kB 2.7 MB/s |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 80 kB 11.6 MB/s |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 68 kB 8.0 MB/s |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 46 kB 5.1 MB/s |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 594 kB 68.0 MB/s |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.0 MB 52.4 MB/s |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 856 kB 66.2 MB/s Building wheel for ffmpy (setup.py) ... done Building wheel for python-multipart (setup.py) ... done

Authenticate with Hugging Face Hub

To use private and gated models on πŸ€— Hugging Face Hub, login is required. If you are only using a public checkpoint (such as CompVis/stable-diffusion-v1-4 in this notebook), you can skip this step.

#@title Login from huggingface_hub import notebook_login notebook_login()
Login successful Your token has been saved to /root/.huggingface/token

CLIP Guided Stable Diffusion

#@title Load the pipeline import torch from PIL import Image from diffusers import LMSDiscreteScheduler, DiffusionPipeline, PNDMScheduler from transformers import CLIPFeatureExtractor, CLIPModel model_id = "CompVis/stable-diffusion-v1-4" #@param {type: "string"} clip_model_id = "laion/CLIP-ViT-B-32-laion2B-s34B-b79K" #@param ["laion/CLIP-ViT-B-32-laion2B-s34B-b79K", "laion/CLIP-ViT-L-14-laion2B-s32B-b82K", "laion/CLIP-ViT-H-14-laion2B-s32B-b79K", "laion/CLIP-ViT-g-14-laion2B-s12B-b42K", "openai/clip-vit-base-patch32", "openai/clip-vit-base-patch16", "openai/clip-vit-large-patch14"] {allow-input: true} scheduler = "plms" #@param ['plms', 'lms'] def image_grid(imgs, rows, cols): assert len(imgs) == rows*cols w, h = imgs[0].size grid = Image.new('RGB', size=(cols*w, rows*h)) grid_w, grid_h = grid.size for i, img in enumerate(imgs): grid.paste(img, box=(i%cols*w, i//cols*h)) return grid if scheduler == "lms": scheduler = LMSDiscreteScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear") else: scheduler = PNDMScheduler.from_config(model_id, subfolder="scheduler") feature_extractor = CLIPFeatureExtractor.from_pretrained(clip_model_id) clip_model = CLIPModel.from_pretrained(clip_model_id, torch_dtype=torch.float16) guided_pipeline = DiffusionPipeline.from_pretrained( model_id, custom_pipeline="clip_guided_stable_diffusion", custom_revision="main", # TODO: remove if diffusers>=0.12.0 clip_model=clip_model, feature_extractor=feature_extractor, scheduler=scheduler, torch_dtype=torch.float16, ) guided_pipeline = guided_pipeline.to("cuda")
#@title Generate with Gradio Demo import gradio as gr import torch from torch import autocast from diffusers import StableDiffusionPipeline from PIL import Image last_model = "laion/CLIP-ViT-B-32-laion2B-s34B-b79K" def infer(prompt, clip_prompt, samples, steps, clip_scale, scale, seed, clip_model, use_cutouts, num_cutouts): global last_model print(last_model) if(last_model == clip_model): guided_pipeline = create_clip_guided_pipeline(model_id, clip_model_id) guided_pipeline = guided_pipeline.to("cuda") last_model = clip_model prompt = prompt clip_prompt = clip_prompt num_samples = samples num_inference_steps = steps guidance_scale = scale clip_guidance_scale = clip_scale if(use_cutouts): use_cutouts = "True" else: use_cutouts = "False" unfreeze_unet = "True" unfreeze_vae = "True" seed = seed if unfreeze_unet == "True": guided_pipeline.unfreeze_unet() else: guided_pipeline.freeze_unet() if unfreeze_vae == "True": guided_pipeline.unfreeze_vae() else: guided_pipeline.freeze_vae() generator = torch.Generator(device="cuda").manual_seed(seed) images = [] for i in range(num_samples): image = guided_pipeline( prompt, clip_prompt=clip_prompt if clip_prompt.strip() != "" else None, num_inference_steps=num_inference_steps, guidance_scale=guidance_scale, clip_guidance_scale=clip_guidance_scale, num_cutouts=num_cutouts, use_cutouts=use_cutouts == "True", generator=generator, ).images[0] images.append(image) #image_grid(images, 1, num_samples) return images css = """ .gradio-container { font-family: 'IBM Plex Sans', sans-serif; } .gr-button { color: white; border-color: black; background: black; } input[type='range'] { accent-color: black; } .dark input[type='range'] { accent-color: #dfdfdf; } .container { max-width: 730px; margin: auto; padding-top: 1.5rem; } #gallery { min-height: 22rem; margin-bottom: 15px; margin-left: auto; margin-right: auto; border-bottom-right-radius: .5rem !important; border-bottom-left-radius: .5rem !important; } #gallery>div>.h-full { min-height: 20rem; } .details:hover { text-decoration: underline; } .gr-button { white-space: nowrap; } .gr-button:focus { border-color: rgb(147 197 253 / var(--tw-border-opacity)); outline: none; box-shadow: var(--tw-ring-offset-shadow), var(--tw-ring-shadow), var(--tw-shadow, 0 0 #0000); --tw-border-opacity: 1; --tw-ring-offset-shadow: var(--tw-ring-inset) 0 0 0 var(--tw-ring-offset-width) var(--tw-ring-offset-color); --tw-ring-shadow: var(--tw-ring-inset) 0 0 0 calc(3px var(--tw-ring-offset-width)) var(--tw-ring-color); --tw-ring-color: rgb(191 219 254 / var(--tw-ring-opacity)); --tw-ring-opacity: .5; } #advanced-btn { font-size: .7rem !important; line-height: 19px; margin-top: 12px; margin-bottom: 12px; padding: 2px 8px; border-radius: 14px !important; } #advanced-options { display: none; margin-bottom: 20px; } .footer { margin-bottom: 45px; margin-top: 35px; text-align: center; border-bottom: 1px solid #e5e5e5; } .footer>p { font-size: .8rem; display: inline-block; padding: 0 10px; transform: translateY(10px); background: white; } .dark .footer { border-color: #303030; } .dark .footer>p { background: #0b0f19; } .acknowledgments h4{ margin: 1.25em 0 .25em 0; font-weight: bold; font-size: 115%; } """ block = gr.Blocks(css=css) examples = [ [ 'A high tech solarpunk utopia in the Amazon rainforest', 2, 45, 7.5, 1024, ], [ 'A pikachu fine dining with a view to the Eiffel Tower', 2, 45, 7, 1024, ], [ 'A mecha robot in a favela in expressionist style', 2, 45, 7, 1024, ], [ 'an insect robot preparing a delicious meal', 2, 45, 7, 1024, ], [ "A small cabin on top of a snowy mountain in the style of Disney, artstation", 2, 45, 7, 1024, ], ] with block: gr.HTML( """ <div style="text-align: center; max-width: 650px; margin: 0 auto;"> <div style=" display: inline-flex; align-items: center; gap: 0.8rem; font-size: 1.75rem; " > <svg width="0.65em" height="0.65em" viewBox="0 0 115 115" fill="none" xmlns="http://www.w3.org/2000/svg" > <rect width="23" height="23" fill="white"></rect> <rect y="69" width="23" height="23" fill="white"></rect> <rect x="23" width="23" height="23" fill="#AEAEAE"></rect> <rect x="23" y="69" width="23" height="23" fill="#AEAEAE"></rect> <rect x="46" width="23" height="23" fill="white"></rect> <rect x="46" y="69" width="23" height="23" fill="white"></rect> <rect x="69" width="23" height="23" fill="black"></rect> <rect x="69" y="69" width="23" height="23" fill="black"></rect> <rect x="92" width="23" height="23" fill="#D9D9D9"></rect> <rect x="92" y="69" width="23" height="23" fill="#AEAEAE"></rect> <rect x="115" y="46" width="23" height="23" fill="white"></rect> <rect x="115" y="115" width="23" height="23" fill="white"></rect> <rect x="115" y="69" width="23" height="23" fill="#D9D9D9"></rect> <rect x="92" y="46" width="23" height="23" fill="#AEAEAE"></rect> <rect x="92" y="115" width="23" height="23" fill="#AEAEAE"></rect> <rect x="92" y="69" width="23" height="23" fill="white"></rect> <rect x="69" y="46" width="23" height="23" fill="white"></rect> <rect x="69" y="115" width="23" height="23" fill="white"></rect> <rect x="69" y="69" width="23" height="23" fill="#D9D9D9"></rect> <rect x="46" y="46" width="23" height="23" fill="black"></rect> <rect x="46" y="115" width="23" height="23" fill="black"></rect> <rect x="46" y="69" width="23" height="23" fill="black"></rect> <rect x="23" y="46" width="23" height="23" fill="#D9D9D9"></rect> <rect x="23" y="115" width="23" height="23" fill="#AEAEAE"></rect> <rect x="23" y="69" width="23" height="23" fill="black"></rect> </svg> <h1 style="font-weight: 900; margin-bottom: 7px;"> CLIP Guided Stable Diffusion Demo </h1> </div> <p style="margin-bottom: 10px; font-size: 94%"> Demo allows you to use newly released <a href="https://huggingface.co/laion" style="text-decoration: underline">CLIP models by LAION AI</a> with Stable Diffusion </p> </div> """ ) with gr.Group(): with gr.Box(): with gr.Row().style(mobile_collapse=False, equal_height=True): text = gr.Textbox( label="Enter your prompt", show_label=False, max_lines=1, placeholder="Enter your prompt", ).style( border=(True, False, True, True), rounded=(True, False, False, True), container=False, ) btn = gr.Button("Generate image").style( margin=False, rounded=(False, True, True, False), ) gallery = gr.Gallery( label="Generated images", show_label=False, elem_id="gallery" ).style(grid=[2], height="auto") advanced_button = gr.Button("Advanced options", elem_id="advanced-btn") with gr.Row(elem_id="advanced-options"): with gr.Column(): clip_prompt = gr.Textbox( label="Enter a CLIP prompt if you want it to differ", show_label=False, max_lines=1, placeholder="Enter a CLIP prompt if you want it to differ", ) with gr.Row(): samples = gr.Slider(label="Images", minimum=1, maximum=2, value=1, step=1) steps = gr.Slider(label="Steps", minimum=1, maximum=50, value=45, step=1) with gr.Row(): use_cutouts = gr.Checkbox(label="Use cutouts?") num_cutouts = gr.Slider(label="Cutouts", minimum=1, maximum=16, value=4, step=1) with gr.Row(): with gr.Column(): clip_model = gr.Dropdown(["laion/CLIP-ViT-B-32-laion2B-s34B-b79K", "laion/CLIP-ViT-L-14-laion2B-s32B-b82K", "laion/CLIP-ViT-H-14-laion2B-s32B-b79K", "laion/CLIP-ViT-g-14-laion2B-s12B-b42K", "openai/clip-vit-base-patch32", "openai/clip-vit-base-patch16", "openai/clip-vit-large-patch14"], value="laion/CLIP-ViT-B-32-laion2B-s34B-b79K", show_label=False) with gr.Row(): scale = gr.Slider( label="Guidance Scale", minimum=0, maximum=50, value=7.5, step=0.1 ) seed = gr.Slider( label="Seed", minimum=0, maximum=2147483647, step=1, randomize=True, ) clip_scale = gr.Slider( label="CLIP Guidance Scale", minimum=0, maximum=5000, value=100, step=1 ) ex = gr.Examples(examples=examples, fn=infer, inputs=[text, samples, steps, scale, clip_scale, seed], outputs=gallery, cache_examples=False) ex.dataset.headers = [""] text.submit(infer, inputs=[text, clip_prompt, samples, steps, scale, clip_scale, seed, clip_model, use_cutouts, num_cutouts], outputs=gallery) btn.click(infer, inputs=[text, clip_prompt, samples, steps, scale, clip_scale, seed, clip_model, use_cutouts, num_cutouts], outputs=gallery) advanced_button.click( None, [], text, _js=""" () => { const options = document.querySelector("body > gradio-app").querySelector("#advanced-options"); options.style.display = ["none", ""].includes(options.style.display) ? "flex" : "none"; }""", ) gr.HTML( """ <div class="footer"> <p>Model by <a href="https://huggingface.co/CompVis" style="text-decoration: underline;" target="_blank">CompVis</a> and <a href="https://huggingface.co/stabilityai" style="text-decoration: underline;" target="_blank">Stability AI</a> - Gradio Demo by πŸ€— Hugging Face </p> </div> <div class="acknowledgments"> <p><h4>LICENSE</h4> The model is licensed with a <a href="https://huggingface.co/spaces/CompVis/stable-diffusion-license" style="text-decoration: underline;" target="_blank">CreativeML Open RAIL-M</a> license. The authors claim no rights on the outputs you generate, you are free to use them and are accountable for their use which must not go against the provisions set in this license. The license forbids you from sharing any content that violates any laws, produce any harm to a person, disseminate any personal information that would be meant for harm, spread misinformation and target vulnerable groups. For the full list of restrictions please <a href="https://huggingface.co/spaces/CompVis/stable-diffusion-license" target="_blank" style="text-decoration: underline;" target="_blank">read the license</a></p> <p><h4>Biases and content acknowledgment</h4> Despite how impressive being able to turn text into image is, beware to the fact that this model may output content that reinforces or exacerbates societal biases, as well as realistic faces, pornography and violence. The model was trained on the <a href="https://laion.ai/blog/laion-5b/" style="text-decoration: underline;" target="_blank">LAION-5B dataset</a>, which scraped non-curated image-text-pairs from the internet (the exception being the removal of illegal content) and is meant for research purposes. You can read more in the <a href="https://huggingface.co/CompVis/stable-diffusion-v1-4" style="text-decoration: underline;" target="_blank">model card</a></p> </div> """ ) block.launch(debug=True)
#@title Generate on Colab prompt = "fantasy book cover, full moon, fantasy forest landscape, golden vector elements, fantasy magic, dark light night, intricate, elegant, sharp focus, illustration, highly detailed, digital painting, concept art, matte, art by WLOP and Artgerm and Albert Bierstadt, masterpiece" #@param {type: "string"} #@markdown `clip_prompt` is optional, if you leave it blank the same prompt is sent to Stable Diffusion and CLIP clip_prompt = "" #@param {type: "string"} num_samples = 1 #@param {type: "number"} num_inference_steps = 50 #@param {type: "number"} guidance_scale = 7.5 #@param {type: "number"} clip_guidance_scale = 100 #@param {type: "number"} num_cutouts = 4 #@param {type: "number"} use_cutouts = "False" #@param ["False", "True"] unfreeze_unet = "True" #@param ["False", "True"] unfreeze_vae = "True" #@param ["False", "True"] seed = 3788086447 #@param {type: "number"} if unfreeze_unet == "True": guided_pipeline.unfreeze_unet() else: guided_pipeline.freeze_unet() if unfreeze_vae == "True": guided_pipeline.unfreeze_vae() else: guided_pipeline.freeze_vae() generator = torch.Generator(device="cuda").manual_seed(seed) images = [] for i in range(num_samples): image = guided_pipeline( prompt, clip_prompt=clip_prompt if clip_prompt.strip() != "" else None, num_inference_steps=num_inference_steps, guidance_scale=guidance_scale, clip_guidance_scale=clip_guidance_scale, num_cutouts=num_cutouts, use_cutouts=use_cutouts == "True", generator=generator, ).images[0] images.append(image) image_grid(images, 1, num_samples)