SD - 2.How to use diffusion web ui? - img2img

In this article, I will provide a walkthrough of common features on img2img tab.

  • I will first go through some unique parameters for this tab, then provide my workflow on use cases like outpaint and inpaint.
    • stable-diffusion-webui-img2img-tab
  • In img2img tab, images could be used as additional input as the guide for generation. This enables a bunch of useful cases. For exmaple, you could use realistic photo to generate your animation figure, or use hand-drawn sketch to create good looking pictures.

Unique parameter

For most of the part, img2img tab shares an identical parameter list as txt2img tab, please check my previous post for more detail. Additional parameters are used to customize the process of input images.

Resize mode

This defines the behavior when the input image size is not same as the one you indicated.

  • Just resize
    • resize the image to target resolution. This may lead to incorrect aspect ratio
  • Crop and resize
    • Resize the image so that entirety of target resolution is filled with the image. Crop parts that stick out.
  • Resize and fill
    • Resize the image so that entirety of image is inside target resolution. Fill empty space with image's colors
  • Just resize (latent upscale)
    • same as the first one, but uses latent upscaling method (without a scaler, just latent decoder) - Ref

Denoising strength

  • Range in [0,1]. This defines the similarity between the generated image and the original one. The larger / the closer to 1, the image would likely to take less features fro the original input.

Usages

Style Transfer

  • Use your image as input, adding prompt, choose the right style model, and AI will do the magic.

  • Below is a showcase of using a same prompt and image input with different models gives different style output.

  • img2img-example-sketch
  • Image Source

  • image.png
  • img2img-sketch-example-people
  • Sketch image source

  • image-20230418221414301

I have to admit that I used ==ControlNet== for the basic image structure control. With only image and text guidance, you will get pretty randomized output, like the one below.

  • image-20230418221837294

Inpaint - Fix / Modify selected parts of the image

Mask the part you want to re-generate, modify the prompt a bit, you could fix those faulty parts or let the AI inspires you.

Inpaint processing logic

  1. Select Mask Area
  2. Pre-process masked area
  3. Add mask blur
  4. Image generation

Parameter settings

Prompt

  • Prompt for the masked area (or the not masked part)
  • Could reuse the prompt when generating original images

Mask Blur

  • To what extent should the masked area should be blurred, influence the third steps above
  • The higher the value, the lower the degree of ambiguity
stable-diffusion-inpaint-different-mask-blur-sample

Mask mode

  • Tell the model which area to inpaint

Masked content

Defines how to preprocess masked area (refer to logic step 2)

  • fill
    • Based on the image color, use blurred color block to replace the original image.
  • original
    • No pre-process, directly use original images
  • latent noise
    • Use latent noise (the AI's interpretation of input prompt) to fill
  • latent nothing
    • Use a zero-valued latent variable to fill selected area

Normally, fill & original are used for minor improvements. latent noise and latent nothing are used to generate something new with significant difference.

Inpaint area

  • Whole picture
    • After inpainting, refer to the width and height, resize mode to adjust the image.
  • Only masked
    • Only repaint the masked area
    • Better for high resolution images, no need to set width and height
  • Only masked padding, pixels
    • When inpaint area = only masked, defines the masked area's pixels count.
    • The smaller the value, the higher the density of filled pixels.
    • The higher the density, the more content it will be generated - may lead to the generation of a new image based on your prompt in the inpaint area.

Other inpaint mode

  • inpaint sketch
    • add color guide to the mask, could adjust the guidance strength by mask transparency
  • inpaint upload
    • problems of normal inpaint
      • Mouse does not easily apply problem areas with precision
      • One-time masking does not stay
    • Inpaint upload tab allows you to upload an image and a prepared mask image
    • Note
      • The mask is black and white, ==white== represents the mask selection area
      • Leave some inner space for inpaint area

Outpaint - Expand existing image

  • Under certain circumstances, you may want to expand an existing image. This is when outpaint script come into play.
  • In the bottom left corner, you could find a drop down list for Outpaint script, which by default has two versions, normally we use Outpainting mk2 version for more stable result.

Parameter settings

stable-diffusion-webui-outpainting-script-ui
  • For both scripts, there are two common settings - Pixels to expand, and Outpainting direction. This is pretty straightforward to understand.
  • For Outpainting mk2, mask blur is applied to the outpainted part during preprocess to create a soft boundary between original image and the expanded part.
  • Fall-off exponent controls the “smoothness” of the masked picture (one to be expanded). A higher value would result in a smoother noise pattern, which would then lead to less detail when generation.
  • Like Fall-off exponent,Color Variation had an impact on the expanded areas in aspect of colors.
  • If you are interested in the details of how these parameters take effect, check the source code in stable-diffusion-webui/scripts/outpainting_mk_2.py

Example

Let's now see a concrete example. Say I got picture of New York city as follows: (I put the generation detail at the end of this article)

  • stable-diffusion-outpaint-example-ori
  • Now I want to expand the left and right side to add more details to become suitable for a wallpaper. I clicked the Send to img2img button under the generation panle, and all the generation detail and the results are passed to the img2img tab.
  • In the script channel, I selected outpainting mk2. Following the script's instruction, I modified the Sampling method to Euler a, and Sampling steps to 60, and Denoising strength to 0.8 and I got the following result.
    • expand-4maskblur-1falloff-0.05colorvar
  • Though it seems good at first glance, this picture has several problems - the most significant one is the clear boundary between original image and the expanded ones, check the light's reflection cut off in water bottom-left or the clear split of skylines on top right.
  • I therefore modified the mask_blur intended to create a soft boundary. By setting mask_blur=10, here is what I got
    • expand-10maskblur-1falloff-0.05colorvar
  • I also play around with other parameters settings and you could check the impact of different parameters.
    • maskblur=10, falloff=1, colorvar=0.5 (Increasing Color variation to the previous one)
      • expand-10maskblur-1falloff-0.5colorvar
    • maskblur=20, falloff=1, colorvar=0.5 (Increasing mask blur)
      • expand-20maskblur-1falloff-0.5colorvar

Modify the image by instructions (Not mature)

  • In light of InstructGPT, some may find it useful to modify the image by direct textual instructions.
  • This involves the wonderful work (below image source) presented by UC Berkeley Tim Brooks and his fellows.
    • instruct-image-example
  • Maybe it's the reason that I didn't get the correct format for instructions, I could hardly use it on my own images, it performs perfectly well on examples though.

Generation details

  • For cyberpunk New York City - Model Link - deliberate_v11
    •   a picture of New York City with a lot of tall buildings, ((cyberpunk)), rainy, night, CG, Unreal Engine, best quality
      
        Negative prompt: (deformed iris, deformed pupils, semi-realistic, cgi, 3d, render, sketch, cartoon, drawing, anime:1.4), text, close up, cropped, out of frame, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck
        Steps: 25, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 2970153128, Size: 512x512, Model hash: d8691b4d16, Model: deliberate_v11, Clip skip: 2, ENSD: 31337

SD - 2.How to use diffusion web ui? - img2img
https://delusion4013.github.io/2023/04/25/How-to-use-diffusion-web-ui/
Author
Chenkai
Posted on
April 25, 2023
Licensed under