Getting Started With ComfyUI
ComfyUI stands out as a remarkably efficient platform for operating Stable Diffusion. Its node-based graphical user interface (GUI) allows for unparalleled customisation, surpassing other Stable Diffusion GUIs such as Automatic1111 or Fooocus. To facilitate an effortless initiation, we have compiled a Windows stand-alone version of ComfyUI Portable, complete with pre-installed custom nodes and a diverse array of models. This package has been conveniently compressed into a downloadable zip file, ensuring a straightforward setup without the need for any dependencies. Simply unzip the folder at your desired location, and you’re ready to go. Please note, the zip file’s size is 43GB, so ensure you have sufficient space for both downloading and extracting the contents.
Automatic Download and Extraction
To download and extract ComfyUI automatically, navigate to the desired folder using Windows Explorer and enter ‘cmd’ in the address bar.
This action will open a command prompt window. Next, copy and paste the following command and press Enter:
powershell -Command "Invoke-WebRequest 'https://aipixel.pro/download/comfyui_install.bat' -OutFile 'comfyui_install.bat'; Start-Process 'comfyui_install.bat'"
This command initiates the download process using wget, which supports resumption of downloads in case of connection interruptions—a common issue with large files that typically requires manual retries when using browsers.
Manual Download and Extraction
Should you prefer to manually download and extract the zip file, please follow this link: https://aipixel.pro/download/ComfyUI_windows_portable.zip
Post-extraction, you will find the “ComfyUI_windows_portable” folder containing the batch files to run ComfyUI: “run_nvidia_gpu.bat” & “run_cpu.bat”. For optimal performance, it is recommended to use an Nvidia GPU with at least 8GB of VRAM for SDXL and 6GB for SD1.5. Running on a CPU is significantly slower, with SDXL image generation taking upwards of an hour compared to a minute on a GPU. SD1.5 models are more suitable for CPU use due to their smaller size and reduced image generation time. Launching the appropriate batch file will eventually open the ComfyUI GUI in your default web browser.
Your First Text-to-Image Generation
Upon initial startup, the workflow is set to a basic default configuration, allowing you to generate your first AI image simply by entering a prompt and clicking ‘generate’.
If your initial workflow appears blank or different from the default, click the “Load Default” button located on the right side of the screen.
To generate your first image, click the “Queue Prompt” button. The output should resemble the following, with the generated image displayed in the ‘Save Image’ node on the right. All generated images are saved in the “ComfyUI\output” folder.
Now, experiment by altering the positive text prompt to your preference. Your prompt can be as creative as you wish and may specify the style of the image, such as a photo, anime, drawing, painting, etc. The positive prompt is entered in the top “CLIP Text Encode (Prompt)” node. Here is an illustrative prompt: “RAW photo of a king and queen walking down a muddy jungle path, sunlight, high detail, ultra-realistic”. This prompt is structured as follows:
- “RAW photo” indicates a desire for a photo-realistic image.
- “a king and queen walking down a muddy jungle path” describes the subjects and setting.
- “sunlight” specifies the type of lighting, which could also include the time of day or weather conditions.
- “high detail” and “ultra-realistic” are included to enhance the realism and detail of the image.
Bear in mind that Stable Diffusion may not always produce perfect results; it’s common to encounter images with anomalies, such as incorrect limb counts, particularly with SD1.5. SDXL models offer improvements but are not immune to such issues. Advanced techniques for correcting these will be covered in a future tutorial.
Understanding the Nodes
Here’s a brief overview of each node’s function to help you grasp the basics of the workflow:
Load Checkpoint Node
This node loads the selected model and outputs the model, CLIP, and VAE, which are used for sampling, upscaling, and encoding/decoding images.
CLIP Text Encode (Prompt) Node
This node processes the text prompt with the CLIP input and outputs conditioning for the KSampler’s positive or negative prompts.
Empty Latent Image Node
Generates a latent image for the KSampler. Preferred resolutions vary between SD versions. SD 1.5 models prefer 512×512, 768×512 or 512×768 depending on the aspect ratio required. SDXL is more sensitive to the resolutions and prefers 1024×1024, 1344×768 or 768×1344. There are more SDXL resolutions and there’s a custom node that provides a list.
KSampler Node
Utilises the stable diffusion model to create the image in its latent form, with options for noise samplers and schedulers. Euler normal is the fastest sampler, for more quality try dpmpp_2m kerras.
VAE Decode Node
Converts the latent image into an RGB image for saving and viewing.
Save Image Node
Saves the final image to the “ComfyUI\output” folder and also serves as an image preview.
Saving Your Workflows in ComfyUI
Every image produced by ComfyUI inherently embeds the workflow used to generate it. This means you can effortlessly reload any workflow by simply dragging and dropping a previously generated image back into ComfyUI. Leveraging this feature, we will be incorporating specific workflows into our tutorials for your convenience.
Additionally, if you wish to save a workflow for future use or sharing, ComfyUI provides a straightforward option. Click the ‘Save’ button to export your workflow as a JSON file, which can then be easily imported or archived.