SkyPilot

Unlock Your Creative Potential with ComfyUI on Seeweb: A Beginner’s Guide

Unlock your creative potential with ComfyUI on Seeweb, the modular, node-based interface for Stable Diffusion that redefines AI image and video generation.
Indice dei contenuti

Welcome, Seeweb users! Are you ready to dive into the exciting world of AI-driven image generation with unparalleled control? ComfyUI is a powerful, node-based graphical user interface for Stable Diffusion that transforms how you create images. Forget rigid menus; ComfyUI offers a visual, modular environment where you design and execute advanced image generation pipelines.

With ComfyUI, you’re not just generating images; you’re orchestrating the entire process. Its key advantages are:

  • Transparency and Control: You see every step of the generation process. This allows for precise control and a deeper understanding of how your final image is created.
  • Modularity and Flexibility: Each operation is a self-contained node. You can connect them in any way you imagine, creating simple or incredibly complex workflows that are impossible with other interfaces.
  • Efficiency: ComfyUI only re-executes the parts of the workflow that have changed, saving you time and computational resources on complex generations.
  • Shareability: The entire workflow can be saved as a single JSON file or embedded within a generated PNG image, making it incredibly easy to share your exact process with others.

Linux Installation Guide

Run the installation commands: The following series of commands will install the necessary tools, create a virtual environment, and start the application.

# 1. Install Virtualenv
apt-get install virtualenv
virtualenv venv

# 2. Activate Virtualenv
source venv/bin/activate

# 3. Install ComfyUI Cli
pip install comfy-cli 

comfy install 

cd ComfyUI

# 4. Start ComfyUI


From within the ComfyUI directory, run:

python main.py --port 8188

You can now access the interface in your browser, usually at http://127.0.0.1:8188.

How to Update ComfyUI

To update your manual installation:

  1. Pull the latest code:
cd /path/to/your/ComfyUI
git pull

2. Install/Update dependencies: Make sure your comfyenv virtual environment is active.

pip install -r requirements.txt

The Graph/Nodes Interface

When you first open the link, you’ll see the ComfyUI interface. It’s a graph where you connect different blocks, called nodes, together.

  • Nodes: These are the rectangular blocks that perform specific tasks (e.g., loading a model, writing a prompt). They have inputs on the left and outputs on the right.
  • Connections (Edges): These are the wires that connect the nodes, allowing data to flow from one node’s output to another’s input.
  • Workflow (Graph): The entire arrangement of nodes and connections is your workflow. It represents a complete process for generating an image.

Your First Image Generation: Text-to-Image

The easiest way to get started is by using a pre-built template, which provides a solid foundation for understanding the fundamental flow of data.

Starting with a Template

ComfyUI comes with several workflow templates that are perfect for getting started. You can browse them by clicking the folder icon 📁 on the right-hand menu. For your first generation, however, the easiest way is to simply click the Load Default button to load a basic, fully functional text-to-image pipeline.

ComfyUI workflow templates

Now, let’s break down this default workflow node by node to understand what each part does in greater detail.

The Workflow, Step-by-Step

1. Load Checkpoint (The Model)

This is the starting point. This node loads the main Stable Diffusion model (the checkpoint), which is a large file containing all the “knowledge” the AI has learned from training on billions of images. It’s the core engine of your image generation.

  • Action: Click the ckpt_name field and select a model like v1-5-pruned.safensors. Different models have different styles and specialties.
  • Outputs: This node provides the MODEL, CLIP, and VAE.

Stable Diffusion model Outputs

2. CLIP Text Encode (The Prompts)

The model needs instructions. These nodes take your text and, using the CLIP model, convert it into “conditioning,” a numerical representation (embeddings) that guides the AI.

  • Positive Prompt: (Green node) This is your creative direction. A majestic lion with a fiery mane, photorealistic, 4k, cinematic lighting.
  • Negative Prompt: (Red node) This tells the AI what to avoid. cartoon, drawing, deformed, bad anatomy, blurry, text, watermark.

CLIP Text Encode (The Prompts)

3. Empty Latent Image (The Canvas)

This node creates a blank canvas of random noise in the “latent space,” which will be the starting point for the generation.

  • Action: Set the width and height. For SDXL models, 1024×1024 is a good start. batch_size allows you to create multiple images in a single run.

Empty Latent Image (The Canvas)

4. KSampler (The Generator)

This is the heart of the operation. The KSampler iteratively “denoises” the latent image, refining the chaos into a coherent picture that matches your vision.

  • seed: This number initializes the random noise. Keeping the seed identical (along with all other settings) allows you to reproduce an image exactly. To get a new variation of your prompt, simply change the seed.
  • steps: The number of denoising steps. A good range is 20-30.
  • cfg: How strictly the AI adheres to your prompt. A value of 7-8 is a good balance.
  • sampler_name: The algorithm used for denoising. euler is fast, while dpmpp_2m_sde is popular for quality.
  • scheduler: Controls the rate of denoising at each step. karras is a popular choice.
  • denoise: A value of 1.0 is standard for text-to-image.

KSampler (The Generator)

5. VAE Decode (From Latent to Pixels)

This node decodes the latent representation back into a standard pixel-based image.

VAE Decode (From Latent to Pixels)

6. Save Image (The Final Output)

This node saves the image and displays a preview in the interface.

Save Image (The Final Output)

Queueing the Prompt

Click Queue Prompt on the menu. The nodes will execute, and your image will appear in the Save Image node.

Queueing the Prompt

Upscaling Your Image (2-Pass Workflow)

To get higher-resolution images with more detail, you can generate an image at its native resolution and then upscale it in a second pass.

Upscaling Your Image (2-Pass Workflow)

Second Pass: Upscaling and Refinement

  1. Load an Upscale Model: Add a Load Upscale Model node.

Load an Upscale Model: Add a Load Upscale Model node.

2. Upscale the Image: Add an ImageUpscaleWithModel node. Connect the IMAGE from your first pass here.

Upscale the Image: Add an Image Upscale With Model node.

3. Refine with a Second KSampler: Encode the upscaled image back to latent and pass it to a new KSampler. Crucially, set the denoise value between 0.3 and 0.5 to add detail without changing the composition.

4. Decode and Save: Decode the result from the second sampler and save the final, high-resolution image.

From Text to Video with Wan 2.1

ComfyUI isn’t just for static images. You can also generate short, animated video clips starting from a single image. The Wan 2.1 workflow is excellent for this.

The easiest way to get started is by using a text to video template, which provides everything you need for reaching your goal.

Here’s a breakdown of the key components for a text-to-video setup:

key components for a text-to-video setup

  1. Load Checkpoint (The Video Models)

This node will load the necessary models for video generation.

  • Action:
    • For the Diffusion Model, load wan2.1_t2v_1.3B_fp16.safetensors.
    • For the CLIP Model, load umt5_xxl_fp8_e4m3fn_scaled.safetensors.
    • For the VAE Model, load wan_2.1_vae.safetensors.
  1. CLIP Text Encode (The Prompts)

Similar to image generation, these nodes convert your text prompts into numerical representations to guide the video generation. You can modify these if needed.

  • Positive Prompt: (Green node) Describe what you want to see in your video.
  • Negative Prompt: (Red node) Describe what you want to avoid in your video.
  1. Empty Latent Video (The Video Canvas)

This node creates a blank, noisy video canvas.

  • Action: (Optional) Adjust the width and height to your desired video dimensions.
  1. Wan2.1 Img2Vid (The Generator)

This is the core node that orchestrates the video generation.

  • Connect the inputs: Wire up the MODEL, CLIP, and LORA from the Load LoRA node, the positive and negative prompts, and the IMAGE from your Load Image node.
  • Key Settings:
    • width/height: Set the resolution of your output video.
    • frames: The total number of frames to generate. 24 frames at 8 fps will give you a 3-second video.
    • frame_rate: Frames per second (fps) for the final video.
    • steps/cfg/sampler_name: These function just like in the KSampler, controlling the quality of each generated frame.
  1. Create the Video File: Video Combine

This node stitches the generated frames into a playable video file.

  • Connect the IMAGE output from the Wan2.1 Img2Vid node to the images input of this node.
  • Settings:
    • frame_rate: Should match the frame rate set in the previous node.
    • format: Choose your output format, such as video/mp4 or image/gif.
    • filename_prefix: Set the name for your saved video file.

Once everything is connected and configured, click Queue Prompt on the menu or use the shortcut Ctrl (Cmd) + Enter to execute the video generation. ComfyUI will generate and combine the frames into your final video.

From Image to Video with Wan 2.1

Just as you can create images from text, you can also bring static images to life by turning them into short video clips. This process, often called “Image-to-Video,” uses your initial image as a starting point and animates it based on your text prompts. We’ll use the powerful Wan 2.1 workflow for this.

From Image to Video with Wan 2.1 Image created with Wan 2.1

How to Load the Workflow

The easiest way is to use a pre-built template. Click the folder icon 📁 and look for a template named something like “Wan2.1 Image to Video”. This will set up the necessary nodes.

How to Load the Workflow, Wan2.1 Image to Video

The Workflow, Step-by-Step

1. Load Your Starting Image

Instead of an Empty Latent Image, this workflow begins with a Load Image node. Use it to upload the image you want to animate.

2. Load the Necessary Models

This workflow is more complex and requires specific models:

  • Load Checkpoint: Loads the base Stable Diffusion model.
  • Load LoRA: This is crucial. Load the wan2.1_lora_1.1_fp16.safetensors LoRA. LoRAs are small models that modify the output of the main checkpoint, and this one is essential for the video effect.

3. CLIP Text Encode (Prompts for Motion)

Your prompts here will guide the animation. Describe the motion you want to see.

  • Positive Prompt: cinematic, a gentle breeze rustles the lion’s mane, slow motion.
  • Negative Prompt: static, frozen, still image, shaking.

4. Wan2.1 Img2Vid (The Animator)

This is the core node for this process. It takes your source image, the models, and your motion prompts and generates the animated frames.

5. Video Combine and Save

Just like in the text-to-video workflow, the Video Combine node assembles the frames into a video file, which is then saved and previewed.

6. Generate Your Video

Click Queue Prompt to start the process. The result will be a short video where your initial image is animated according to your instructions.

Advanced Techniques

Image-to-Image (img2img)

This technique, often called img2img, is perfect for transforming a sketch into a detailed image or applying a new style to an existing photo. Instead of starting from pure noise, you use an existing image as a guide.

  1. Add Load Image and VAE Encode nodes.
  2. Connect the encoded LATENT to your KSampler, replacing the Empty Latent Image node.
  3. Adjust Denoise: In the KSampler, lower the denoise value to 0.6-0.8.

Image-to-Image (img2img)

Inpainting

Regenerate a specific part of an image.

  1. Right-click a Load Image node, select Open in MaskEditor, and draw your mask.
  2. Use a VAE Encode (for Inpainting) node.
  3. Connect to KSampler. In your positive prompt, describe what you want in the masked area.

Inpainting, Connect to KSampler

Exploring Other Features

  • ControlNet: For precise control over poses and composition.
  • Loading Workflows from Images: Drag and drop a PNG from ComfyUI onto the canvas to instantly load its workflow.

Conclusion

ComfyUI, available to you via Seeweb, is a powerful framework for exploring AI art. Its node-based interface empowers you to take full control, moving beyond simple prompts to craft truly unique visuals. This tutorial is just the beginning. We encourage you to experiment, explore, and discover new workflows. Happy creating!

CONDIVIDI SUI SOCIAL

Lascia un commento

Il tuo indirizzo email non sarà pubblicato. I campi obbligatori sono contrassegnati *

71 − 68 =