Configuration Guide¶
This guide explains all configuration options available in Smart-Diffusion.
Configuration Levels¶
Smart-Diffusion uses a three-tier configuration system:
1. Model Parameters (Static)¶
Location: chitu_core/config/models/<model>.yaml
Purpose: Define model architecture
Can be changed: No (tied to checkpoint weights)
Examples: - Number of layers - Hidden dimensions - Attention heads - Model-specific hyperparameters
2. User Parameters (Dynamic)¶
Location: DiffusionUserParams class
Purpose: Control per-request generation
Can be changed: Yes (for each generation)
Key Parameters:
DiffusionUserParams(
# Basic
role="user1", # User identifier
prompt="A cat on grass", # Text prompt
# Video properties
height=480, # Video height in pixels
width=848, # Video width in pixels
num_frames=81, # Number of frames
fps=24, # Frames per second
# Generation quality
num_inference_steps=50, # Denoising steps (30-100)
guidance_scale=7.0, # CFG scale (5.0-15.0)
# Advanced
seed=None, # Random seed (None = random)
save_path=None, # Output path (None = auto)
flexcache=None, # Legacy cache strategy field
flexcache_params=FlexCacheParams(
strategy="teacache", # 'teacache' / 'pab' / 'ditango'
cache_ratio=0.4, # 0 quality-first, 1 speed-first
warmup=5, # First 5 steps full compute
cooldown=5, # Last 5 steps full compute
),
)
3. System Parameters (Semi-static)¶
Location: Launch arguments (command line or config files)
Purpose: Configure system behavior
Can be changed: Only at initialization
Categories: - Model selection - Parallelism configuration - Memory management - Attention backends - Evaluation settings
Recommended system_config.yaml Template¶
launch:
tag: my-exp
num_nodes: 1
gpus_per_node: 4
python_script: test/test_generate.py
enable_launch_log: false
parallel:
cfp: 1 # only 1 or 2
infer:
attn_type: flash_attn
low_mem_level: 0
enable_flexcache: true
up_limit: 81
output:
root_dir: outputs
enable_run_log: true
enable_timer_dump: true
hydra_dump_mode: video_dir # default/video_dir/off
Launcher behavior tied to this file:
- launch.tag is exported as CHITU_RUN_TAG and prefixes run output directory names.
- parallel.cfp maps to infer.diffusion.cfg_size.
- infer.diffusion.cp_size is auto-derived as (num_nodes * gpus_per_node) / cfp.
System Configuration¶
Model Selection¶
# Specify model name
models.name="Wan2.1-T2V-14B"
# Specify checkpoint directory
models.ckpt_dir="/path/to/checkpoint"
Supported models:
- Wan2.1-T2V-1.3B
- Wan2.1-T2V-14B
- Wan2.2-T2V-A14B
Attention Backend¶
Options:
- flash_attn - Default FlashAttention (accurate, fast)
- sage - SageAttention (quantized, performance testing in progress)
- sparge - SpargeAttention (sparse, performance testing in progress)
- auto - Automatically select best available
Memory Management¶
Levels: - 0: All models on GPU (default) - 1: Enable VAE tiling - 2: Offload T5 encoder to CPU - 3+: Offload DiT model to CPU
Parallelism¶
Context Parallelism¶
# Split sequence across GPUs
infer.diffusion.cp_size=<num_gpus>
infer.diffusion.up_limit=<seq_length>
Example:
CFG Parallelism¶
# Automatically enabled when world_size >= 2 and CFG is active
# Can be explicitly controlled:
infer.diffusion.cfg_size=<num_gpus>
Options:
- 1: No CFG parallelism
- 2: Split positive/negative prompts
When using run.sh, prefer setting parallel.cfp (or --cfp) instead of directly setting infer.diffusion.cfg_size.
FlexCache¶
# Enable feature cache in system_config.yaml
infer.enable_flexcache=true
# Runtime Hydra override applied by launcher
infer.diffusion.enable_flexcache=true
Then set cache type in user parameters:
from chitu_diffusion.task import DiffusionUserParams, FlexCacheParams
DiffusionUserParams(
prompt="...",
flexcache_params=FlexCacheParams(
strategy='teacache',
cache_ratio=0.4,
warmup=5,
cooldown=5,
)
)
Legacy style is still supported:
Evaluation¶
# Enable automatic evaluation (multi-select)
eval.eval_type=[vbench,fid,psnr]
eval.reference_path=/path/to/reference_videos
Options:
- []/null - No evaluation (default)
- vbench - VBench custom-mode evaluation
- fid - Frechet Inception Distance (needs eval.reference_path)
- fvd - Frechet Video Distance (needs eval.reference_path)
- psnr - Peak Signal-to-Noise Ratio (needs eval.reference_path)
- ssim - Structural Similarity (needs eval.reference_path)
- lpips - Perceptual similarity LPIPS (needs eval.reference_path)
If eval.reference_path is missing or invalid, reference-based metrics are skipped with warning while other selected metrics continue.
Other Settings¶
# Random seed
infer.seed=42
# Precision
float_16bit_variant="bfloat16" # or "float16"
# Output directory
output_dir="./outputs"
# Logging level
logging_level="INFO" # or "DEBUG"
Output and Runtime Metadata¶
output.hydra_dump_mode:default: keep Hydra metadata in Hydra runtime output directoryvideo_dir: relocate.hydrainto each run video directoryoff: remove Hydra metadata directory after run
output.enable_timer_dump=truewritestime_stats.csvinto the run output directory.launch.enable_launch_log=truewrites launch stdout/stderr tooutput.root_dir/launch_<timestamp>.log.
Configuration Files¶
Using Hydra¶
Smart-Diffusion uses Hydra for configuration management.
Default config: config/wan.yaml
Override from command line:
python test_generate.py \
models.name=Wan2.1-T2V-14B \
models.ckpt_dir=/path/to/checkpoint \
infer.attn_type=sage \
infer.diffusion.low_mem_level=2
Create custom config:
# config/my_config.yaml
models:
name: Wan2.1-T2V-14B
ckpt_dir: /path/to/checkpoint
infer:
attn_type: sage
seed: 42
diffusion:
low_mem_level: 2
cp_size: 1
enable_flexcache: false
output_dir: ./my_outputs
Use with:
Environment Variables¶
Smart-Diffusion respects several environment variables:
# Enable debug mode
export CHITU_DEBUG=1
# Set CUDA device
export CUDA_VISIBLE_DEVICES=0,1
# Distributed training
export MASTER_ADDR=localhost
export MASTER_PORT=29500
export WORLD_SIZE=2
export RANK=0
export LOCAL_RANK=0
Common Configuration Patterns¶
High Quality, Slow¶
python test_generate.py \
models.name=Wan2.1-T2V-14B \
infer.attn_type=flash_attn \
infer.diffusion.low_mem_level=0
DiffusionUserParams(
prompt="...",
height=720,
width=1280,
num_frames=121,
num_inference_steps=100,
guidance_scale=9.0,
)
Fast, Lower Quality¶
python test_generate.py \
models.name=Wan2.1-T2V-1.3B \
infer.attn_type=sparge \
infer.diffusion.low_mem_level=1
DiffusionUserParams(
prompt="...",
height=480,
width=848,
num_frames=61,
num_inference_steps=30,
guidance_scale=7.0,
flexcache='teacache',
)
Low Memory¶
python test_generate.py \
models.name=Wan2.1-T2V-14B \
infer.attn_type=sage \
infer.diffusion.low_mem_level=3
Configuration Best Practices¶
- Start with defaults: Test with default settings first
- Adjust incrementally: Change one parameter at a time
- Monitor resources: Watch GPU memory and utilization
- Profile performance: Measure impact of each change
- Document your settings: Keep track of what works