Installation¶
This guide will help you install Smart-Diffusion on your system.
Prerequisites¶
Before installing Smart-Diffusion, ensure you have:
- Python: Version 3.12 or higher
- CUDA: Version 12.4 or higher (12.8 recommended)
- GPU: NVIDIA GPU with compute capability:
- 8.0+ (Ampere: A100, A10, etc.)
- 9.0+ (Hopper: H100, H20, etc.)
- 9.0+ (Blackwell: B100, B200, 5090, etc.)
Installation Methods¶
Method 1: Using uv (Recommended)¶
uv is a fast Python package manager that simplifies the installation process.
Step 1: Clone the Repository¶
git clone git@github.com:chen-yy20/SmartDiffusion.git
cd SmartDiffusion
git submodule update --init --recursive
Step 2: Install uv¶
For more installation options, see the uv documentation.
Step 3: Configure CUDA Version¶
Check your CUDA version:
Edit pyproject.toml to match your CUDA version. For example, for CUDA 12.8:
[[tool.uv.index]]
name = "pytorch-cu128"
url = "https://download.pytorch.org/whl/cu128"
explicit = true
[tool.uv.sources]
torch = { index = "pytorch-cu128" }
torchvision = { index = "pytorch-cu128" }
Available CUDA versions: cu124, cu126, cu128, cu130
Step 4: Configure GPU Architecture¶
Set the TORCH_CUDA_ARCH_LIST in pyproject.toml according to your GPU:
[tool.uv.extra-build-variables]
sageattention = {
EXT_PARALLEL= "4",
NVCC_APPEND_FLAGS="--threads 8",
MAX_JOBS="32",
"TORCH_CUDA_ARCH_LIST" = "8.0;9.0" # Adjust based on your GPU
}
spas_sage_attn = {
EXT_PARALLEL= "4",
NVCC_APPEND_FLAGS="--threads 8",
MAX_JOBS="32",
"TORCH_CUDA_ARCH_LIST" = "8.0;9.0" # Adjust based on your GPU
}
Common architectures:
- Ampere (A100, A10): 8.0
- Hopper (H100, H20): 9.0
- Blackwell (B100, B200, 5090): 9.0
Step 5: Install Dependencies¶
Basic Installation (FlashAttention only):
Full Installation (with SageAttention and SpargeAttention):
Build Time
Full installation takes approximately 10 minutes on a machine with 32 cores and 256GB RAM.
Method 2: Using pip¶
Step 1: Clone the Repository¶
git clone git@github.com:chen-yy20/SmartDiffusion.git
cd SmartDiffusion
git submodule update --init --recursive
Step 2: Install Dependencies¶
Step 3: Install Smart-Diffusion¶
Method 3: Docker (Coming Soon)¶
Docker support is planned for future releases.
Installing Flash Attention¶
Flash Attention can be installed via prebuilt wheels:
- Visit Flash Attention Releases
- Download the wheel matching your Python version and CUDA version
- Install:
pip install flash_attn-*.whl
Or build from source (requires ~10-15 minutes):
Verifying Installation¶
After installation, verify that Smart-Diffusion is correctly installed:
Check available attention backends:
from chitu_diffusion.modules.attention.diffusion_attn_backend import DiffusionAttnBackend
# This will list available attention types
backend = DiffusionAttnBackend("auto")
Troubleshooting¶
CUDA Version Mismatch¶
If you encounter CUDA version mismatches:
- Check your installed CUDA:
nvcc --version - Update
pyproject.tomlto match your CUDA version - Reinstall:
uv sync -v --reinstall
Out of Memory During Build¶
If compilation fails due to memory:
- Reduce
MAX_JOBSinpyproject.toml - Retry the build
Symbol Link Issues with Flash Attention¶
If you encounter symbol link issues with Flash Attention, uncomment the source build option in pyproject.toml:
[tool.uv.extra-build-variables]
flash_attn = {
FLASH_ATTN_CUDA_ARCHS = "80",
FLASH_ATTENTION_FORCE_BUILD = "TRUE"
}
Next Steps¶
Once installation is complete, proceed to:
- Quick Start Guide - Run your first generation
- Configuration Guide - Learn about configuration options
- Basic Usage - Explore the API
Getting Help¶
If you encounter issues:
- Check the FAQ
- Search existing issues
- Open a new issue with:
- Your system configuration
- Installation method used
- Error messages and logs