Model Families
Each family is a distinct transformer backbone with its own characteristics.
| Family | Modality | Default Backend | Status |
|---|
flux | image | nunchaku | active |
zimage | image | nunchaku | active |
wan | video | torch | coming soon |
Backends
Three inference backends, each with different tradeoffs:
| Backend | Stack | Notes |
|---|
nunchaku | NVFP4 on Blackwell | Fastest, 4-bit quantized |
torch | diffusers + CUDA | Flexible, full precision |
tensorrt | TRT-LLM + ModelOpt | NVIDIA-optimized, production |
| Format | Dimensions | Aspect |
|---|
1024 | 1024×1024 | 1:1 |
512 | 512×512 | 1:1 |
portrait | 768×1024 | 3:4 |
landscape | 1024×768 | 4:3 |
| Format | Dimensions | Aspect |
|---|
720p | 1280×720 | 16:9 |
480p | 832×480 | ~16:9 |
square | 640×640 | 1:1 |
Tasks
| Task | Requires | Produces | Description |
|---|
t2v | prompt | video | text to video |
i2v | prompt + image | video | animate a still |
t2i | prompt | image | text to image |
i2i | prompt + image | image | transform / style transfer |
edit | prompt + image + mask | image | inpaint / outpaint |