Wan2.1 is a cutting-edge suite of video foundation models that excels in text-to-video (T2V) generation, pushing the boundaries of what's possible. It consistently outperforms existing open-source and commercial solutions across multiple benchmarks.
SOTA Performance: Consistently outperforms existing open-source and commercial models across multiple benchmarks.
Powerful Video VAE: Wan-VAE delivers exceptional efficiency and performance, encoding and decoding 1080P videos of any length while preserving temporal information.
Architecture: Designed on the mainstream diffusion transformer paradigm with innovations like a novel spatio-temporal variational autoencoder (VAE).
T2V-14B: Supports both 480P and 720P resolutions. It establishes a new SOTA performance benchmark.
T2V-1.3B: Supports 480P resolution. While capable of generating videos at 720P, the 480P resolution provides more stable results
The models are licensed under the Apache 2.0 License, granting freedom of use while ensuring compliance with the license provisions.
Extensive manual evaluations confirm that Wan2.1 outperforms both closed-source and open-source models