Join the Community

Subscribe to our newsletter for the latest news and updates

Wan 2.1 is a revolutionary set of video foundation models that sets new standards for video production. It utilizes an advanced 3D VAE architecture combined with improved diffusion transformer technology, delivering outstanding performance on consumer-grade GPUs. This adaptable model offers both text-to-video and image-to-video functionalities, distinguishing itself as the first to provide text generation in English and Chinese. Features:

Outstanding Performance: Surpasses both commercial and open-source alternatives.
Consumer GPU Compatibility: Operates on an RTX 4090, requiring only 8.19GB of VRAM.
Versatile Functionality: Capable of performing tasks such as Text-to-Video and Image-to-Video, among others.
Groundbreaking Text Incorporation: The first video model to support English and Chinese text.
Advanced Video VAE: Processes 1080P videos of any length while maintaining temporal consistency.
Multi-Resolution Capability: Produces high-quality videos in both 480P and 720P resolutions.
Open-Source Licensing: Available under the Apache 2.0 license, ensuring clear usage rights and robust community backing.
Resource Efficiency: Generates 5-second 480P videos in just 4 minutes on typical consumer GPUs.

Uses:

Leverage AI technology to generate videos based on written prompts
Transform static images into engaging video presentations
Explore various video styles in an interactive environment
Produce multilingual videos that include text in both English and Chinese
Quickly prototype AI initiatives that focus on video content

FAQ:

Q: What makes Wan 2.1 different from other video AI models? A: Wan 2.1 sets itself apart by combining advanced performance with the ability to run efficiently on consumer-grade GPUs, requiring only 8.19GB VRAM, and outshining both open-source and commercial competitors.
Q: Which video resolutions are supported by Wan 2.1? A: Wan 2.1 can generate videos in 480P and 720P. The 14B model supports both resolutions, while the optimized 1.3B model is specifically tailored for 480P resolution.
Q: Is Wan 2.1 suitable for professional use? A: Yes, indeed! The 14B model offers enterprise-level performance, and for smaller projects, the 1.3B version provides a more accessible solution.
Q: What is distinctive about the architecture of Wan 2.1? A: Wan 2.1 features a cutting-edge architecture that includes a 3D causal VAE design along with an advanced diffusion transformer, enhancing video generation efficiency.
Q: Can Wan 2.1 handle multiple languages? A: Absolutely! Wan 2.1 is revolutionary as it is the first video model capable of generating videos that incorporate both Chinese and English text, demonstrating impressive text generation capabilities.