1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
const axios = require('axios');
const FormData = require('form-data');
const api_key = "YOUR API-KEY";
const url = "https://api.segmind.com/v1/wan-2.6-t2v";
const reqBody = {
"size": "1280*720",
"prompt": "Humorous but premium mini-trailer: a rugged caveman explorer sparks \"evolution\" by grunting simple commands that instantly upgrade his world and morph his own form through the ages. Extreme photoreal 4K, cinematic lighting, subtle film grain, smooth camera. No subtitles, no UI, no watermark.\nShot 1 [0-3s] Macro close-up on the caveman's weathered face and hairy knuckles clutching a jagged stone axe in a misty prehistoric dawn. He grunts deeply: \"Better!\"\nShot 2 [3-6s] Hard cut: Bustling ancient forge at golden hour, sparks flying. The caveman, now in leather tunic with a bronze sword, hammers metal confidently. Camera dollies in as he bellows: \"Faster!\"\nShot 3 [6-10s] Hard cut: Steampunk factory amid rainy industrial night, gears whirring. He transforms into a suited inventor with goggles, cranking a massive machine that hums to life. Slow zoom on his evolving eyes as he commands: \"Smarter!\"\nShot 4 [10-15s] Hard cut: Futuristic AI lab bathed in neon glow, holographic interfaces pulsing. The caveman, sleek in neural-linked exosuit, interfaces with a glowing orb; his form subtly digitizes. He smiles knowingly: \"Evolved. What's next?\"",
"duration": 5,
"multi_shots": false,
"negative_prompt": "low resolution, error, worst quality, low quality, defects",
"enable_prompt_expansion": true
};
(async function() {
try {
const formData = new FormData();
// Append regular fields
for (const key in reqBody) {
if (reqBody.hasOwnProperty(key)) {
formData.append(key, reqBody[key]);
}
}
// Convert and append images as Base64 if necessary
const response = await axios.post(url, formData, {
headers: {
'x-api-key': api_key,
...formData.getHeaders()
}
});
console.log(response.data);
} catch (error) {
console.error('Error:', error.response ? error.response.data : error.message);
}
})();Random seed for reproducible generation
An enumeration.
Allowed values:
Audio file (wav/mp3, 3-30s, ≤15MB) for voice/music synchronization
Text prompt for video generation
An enumeration.
Allowed values:
Enable intelligent multi-shot segmentation (only active when enable_prompt_expansion is enabled). True enables multi-shot segmentation, false generates single-shot content.
Negative prompt to avoid certain elements
If set to true, the prompt optimizer will be enabled
To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.
Edited by Segmind Team on December 18, 2025.
Alibaba Wan 2.6 is a high-performance AI model with text-to-video capabilities that convert written prompts, images, audio, or reference clips into cinematic 1080p videos. Wan 2.6 can create multi-shot sequences up to 15 seconds long with precise character consistency and smooth narrative flow, all this with no traditional filming. The model perfectly syncs audio with visuals and delivers realistic studio-quality videos, making it ideal for marketers, educators, product teams, and social media creators who need high-quality videos in a fast-paced environment.
What sets Wan 2.6 apart from basic video creation models is its ability to maintain continuity even through multiple scenes, propelling complex storytelling with dynamic camera work and seamless transitions.
Writing effective prompts:
Parameter recommendations:
Keep enable_prompt_expansion on for richer visual detail and better scene interpretation.
Use multi_shots for narrative content that requires multiple perspectives or scene changes.
Set duration to 15 seconds when telling complex stories; use 5 seconds for simple product shots.
Choose 1920×1080 for YouTube and presentations; 1080×1920 for mobile-first platforms.
Leverage negative_prompt to exclude unwanted elements like "blurry, low quality, distorted faces."
For reproducible results, lock the seed value.
When iterating, adjust only one parameter at a time to understand its impact.
Is Alibaba Wan 2.6 open-source?
No, Wan 2.6 is a proprietary model developed by Alibaba. It is accessible through API integrations and platforms like Segmind.
How does Wan 2.6 compare to other text-to-video models?
Wan 2.6 stands out with its gamut of capabilities that include: native audio synchronization, realistic lip-sync capabilities, and superior multi-shot composition compared to single-scene generators. Additionally, its character continuity across shots rivals models used in professional production pipelines.
What audio formats does it support?
The model accepts WAV and MP3 files between 3 and 30 seconds. It supports frame-by-frame audio synchronization for accurate lip movement and timing.
Can I control the video aspect ratio?
Yes, you can choose from four presets: 1280×720, 720×1280 (vertical), 1920×1080 (landscape), or 1080×1920 (portrait) based on the specific platform.
What parameters should I tweak for the best results?
Does the seed parameter affect video content?
Yes, using the same seed with identical parameters produces consistent outputs; it is essential for A/B testing prompts or to maintain brand consistency across video variants.