POST

javascript

const axios = require('axios');
const FormData = require('form-data');


const api_key = "YOUR API-KEY";
const url = "https://api.segmind.com/v1/wan-2.5-i2v";

const reqBody = {
  "seed": 42,
  "image": "https://segmind-resources.s3.amazonaws.com/output/21aeb463-bb17-4536-864b-0bd1e11594a9-EMXN1y8qTgoGdXBsb2FkEg55bGFiLXN0dW50LXNncBo0YWlfcG9ydGFsLzE3NTM5NjM5NTIvbjdwNDlYOURCbS8yZTk0X2wwXzAwMS0wXzAuanBlZw_1000x1000.webp",
  "prompt": "Kitten in a McDonald's uniform stands on a stool, grilling burger patties. It flips the patties with a spatula, watches them sizzle, and occasionally looks around while steam rises from the grill.",
  "duration": 5,
  "resolution": "720p",
  "negative_prompt": "unnecessary clutter, dark shadows",
  "enable_prompt_expansion": true
};

(async function() {
    try {
        const formData = new FormData();
        
        // Append regular fields
        for (const key in reqBody) {
            if (reqBody.hasOwnProperty(key)) {
                formData.append(key, reqBody[key]);
            }
        }

        // Convert and append images as Base64 if necessary
        
        
        const response = await axios.post(url, formData, {
            headers: {
                'x-api-key': api_key,
                ...formData.getHeaders()
            }
        });
        console.log(response.data);
    } catch (error) {
        console.error('Error:', error.response ? error.response.data : error.message);
    }
})();

RESPONSE

image/jpeg

HTTP Response Codes

200 - OKImage Generated

401 - UnauthorizedUser authentication failed

404 - Not FoundThe requested URL does not exist

405 - Method Not AllowedThe requested HTTP method is not allowed

406 - Not AcceptableNot enough credits

500 - Server ErrorServer had some issue with processing

Attributes

seedint ( default: 42 )

Sets a random seed for consistent outputs. Use values between 1 and 100 for variation.

audiostr ( default: 1 )

Upload an audio file for syncing. Use a song clip or melody for dynamic results.

imagestr *

Input image for video generation. Choose a high-resolution image for best quality.

promptstr *

Text description for video creation. Include vivid visuals for creative animations.

durationenum:str ( default: 1 )

Sets the video length. Choose 5 seconds for shorter clips and 10 for longer scenes.

Allowed values:

resolutionenum:str ( default: 720p )

Set video quality. Use 1080p for high-quality renders and 480p for faster results.

Allowed values:

negative_promptstr ( default: unnecessary clutter, dark shadows )

Avoid certain elements in generation. Include unwanted objects or colors for exclusion.

enable_prompt_expansionbool ( default: true )

Activates prompt optimizer for enhanced results. Set to true for more detailed outputs.

To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.

Wan2.5-Preview: Multimodal AI Video Generation Model

Edited by Segmind Team on September 28, 2025.

What is Wan2.5-Preview?

Wan2.5-Preview takes a revolutionary approach when it comes to multimodal AI in multimedia content creation. It can smoothly merge text, image, video, and audio to render a cohesive and unified audio-visual output. It is equipped to produce high-fidelity cinematic 1080p videos up to 10 seconds in length, with synchronized multi-audio tracks to include voice, sound effects, and music. It is essentially a holistic model for a multitude of platforms and professional creators across several industries.

Key Features Wan2.5-Preview

Advanced multimodal input processing: This multimodal platform accepts text, images, video, and audio as inputs to create high-quality videos.
High-fidelity video - It is designed to render 1080p videos with a customizable duration of 5-10 seconds.
Multi-track audio synchronization: It is capable of hormoniously aligning multiple audio tracks with video.
Enhanced instruction adherence: It closely follows instructions to deliver precise visual outputs.
Flexible resolution options: It offers 480p, 720p, or 1080p resolutions for different use cases.
Intelligent prompt expansion: It automatically refines prompts for better results.
Controllable generation: It gives the users the option to include negative prompts that can prevent unwanted aspects in outputs.

Best Use Cases

Content creation and editing: It is ideal for making and editing professional videos across multiple platforms.
Music video production: It is perfect for producing videos with precise audio-visual sync.
Marketing and advertising: It helps in creating impactful and result-oriented promotional videos.
Educational content: It is ideal for educational videos with synchronized narration and voice explanations.
Digital art and animation: It supports creative projects in art and animation.
Professional presentation: It is useful for producing professional business presentations.
Social media: It enables content development for engaging social media videos.

Prompt Tips and Output Quality

The prompts should be clear and vivid with visual descriptions
Clearly specify movements and transitions you envision in the video
Use the negative prompt parameter to root out unwanted elements
Enable prompt expansion for videos with detailed and accurate outputs
Set consistent seeds (1-100) when you need reproducible results
Upload high-quality source images for sharper video outputs
Utilize audio files that complement the visual narrative

FAQs

How does Wan2.5-Preview handle audio synchronization?

The Wan2.5-Preview processes multiple audio tracks simultaneously and automatically aligns them with visual elements for professional-grade synchronization. Furthermore, you can upload an audio file via the audio parameter for incredible results.

What's the optimal resolution for different use cases?

With multiple resolution options, you can select 1080p for professional content, 720p for balanced quality and processing time, and 480p for rapid prototyping or preview.

How can I ensure consistent outputs?

You can use the seed parameter (1-100) to maintain consistency across multiple video generations; the same seed with identical inputs will produce similar results.

What makes Wan2.5-Preview different from other video generation models?

Wan2.5-Preview follows the unified approach to multimodal inputs, superior audio-visual synchronization, and high-resolution output capabilities, making it an excellent option when compared to other models. It further excels in maintaining visual quality while handling complex audio integration.

Can I control the artistic style of the generated videos?

Yes, you can easily control the artistic style of the videos by providing detailed prompting and enabling the prompt expansion option. You may also use specific style descriptions in your prompt for more precise artistic control.

Popular Models

Faceswap V2 Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training

SDXL Inpaint This model is capable of generating photo-realistic images given any text input, with the extra capability of inpainting the pictures by using a mask

Stable Diffusion XL 1.0 The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software

Faceswap Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training