1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
const axios = require('axios');
const api_key = "YOUR API-KEY";
const url = "https://api.segmind.com/v1/kling-v1-standard-ai-avatar";
const data = {
"image_url": "https://segmind-resources.s3.amazonaws.com/input/1be1777d-afdd-4636-8df2-fa168a9b01db-kling-video-v1-pro-ai-avatar-input.png",
"audio_url": "https://segmind-resources.s3.amazonaws.com/input/e8cf45f0-2f54-4bfd-922a-70a6d889b04c-kling-std-ai-avatar.mp3",
"prompt": "Create an energetic welcome message with the AI avatar."
};
(async function() {
try {
const response = await axios.post(url, data, { headers: { 'x-api-key': api_key } });
console.log(response.data);
} catch (error) {
console.error('Error:', error.response.data);
}
})();
The URL for the video background. Choose a high-quality URL for clear visuals.
Audio URL for syncing with visuals. Opt for high-bitrate files for quality sound.
Provide a directional prompt. Use clear instructions for specific outcomes.
To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.
Edited by Segmind Team on October 12, 2025.
Kwaivgi Kling V1 is a recently launched AI model that brings AI avatars to life. The model is developed for the WaveSpeedAI platform and is capable of rendering avatars synced with audio that have a natural movement from basic static images. Kwaivgi Kling V1 is a great option for creators to design realistic, high-end digital AI-based characters in multiple forms and for various multimedia projects, such as videos, storytelling, and many more. Its incredible lip-sync technology synergistically works with impressive visual clarity to produce the finest output.
Additionally, structure your prompts with clear directives about:
What input formats does Kwaivgi Kling V1 support? The model accepts image URLs for visuals and audio URLs for speech content; both should be high-quality for the best results.
How detailed should the prompt be? The default prompt creates an energetic welcome message, and prompts are optional. But if you provide clear instructions about the desired presentation style and energy level, you will get significantly better results.
Can I customize the avatar's expressions? Yes, you can control the avatar's expressions and emotional output through detailed prompts and audio input.
What's the ideal use case for this model? The model is ideal when you need to create engaging video content with a lifelike AI presenter, like in educational content, corporate communications, and digital entertainment.
How does the lip-sync feature work? The model processes the audio input to accurately synchronize the avatar’s lip movements with the speech, ensuring a smooth and natural blending of visual expressions with speech.