POST

javascript

const axios = require('axios');


const api_key = "YOUR API-KEY";
const url = "https://api.segmind.com/v1/kling-v1-standard-ai-avatar";

const data = {
  "image_url": "https://segmind-resources.s3.amazonaws.com/input/1be1777d-afdd-4636-8df2-fa168a9b01db-kling-video-v1-pro-ai-avatar-input.png",
  "audio_url": "https://segmind-resources.s3.amazonaws.com/input/e8cf45f0-2f54-4bfd-922a-70a6d889b04c-kling-std-ai-avatar.mp3",
  "prompt": "Create an energetic welcome message with the AI avatar."
};

(async function() {
    try {
        const response = await axios.post(url, data, { headers: { 'x-api-key': api_key } });
        console.log(response.data);
    } catch (error) {
        console.error('Error:', error.response.data);
    }
})();

RESPONSE

image/jpeg

HTTP Response Codes

200 - OKImage Generated

401 - UnauthorizedUser authentication failed

404 - Not FoundThe requested URL does not exist

405 - Method Not AllowedThe requested HTTP method is not allowed

406 - Not AcceptableNot enough credits

500 - Server ErrorServer had some issue with processing

Attributes

image_urlstr *

The URL for the video background. Choose a high-quality URL for clear visuals.

audio_urlstr *

Audio URL for syncing with visuals. Opt for high-bitrate files for quality sound.

promptstr ( default: Create an energetic welcome message with the AI avatar. )

Provide a directional prompt. Use clear instructions for specific outcomes.

To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.

Kwaivgi Kling V1: AI Avatar Generation Model

Edited by Segmind Team on October 12, 2025.

What is Kwaivgi Kling V1?

Kwaivgi Kling V1 is a recently launched AI model that brings AI avatars to life. The model is developed for the WaveSpeedAI platform and is capable of rendering avatars synced with audio that have a natural movement from basic static images. Kwaivgi Kling V1 is a great option for creators to design realistic, high-end digital AI-based characters in multiple forms and for various multimedia projects, such as videos, storytelling, and many more. Its incredible lip-sync technology synergistically works with impressive visual clarity to produce the finest output.

Key Features of Kwaivgi Kling V1

Advanced lip-sync technology is capable of producing flawless audio-visual synchronization
High-fidelity avatar generation from static images gives impressive outputs with high fidelity
Natural animation capabilities render AI avatars that are smooth and realistic
Customizable avatar expressions and movements enable users to improve upon the output
Seamless integration with existing video content enables creators to use the generated avatars for their multimedia projects
Support for various audio input formats makes it usable across various platforms
Real-time processing capabilities help to test the output and make the required changes without delay

Best Use Cases

Virtual Presenters and Digital Hosts
Educational Content Creation
Gaming Character Development
Corporate Training Videos
Virtual Customer Service Representatives
Social Media Content Production
Digital Entertainment Applications
Live Streaming Avatars

Prompt Tips and Output Quality

Image Selection: Upload high-resolution portrait images for the finest avatar quality
Audio Preparation: Use clear, well-recorded audio files for perfectly lip-sync results
Prompt Structure: Include clear and precise instructions to achieve desired emotions and expressions
Background Compatibility: Ensure background images complement avatar placement
Animation Style: Define the energy level and presentation style in the prompt

Additionally, structure your prompts with clear directives about:

Desired emotional tone
Speaking style and pace
Facial expressions and gestures
Overall presentation energy

FAQs

What input formats does Kwaivgi Kling V1 support? The model accepts image URLs for visuals and audio URLs for speech content; both should be high-quality for the best results.

How detailed should the prompt be? The default prompt creates an energetic welcome message, and prompts are optional. But if you provide clear instructions about the desired presentation style and energy level, you will get significantly better results.

Can I customize the avatar's expressions? Yes, you can control the avatar's expressions and emotional output through detailed prompts and audio input.

What's the ideal use case for this model? The model is ideal when you need to create engaging video content with a lifelike AI presenter, like in educational content, corporate communications, and digital entertainment.

How does the lip-sync feature work? The model processes the audio input to accurately synchronize the avatar’s lip movements with the speech, ensuring a smooth and natural blending of visual expressions with speech.

Popular Models

SadTalker Audio-based Lip Synchronization for Talking Head Video

IDM VTON Best-in-class clothing virtual try on in the wild

face-to-many Turn a face into 3D, emoji, pixel art, video game, claymation or toy

SDXL Inpaint This model is capable of generating photo-realistic images given any text input, with the extra capability of inpainting the pictures by using a mask