1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
const axios = require('axios');
const api_key = "YOUR API-KEY";
const url = "https://api.segmind.com/v1/kling-2.6";
const data = {
"prompt": "Visual: São Paulo, Brazil – a narrow street packed with vibrant murals in neon greens, reds, and yellows. The atmosphere is electric: street vendors, kids playing football, and local dancers vibing in the background. The city feels loud, alive, and unapologetic. Subject: A Black male rapper in raw streetwear — fitted cap, graphic tee, worn sneakers, simple chain. He stands front and center, locked in with the camera, body moving naturally to the rhythm, commanding the street like it’s his stage. Audio: [Male rapper, high-energy, gritty voice] Rapping over a hard street beat blended with Brazilian percussion: “Painted walls talk loud, yeah, they know my name, Came up from the block where the heat stay flame. City never soft, had to hustle, stay sharp, Turned struggle into fire, now I’m leavin’ my mark. Bass hit heavy, feel the ground when I step, From these streets to the world, yeah, I’m reppin’ respect.” Background: Heavy bass layered with live drum hits, claps, and subtle street noise — bikes passing, distant voices, raw urban texture. Camera: Rapid cuts between tight close-ups of his face and hands, wide shots of the colorful murals, quick flashes of dancers and street life. Handheld movement keeps it gritty and real, ending on a strong close-up as the beat hits.",
"negative_prompt": "no noise, no distortion, no clutter",
"cfg_scale": 0.8,
"duration": 10,
"mode": "pro",
"aspect_ratio": "16:9",
"generate_audio": true
};
(async function() {
try {
const response = await axios.post(url, data, { headers: { 'x-api-key': api_key } });
console.log(response.data);
} catch (error) {
console.error('Error:', error.response.data);
}
})();Link to video background. Use scenic images for dramatic effects.
Guides the scene. Try 'tranquil forest morning' for nature-themed visuals.
Removes unwanted elements. Use 'no noise' for cleaner visuals.
Adjusts style strength. Set to 0.8 for vivid details.
min : 0,
max : 1
Defines video length. Use '10' for longer scenes.
Allowed values:
Controls output quality. Choose 'pro' for polished outcomes.
Allowed values:
Sets frame dimensions. Use '16:9' for widescreen.
Allowed values:
Includes music. Set to true for engaging audio.
To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.
Kling 2.6 Pro is an advanced image-to-video generative AI model that turns a single still image plus a text prompt into a short, cinematic clip—with integrated, native audio. Provide an image_url as the visual starting frame and describe the scene in prompt; the model generates smooth motion, coherent visuals, and synchronized sound design (voice-like audio cues, ambience, and effects) to match the on-screen action.
It’s designed for teams who need fast, production-ready video generation via REST API, especially when you want the output to feel like a complete “scene” instead of silent B-roll.
image_url)generate_audio for richer, immersive clipsduration)prompt and negative_prompt for precisioncfg_scale (0–1) to balance prompt adherence vs. natural motion16:9, 9:16, 1:1 (aspect_ratio) for web, social, and productmode="pro" for polished outputsnegative_prompt to prevent artifacts: “no distortion, no jitter, no clutter”.cfg_scale:
aspect_ratio early (e.g., 16:9 cinematic, 9:16 social) to avoid reframing.generate_audio=true when you want the scene to feel complete (ambience + SFX).Is Kling 2.6 Pro text-to-video?
It’s primarily image-to-video: you can include image_url as the visual anchor plus a detailed prompt.
Does it generate audio automatically?
Yes—set generate_audio to true to include native audio alongside the video.
What video lengths are supported?
duration supports "5" or "10" seconds.
How is it different from other image-to-video models?
Its standout is native audio generation plus coherent cinematic motion from a single image.
What parameters should I tweak first for best results?
Start with prompt + negative_prompt, then adjust cfg_scale, duration, and aspect_ratio.
What mode should I use?
Use mode="pro" (the available option) for high-quality, polished outputs.