POST
javascript
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 const axios = require('axios'); const api_key = "YOUR API-KEY"; const url = "https://api.segmind.com/v1/kling-2.6"; const data = { "prompt": "Visual: São Paulo, Brazil – a narrow street packed with vibrant murals in neon greens, reds, and yellows. The atmosphere is electric: street vendors, kids playing football, and local dancers vibing in the background. The city feels loud, alive, and unapologetic. Subject: A Black male rapper in raw streetwear — fitted cap, graphic tee, worn sneakers, simple chain. He stands front and center, locked in with the camera, body moving naturally to the rhythm, commanding the street like it’s his stage. Audio: [Male rapper, high-energy, gritty voice] Rapping over a hard street beat blended with Brazilian percussion: “Painted walls talk loud, yeah, they know my name, Came up from the block where the heat stay flame. City never soft, had to hustle, stay sharp, Turned struggle into fire, now I’m leavin’ my mark. Bass hit heavy, feel the ground when I step, From these streets to the world, yeah, I’m reppin’ respect.” Background: Heavy bass layered with live drum hits, claps, and subtle street noise — bikes passing, distant voices, raw urban texture. Camera: Rapid cuts between tight close-ups of his face and hands, wide shots of the colorful murals, quick flashes of dancers and street life. Handheld movement keeps it gritty and real, ending on a strong close-up as the beat hits.", "negative_prompt": "no noise, no distortion, no clutter", "cfg_scale": 0.8, "duration": 10, "mode": "pro", "aspect_ratio": "16:9", "generate_audio": true }; (async function() { try { const response = await axios.post(url, data, { headers: { 'x-api-key': api_key } }); console.log(response.data); } catch (error) { console.error('Error:', error.response.data); } })();
RESPONSE
image/jpeg
HTTP Response Codes
200 - OKImage Generated
401 - UnauthorizedUser authentication failed
404 - Not FoundThe requested URL does not exist
405 - Method Not AllowedThe requested HTTP method is not allowed
406 - Not AcceptableNot enough credits
500 - Server ErrorServer had some issue with processing

Attributes


image_urlstr ( default: 1 )

Link to video background. Use scenic images for dramatic effects.


promptstr *

Guides the scene. Try 'tranquil forest morning' for nature-themed visuals.


negative_promptstr ( default: no noise, no distortion, no clutter )

Removes unwanted elements. Use 'no noise' for cleaner visuals.


cfg_scalefloat ( default: 0.8 )

Adjusts style strength. Set to 0.8 for vivid details.

min : 0,

max : 1


durationenum:str ( default: 10 )

Defines video length. Use '10' for longer scenes.

Allowed values:


modeenum:str ( default: pro )

Controls output quality. Choose 'pro' for polished outcomes.

Allowed values:


aspect_ratioenum:str ( default: 16:9 )

Sets frame dimensions. Use '16:9' for widescreen.

Allowed values:


generate_audiobool ( default: true )

Includes music. Set to true for engaging audio.

To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.

Kling 2.6 Pro: Image-to-Video Model (with Native Audio)

What is Kling 2.6 Pro?

Kling 2.6 Pro is an advanced image-to-video generative AI model that turns a single still image plus a text prompt into a short, cinematic clip—with integrated, native audio. Provide an image_url as the visual starting frame and describe the scene in prompt; the model generates smooth motion, coherent visuals, and synchronized sound design (voice-like audio cues, ambience, and effects) to match the on-screen action.

It’s designed for teams who need fast, production-ready video generation via REST API, especially when you want the output to feel like a complete “scene” instead of silent B-roll.

Key Features

  • Image-to-video generation from one reference image (image_url)
  • Native audio generation with generate_audio for richer, immersive clips
  • Cinematic motion + realism: smooth camera movement and consistent scene continuity
  • Duration control: generate 5s or 10s videos (duration)
  • Prompt steering + cleanup: prompt and negative_prompt for precision
  • CFG control: cfg_scale (0–1) to balance prompt adherence vs. natural motion
  • Aspect ratios: 16:9, 9:16, 1:1 (aspect_ratio) for web, social, and product
  • Pro mode: set mode="pro" for polished outputs

Best Use Cases

  • Product explainers & promos: animate packshots into lifestyle motion with ambience
  • Cinematic social content: vertical 9:16 reels with camera moves and sound
  • Storytelling shorts: consistent scene building from a keyframe image
  • Brand mood films: landscapes, interiors, and stylized establishing shots
  • App/landing page hero media: loopable 5–10s visuals with audio atmosphere

Prompt Tips and Output Quality

  • Start with a clear scene + action + camera: “slow dolly-in”, “handheld”, “crane up”.
  • Add lighting and mood: “golden hour”, “neon reflections”, “soft fog”.
  • Use negative_prompt to prevent artifacts: “no distortion, no jitter, no clutter”.
  • Tune cfg_scale:
    • Higher (~0.7–1.0): stronger style/detail adherence
    • Lower (~0.3–0.6): more natural motion, fewer “over-directed” frames
  • Pick aspect_ratio early (e.g., 16:9 cinematic, 9:16 social) to avoid reframing.
  • Enable generate_audio=true when you want the scene to feel complete (ambience + SFX).

FAQs

Is Kling 2.6 Pro text-to-video?
It’s primarily image-to-video: you can include image_url as the visual anchor plus a detailed prompt.

Does it generate audio automatically?
Yes—set generate_audio to true to include native audio alongside the video.

What video lengths are supported?
duration supports "5" or "10" seconds.

How is it different from other image-to-video models?
Its standout is native audio generation plus coherent cinematic motion from a single image.

What parameters should I tweak first for best results?
Start with prompt + negative_prompt, then adjust cfg_scale, duration, and aspect_ratio.

What mode should I use?
Use mode="pro" (the available option) for high-quality, polished outputs.