POST
javascript
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 const axios = require('axios'); const api_key = "YOUR API-KEY"; const url = "https://api.segmind.com/v1/kling-o1-video-to-video-edit"; const data = { "prompt": "change the background of the video with the background in @image1 maintain the character as it is.", "video_url": "https://segmind-inference-inputs.s3.amazonaws.com/c511b717-f9dd-4624-8b8d-8f5a0f688a5d-3044799-hd_1280_720_24fps (1).mp4", "image_urls": [ "https://www.shutterstock.com/image-photo/nicely-decorated-pergola-pots-blue-600nw-507200902.jpg" ], "keep_audio": false }; (async function() { try { const response = await axios.post(url, data, { headers: { 'x-api-key': api_key } }); console.log(response.data); } catch (error) { console.error('Error:', error.response.data); } })();
RESPONSE
image/jpeg
HTTP Response Codes
200 - OKImage Generated
401 - UnauthorizedUser authentication failed
404 - Not FoundThe requested URL does not exist
405 - Method Not AllowedThe requested HTTP method is not allowed
406 - Not AcceptableNot enough credits
500 - Server ErrorServer had some issue with processing

Attributes


promptstr *

Describe the edit. Use @Element1, @Element2 to reference elements. Example: 'Replace the character in the video with @Element1, maintaining the same movements'


video_urlstr *

URL of the input video to edit


image_urlslist ( default: 1 )

List of reference image URLs for context


elementslist ( default: 1 )

List of elements with reference and frontal images. Each element should have 'reference_image_urls' (list) and 'frontal_image_url' (str)


keep_audiobool ( default: 1 )

Keep the original audio from the input video

To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.

Kling Video O1: AI-Powered Video Editing Model

What is Kling Video O1?

Kling Video O1 is an advanced AI video generation and editing model that transforms how creators manipulate video content through natural language commands. Unlike traditional video editing tools that require frame-by-frame manual work, Kling Video O1 uses a unified multimodal architecture to intelligently understand and execute complex editing tasks—from background replacements and character swaps to object removal and video inpainting—all while maintaining natural motion dynamics and visual coherence. The model supports high-resolution output up to 1080p (extendable to 2K/4K) at 30 FPS, making it production-ready for professional workflows in film, advertising, and content creation.

Key Features

  • Natural Language Video Editing: Execute complex edits using simple text prompts with reference syntax (@Element1, @image1)
  • Motion-Aware Processing: Maintains fluid dynamics, fabric movement, and realistic camera motion throughout edits
  • Multi-Angle Character Replacement: Swap characters or objects across different angles without manual masking
  • Environment & Background Swapping: Seamlessly replace scenes while preserving foreground subjects
  • Object Manipulation: Remove, replace, or modify specific elements with surgical precision
  • Video Inpainting & Interpolation: Fill gaps and smooth transitions automatically
  • High-Resolution Output: Generate videos up to 1080p at 30 FPS, with 2K/4K scaling capabilities
  • Audio Preservation: Optional audio retention during editing operations

Best Use Cases

Film & Video Production: Quickly swap locations, replace props, or modify backgrounds in post-production without reshoots.

Advertising & Marketing: Create multiple variations of product videos by changing environments, colors, or featured elements.

Content Creation: YouTubers and social media creators can repurpose content by updating backgrounds or removing unwanted objects.

Visual Effects: Pre-visualization and concept testing for complex VFX shots before committing to expensive production.

E-commerce: Generate product videos in different settings or lifestyle contexts from a single source video.

Prompt Tips and Output Quality

Use Reference Syntax: Leverage the @ notation to precisely control edits. For example: "Replace the sky with @image1 while keeping buildings intact" tells the model exactly what to modify and what to preserve.

Be Specific About Preservation: Always indicate what elements should remain unchanged. "Swap background to @image2, maintaining character motion" produces better results than vague instructions.

Describe Desired Motion: When working with dynamic scenes, specify motion characteristics: "Change environment to @image1 with smooth camera pan" helps maintain natural cinematography.

Supply Quality References: The image_urls parameter accepts visual references that guide style and composition. Higher-quality reference images yield more accurate replacements.

Element Definition: For complex character or object swaps, use the elements parameter with both reference and frontal images for consistent multi-angle modifications.

Audio Considerations: Toggle keep_audio based on your use case—disable for environment changes where audio context matters, enable for purely visual edits.

FAQs

How is Kling Video O1 different from Runway or Veo?
Kling Video O1 excels in motion integrity and natural camera control, outperforming competitors in accuracy for complex editing tasks. Its unified multimodal architecture handles character replacement and environment swapping without the manual masking required by traditional tools, while maintaining more realistic physics and fluid dynamics than Runway Aleph or Google Veo 3.1.

Can I use Kling Video O1 for commercial projects?
Yes, Kling Video O1 is suitable for professional workflows including film production, advertising campaigns, and commercial content creation. Always check your specific licensing terms for commercial usage rights.

What video formats and lengths does it support?
The model accepts videos via URL input and supports standard web formats. While specific length limits depend on your implementation, the model is optimized for editing clips commonly used in social media and advertising (typically under 60 seconds).

Do I need to manually mask areas for editing?
No. Kling Video O1 automatically understands spatial relationships and object boundaries through its AI architecture. Simply describe what you want to change in your prompt, and the model handles segmentation and masking internally.

What resolution should my input video be?
The model outputs up to 1080p at 30 FPS (with 2K/4K extension capability). For best results, use input videos at or above your desired output resolution. Lower-quality inputs may limit detail in the final output.

How do I reference multiple images in a single edit?
Use the image_urls parameter to provide an array of reference images, then reference them in your prompt using @image1, @image2, etc. This allows complex edits like "Replace foreground with @image1 and background with @image2."