POST

javascript

const axios = require('axios');


const api_key = "YOUR API-KEY";
const url = "https://api.segmind.com/v1/kling-o1-video-to-video-edit";

const data = {
  "prompt": "change the background of the video with the background in @image1 maintain the character as it is.",
  "video_url": "https://segmind-inference-inputs.s3.amazonaws.com/c511b717-f9dd-4624-8b8d-8f5a0f688a5d-3044799-hd_1280_720_24fps (1).mp4",
  "image_urls": [
    "https://www.shutterstock.com/image-photo/nicely-decorated-pergola-pots-blue-600nw-507200902.jpg"
  ],
  "keep_audio": false
};

(async function() {
    try {
        const response = await axios.post(url, data, { headers: { 'x-api-key': api_key } });
        console.log(response.data);
    } catch (error) {
        console.error('Error:', error.response.data);
    }
})();

RESPONSE

image/jpeg

HTTP Response Codes

200 - OKImage Generated

401 - UnauthorizedUser authentication failed

404 - Not FoundThe requested URL does not exist

405 - Method Not AllowedThe requested HTTP method is not allowed

406 - Not AcceptableNot enough credits

500 - Server ErrorServer had some issue with processing

Attributes

promptstr *

Describe the edit. Use @Element1, @Element2 to reference elements. Example: 'Replace the character in the video with @Element1, maintaining the same movements'

video_urlstr *

URL of the input video to edit

image_urlslist ( default: 1 )

List of reference image URLs for context

elementslist ( default: 1 )

List of elements with reference and frontal images. Each element should have 'reference_image_urls' (list) and 'frontal_image_url' (str)

keep_audiobool ( default: 1 )

Keep the original audio from the input video

To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.

Kling Video O1: AI-Powered Video Editing Model

What is Kling Video O1?

Kling Video O1 is an advanced AI video generation and editing model that transforms how creators manipulate video content through natural language commands. Unlike traditional video editing tools that require frame-by-frame manual work, Kling Video O1 uses a unified multimodal architecture to intelligently understand and execute complex editing tasks—from background replacements and character swaps to object removal and video inpainting—all while maintaining natural motion dynamics and visual coherence. The model supports high-resolution output up to 1080p (extendable to 2K/4K) at 30 FPS, making it production-ready for professional workflows in film, advertising, and content creation.

Key Features

Natural Language Video Editing: Execute complex edits using simple text prompts with reference syntax (@Element1, @image1)
Motion-Aware Processing: Maintains fluid dynamics, fabric movement, and realistic camera motion throughout edits
Multi-Angle Character Replacement: Swap characters or objects across different angles without manual masking
Environment & Background Swapping: Seamlessly replace scenes while preserving foreground subjects
Object Manipulation: Remove, replace, or modify specific elements with surgical precision
Video Inpainting & Interpolation: Fill gaps and smooth transitions automatically
High-Resolution Output: Generate videos up to 1080p at 30 FPS, with 2K/4K scaling capabilities
Audio Preservation: Optional audio retention during editing operations

Best Use Cases

Film & Video Production: Quickly swap locations, replace props, or modify backgrounds in post-production without reshoots.

Advertising & Marketing: Create multiple variations of product videos by changing environments, colors, or featured elements.

Content Creation: YouTubers and social media creators can repurpose content by updating backgrounds or removing unwanted objects.

Visual Effects: Pre-visualization and concept testing for complex VFX shots before committing to expensive production.

E-commerce: Generate product videos in different settings or lifestyle contexts from a single source video.

Prompt Tips and Output Quality

Use Reference Syntax: Leverage the @ notation to precisely control edits. For example: "Replace the sky with @image1 while keeping buildings intact" tells the model exactly what to modify and what to preserve.

Be Specific About Preservation: Always indicate what elements should remain unchanged. "Swap background to @image2, maintaining character motion" produces better results than vague instructions.

Describe Desired Motion: When working with dynamic scenes, specify motion characteristics: "Change environment to @image1 with smooth camera pan" helps maintain natural cinematography.

Supply Quality References: The image_urls parameter accepts visual references that guide style and composition. Higher-quality reference images yield more accurate replacements.

Element Definition: For complex character or object swaps, use the elements parameter with both reference and frontal images for consistent multi-angle modifications.

Audio Considerations: Toggle keep_audio based on your use case—disable for environment changes where audio context matters, enable for purely visual edits.

FAQs

How is Kling Video O1 different from Runway or Veo?
Kling Video O1 excels in motion integrity and natural camera control, outperforming competitors in accuracy for complex editing tasks. Its unified multimodal architecture handles character replacement and environment swapping without the manual masking required by traditional tools, while maintaining more realistic physics and fluid dynamics than Runway Aleph or Google Veo 3.1.

Can I use Kling Video O1 for commercial projects?
Yes, Kling Video O1 is suitable for professional workflows including film production, advertising campaigns, and commercial content creation. Always check your specific licensing terms for commercial usage rights.

What video formats and lengths does it support?
The model accepts videos via URL input and supports standard web formats. While specific length limits depend on your implementation, the model is optimized for editing clips commonly used in social media and advertising (typically under 60 seconds).

Do I need to manually mask areas for editing?
No. Kling Video O1 automatically understands spatial relationships and object boundaries through its AI architecture. Simply describe what you want to change in your prompt, and the model handles segmentation and masking internally.

What resolution should my input video be?
The model outputs up to 1080p at 30 FPS (with 2K/4K extension capability). For best results, use input videos at or above your desired output resolution. Lower-quality inputs may limit detail in the final output.

How do I reference multiple images in a single edit?
Use the image_urls parameter to provide an array of reference images, then reference them in your prompt using @image1, @image2, etc. This allows complex edits like "Replace foreground with @image1 and background with @image2."

Popular Models

SDXL Controlnet SDXL ControlNet gives unprecedented control over text-to-image generation. SDXL ControlNet models Introduces the concept of conditioning inputs, which provide additional information to guide the image generation process

face-to-many Turn a face into 3D, emoji, pixel art, video game, claymation or toy

Stable Diffusion XL 1.0 The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software

Codeformer CodeFormer is a robust face restoration algorithm for old photos or AI-generated faces.