1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
const axios = require('axios');
const api_key = "YOUR API-KEY";
const url = "https://api.segmind.com/v1/ltx-2-19b-t2v";
const data = {
"prompt": "A dynamic medium shot of a young Indian woman in casual streetwear walking quickly toward the camera through a modern tech office corridor, eyes wide with excitement, gesturing with her hands as she talks, bright daytime fluorescent lighting, smooth handheld tracking shot that keeps her centered in frame, colleagues and Segmind branding softly blurred in the background, natural office ambience as she walks and excitedly shouts in clear Indian English: 'L T X 2 video model is superfast, check it out its live on segmind'",
"negative_prompt": "blurry, low quality, still frame, frames, watermark, overlay, titles, has blurbox, has subtitles",
"width": 720,
"height": 1280,
"num_frames": 121,
"fps": 24,
"seed": 1234567890,
"guidance_scale": 4
};
(async function() {
try {
const response = await axios.post(url, data, { headers: { 'x-api-key': api_key } });
console.log(response.data);
} catch (error) {
console.error('Error:', error.response.data);
}
})();Prompt describing the video scene
Negative prompt to avoid certain elements
Width of the output video
min : 256,
max : 1280
Height of the output video
min : 256,
max : 1280
Number of frames to generate
min : 1,
max : 400
Frames per second
min : 1,
max : 30
Random seed for reproducibility
Guidance scale for prompt adherence
min : 1,
max : 20
To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.
Edited by Segmind Team on January 14, 2026.
LTX-2, by Lightricks, is a next-generation diffusion-based audio‑visual foundation model that can seamlessly process text, images, video clips, and audio prompts to generate synchronized video and audio. It is a true multimodal system that can produce aligned visual and sound outputs at the same time, making it eons ahead compared to conventional text‑to‑video models. LTX-2 features open weights and is tuned for efficient local deployment, making it perfect for practical, resource‑conscious multimedia creation in multiple languages. The model is available in variants that focus on either speed or fidelity and includes native spatial and temporal upscaling to boost resolution and frame rate.
Effective Prompt Structure: Best results can be achieved by using detailed, action-oriented descriptions with clear subject placement (e.g.: "A dynamic medium shot of..."), lighting conditions, and camera movement guidance. Also, include specific audio cues when they are relevant to the video.
Guidance Scale Tuning: Keep guidance_scale around 3-5 for balanced results; lower values of 1-3 increase creativity and variation; higher values of 5-7 enforce stricter prompt adherence, but the result may not be as natural.
Frame Rate Considerations: 24 FPS delivers standard cinematic motion; higher frame rate such as 30 FPS works for fast action sequences. Match "num_frames" to your desired video length: 121 frames at 24 FPS produces approximately 5 seconds.
Resolution Best Practices: Correct resolution yield optimal quality-to-performance ratio: 720×1280 is for portrait; 1280×720 is perfect for landscape. Avoid extreme aspect ratios; stay within the 256-1280px range for each dimension.
Negative Prompts Matter: Explicitly exclude "blurry, low quality, watermark, frames, still frame" to maintain output consistency and prevent common artifacts in generated videos.
Is LTX-2 open-source?
Yes, LTX-2 is released with open weights, allowing local deployment, custom fine-tuning, and integration into private workflows without external API dependencies, making it an incredible asset to professionals across different sectors.
How does LTX-2 differ from other video generation models?
LTX-2 generates synchronized audio-video, while supporting multiple input modalities (text, image, audio); it also offers spatial/temporal upscalers, which are built into the model ecosystem.
What parameters should I adjust for faster generation?
To get output quickly: reduce num_frames (e.g., 60 instead of 121); lower resolution (720×720); and use the speed-optimized model variant. These changes significantly cut down generation time while maintaining acceptable quality.
Can I use LTX-2 for commercial projects?
Check Lightricks' licensing terms for commercial usage. Open weights typically imply flexible licensing, but verify usage rights for your specific application.
What's the difference between spatial and temporal upscalers?
Spatial upscalers increase resolution (pixel dimensions for sharper details), while temporal upscalers increase frame count and smoothness (hence smoother motion).
Does seed value affect audio generation too?
Yes, the seed value controls video and audio generation, thus ensuring reproducible results across both modalities when the same seed value is used.