1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
const axios = require('axios');
const FormData = require('form-data');
const api_key = "YOUR API-KEY";
const url = "https://api.segmind.com/v1/veo-2-image2video";
const reqBody = {
"image": "https://segmind-resources.s3.amazonaws.com/input/65a44af9-de76-474e-898f-57e83b0ff3b3-a-photograph-of-a-giant-panda-swimming-i_k_2X05UAQYWIOoZJLuVpyg_eFEiq19KSYO9FzXR4tdibQ.jpeg",
"prompt": "A photograph of a giant panda swimming in a crystal-clear outdoor pool. The panda is gracefully paddling with its black and white fur glistening in the sunlight, its playful expression clearly visible. The pool is surrounded by lush green foliage and colorful flowers, with a wooden deck leading to a grassy lawn. Soft, natural light bathes the scene, highlighting the water's clarity and the panda's movements.",
"duration": 5,
"aspect_ratio": "16:9"
};
(async function() {
try {
const formData = new FormData();
// Append regular fields
for (const key in reqBody) {
if (reqBody.hasOwnProperty(key)) {
formData.append(key, reqBody[key]);
}
}
// Convert and append images as Base64 if necessary
const response = await axios.post(url, formData, {
headers: {
'x-api-key': api_key,
...formData.getHeaders()
}
});
console.log(response.data);
} catch (error) {
console.error('Error:', error.response ? error.response.data : error.message);
}
})();
For new variations. Leave empty to use a random number.
Input image that acts as the starting frame for the generated video. Get best results with16:9 or 9:16 and 1280x720 or 720x1280, depending on the aspect ratio you choose.
Prompt for video generation
Number of second of video to be generated.
Allowed values:
Aspect ratio for the video.
Allowed values:
To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.
Google Veo 2, developed by Google DeepMind, is an advanced AI-powered video generation model that transforms static images into dynamic, high-quality videos. Launched as an upgrade to its predecessor, Veo, this model leverages cutting-edge AI to deliver realistic motion and cinematic visuals, making it a powerful tool for developers and creators looking to streamline video production. The model is accessible through Segmind now. It is poised to redefine creative workflows.
Veo 2 excels at converting images into videos with impressive realism, supporting resolutions up to 4K and durations exceeding two minutes (claimed by Google)—though current early access limits outputs to 720p and 8 seconds. It boasts advanced control over camera angles, lens types, and cinematic effects, allowing users to specify details like "low-angle tracking shot" or "18mm lens." The model’s enhanced understanding of real-world physics ensures natural movement, such as fluid dynamics or human expressions, making it ideal for lifelike video content.
In head-to-head comparisons on MovieGenBench, a dataset by Meta featuring 1,003 prompts, Veo 2 outperformed competitors like OpenAI’s Sora Turbo and Meta’s MovieGen. Human raters favored Veo 2 for overall preference and prompt adherence, with standout scores against Sora Turbo (58.8% preference) and Minimax (55.7% accuracy). Tested at 720p, Veo 2’s 8-second clips demonstrated superior detail and realism compared to shorter 5-second outputs from other models.
It struggles with maintaining consistency in complex scenes or intricate motions, occasionally producing artifacts like inconsistent textures or errors in human features (e.g., hands). Early access restrictions—capped resolution and duration—also limit its full potential, though future updates may address these. Complex prompts can sometimes overwhelm the model, leading to deviations from the intended output.
Veo 2 is versatile for developers and creators alike. Filmmakers can prototype scenes, marketers can craft engaging ads from product images, and educators can animate static visuals for lessons. Social media creators benefit from its ability to produce polished vlogs or influencer-style videos, while developers can integrate it into apps via Google Veo 2 APIs for automated video generation.
User feedback has been largely positive, with creators praising Veo 2’s realistic physics and prompt fidelity. User reviews highlight its image-to-video feature as a game-changer, though some note its higher cost compared to rivals. Early testers appreciate the natural results, like smooth transitions and lifelike movements, but a few criticize lingering inconsistencies, suggesting it’s not yet flawless. Overall, the creative community sees Veo 2 as a leap forward, eagerly awaiting broader access and refinements.