POST

javascript

const axios = require('axios');


const api_key = "YOUR API-KEY";
const url = "https://api.segmind.com/v1/pixverse-lipsync";

const data = {
  "video_url": "https://segmind-inference-inputs.s3.amazonaws.com/60bf2999-263a-4927-9fa7-f7baa9017188-e1101c56-6f75-4879-9783-d6800b1edd9b.mp4",
  "audio_url": "https://segmind-inference-inputs.s3.amazonaws.com/2f714073-f5d2-4fce-b772-72179e845873-48e09aa9-15af-4592-b83f-be2dac195df3.mp3"
};

(async function() {
    try {
        const response = await axios.post(url, data, { headers: { 'x-api-key': api_key } });
        console.log(response.data);
    } catch (error) {
        console.error('Error:', error.response.data);
    }
})();

RESPONSE

image/jpeg

HTTP Response Codes

200 - OKImage Generated

401 - UnauthorizedUser authentication failed

404 - Not FoundThe requested URL does not exist

405 - Method Not AllowedThe requested HTTP method is not allowed

406 - Not AcceptableNot enough credits

500 - Server ErrorServer had some issue with processing

Attributes

video_urlstr *

Provide the direct URL of the video. Try using high-resolution video for best results.

audio_urlstr *

Provide the direct URL of the audio. Use clear and distinct audio for accurate syncing.

To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.

PixVerse Speech: AI Lip Synchronization Model

What is PixVerse Speech?

PixVerse Speech is an advanced AI model that creates natural, precisely synchronized lip movements for video content by matching mouth animations with audio input. This cutting-edge lip-sync technology enables creators to generate professional-quality talking videos where the speaker's lip movements perfectly align with the accompanying speech or audio track. Whether you're working with pre-recorded videos or generating new content through the API, PixVerse Speech ensures seamless integration between visual and audio elements.

Key Features

High-Precision Lip Synchronization: Advanced AI algorithms ensure accurate mouth movement matching with audio
Flexible Input Options: Supports both video uploads and API-generated video content
Multi-Audio Support: Compatible with various audio types including speech, singing, and advertisements
Multilingual Capability: Handles lip-sync across multiple languages effectively
Real-time Processing: Monitors and delivers synchronized content through efficient API processing

Best Use Cases

Content Creation: YouTube videos, educational content, and virtual presentations
Entertainment Production: Animation dubbing, music videos, and film localization
Digital Marketing: Promotional videos, product demonstrations, and advertisements
Virtual Assistants: Creating engaging AI spokespersons and digital avatars
E-Learning: Developing interactive educational content with synchronized speech

Prompt Tips and Output Quality

Video Quality: Use high-resolution video input for optimal results and clearer lip movements
Audio Clarity: Provide clear, well-recorded audio files for more accurate synchronization
Frame Rate Consideration: Maintain consistent frame rates between input video and desired output
Face Positioning: Ensure the speaker's face is clearly visible and well-lit in the video
Audio-Video Length: Match audio duration closely with video length for best results

FAQs

Q: What video formats does PixVerse Speech support? A: The model accepts standard video formats through direct URL input, with high-resolution videos recommended for optimal results.

Q: Can I use PixVerse Speech for multiple languages? A: Yes, the model supports lip synchronization across various languages and accents.

Q: How does the audio input process work? A: You can provide audio through a direct URL, ensuring the audio is clear and distinct for accurate synchronization.

Q: What's the typical processing time for lip synchronization? A: Processing time varies based on video length and complexity, with real-time status monitoring available through the API.

Q: Can I use this for both pre-recorded and generated videos? A: Yes, PixVerse Speech works with both user-uploaded videos and videos generated through the PixVerse API.

Popular Models

Story Diffusion Story Diffusion turns your written narratives into stunning image sequences.

Faceswap V2 Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training

Stable Diffusion XL 1.0 The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software

Codeformer CodeFormer is a robust face restoration algorithm for old photos or AI-generated faces.