POST

javascript

const axios = require('axios');


const api_key = "YOUR API-KEY";
const url = "https://api.segmind.com/v1/dubbing";

const data = {
  "source_url": "https://segmind-sd-models.s3.amazonaws.com/display_images/dubbing-op.mp3",
  "target_lang": "hi",
  "source_lang": "auto",
  "num_speakers": 0,
  "highest_resolution": false,
  "drop_background_audio": false,
  "use_profanity_filter": false,
  "disable_voice_cloning": false,
  "mode": "automatic"
};

(async function() {
    try {
        const response = await axios.post(url, data, { headers: { 'x-api-key': api_key } });
        console.log(response.data);
    } catch (error) {
        console.error('Error:', error.response.data);
    }
})();

RESPONSE

image/jpeg

HTTP Response Codes

200 - OKImage Generated

401 - UnauthorizedUser authentication failed

404 - Not FoundThe requested URL does not exist

405 - Method Not AllowedThe requested HTTP method is not allowed

406 - Not AcceptableNot enough credits

500 - Server ErrorServer had some issue with processing

Attributes

source_urlstr *

URL of the source video/audio file.

target_langenum:str *

The target language to dub the content into.

Allowed values:

source_langenum:str ( default: auto )

Source language. Set to 'auto' to automatically detect.

Allowed values:

num_speakersint ( default: 1 )

Number of speakers to use for dubbing. Set to 0 to automatically detect.

min : 0,

max : 999999999999999

start_timeint ( default: 1 )

Start time of the source video/audio file in seconds.

min : 0,

max : 999999999999999

end_timeint ( default: 1 )

End time of the source video/audio file in seconds.

min : 0,

max : 999999999999999

highest_resolutionboolean ( default: 1 )

Whether to use the highest resolution available.

drop_background_audioboolean ( default: 1 )

Whether to drop background audio from the final dub. Improves quality for speeches or monologues.

use_profanity_filterboolean ( default: 1 )

[BETA] Whether transcripts should have profanities censored with '[censored]'.

disable_voice_cloningboolean ( default: 1 )

Use voices from ElevenLabs Voice Library instead of voice cloning.

csv_file_urlstr ( default: 1 )

URL to CSV file containing transcription/translation metadata.

foreground_audio_urlstr ( default: 1 )

URL to foreground audio file. For use only with CSV input.

background_audio_urlstr ( default: 1 )

URL to background audio file. For use only with CSV input.

target_accentstr ( default: 1 )

[Experimental] An accent to apply when selecting voices and informing translation dialect.

modeenum:str ( default: automatic )

Dubbing mode. Use 'automatic' for auto-processing or 'manual' with CSV transcript.

Allowed values:

csv_fpsfloat ( default: 1 )

Frames per second for parsing CSV file. If not provided, FPS will be inferred from timecodes.

min : 0,

max : 999999999999999

To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.

Dubbing

ElevenLabs Dubbing is an AI model to translate and dub audio content. It streamlines the process of making your audio multilingual, allowing you to reach a wider audience without needing traditional recording studios or voice actors for each target language.

Using ElevenLabs Dubbing

Audio Input: Upload audio files directly.
Language Selection: The model can automatically identify the source language of your audio. You can also manually choose from a list of supported languages. The model supports 29 languages, you can dub your content between any pair of these languages.
Target Language Selection: Select the language you want your audio translated into. ElevenLabs offers 29 languages at present: Chinese, Korean, Dutch, Turkish, Swedish, Indonesian, Filipino, Japanese, Ukrainian, Greek, Czech, Finnish, Romanian, Russian, Danish, Bulgarian, Malay, Slovak, Croatian, Classic Arabic, Tamil, English, Polish, German, Spanish, French, Italian, Hindi and Portuguese.
AI-powered Dubbing: The model will translate the audio content while attempting to match the speaker's voice characteristics, intonation, and emotional delivery in the target language.

Benefits of Using ElevenLabs Dubbing

Simplified Workflow: Eliminate the need for traditional dubbing studios and voice actors for each target language. Translate and dub your audio content efficiently within a single platform.
Multilingual Reach: Expand the reach of your audio content by making it accessible to audiences speaking different languages.
Cost-effective Solution: Potentially reduce production costs associated with traditional dubbing methods.
Time-saving: Streamline your audio translation and dubbing process compared to conventional methods.

Popular Models

SDXL Img2Img SDXL Img2Img is used for text-guided image-to-image translation. This model uses the weights from Stable Diffusion to generate new images from an input image using StableDiffusionImg2ImgPipeline from diffusers

SDXL Controlnet SDXL ControlNet gives unprecedented control over text-to-image generation. SDXL ControlNet models Introduces the concept of conditioning inputs, which provide additional information to guide the image generation process

Faceswap V2 Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training

Codeformer CodeFormer is a robust face restoration algorithm for old photos or AI-generated faces.