1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
const axios = require('axios');
const api_key = "YOUR API-KEY";
const url = "https://api.segmind.com/v1/dubbing";
const data = {
"source_url": "https://segmind-sd-models.s3.amazonaws.com/display_images/dubbing-op.mp3",
"target_lang": "hi",
"source_lang": "auto",
"num_speakers": 0,
"highest_resolution": false,
"drop_background_audio": false,
"use_profanity_filter": false,
"disable_voice_cloning": false,
"mode": "automatic"
};
(async function() {
try {
const response = await axios.post(url, data, { headers: { 'x-api-key': api_key } });
console.log(response.data);
} catch (error) {
console.error('Error:', error.response.data);
}
})();URL of the source video/audio file.
The target language to dub the content into.
Allowed values:
Source language. Set to 'auto' to automatically detect.
Allowed values:
Number of speakers to use for dubbing. Set to 0 to automatically detect.
min : 0,
max : 999999999999999
Start time of the source video/audio file in seconds.
min : 0,
max : 999999999999999
End time of the source video/audio file in seconds.
min : 0,
max : 999999999999999
Whether to use the highest resolution available.
Whether to drop background audio from the final dub. Improves quality for speeches or monologues.
[BETA] Whether transcripts should have profanities censored with '[censored]'.
Use voices from ElevenLabs Voice Library instead of voice cloning.
URL to CSV file containing transcription/translation metadata.
URL to foreground audio file. For use only with CSV input.
URL to background audio file. For use only with CSV input.
[Experimental] An accent to apply when selecting voices and informing translation dialect.
Dubbing mode. Use 'automatic' for auto-processing or 'manual' with CSV transcript.
Allowed values:
Frames per second for parsing CSV file. If not provided, FPS will be inferred from timecodes.
min : 0,
max : 999999999999999
To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.
ElevenLabs Dubbing is an AI model to translate and dub audio content. It streamlines the process of making your audio multilingual, allowing you to reach a wider audience without needing traditional recording studios or voice actors for each target language.
Audio Input: Upload audio files directly.
Language Selection: The model can automatically identify the source language of your audio. You can also manually choose from a list of supported languages. The model supports 29 languages, you can dub your content between any pair of these languages.
Target Language Selection: Select the language you want your audio translated into. ElevenLabs offers 29 languages at present: Chinese, Korean, Dutch, Turkish, Swedish, Indonesian, Filipino, Japanese, Ukrainian, Greek, Czech, Finnish, Romanian, Russian, Danish, Bulgarian, Malay, Slovak, Croatian, Classic Arabic, Tamil, English, Polish, German, Spanish, French, Italian, Hindi and Portuguese.
AI-powered Dubbing: The model will translate the audio content while attempting to match the speaker's voice characteristics, intonation, and emotional delivery in the target language.
Simplified Workflow: Eliminate the need for traditional dubbing studios and voice actors for each target language. Translate and dub your audio content efficiently within a single platform.
Multilingual Reach: Expand the reach of your audio content by making it accessible to audiences speaking different languages.
Cost-effective Solution: Potentially reduce production costs associated with traditional dubbing methods.
Time-saving: Streamline your audio translation and dubbing process compared to conventional methods.