1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
const axios = require('axios');
const api_key = "YOUR API-KEY";
const url = "https://api.segmind.com/v1/sts-eleven-labs";
const data = {
"input_audio": "https://segmind-sd-models.s3.amazonaws.com/display_images/sad_talker/sad_talker_audio_input.mp3",
"voice": "Sarah",
"seed": 0,
"remove_background_noise": false
};
(async function() {
try {
const response = await axios.post(url, data, { headers: { 'x-api-key': api_key } });
console.log(response.data);
} catch (error) {
console.error('Error:', error.response.data);
}
})();Input Audio URL
Voice name
Allowed values:
ElevenLabs voice ID (e.g., '21m00Tcm4TlvDq8ikWAM'). If not provided, voice parameter will be used.
Model identifier
Allowed values:
Voice settings overriding stored settings for the given voice. They are applied only on the given request. Needs to be sent as a JSON encoded string.
Seed for reproducible dialogue generation. Use 0 for random.
min : 0,
max : 999999999999999
If set, will remove the background noise from your audio input using our audio isolation model. Only applies to Voice Changer.
To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.
Eleven Labs Speech-to-Speech (STS) leverages deep learning technology to offer a powerful and versatile voice conversion solution. It enables users to modify various aspects of audio speech, catering to diverse applications in content creation, media production, and accessibility.
Speaker Identity Conversion: Transform the speaker's voice in an audio file while preserving the original content. Choose from a library of diverse voice styles and genders for a customized output.
Emotional Style Transfer: Infuse the converted speech with desired emotions, such as happiness, anger, or sadness. This functionality enhances the expressiveness and impact of audio content.
Language Translation with Voice Conversion: Achieve seamless audio translation while maintaining a natural-sounding voice in the target language. This feature expands the reach and accessibility of multilingual content.
Real-time Voice Cloning: Generate a synthetic voice clone that replicates a specific speaker's voice characteristics. This allows for voiceover creation or speech modification tasks.
Advanced Audio Editing: Utilize functionalities like noise reduction, silence removal, and audio mixing for professional-grade audio editing within the Eleven Labs platform.
Content Personalization: Enhance the engagement of your audience by tailoring the voice and emotional delivery of audio content.
Accessibility Improvements: Create multilingual audio content with natural-sounding voices, removing language barriers for global audiences.
Streamlined Content Creation: Generate voiceovers or modify existing audio speech efficiently, accelerating production workflows.
Preserving Speaker Identity: Maintain the speaker's voice characteristics while enhancing audio quality or modifying language for broader reach.
Creative Voice Exploration: Experiment with diverse voice styles and emotions to inject new life into your audio projects.