POST

javascript

const axios = require('axios');


const api_key = "YOUR API-KEY";
const url = "https://api.segmind.com/v1/sts-eleven-labs";

const data = {
  "input_audio": "https://segmind-sd-models.s3.amazonaws.com/display_images/sad_talker/sad_talker_audio_input.mp3",
  "voice": "Rachel",
  "seed": 0,
  "remove_background_noise": false
};

(async function() {
    try {
        const response = await axios.post(url, data, { headers: { 'x-api-key': api_key } });
        console.log(response.data);
    } catch (error) {
        console.error('Error:', error.response.data);
    }
})();

RESPONSE

image/jpeg

HTTP Response Codes

200 - OKImage Generated

401 - UnauthorizedUser authentication failed

404 - Not FoundThe requested URL does not exist

405 - Method Not AllowedThe requested HTTP method is not allowed

406 - Not AcceptableNot enough credits

500 - Server ErrorServer had some issue with processing

Attributes

input_audiostr *

Input Audio URL

voiceenum:str ( default: Rachel )

Voice name

Allowed values:

voice_idstr ( default: 1 )

ElevenLabs voice ID (e.g., '21m00Tcm4TlvDq8ikWAM'). If not provided, voice parameter will be used.

model_idenum:str *

Model identifier

Allowed values:

voice_settingsstr ( default: 1 )

Voice settings overriding stored settings for the given voice. They are applied only on the given request. Needs to be sent as a JSON encoded string.

seedint ( default: 1 )

Seed for reproducible dialogue generation. Use 0 for random.

min : 0,

max : 999999999999999

remove_background_noiseboolean ( default: 1 )

If set, will remove the background noise from your audio input using our audio isolation model. Only applies to Voice Changer.

To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.

Elevenlabs Speech To Speech

Eleven Labs Speech-to-Speech (STS) leverages deep learning technology to offer a powerful and versatile voice conversion solution. It enables users to modify various aspects of audio speech, catering to diverse applications in content creation, media production, and accessibility.

Core Functionalities of Eleven Labs Speech-to-Speech

Speaker Identity Conversion: Transform the speaker's voice in an audio file while preserving the original content. Choose from a library of diverse voice styles and genders for a customized output.
Emotional Style Transfer: Infuse the converted speech with desired emotions, such as happiness, anger, or sadness. This functionality enhances the expressiveness and impact of audio content.
Language Translation with Voice Conversion: Achieve seamless audio translation while maintaining a natural-sounding voice in the target language. This feature expands the reach and accessibility of multilingual content.
Real-time Voice Cloning: Generate a synthetic voice clone that replicates a specific speaker's voice characteristics. This allows for voiceover creation or speech modification tasks.
Advanced Audio Editing: Utilize functionalities like noise reduction, silence removal, and audio mixing for professional-grade audio editing within the Eleven Labs platform.

Benefits of Utilizing Eleven Labs Speech-to-Speech

Content Personalization: Enhance the engagement of your audience by tailoring the voice and emotional delivery of audio content.
Accessibility Improvements: Create multilingual audio content with natural-sounding voices, removing language barriers for global audiences.
Streamlined Content Creation: Generate voiceovers or modify existing audio speech efficiently, accelerating production workflows.
Preserving Speaker Identity: Maintain the speaker's voice characteristics while enhancing audio quality or modifying language for broader reach.
Creative Voice Exploration: Experiment with diverse voice styles and emotions to inject new life into your audio projects.

Popular Models

SDXL Img2Img SDXL Img2Img is used for text-guided image-to-image translation. This model uses the weights from Stable Diffusion to generate new images from an input image using StableDiffusionImg2ImgPipeline from diffusers

Fooocus Fooocus enables high-quality image generation effortlessly, combining the best of Stable Diffusion and Midjourney.

Faceswap V2 Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training

Stable Diffusion XL 1.0 The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software