POST

javascript

const axios = require('axios');

const fs = require('fs');
const path = require('path');

async function toB64(imgPath) {
    const data = fs.readFileSync(path.resolve(imgPath));
    return Buffer.from(data).toString('base64');
}

const api_key = "YOUR API-KEY";
const url = "https://api.segmind.com/v1/v-express";

const data = {
  "input_image": "toB64('https://segmind-sd-models.s3.amazonaws.com/display_images/v_express/v-express-ip.jpg')",
  "input_audio": "https://segmind-sd-models.s3.amazonaws.com/display_images/v_express/v_express_audio.mp3",
  "fps": 30,
  "num_inference_steps": 20,
  "guidance_scale": 2,
  "retarget_strategy": "fix_face",
  "base64": false
};

(async function() {
    try {
        const response = await axios.post(url, data, { headers: { 'x-api-key': api_key } });
        console.log(response.data);
    } catch (error) {
        console.error('Error:', error.response.data);
    }
})();

RESPONSE

image/jpeg

HTTP Response Codes

200 - OKImage Generated

401 - UnauthorizedUser authentication failed

404 - Not FoundThe requested URL does not exist

405 - Method Not AllowedThe requested HTTP method is not allowed

406 - Not AcceptableNot enough credits

500 - Server ErrorServer had some issue with processing

Attributes

input_imageimage *

Input image of a talking-head.

input_audiostr *

Input audio file. Avoid special symbol in the filename as it may cause ffmpeg erros.

fpsint ( default: 30 ) Affects Pricing

Output frames per second.

min : 10,

max : 60

num_inference_stepsint ( default: 20 ) Affects Pricing

Number of steps to generate.

min : 5,

max : 50

guidance_scaleint ( default: 2 ) Affects Pricing

Scale for classifier-free guidance

min : 1,

max : 15

retarget_strategystr ( default: fix_face ) Affects Pricing

Retarget Strategy.

base64boolean ( default: 1 )

Base64 encoding of the output image.

To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.

V-Express

The V-Express model is a groundbreaking advancement in the realm of portrait video generation. It combines deep learning techniques with progressive training and conditional dropout operations. V-Express leverages generative models to create portrait videos from single images. It takes into account pose, input image, and audio, resulting in emotionally resonant videos. V-Express addresses the challenge of balancing different control signals. Whether it’s text, audio, pose, or image reference, V-Express ensures that weaker conditions contribute effectively to the final output.

Applications of V-Express

Content Creation: Writers, filmmakers, and artists can harness V-Express to craft moving narratives. Imagine generating heartfelt monologues or poignant dialogues effortlessly.
Chatbots with Empathy: Mental health chatbots powered by V-Express can empathize with users. When words alone aren’t enough, V-Express bridges the gap.
Character Animation: Game designers and animators can breathe life into characters. V-Express infuses emotions into their expressions, making them relatable.
Music Videos: V-Express isn’t limited to faces. It can create soul-stirring music videos, syncing lyrics with visuals.

Popular Models

SDXL Controlnet SDXL ControlNet gives unprecedented control over text-to-image generation. SDXL ControlNet models Introduces the concept of conditioning inputs, which provide additional information to guide the image generation process

Faceswap V2 Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training

SDXL Inpaint This model is capable of generating photo-realistic images given any text input, with the extra capability of inpainting the pictures by using a mask

Codeformer CodeFormer is a robust face restoration algorithm for old photos or AI-generated faces.