POST
javascript
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 const axios = require('axios'); const fs = require('fs'); const path = require('path'); async function toB64(imgPath) { const data = fs.readFileSync(path.resolve(imgPath)); return Buffer.from(data).toString('base64'); } const api_key = "YOUR API-KEY"; const url = "https://api.segmind.com/v1/ominicontrol"; const data = { "image": "toB64('https://segmind-sd-models.s3.us-east-1.amazonaws.com/display_images/Object+24.png')", "prompt": "photo of this orange sofa in a modern living room", "steps": 8, "seed": 4710825087, "image_format": "png", "image_quality": 90, "base64": false }; (async function() { try { const response = await axios.post(url, data, { headers: { 'x-api-key': api_key } }); console.log(response.data); } catch (error) { console.error('Error:', error.response.data); } })();
RESPONSE
image/jpeg
HTTP Response Codes
200 - OKImage Generated
401 - UnauthorizedUser authentication failed
404 - Not FoundThe requested URL does not exist
405 - Method Not AllowedThe requested HTTP method is not allowed
406 - Not AcceptableNot enough credits
500 - Server ErrorServer had some issue with processing

Attributes


imageimage *

URL of the input image.


promptstr *

Prompt for the image generation.


stepsint ( default: 8 )

Number of inference steps for image generation.

min : 4,

max : 40


seedint ( default: 12467 )

Random seed for generation


image_formatenum:str ( default: png )

Output image format

Allowed values:


image_qualityint ( default: 95 )

Image quality setting for output

min : 10,

max : 100


base64boolean ( default: 1 )

Base64 encoding of the output image.

To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.

OminiControl

OminiControl is a cutting-edge framework designed to enhance the capabilities of Diffusion Transformer (DiT) models for image generation tasks. This model stands out due to its parameter efficiency and universal control features, making it suitable for a wide range of image conditioning tasks.

Key Features of OminiControl

  • Minimal Architectural Changes: OminiControl achieves its functionality with only 0.1% additional parameters compared to traditional methods, significantly reducing the complexity associated with model modifications.

  • Unified Control Mechanism: The framework integrates various image conditioning tasks—such as subject-driven generation and spatially-aligned conditions (e.g., edges and depth)—into a single model architecture, allowing for versatile applications without the need for separate modules.

  • Parameter Reuse Mechanism: By leveraging existing components within the DiT architecture, OminiControl minimizes the need for additional control modules, which are common in other frameworks like ControlNet and T2I-Adapter.

Technical Innovations of OminiControl

  • Multi-Modal Attention Processing: OminiControl utilizes a multi-modal attention mechanism that allows for flexible interactions between condition tokens and noisy image tokens. This approach facilitates both spatially aligned and non-aligned tasks without rigid spatial constraints.

  • Dynamic Positioning Strategy: The model employs a dynamic positioning strategy for condition tokens, which adjusts based on whether the task is spatially aligned or not. This flexibility enhances performance across diverse generation scenarios.

  • Automated Data Synthesis Pipeline: To support its training, OminiControl introduces a novel data synthesis pipeline that generates high-quality, identity-consistent images. This pipeline has produced the Subjects200K dataset, comprising over 200,000 images tailored for subject-driven generation tasks.

Benefits of Using OminiControl

  • OminiControl excels in generating images based on specific subjects. This capability is particularly useful in industries such as advertising and media, where personalized content is essential..

  • The model supports advanced image editing tasks, including: Filling in missing parts of an image seamlessly, Creating images that adhere to specified edge outlines, useful in graphic design and illustration and Changing or enhancing backgrounds while preserving the integrity of the main subjects.