1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
const axios = require('axios');
const FormData = require('form-data');
const api_key = "YOUR API-KEY";
const url = "https://api.segmind.com/v1/hidream-l1-fast";
const reqBody = {
"seed": -1,
"prompt": "a cute panda holding a sign that says \"Keep Calm and Keep Building\"",
"model_type": "fast",
"resolution": "1024 × 1024 (Square)",
"speed_mode": "Lightly Juiced 🍊 (more consistent)",
"output_format": "webp",
"output_quality": 100
};
(async function() {
try {
const formData = new FormData();
// Append regular fields
for (const key in reqBody) {
if (reqBody.hasOwnProperty(key)) {
formData.append(key, reqBody[key]);
}
}
// Convert and append images as Base64 if necessary
const response = await axios.post(url, formData, {
headers: {
'x-api-key': api_key,
...formData.getHeaders()
}
});
console.log(response.data);
} catch (error) {
console.error('Error:', error.response ? error.response.data : error.message);
}
})();
Seed (-1 for random)
Prompt
An enumeration.
Allowed values:
Image resolution
Allowed values:
Quality vs Speed
Allowed values:
Output format.
Allowed values:
Output image quality (for jpg and webp)
min : 1,
max : 100
To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.
HiDream-I1 is a state-of-the-art, open-source text-to-image model built for exceptional image generation quality, accurate prompt adherence, and broad commercial usability. It's designed for creators, developers, and researchers looking for high performance without licensing constraints.
| Feature | Description | |-------------------------------|-------------| | Superior Image Quality | Consistently produces high-fidelity images across styles—photorealistic, cartoon, concept art, and more. Scores highly on the HPS v2.1 benchmark, which aligns with human aesthetic preferences. Great at rendering text within images. | | Best-in-Class Prompt Following | Achieves top-tier scores on GenEval and DPG benchmarks. Outperforms all other open-source models in prompt accuracy, ensuring precise visual outputs from user instructions. | | Open Source (MIT License) | Freely available for personal, academic, and commercial use. Ideal for developers and startups seeking to integrate a powerful model without licensing headaches. | | Commercial-Ready | Outputs can be used for business applications like product mockups, ads, UI/UX design, and content creation, without additional licensing requirements. | | Multiple Versions Available | Choose from: • Full – highest quality • Dev – quality-performance balance • Fast – optimized for real-time use |
| Component | Details | |------------------|---------| | Architecture | Based on Mixture of Experts (MoE) using a Diffusion Transformer (DiT) backbone for modular and efficient processing. | | Text Encoders | Integrates multiple encoders for richer semantic understanding: • OpenCLIP • OpenAI CLIP • T5-XXL • Llama-3.1-8B-Instruct | | Routing | Uses dynamic routing to selectively activate expert pathways based on the input prompt, boosting both quality and efficiency. |