POST

javascript

const axios = require('axios');

const fs = require('fs');
const path = require('path');

// helper function to help you convert your local images into base64 format
async function toB64(imgPath) {
    const data = fs.readFileSync(path.resolve(imgPath));
    return Buffer.from(data).toString('base64');
}

const api_key = "YOUR API-KEY";
const url = "https://api.segmind.com/v1/gemini-2.5-pro";

const data = {
  "messages": [
    {
      "role": "user",
      "content" : "tell me a joke on cats"
    },
    {
      "role": "assistant",
      "content" : "here is a joke about cats..."
    },
    {
      "role": "user",
      "content" : "now a joke on dogs"
    },
  ]
};

(async function() {
    try {
        const response = await axios.post(url, data, { headers: { 'x-api-key': api_key } });
        console.log(response.data);
    } catch (error) {
        console.error('Error:', error.response.data);
    }
})();

RESPONSE

application/json

HTTP Response Codes

200 - OKImage Generated

401 - UnauthorizedUser authentication failed

404 - Not FoundThe requested URL does not exist

405 - Method Not AllowedThe requested HTTP method is not allowed

406 - Not AcceptableNot enough credits

500 - Server ErrorServer had some issue with processing

Attributes

messagesArray

An array of objects containing the role and content

rolestr

Could be "user", "assistant" or "system".

contentstr

A string containing the user's query or the assistant's response.

To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.

Gemini 2.5 Pro: Multimodal AI Model

Edited by Segmind Team on October 27, 2025.

What is Gemini 2.5 Pro?

Gemini 2.5 Pro, developed by Google Cloud, is a highly sophisticated AI model designed for handling complex multimodal reasoning tasks. It is powered by the Vertex AI platform, which makes it proficient and versatile for processing and interpreting a myriad of data types, such as text, code, images, audio, and video, all at once. Therefore, its ability to synthesize different data formats and provide solutions for complex cross-domain problems makes it a valuable model for developers and enterprises.

Key Features of Gemini 2.5 Pro

Multimodal Processing: It can seamlessly handle text, code, images, audio, and video inputs
Large Context Windows: It can support extensive input sizes for comprehensive analysis
Google Search Integration: It can provide enhanced response grounding through search capabilities
Code Execution: It has a built-in ability to run and analyze code
Function Calling: It is integrated with advanced API integration capabilities
Structured Output: It can deliver organized, formatted responses
Flexible Input Handling: It accepts text prompts and image inputs

Best Use Cases

Content Creation: It is an ideal model to generate and optimize blog posts, articles, and marketing copy
Software Development: It can perform code analysis, debugging, and documentation
Data Analysis: It can promptly process and interpret complex datasets
Visual Projects: It can execute image analysis and visual content strategy
Enterprise Solutions: It can carry out large-scale data processing and problem-solving
Research: It can analyze multiple data sources and synthesize findings

Prompt Tips and Output Quality

Provide clear, specific prompts that outline your desired outcome
Include proper context and vivid examples to guide the model
For visual outputs, provide text prompts with high-quality images
Use properly structured prompts for complex queries (e.g., bullet points or numbered lists)
Leverage multimodal capabilities by providing different input types
Divide complex tasks into smaller, targeted prompts for more precise results

FAQs

How is Gemini 2.5 Pro different from other AI models? Gemini 2.5 Pro is integrated with advanced multimodal capabilities, Google Search integration, and robust enterprise-grade features on the Vertex AI platform, making it a much more sophisticated model when compared to other options.

What types of inputs does Gemini 2.5 Pro accept? Gemini 2.5 Pro accepts text prompts and images as primary inputs, in addition to possessing the ability to process and analyze code, audio, and video content within its responses.

Can Gemini 2.5 Pro handle enterprise-scale tasks? Yes, it is an advanced AI model designed for enterprise use cases with large context windows and the ability to process complex, multi-format datasets.

**How can I optimize my prompts to achieve better results? **** To achieve better results, provide clear and specific instructions, offer relevant context, and utilize text and image inputs (when appropriate) for your use case.

Is Gemini 2.5 Pro available through API integration? Yes, it can be accessed through Google Cloud's Vertex AI platform and supports function calling for seamless API integration.

Popular Models

SDXL Img2Img SDXL Img2Img is used for text-guided image-to-image translation. This model uses the weights from Stable Diffusion to generate new images from an input image using StableDiffusionImg2ImgPipeline from diffusers

SDXL Controlnet SDXL ControlNet gives unprecedented control over text-to-image generation. SDXL ControlNet models Introduces the concept of conditioning inputs, which provide additional information to guide the image generation process

Stable Diffusion XL 1.0 The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software

Faceswap Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training