1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
const axios = require('axios');
const fs = require('fs');
const path = require('path');
// helper function to help you convert your local images into base64 format
async function toB64(imgPath) {
const data = fs.readFileSync(path.resolve(imgPath));
return Buffer.from(data).toString('base64');
}
const api_key = "YOUR API-KEY";
const url = "https://api.segmind.com/v1/qwen2p5-vl-32b-instruct";
const data = {
"messages": [
{
"role": "user",
"content" : "tell me a joke on cats"
},
{
"role": "assistant",
"content" : "here is a joke about cats..."
},
{
"role": "user",
"content" : "now a joke on dogs"
},
]
};
(async function() {
try {
const response = await axios.post(url, data, { headers: { 'x-api-key': api_key } });
console.log(response.data);
} catch (error) {
console.error('Error:', error.response.data);
}
})();
An array of objects containing the role and content
Could be "user", "assistant" or "system".
A string containing the user's query or the assistant's response.
To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.
Qwen2.5-VL 32B Instruct is a state-of-the-art multimodal AI model from the Qwen team at Alibaba Cloud. Built on 33 billion parameters, it seamlessly processes and generates both text and image inputs, making it ideal for complex instruction-following across modalities. With an industry-leading context window of up to 125,000 tokens, Qwen2.5-VL excels at handling long documents, extended conversations, and deep multi-step reasoning. The model supports fine-tuning on domain-specific data and offers serverless deployment for automatic scaling and low-latency inference.
Q: What types of inputs does Qwen2.5-VL 32B support?
A: It accepts free-form text prompts and image URLs or binary data for analysis and generation tasks.
Q: How long is the maximum context length?
A: Up to 125,000 tokens, enabling the processing of entire books, code repositories, or lengthy legal contracts.
Q: Can I fine-tune Qwen2.5-VL 32B on my own data?
A: Yes. The model provides a fine-tuning API that tailors responses to your domain, style, or industry vocabulary.
Q: Is serverless deployment available?
A: Absolutely—deploy Qwen2.5-VL via serverless endpoints that handle auto-scaling and reduce operational overhead.
Q: What are common applications for Qwen2.5-VL?
A: Popular use cases include multimodal chatbots, document QA, image captioning, code analysis, and research summarization.