1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
const axios = require('axios');
const fs = require('fs');
const path = require('path');
// helper function to help you convert your local images into base64 format
async function toB64(imgPath) {
const data = fs.readFileSync(path.resolve(imgPath));
return Buffer.from(data).toString('base64');
}
const api_key = "YOUR API-KEY";
const url = "https://api.segmind.com/v1/qwen3.5-plus";
const data = {
"messages": [
{
"role": "user",
"content" : "tell me a joke on cats"
},
{
"role": "assistant",
"content" : "here is a joke about cats..."
},
{
"role": "user",
"content" : "now a joke on dogs"
},
]
};
(async function() {
try {
const response = await axios.post(url, data, { headers: { 'x-api-key': api_key } });
console.log(response.data);
} catch (error) {
console.error('Error:', error.response.data);
}
})();An array of objects containing the role and content
Could be "user", "assistant" or "system".
A string containing the user's query or the assistant's response.
To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.
Qwen 3.5 Plus is Alibaba Cloud's hosted flagship multimodal AI model, released February 16, 2026. It is the production-ready version of Qwen3.5-397B-A17B — a Mixture-of-Experts model with 397 billion total parameters and just 17 billion active per forward pass, enabling exceptional inference efficiency without sacrificing quality.
Unlike traditional AI models that treat text and vision as separate pipelines, Qwen 3.5 Plus was trained natively on text, images, and video from the ground up using early text-vision fusion architecture. This means it genuinely understands all three modalities together — not as bolt-on features.
With a 1 million token context window (one of the largest available), it is built for developers and enterprises who need to handle long documents, complex reasoning chains, and multi-turn agentic workflows in a single API call.
Qwen 3.5 Plus is ideal for:
To get the best results from Qwen 3.5 Plus:
Output quality is consistently high across text, code, and visual tasks. Benchmark scores include 83.6 on LiveCodeBench v6, 91.3 on AIME26, and 88.4 on GPQA Diamond — outperforming GPT-5.2 and Claude Opus 4.5 on over 80% of evaluated categories.
What is the difference between Qwen 3.5 Plus and Qwen3.5-397B-A17B? Qwen3.5-397B-A17B is the open-weight model you can self-host. Qwen 3.5 Plus is the hosted production version with additional features: 1M context (vs 256K base), built-in tools, and three operational modes.
Does Qwen 3.5 Plus support video input? Yes. It natively processes video clips up to 60 seconds. Pass video as a URL or base64-encoded data alongside your prompt.
What operational modes are available? Auto (adaptive thinking + web search + code interpreter), Thinking (deep step-by-step reasoning), and Fast (instant, low-latency responses).
How does pricing compare to other models? Input tokens are priced at $0.50 per 1M tokens and output at $3.00 per 1M tokens — competitive with mid-tier frontier models while offering significantly better multimodal capabilities.
Can Qwen 3.5 Plus use external tools natively? Yes. In Auto mode, built-in tools include web search and a code interpreter — no external orchestration or LangChain required.
Is Qwen 3.5 Plus suitable for production enterprise workloads? Absolutely. The MoE architecture delivers 8.6x-19x faster throughput than predecessor dense models, and the 1M context window reduces RAG complexity significantly.