POST

javascript

const axios = require('axios');


const api_key = "YOUR API-KEY";
const url = "https://api.segmind.com/v1/qwq-plus";

const data = {
  "messages": [
    {
      "role": "user",
      "content" : "tell me a joke on cats"
    },
    {
      "role": "assistant",
      "content" : "here is a joke about cats..."
    },
    {
      "role": "user",
      "content" : "now a joke on dogs"
    },
  ]
};

(async function() {
    try {
        const response = await axios.post(url, data, { headers: { 'x-api-key': api_key } });
        console.log(response.data);
    } catch (error) {
        console.error('Error:', error.response.data);
    }
})();

RESPONSE

application/json

HTTP Response Codes

200 - OKImage Generated

401 - UnauthorizedUser authentication failed

404 - Not FoundThe requested URL does not exist

405 - Method Not AllowedThe requested HTTP method is not allowed

406 - Not AcceptableNot enough credits

500 - Server ErrorServer had some issue with processing

Attributes

messagesArray

An array of objects containing the role and content

rolestr

Could be "user", "assistant" or "system".

contentstr

A string containing the user's query or the assistant's response.

To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.

QwQ Plus — Advanced Reasoning Language Model

What is QwQ Plus?

QwQ Plus is Alibaba Cloud's flagship reasoning language model, built on the QwQ-32B architecture and significantly enhanced through reinforcement learning. Unlike standard language models that respond immediately, QwQ Plus operates in thinking-only mode — it always deliberates internally before generating a final answer. This deep reasoning process makes it exceptionally capable on tasks that require multi-step logic, mathematical derivation, and complex problem decomposition.

With a 131,072-token context window and 32.5 billion parameters, QwQ Plus handles long documents, intricate prompts, and multi-turn reasoning sessions with ease. It achieves benchmark performance comparable to DeepSeek-R1 on AIME 24/25 and LiveCodeBench, making it one of the most capable open-weight reasoning models available via API.

Key Features

Thinking-Only Mode: QwQ Plus always reasons before responding, exposing its chain-of-thought in a reasoning_content field for full transparency.
131K Token Context: Processes extensive inputs including long codebases, research papers, and detailed system prompts without truncation.
Reinforcement Learning Enhanced: Post-training via RL dramatically improves accuracy on math, science, and logic tasks.
OpenAI-Compatible API: Integrate directly using the OpenAI SDK — no custom client required.
DashScope Inference Backend: Served via Alibaba's high-performance DashScope infrastructure for low-latency production use.

Best Use Cases

QwQ Plus is purpose-built for tasks where accuracy and reasoning depth matter more than raw speed:

Mathematics & Science: Solve multi-step equations, proofs, and quantitative reasoning problems with verifiable chain-of-thought.
Code Generation & Debugging: Reason through algorithmic challenges, write clean production code, and diagnose complex bugs.
Legal & Financial Analysis: Parse dense documents and synthesize structured conclusions from unstructured text.
Research Assistance: Summarize papers, compare hypotheses, and generate well-reasoned literature reviews.
Technical Q&A: Answer developer and engineering questions with detailed, step-by-step explanations.

Prompt Tips and Output Quality

QwQ Plus works best when prompts are clear and goal-oriented. For mathematical or coding problems, include all relevant context and constraints upfront. Because the model reasons internally, you will receive both a reasoning_content block (the thinking trace) and a content block (the final answer) — use the reasoning trace to audit correctness or understand the model's approach.

Recommended parameters: temperature 0.6, TopP 0.95, presence penalty 0-2. Avoid greedy decoding (temperature 0) which can produce repetitive outputs.

FAQs

What makes QwQ Plus different from Qwen or GPT-4o? QwQ Plus is a dedicated reasoning model — it always thinks before answering, making it slower but significantly more accurate on hard problems.

Does QwQ Plus support function calling or tool use? QwQ Plus is optimized for deep reasoning text generation. For agentic tool-use workflows, consider pairing it with a planning layer.

What is the context limit? 131,072 tokens — sufficient for long codebases, research papers, and multi-turn conversations.

How is billing calculated? Input and output tokens are billed separately. Thinking tokens (in reasoning_content) are billed as output tokens.

Is QwQ Plus open source? The underlying QwQ-32B weights are open-source on Hugging Face. QwQ Plus is the production-optimized API version served by Alibaba Cloud via Segmind.

Which model should I use for speed vs. accuracy? For maximum accuracy on complex tasks, use QwQ Plus. For faster responses on simpler queries, consider Qwen3-Plus or a smaller Qwen3 model.

Popular Models

SadTalker Audio-based Lip Synchronization for Talking Head Video

IDM VTON Best-in-class clothing virtual try on in the wild

Faceswap V2 Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training

Codeformer CodeFormer is a robust face restoration algorithm for old photos or AI-generated faces.