QwQ-32B

2025-03-06
Chat, Reasoning
By Qwen
Budget Friendly

Input: ￥2.00 / M tokens Output: ￥8.00 / M tokens
Features: Function Calling, Reasoning, Streaming, Text Input, Text Output
Context Window: 32K
Maximum Output: 8K

Input: ￥2.00 / M tokens Output: ￥8.00 / M tokens
Features: Function Calling, Reasoning, Streaming, Text Input, Text Output
Context Window: 32K
Maximum Output: 8K

Model Description

QwQ-32B is a medium-sized reasoning model from the Qwen series, optimized for enhanced performance in downstream tasks, particularly challenging problems requiring deep reasoning. Unlike conventional instruction-tuned models, QwQ-32B integrates advanced architectural components such as RoPE, SwiGLU, RMSNorm, and Attention QKV bias. With 64 layers, 40 query heads, and 8 key-value heads (GQA), it supports a full 131,072-token context length, though YaRN must be enabled for prompts exceeding 8,192 tokens. Pretrained and post-trained via supervised finetuning and reinforcement learning, it achieves competitive results against leading models like DeepSeek-R1 and o1-mini. Users can explore its capabilities via QwenChat or refer to official resources for deployment guidelines.

Recommend Models

DeepSeek-R1

Chat, Reasoning
DeepSeek

Performance on par with OpenAI-o1, Fully open-source model & technical report, Code and models are released under the MIT License: Distill & commercialize freely.

2025-01-20

gpt-4.1-nano-2025-04-14

Chat, Vision
OpenAI

GPT-4.1 nano is the fastest, most cost-effective GPT-4.1 model.

2025-04-14

DeepClaude-3-7-sonnet

Chat, Reasoning
JuheAI

DeepSeek-R1 + claude-3-7-sonnet-20250219，The Deep series is composed of the DeepSeek-R1 (671b) model combined with the chain-of-thought reasoning of other models, fully utilizing the powerful capabilities of the DeepSeek chain-of-thought. It employs a strategy of leveraging other more powerful models for supplementation, thereby enhancing the overall model's capabilities.

2025-02-19