Qwen3.7-Plus

Qwen3.7-Plus is the cost-effective multimodal model in Alibaba’s Qwen3.7 series, made by the Qwen team. It takes text and images as input and returns text, with vision-language understanding, a 1 million token context window, and strong agentic abilities for coding, tool use, and productivity work.

The Qwen team built Qwen3.7-Plus as a versatile agent foundation: it can read screens, reason over images, and write code from visual references, while keeping text quality close to their flagship Max model.

Capabilities

Text and image input, streaming text output
1 million token context window, up to 65,536 output tokens
Vision-language understanding and visual reasoning
Strong coding and tool use

Inputs

prompt: the text prompt to send to the model
system_prompt: guides the model’s behavior
image: optional list of images for visual understanding
max_tokens, temperature, top_p, presence_penalty, frequency_penalty: standard sampling controls

Qwen3.7-Plus

Capabilities

Inputs

Links