What is the context window of Qwen3.5-397B?

The model supports a massive 1,000,000 token context window, perfect for large codebases and long videos.

Is Qwen3.5-397B open source?

Yes, it is released with open weights under the Apache 2.0 license, allowing for local hosting and customization.

How much does Qwen3.5 cost via API?

It is priced at $0.60 per 1 million input tokens and $3.60 per 1 million output tokens on Alibaba Cloud.

Does it support image and video input?

Yes, it is a native multimodal model capable of processing static images and up to 2 hours of video.

How does it compare to GPT-4o?

Qwen3.5 matches or exceeds GPT-4o in STEM benchmarks, math competition scores, and vision-language tasks.

What hardware is required for local hosting?

It requires enterprise-grade hardware like dual H100s or a Mac Studio with 256GB RAM due to its 397B size.

Does it support function calling?

Yes, it is highly optimized for tool use and consistently ranks at the top of function-calling benchmarks.

Qwen3.5-397B-A17B

Qwen3.5-397B-A17B is Alibaba's flagship open-weight MoE model. It features native multimodal reasoning, a 1M context window, and a 19x decoding throughput...

MultimodalMoEOpen-WeightsAgentic AIReasoning

alibabaQwen2026-02-16

Context

1.0Mtokens

Max Output

8Ktokens

Input Price

$0.60/ 1M

Output Price

$3.60/ 1M

Modality:TextImageVideo

Capabilities:VisionToolsStreamingReasoning

Benchmarks

GPQA

88.4%

HLE

28.7%

MMLU

88.6%

MMLU Pro

87.8%

SimpleQA

48%

IFEval

92.6%

AIME 2025

91.3%

MATH

74.1%

GSM8k

93.7%

MGSM

92.1%

MathVista

90.3%

SWE-Bench

76.4%

HumanEval

79.3%

LiveCodeBench

83.6%

MMMU

85%

MMMU Pro

79%

ChartQA

86.5%

DocVQA

93.2%

Terminal-Bench

52.5%

ARC-AGI

12%

View API Documentation

About Qwen3.5-397B-A17B

Learn about Qwen3.5-397B-A17B's capabilities, features, and how it can help you achieve better results.

A Monumental Leap in Open AI

Qwen3.5-397B-A17B represents a monumental leap in Alibaba Cloud's AI strategy, transitioning from a strong open-source contender to a dominant frontier-level system designed for the agentic AI era. Released on February 16, 2026, it is the flagship of the Qwen3.5 series, utilizing a massive 397-billion-parameter Mixture-of-Experts (MoE) architecture. By activating only 17 billion parameters per token, it achieves an unprecedented 19x decoding throughput boost compared to its predecessor, Qwen3-Max, while narrowing the performance gap with the world's most advanced proprietary models.

Unified Multimodal Powerhouse

The model is a unified, native multimodal powerhouse. Unlike previous versions that required separate vision-language adapters, Qwen3.5 features early-fusion multimodality trained on trillions of multimodal tokens. This allows it to watch and reason over two hours of video content, operate as a GUI agent across desktop and mobile interfaces, and handle complex coding tasks in its specialized Thinking mode. With an expanded vocabulary of 250,000 tokens supporting 201 languages, it stands as the premier global choice for multilingual and multimodal automation.

Architected for the Agentic Era

Beyond simple chat, Qwen3.5-397B is optimized for tool use and autonomous workflows. Its high scores in function-calling benchmarks and instruction following make it an ideal backbone for visual software engineering and PhD-level research. By offering state-of-the-art performance under an Apache 2.0 license, Alibaba has provided the community with a credible, high-efficiency alternative to the most restricted closed-source models.

Use Cases for Qwen3.5-397B-A17B

Discover the different ways you can use Qwen3.5-397B-A17B to achieve great results.

Autonomous GUI Agents

Navigates complex PC and smartphone interfaces to complete multi-step office automation workflows.

Long-Form Video Intelligence

Extracts deep causal reasoning and summaries from continuous video files up to 120 minutes long.

Vibe Coding & Prototyping

Translates UI sketches directly into production-ready React and frontend logic in a single shot.

PhD-Level Research

Solves graduate-level STEM problems using specialized internal chain-of-thought Thinking mode.

Multilingual Global Support

Engages users across 201 languages with superior tokenization efficiency for non-English scripts.

Visual Software Engineering

Transforms wireframes and screenshots into clean, layout-aware HTML, CSS, and JavaScript code.

Strengths

Limitations

Inference Efficiency: Achieves 19x decoding throughput gains by activating only 17B parameters via its hybrid MoE architecture.

Massive Hardware Demand: At 397B total parameters, running unquantized versions locally requires high-end server-grade infrastructure.

Native Video Reasoning: Processes up to 120 minutes of continuous video natively without the need for frame-extraction adapters.

Audio Modality Gap: Lacks the native audio input and output capabilities found in 'omni' models like GPT-4o or Gemini.

Top-Tier STEM Capability: Rivals proprietary reasoning models with an 88.4% score on GPQA and 91.3% on AIME 2025 math exams.

HLE Performance Gap: Trails proprietary leaders on Humanity's Last Exam (28.7%), indicating gaps in niche expert knowledge.

Open-Weight Accessibility: Provides frontier-level multimodal intelligence under the Apache 2.0 license for private enterprise deployment.

Memory Footprint: The sheer scale requires substantial VRAM even with sparsity, limiting widespread consumer-level deployment.

API Quick Start

alibaba/qwen-3.5-plus

View Documentation

alibaba SDK

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.DASHSCOPE_API_KEY,
  baseURL: 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1',
});

async function main() {
  const completion = await client.chat.completions.create({
    model: 'qwen-3.5-397b-instruct',
    messages: [{ role: 'user', content: 'Analyze this 2-hour video context.' }],
    extra_body: { enable_thinking: true },
  });
  console.log(completion.choices[0].message.content);
}

main();

Install the SDK and start making API calls in minutes.

What People Are Saying About Qwen3.5-397B-A17B

See what the community thinks about Qwen3.5-397B-A17B

“Qwen3.5-397B is basically the open-source community's answer to GPT-4o. The SVG capability alone is insane for web design.”

— u/LLM_Reviewer

“The 19x throughput boost makes Qwen3.5 feel significantly more responsive than any other model of this size I have tested.”

— tech_enthusiast_99

“Apache 2.0 for a model this large is a total game changer for local AI development and privacy-focused enterprises.”

— TechInnovator88

twitter

“The MoE routing in the 3.5-397B model is noticeably more intelligent than the previous 2.5 generation; it actually follows logic.”

— DistanceSolar1449

“The 1M context on an open-weight model of this caliber is unprecedented in the current ecosystem.”

— dev_logic

hackernews

“The video reasoning isn't just frame-by-frame; it's actual temporal understanding that feels miles ahead of current vision LLMs.”

— Matthew Berman (Context)

youtube

Videos About Qwen3.5-397B-A17B

Watch tutorials, reviews, and discussions about Qwen3.5-397B-A17B

“It beats Claude Opus 4.5 on browser comp as well as Gemini 3 Pro in several multimodal tasks.”

“Reportedly 19 times faster than the Qwen 3 Max which supports 201 languages and dialects.”

“It did a pretty great job with the photorealistic butterfly... better than most open-source models.”

“The 397B model is essentially the first open-weights model to truly compete at the frontier of AGI.”

“Scaling with MoE is clearly working for Alibaba and their latest benchmark results prove it.”

“This model is matching what their Qwen Max was able to do... but it's able to do it with up to a 19x speed boost.”

“The tokenizer has actually gone to a vocab of 250K... matching Gemini and the Google tokenizer.”

“You have to think of the Qwen team as a Frontier Lab... they're jumping into tasks proprietary labs focus on.”

“The tokenization is much more efficient for non-Latin scripts compared to earlier Llama iterations.”

“Thinking mode adds significant latency but the accuracy gain is worth it for coding and reasoning.”

“This is a unified vision language model... where prior models had a specific VL variant, this has everything contained within a single model.”

“Video understanding allows it to catch temporal details that frame-extraction methods miss.”

“In terms of coding, it feels as responsive as the GPT-4o model but with better instruction following.”

“Desktop GUI agent capability is the standout feature here for real-world automation.”

“It handles 120 minutes of video without losing context, which is just massive for analysis.”

More than just prompts

Supercharge your workflow with AI Automation

Automatio combines the power of AI agents, web automation, and smart integrations to help you accomplish more in less time.

AI Agents

Web Automation

Smart Workflows

Get Started Free

Pro Tips for Qwen3.5-397B-A17B

Expert tips to help you get the most out of Qwen3.5-397B-A17B and achieve better results.

Toggle Thinking Mode

Use the enable_thinking parameter for logic-heavy tasks to activate deep internal reasoning paths.

Leverage Native Search

Enable the search body parameter to verify facts against real-time web data and execute python code.

Optimize Video Prompts

Provide specific timestamp anchors to focus the 1M token context window on the most relevant segments.

Regional Endpoint Selection

Use the dashscope-intl endpoint for users outside mainland China to reduce network latency.

Testimonials

What Our Users Say

Join thousands of satisfied users who have transformed their workflow

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Automatio is one of the most used for RPA Tools both internally and externally. It saves us countless hours of work and we realized this could do the same for other startups and so we choose Automatio for most of our automation needs.

Mohammed Ibrahim

CEO, qannas.pro

I have used many tools over the past 5 years, Automatio is the Jack of All trades.. !! it could be your scraping bot in the morning and then it becomes your VA by the noon and in the evening it does your automations.. its amazing!

Ben Bressington

CTO, AiChatSolutions

Automatio is fantastic and simple to use to extract data from any website. This allowed me to replace a developer and do tasks myself as they only take a few minutes to setup and forget about it. Automatio is a game changer!

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Jonathan Kogan

Co-Founder/CEO, rpatools.io

Mohammed Ibrahim

CEO, qannas.pro

Ben Bressington

CTO, AiChatSolutions

Sarah Chen

Head of Growth, ScaleUp Labs

We've tried dozens of automation tools, but Automatio stands out for its flexibility and ease of use. Our team productivity increased by 40% within the first month of adoption.

David Park

Founder, DataDriven.io

The AI-powered features in Automatio are incredible. It understands context and adapts to changes in websites automatically. No more broken scrapers!

Emily Rodriguez

Marketing Director, GrowthMetrics

Automatio transformed our lead generation process. What used to take our team days now happens automatically in minutes. The ROI is incredible.

Related AI Models

GPT-5.1

OpenAI

GPT-5.1 is OpenAI’s advanced reasoning flagship featuring adaptive thinking, native multimodality, and state-of-the-art performance in math and technical...

400K context

$1.25/$10.00/1M

Kimi K2.5

Moonshot

Discover Moonshot AI's Kimi K2.5, a 1T-parameter open-source agentic model featuring native multimodal capabilities, a 262K context window, and SOTA reasoning.

262K context

$0.60/$2.50/1M

Grok-4

xAI

Grok-4 by xAI is a frontier model featuring a 2M token context window, real-time X platform integration, and world-record reasoning capabilities.

2M context

$3.00/$15.00/1M

Claude Opus 4.5

Anthropic

Claude Opus 4.5 is Anthropic's most powerful frontier model, delivering record-breaking 80.9% SWE-bench performance and advanced autonomous agency for coding.

200K context

$5.00/$25.00/1M

Gemini 3.1 Flash-Lite

Google

Gemini 3.1 Flash-Lite is Google's fastest, most cost-efficient model. Features 1M context, native multimodality, and 363 tokens/sec speed for scale.

1M context

$0.25/$1.50/1M

Claude Sonnet 4.6

Anthropic

Claude Sonnet 4.6 offers frontier performance for coding and computer use with a massive 1M token context window for only $3/1M tokens.

1M context

$3.00/$15.00/1M

Gemini 3 Flash

Google

Gemini 3 Flash is Google's high-speed multimodal model featuring a 1M token context window, elite 90.4% GPQA reasoning, and autonomous browser automation tools.

1M context

$0.50/$3.00/1M

Claude Opus 4.6

Anthropic

Claude Opus 4.6 is Anthropic's flagship model featuring a 1M token context window, Adaptive Thinking, and world-class coding and reasoning performance.

200K context

$5.00/$25.00/1M

Frequently Asked Questions About Qwen3.5-397B-A17B

Find answers to common questions about Qwen3.5-397B-A17B