Current Model Studio lanes, organized for buying decisions

The largest Qwen3.7 model for the highest-value text, coding, office, and long-running agent tasks. Use it when answer quality matters more than cost.

Model code

qwen3.7-max

Capability

Flagship text reasoning

Qwen3.6

Qwen3.6-Plus

Input $0.5

A practical production lane for vision-language work, agentic coding, front-end programming, OCR-like extraction, and object localization.

Model code

qwen3.6-plus

Capability

Balanced multimodal production

DeepSeek

DeepSeek V4 Pro

Input $1.65

A high-end MoE reasoning lane for technical research, sophisticated office workflows, long-form analysis, code, and math-heavy tasks.

Model code

deepseek-v4-pro

Capability

Reasoning, code, math

Kimi

Kimi K2.6

Input $0.8939

A long-context model for coding, instruction following, visual input, reasoning and non-reasoning dialogue, and agent-based tasks.

Model code

kimi-k2.6

Capability

Long context and visual agent tasks

GLM

GLM 5.1

Input $0.825

A long-horizon model for logical reasoning, long-text understanding, code generation, summaries, and developer assistance.

Model code

glm-5.1

Capability

Long-horizon logic

MiniMax

MiniMax M2.5

Input $0.304

A MiniMax flagship lane for programming, tool invocation, search, office work, and productivity scenarios.

Model code

MiniMax-M2.5

Capability

Agent and tool-use productivity

Flagship12 Cost-optimized11 Visual17 Wan8 Audio20 Multimodal7 Embeddings3 Third-party8 Older10 Token Plan models and supported tools16

Source checked from Alibaba Cloud Model Studio Console on 2026-06-08. Final quotes must confirm region, account route, deployment mode, official availability, quota, ModelSmarter service fee, taxes, and promotion status.Official source

Official taxonomy

The console currently separates the catalog into nine lanes

This map follows the visible Model Studio console tabs. Some official models appear in more than one lane; the site keeps those lane placements while routing each unique model code to one detail page.

Groups

Entries

Details

Jump by category

Official catalog map

Catalog category

Flagship

Versatile high-intelligence models for premium reasoning, coding, agent, and multilingual workloads.

Qwen3.7

Qwen3.7-Plus

Input $0.4

Cost-effective Qwen3.7 model with upgraded vision-language ability, coding, tool use, GUI perception, and productivity workflow support.

Model code

qwen3.7-plus

Capability

Multimodal agent foundation

Qwen3.7

Qwen3.7-Max

Input $2.5

The largest Qwen3.7 model for the highest-value text, coding, office, and long-running agent tasks. Use it when answer quality matters more than cost.

Model code

qwen3.7-max

Capability

Flagship text reasoning

Qwen3.6

Qwen3.6-Plus

Input $0.5

A practical production lane for vision-language work, agentic coding, front-end programming, OCR-like extraction, and object localization.

Model code

qwen3.6-plus

Capability

Balanced multimodal production

Qwen3.6

Qwen3.6-Max

Input $1.3

Preview Max lane with stronger vibe coding, coding-agent execution, front-end development, and long-tail knowledge retention.

Model code

qwen3.6-max-preview

Capability

Preview flagship text

Qwen3.6

Qwen3.6-Open-Source

Input $0.6

Dense 27B vision-language model with improved agentic coding, STEM reasoning, object localization, detection, and OCR-like work.

Model code

qwen3.6-27b

Capability

Open-source vision-language

Qwen3.5

Qwen3.5-Plus

Input $0.4

Prior Qwen3.5 plus lane still listed in the official catalog, but not used as a current primary homepage recommendation.

Model code

qwen3.5-plus

Capability

Prior generation plus

Qwen3.5

Qwen3.5-Open-Source

Open-source Qwen3.5 native vision-language model with hybrid architecture and efficient inference.

Model code

qwen3.5-27b

Capability

Prior open-source lane

Qwen3

Qwen3-Max

Input $1.2

Earlier Qwen3 Max catalog item. It remains in the official catalog map but should not replace Qwen3.7 in primary messaging.

Model code

qwen3-max

Capability

Earlier flagship

Qwen

Qwen-Plus

Input $0.4

Enhanced general Qwen lane with thinking and non-thinking mode support in current snapshots.

Model code

qwen-plus

Capability

General plus

Qwen3

Qwen3-Coder-Plus

Input $1

Code generation model with strong coding-agent, tool invocation, and environment interaction ability.

Model code

qwen3-coder-plus

Capability

Coding agent

Qwen

Qwen-Plus-Character

Input $0.5

Role-playing model optimized for character instruction following, conversation progression, and empathy.

Model code

qwen-plus-character

Capability

Role-play

Qwen3

Qwen3-Open-Source

Open-source Qwen3 family covering hybrid, thinking, and non-thinking model styles.

Model code

qwen3-coder-next

Capability

Open-source family

Token PlanVerify lifecycle

Catalog category

Cost-optimized

Fast, cost-effective models for high-volume production, support, translation, and lighter agent work.

Qwen3.6

Qwen3.6-Flash

Input $0.25

Cost-optimized Qwen3.6 lane with stronger coding, math, spatial intelligence, object localization, and detection than the prior flash generation.

Model code

qwen3.6-flash

Capability

Fast multimodal

Qwen3.6

Qwen3.6-Open-Source

Input $0.6

Open-source Qwen3.6 dense model listed in the cost-optimized section for economical evaluation.

Model code

qwen3.6-27b

Capability

Open-source cost lane

Qwen3.5

Qwen3.5-Flash

Input $0.1

Prior generation flash lane still listed by the official console, not a current homepage recommendation.

Model code

qwen3.5-flash

Capability

Prior flash lane

Qwen3.5

Qwen3.5-Open-Source

Open-source Qwen3.5 native vision-language model also visible in the cost-optimized section of the console.

Model code

qwen3.5-27b

Capability

Prior open-source cost lane

Qwen MT

Qwen-MT-Lite

Input $0.12

Translation model for 32 languages with terminology, format preservation, and domain adaptation support.

Model code

qwen-mt-lite

Capability

Low-cost translation

Qwen3

Qwen-Flash

Fast general Qwen3-series lane with thinking/non-thinking switching and long context support.

Model code

qwen-flash

Capability

Fast general model

Qwen3

Qwen3-Coder-Flash

Cost-optimized coding model focused on repository-level understanding and stable tool calling.

Model code

qwen3-coder-flash

Capability

Fast coding agent

Qwen MT

Qwen-MT-Flash

Input $0.16

Fast translation model for 92 languages with terminology and format preservation.

Model code

qwen-mt-flash

Capability

Fast translation

Qwen MT

Qwen-MT-Plus

Input $2.46

Higher-quality translation lane for accurate and natural specialized-domain translation.

Model code

qwen-mt-plus

Capability

Flagship translation

Qwen

Qwen-Flash-Character

Fast character interaction model for multilingual role consistency and scenario-based dialogue.

Model code

qwen-flash-character

Capability

Fast role-play

Qwen3

Qwen3-Open-Source

Open-source Qwen3 family also visible in the cost-optimized console section.

Model code

qwen3-coder-next

Capability

Open-source family

Catalog category

Visual

Visual understanding, image generation, image editing, video generation, and visual-agent models.

HappyHorse

HappyHorse-I2V

Image-to-video generation with source-image consistency and fluid dynamic rendering.

Model code

happyhorse-1.0-i2v

Capability

Image to video

HappyHorse

HappyHorse-T2V

Text-to-video generation for cinematic creative output and realistic motion details.

Model code

happyhorse-1.0-t2v

Capability

Text to video

HappyHorse

HappyHorse-R2V

Reference-to-video generation that preserves subject and scene references across motion.

Model code

happyhorse-1.0-r2v

Capability

Reference to video

HappyHorse

HappyHorse-Video-Edit

Natural-language local or global video editing with reference images and motion preservation.

Model code

happyhorse-1.0-video-edit

Capability

Video editing

Qwen Image

Qwen-Image-2.0

Image $0.035

Accelerated image generation and editing model with stronger text rendering and realistic texture control.

Model code

qwen-image-2.0

Capability

Image generation / editing

Qwen Image

Qwen-Image-2.0-Pro

Full-featured Qwen Image 2.0 lane with the strongest text rendering and lifelike texture quality in the series.

Model code

qwen-image-2.0-pro

Capability

High-quality image generation

Qwen Image

Qwen-Image-Max

High-quality Qwen image model for realism, human material texture, and better text rendering.

Model code

qwen-image-max

Capability

Image generation

Qwen Image

Qwen-Image-Edit-Max

Image editing model for industrial design, geometry, character consistency, and broader editing functions.

Model code

qwen-image-edit-max

Capability

Image editing

Z-Image

Z-Image-Turbo

Image std $0.015

Efficient image-generation model focused on photo-realistic output and bilingual text rendering.

Model code

z-image-turbo

Capability

Open-source image generation

Qwen3 VL

Qwen3-VL-Plus

Input $0.2

Visual-language model with visual agent, visual coding, spatial perception, and multimodal reasoning upgrades.

Model code

qwen3-vl-plus

Capability

Visual understanding

Qwen3 VL

Qwen3-VL-Flash

Fast visual understanding model for long videos, documents, spatial awareness, and localization tasks.

Model code

qwen3-vl-flash

Capability

Fast visual understanding

Qwen VL

Qwen-VL-OCR

Input $0.07

OCR-oriented visual-language model for image-text recognition, parsing, and processing tasks.

Model code

qwen-vl-ocr

Capability

OCR-like visual extraction

Qwen3

Qwen3-Open-Source

Open-source Qwen3 family listed in the visual section of the console for visual and hybrid model coverage.

Model code

qwen3-coder-next

Capability

Open-source visual family

Qwen Image

Qwen-Image-Plus

Image foundation model with complex text rendering and precise image editing capabilities.

Model code

qwen-image-plus

Capability

Image generation

Qwen Image

Qwen-Image-Edit-Plus

Image editing plus lane with faster inference, stability, and multi-image return support.

Model code

qwen-image-edit-plus

Capability

Image editing

Wan2.7

Wan-I2V

Image-to-video model that preserves subject, style, and text details through dynamic transitions.

Model code

wan2.7-i2v

Capability

Image to video

Wan2.7

Wan-T2V

Text-to-video model for smooth motion generation and cinematic direction control.

Model code

wan2.7-t2v

Capability

Text to video

Catalog category

Wan

Alibaba Wan image and video generation models for text, image, reference, editing, and VACE workflows.

Wan2.7

Wan-R2V

Reference-to-video model that preserves the look and voice of people or objects from reference material.

Model code

wan2.7-r2v

Capability

Reference to video

Wan2.7

Wan-VideoEdit

Prompt-driven video editing for local/global edits, video reshaping, and video transfer.

Model code

wan2.7-videoedit

Capability

Video editing

Wan2.7

Wan-I2V

Image-to-video generation with subject and style consistency.

Model code

wan2.7-i2v

Capability

Image to video

Wan2.7

Wan-T2V

Text-to-video generation with cinematic aesthetics and instruction adherence.

Model code

wan2.7-t2v

Capability

Text to video

Wan2.6

Wan-T2I

Text-to-image generation for photorealistic texture and accurate text rendering.

Model code

wan2.6-t2i

Capability

Text to image

Wan2.7

Wan2.7-Image

Wan2.7 image lane for generation, editing, style transformation, and visual production. Confirm exact price in the official console.

Model code

wan2.7-image

Capability

Image generation / editing

Wan2.7

Wan-Image

Wan image model for local edits, style transformations, object replacement, and contextual consistency.

Model code

wan2.7-image-pro

Capability

Image generation / editing

Video Generation(std) $0.1/s

Wan2.1

Wan2.1-VACE-Plus

Unified video editing and generation model for repainting, expansion, extension, and image referencing.

Model code

wan2.1-vace-plus

Capability

Unified video editing

Catalog category

Audio

Speech recognition, translation, voice design, voice cloning, and real-time speech synthesis models.

Qwen3.5

Qwen3.5-LiveTranslate-Flash-Realtime

Input Audio: $7.5

Realtime multilingual audio and video interpretation model with broad language coverage.

Model code

qwen3.5-livetranslate-flash-realtime

Capability

Realtime translation

Qwen Speech

Qwen-Voice-Design

Voice $0.2

Voice design service that creates a suitable voice from a text description.

Model code

qwen-voice-design

Capability

Voice design

Qwen3 TTS

Qwen3-TTS-VD-Realtime

Voice $0.2

Realtime speech synthesis on voices designed by Qwen voice-design services.

Model code

qwen3-tts-vd-realtime

Capability

Realtime voice-design TTS

Qwen3 TTS

Qwen3-TTS-VC-Realtime

Voice $0.2

Realtime speech synthesis using replicated voices with consistent timbre across languages.

Model code

qwen3-tts-vc-realtime

Capability

Realtime voice-clone TTS

Price Check official console

Qwen3

Qwen3-LiveTranslate-Flash

Multilingual simultaneous audio/video interpretation for offline and realtime workflows.

Model code

qwen3-livetranslate-flash

Capability

Live translation

Qwen Speech

Qwen-Voice-Enrollment

TTS $0.01

Voice replication service that can clone a similar voice from short audio samples.

Model code

qwen-voice-enrollment

Capability

Voice enrollment

Price Check official console

Fun-ASR

Fun-ASR-Realtime

Realtime speech recognition with hotword, punctuation, ITN, dialect, and noise-robust capabilities.

Model code

fun-asr-realtime

Capability

Realtime ASR

CosyVoice

TTS $0.26

Generative speech model that converts text into natural human-like speech.

Model code

cosyvoice-v3-plus

Capability

Speech synthesis

Qwen3 TTS

Qwen3-TTS-VC

TTS $0.26

Offline speech synthesis using replicated voices with adaptive tone control.

Model code

qwen3-tts-vc

Capability

Voice-clone TTS

Qwen3 TTS

Qwen3-TTS-VD

TTS $0.26

Offline speech synthesis using designed voices and adaptive tone control.

Model code

qwen3-tts-vd

Capability

Voice-design TTS

Qwen3 TTS

Qwen3-TTS-Instruct-Flash

TTS $0.115

Speech synthesis model that adjusts emotion and expression through natural-language instructions.

Model code

qwen3-tts-instruct-flash

Capability

Instructional TTS

Qwen3 TTS

Qwen3-TTS-Flash

TTS $0.1

Fast offline TTS model with expressive voices, multilingual support, and stable synthesis.

Model code

qwen3-tts-flash

Capability

Fast TTS

Qwen3 TTS

Qwen3-TTS-Flash-Realtime

TTS $0.13

Realtime TTS model for low-latency and stable speech generation.

Model code

qwen3-tts-flash-realtime

Capability

Realtime fast TTS

Qwen3 TTS

Qwen3-TTS-Instruct-Flash-Realtime

TTS $0.143

Realtime instructable speech synthesis with emotion and expression control.

Model code

qwen3-tts-instruct-flash-realtime

Capability

Realtime instructional TTS

Audio Duration $0.000035/s

Fun-ASR

Speech recognition model for Mandarin, English, Japanese, dialects, and noisy environments.

Model code

fun-asr

Capability

Speech recognition

Qwen3

Qwen3-LiveTranslate-Flash-Realtime

Input Audio: $10

Realtime version of Qwen3 live translation for simultaneous interpretation workflows.

Model code

qwen3-livetranslate-flash-realtime

Capability

Realtime live translation

Audio Duration $0.000035/s

Qwen3 ASR

Qwen3-ASR-Flash-Filetrans

Large-file transcription model based on Qwen3-ASR-Flash.

Model code

qwen3-asr-flash-filetrans

Capability

File transcription

Audio Duration $0.000035/s

Qwen3 ASR

Qwen3-ASR-Flash

Accurate multilingual speech recognition model based on a large language model.

Model code

qwen3-asr-flash

Capability

Speech recognition

Audio Duration $0.00009/s

Qwen3 ASR

Qwen3-ASR-Flash-Realtime

Realtime speech recognition model for multilingual, complex audio environments.

Model code

qwen3-asr-flash-realtime

Capability

Realtime speech recognition

Price Check official console

CosyVoice

Voice-Enrollment

Voice cloning service that extracts and prepares voices for speech generation.

Model code

voice-enrollment

Capability

Voice cloning

Catalog category

Multimodal

Omni-modal models supporting text, images, audio, and video understanding and interaction.

Qwen3.5 Omni

Qwen3.5-Omni-Flash-Realtime

Input Audio: $4.5

Realtime omni-modal model for voice assistants, multimedia analysis, interruptions, and tool calls.

Model code

qwen3.5-omni-flash-realtime

Capability

Realtime omni-modal

Qwen3.5 Omni

Qwen3.5-Omni-Flash

Input Audio: $3

Omni-modal flash model for long audio and audio-visual understanding.

Model code

qwen3.5-omni-flash

Capability

Fast omni-modal

Qwen3.5 Omni

Qwen3.5-Omni-Plus

Input Audio: $11

Plus omni-modal lane for audio, video, image, and text understanding and interaction.

Model code

qwen3.5-omni-plus

Capability

Plus omni-modal

Qwen3.5 Omni

Qwen3.5-Omni-Plus-Realtime

Input Audio: $16.5

Realtime plus omni-modal model for controllable voice dialogue and complex function calls.

Model code

qwen3.5-omni-plus-realtime

Capability

Realtime plus omni-modal

Qwen3 Omni

Qwen3-Omni-Flash-Realtime

Input Text: $0.52

Realtime Qwen3 Omni Flash model for efficient multimodal understanding and speech generation.

Model code

qwen3-omni-flash-realtime

Capability

Realtime omni-modal

Qwen3 Omni

Qwen3-Omni-Flash

Input Text: $0.43

Qwen3 Omni Flash model supporting text, image, audio, and video interaction.

Model code

qwen3-omni-flash

Capability

Omni-modal flash

Qwen3 Omni

Qwen3-Omni-30B-A3B-Captioner

Input Audio: $3.81

Fine-grained audio analysis model for generating accurate descriptions of complex audio scenes.

Model code

qwen3-omni-30b-a3b-captioner

Capability

Audio captioning

Catalog category

Embeddings

Embedding and reranking models for search, RAG, clustering, classification, recall, and ranking.

Qwen3

Qwen-Rerank

Text Input $0.1

Text-ranking model for reranking query and document relevance across 100+ languages.

Model code

qwen3-rerank

Capability

Reranking

Qwen3

Qwen-Embedding

Text Input $0.07

Multilingual text vector model for retrieval, clustering, classification, and RAG.

Model code

text-embedding-v4

Capability

Text embeddings

Tongyi embedding

Domain Embedding

Image Input $0.09

Domain-specific multimodal representation model for e-commerce, security, photo albums, and autonomous driving retrieval.

Model code

tongyi-embedding-vision-plus

Capability

Domain embedding

Catalog category

Third-party

Third-party model families available through the official Model Studio route, subject to region and account support.

DeepSeek

DeepSeek V4 Pro

Input $1.65

A high-end MoE reasoning lane for technical research, sophisticated office workflows, long-form analysis, code, and math-heavy tasks.

Model code

deepseek-v4-pro

Capability

Reasoning, code, math

Kimi

Kimi K2.6

Input $0.8939

A long-context model for coding, instruction following, visual input, reasoning and non-reasoning dialogue, and agent-based tasks.

Model code

kimi-k2.6

Capability

Long context and visual agent tasks

GLM

GLM 5.1

Input $0.825

A long-horizon model for logical reasoning, long-text understanding, code generation, summaries, and developer assistance.

Model code

glm-5.1

Capability

Long-horizon logic

MiniMax

MiniMax M2.5

Input $0.304

A MiniMax flagship lane for programming, tool invocation, search, office work, and productivity scenarios.

Model code

MiniMax-M2.5

Capability

Agent and tool-use productivity

DeepSeek

DeepSeek V4 Flash

Input $0.138

Lightweight DeepSeek MoE model for fast, low-latency, cost-effective everyday dialogue, content, RAG, and batch work.

Model code

deepseek-v4-flash

Capability

Fast reasoning

DeepSeek

DeepSeek V3.2

Input $0.287

DeepSeek model with sparse attention and reasoning integrated into tool usage.

Model code

deepseek-v3.2

Capability

Stable reasoning and tool use

Kimi

Kimi K2.5

Input $0.574

Kimi stable lane for long-context business and coding review, subject to console availability.

Model code

kimi-k2.5

Capability

Long-context stable lane

GLM

GLM 5

Input $0.573

GLM stable lane for general enterprise assistant and reasoning workloads, subject to console availability.

Model code

glm-5

Capability

General stable lane

Catalog category

Older

Older models still visible in the official catalog. Use them only after confirming support and deprecation status.

Qwen visual reasoning

QVQ-Max

Input $1.2

Older visual reasoning model for math, programming, visual analysis, creation, and general tasks.

Model code

qvq-max

Capability

Older visual reasoning

Qwen2.5

Qwen-Max

Input $1.6

Older super-large Qwen model. Check official lifecycle status before use.

Model code

qwen-max

Capability

Older flagship

QwQ

Qwen-QwQ-Plus

Input $0.8

Older reasoning model trained from Qwen2.5 and marked as retiring in the official catalog.

Model code

qwq-plus

Capability

Older reasoning

Qwen

Qwen-Turbo

Older fast Qwen lane. Check current snapshot and lifecycle before recommending.

Model code

qwen-turbo

Capability

Older fast text

Qwen VL

Qwen-VL-Max

Input $0.8

Older large vision-language model for high-level visual perception and cognition.

Model code

qwen-vl-max

Capability

Older vision-language

Qwen VL

Qwen-VL-Plus

Input $0.21

Older enhanced vision-language model for detail and text recognition.

Model code

qwen-vl-plus

Capability

Older vision-language plus

Qwen MT

Qwen-MT-Turbo

Input $0.16

Older fast translation model for 92 languages with terminology and format support.

Model code

qwen-mt-turbo

Capability

Older fast translation

Qwen2.5

Qwen2.5-Open-Source

Input Text: $0.1

Older Qwen2.5 open-source family covering text, visual, and multimodal models.

Model code

qwen2.5-omni-7b

Capability

Older open-source family

Qwen Omni

Qwen-Omni-Turbo

Input Text: $0.07

Older omni-modal model for mixed input understanding and streaming text/speech generation.

Model code

qwen-omni-turbo

Capability

Older omni-modal

Qwen Omni

Qwen-Omni-Turbo-Realtime

Input Text: $0.27

Older realtime omni-modal model for audio interaction scenarios.

Model code

qwen-omni-turbo-realtime

Capability

Older realtime omni-modal