Model selection

Choose a Model

Select the model by workload, context length, modality, region, latency, and cost. The model list changes over time, so final quotes should reference the official current catalog.

Product basicsOfficial source

Who this is for

Sales and technical leads preparing a model recommendation.

Configuration reference

Values to confirm before setup

Flagship lane

qwen3.7-max for highest-value reasoning, coding, office, and agent work

Balanced lane

qwen3.7-plus or qwen3.6-plus for multimodal production and business automation

Fast lane

qwen3.6-flash or deepseek-v4-flash for lighter high-speed work

Other lanes

qwen-image-2.0, wan2.7-image, kimi-k2.6, glm-5.1, and MiniMax-M2.5 subject to Token Plan support

Setup flow

Practical steps

  1. 01Write down the customer task in plain language.
  2. 02Classify the workload: text, vision, image, video, speech, embeddings, or coding tool use.
  3. 03Confirm context length and output size.
  4. 04Check the region/deployment mode where that model is available.
  5. 05Compare estimated input/output tokens and latency requirements.
  6. 06Pick a fallback model for cost or quota constraints.

Model lanes

For a customer-facing quote, use model lanes instead of a raw catalog dump: flagship for difficult reasoning, balanced for daily production, fast for support/classification/extraction, and specialist lanes for multimodal or retrieval tasks.

Procurement wording

Say 'recommended model lane subject to official availability' until the exact account, region, and billing route are confirmed.

Common mistakes

Check these before escalating

  • A model name available in one deployment mode may not be available in another.
  • Prices in examples are references and can change.
  • Image/video models may use different APIs than text models.

Related guides