Back to model catalog

Cost-optimized

Qwen-Flash

Fast general Qwen3-series lane with thinking/non-thinking switching and long context support.

Model details

Model code

qwen-flash

Category

Cost-optimized

Family

Qwen3

Capability

Fast general model

Modality

Text -> Text

Release / status

2025-08-01

Snapshot

qwen-flash-2025-07-28

Source region

Console

Official detail price

Input: $0.05 / 1M tokens · Input(Implicit Cache): $0.01 / 1M tokens

Input

$0.05 / 1M tokens

Input(Implicit Cache)

$0.01 / 1M tokens

Input(Batch File)

$0.025 / 1M tokens

Explicit Cache Creation

$0.063 / 1M tokens

Explicit Cache Read

$0.005 / 1M tokens

Detail checked

Source region: International. This is a copied summary from the official Model Studio detail page checked on June 8, 2026. Final quotes still require official console confirmation for region, account route, quota, promotions, taxes, and current availability.

Buyer review

Questions to confirm before purchase

Does this exact model code support the buyer's region?
Is this for Token Plan, direct API, or both?
Does the workload need text, image, video, audio, or embeddings?
Are context length, rate limits, and quota enough for production?
Are official usage cost and ModelSmarter service fee separated?
Is there a lower-cost fallback model if usage grows?

Source note

Catalog taxonomy and model detail price summaries were checked against Alibaba Cloud Model Studio Console on 2026-06-08. Availability, region, account route, quota, taxes, promotions, and official terms must be confirmed before purchase.

Open official console source