Text

Text Generation

Text generation covers chat, reasoning, translation, summarization, coding, extraction, classification, and agent workflows.

Model inferenceOfficial source

Who this is for

Customers choosing a language model for business workflows.

Configuration reference

Values to confirm before setup

Common API

OpenAI-compatible Chat Completions

Typical models

Qwen Max / Plus / Flash families subject to region

Typical controls

streaming, tool calls, structured output, thinking mode where supported

Setup flow

Practical steps

  1. 01Identify the task type.
  2. 02Select a flagship, balanced, or fast model.
  3. 03Choose streaming or non-streaming.
  4. 04Set context and output limits.
  5. 05Test prompt quality before scaling traffic.

Choosing a text lane

Use stronger models for reasoning, planning, and high-value customer-facing answers. Use faster models for classification, routing, extraction, support drafts, and high-volume internal work.

Common mistakes

Check these before escalating

  • A cheaper model can become expensive if it needs repeated retries.
  • Long context increases token cost.
  • Structured output should be tested before production.

Related guides