Skip to main content
Reasoning models can think for milliseconds or for many seconds. Foundry exposes that choice as a four-level dial in Settings → Reasoning Effort.
Pro feature. Reasoning Effort is available on Pro plans.
Screenshot coming. Effort dropdown.

The four levels

LevelWhen to useWhat it buys you
InstantQuick lookups, casual repliesNo reasoning. Fastest. Cheapest.
LowLight synthesis, simple multi-stepBrief reasoning trace. Snappy.
MediumCoding, analysis, comparisonsBalanced. The default for serious work.
HighHard math, long planning, complex researchDeepest trace. Slowest. Highest token cost.
The dial applies to every routed reply until you change it.

What “instant” means

Instant turns reasoning off, even when the routed model is reasoning-capable. The reply behaves like a regular chat completion. The Reasoning block doesn’t render.

Picking a level

Most people pick Medium and leave it there. Move to High for one-off hard problems, Low when you want speed, Instant when you don’t want any thinking. The footer of every reply tells you which level was used so you can audit cost and latency.

Limits

  • Reasoning Effort only affects reasoning-capable models. If the router picks a non-reasoning model for your prompt, the level is ignored.
  • High runs can take 30+ seconds on the hardest prompts. Pair with browser notifications so you don’t sit and watch.

Live reasoning view

Watch the trace stream.

Reasoning models

Concept overview.