$ man how-to/model-selection-strategy

Cost Efficiencybeginner

Model Selection Strategy

Match the model to the task — stop overpaying for simple work


The Core Principle

Not every task needs the most capable model. Using a capable model (Opus tier) on a simple reformatting task is like hiring a senior architect to paint a wall. Using a fast model (Sonnet tier) on a complex architecture decision is like hiring a junior intern to design the building. The core principle: match the model to the task. Simple tasks get fast models. Complex tasks get capable models. Everything in between is a judgment call, and the framework below helps you make it.
PATTERN

The Matching Framework

Fast models work for: reformatting content, scanning files, simple code edits, copy-paste-and-adapt tasks, straightforward data transformations, building pages that mirror existing patterns. These tasks have clear inputs, clear outputs, and low ambiguity. Capable models work for: architecture decisions, complex debugging, creative writing with nuanced voice, multi-step reasoning chains, research synthesis, and anything where the agent needs to make judgment calls. These tasks have ambiguity, tradeoffs, and require the model to think deeply. The dividing line: does this task require judgment or is it mechanical? Judgment tasks get the capable model. Mechanical tasks get the fast model. If you are unsure, start with the fast model. If the output is bad, escalate. It is cheaper to try fast and upgrade than to default to expensive on everything.
PRO TIP

Model Selection for Parallel Agents

When running parallel agents, assign models per task. The orchestrating agent uses the capable model because it needs to reason about dependencies, context, and sequencing. Sub-agents doing straightforward work (mirror an existing page, update a config, run a build check) use fast models. Sub-agents doing heavy creative work (writing 17 wiki entries, architecting a new feature) use the capable model. This is not about being cheap. It is about being efficient. A fast model that completes in 30 seconds on a simple task is better than a capable model that takes 2 minutes on the same task with identical quality. Speed compounds across parallel agents. Five fast agents on simple tasks finish before one capable agent on the same five tasks.
FORMULA

The Daily Tracking Method

Track your model usage for one week. At the end of each day, note which tasks used which model and whether the output quality was sufficient. Look for two patterns: 1. Capable model sessions where a fast model would have produced the same quality. These are overspend. Switch those task types to fast models. 2. Fast model sessions where the output was bad and you had to redo the work. These are false economy. Switch those task types to capable models. After one week, you will have a clear map of which tasks need which model. Apply that map going forward. Revisit quarterly as models improve (today's capable model becomes tomorrow's fast model).

knowledge guide
See "Claude" in Knowledge

related guides
Credit and Token ManagementParallel Agent PatternsOrchestrating Multi-Agent Workflows
← how-to wikiknowledge guide →
ShawnOS.ai|theGTMOS.ai|theContentOS.ai
built with Next.js · Tailwind · Claude · Remotion