Which Model?

Stop overthinking it. Match the shape of the job to the shape of the model.

Quick picks Start with the common task shapes and the model traits that matter. Compare Scan the current model families side-by-side without chasing release names.

Good questions to ask first:

is quality or speed the bottleneck? does it need tools? how much context must fit? what mistake is expensive?

Quick picks

Skip the quiz. Six common asks, six ways to narrow the field.

Write polished prose

Pick for voice writing quality over benchmark rank

Use a model whose drafts need the fewest style edits. The best writing pick is the one you rewrite least.

CheckAsk for three versions and choose the one with the strongest default taste.

Code a feature in your app

Pick for tool use repo access, tests, edit loop

Coding work needs context and verification more than a clever single answer. Favor models and apps that can inspect files, edit, and run tests.

CheckGive it a small real bug and see whether it changes the right file.

Crack hard math, logic, or research

Pick reasoning slower, pricier, more deliberate

Reasoning models are worth the extra latency when each mistake is expensive and the answer has multiple dependent steps.

CheckAsk for assumptions, not just the final answer.

Summarize a huge document or codebase

Pick long context fits the source without chopping

The model cannot reason over pages it never saw. Context window matters most when the source is long and cross-referenced.

CheckAsk it to cite where in the source each claim came from.

Run high-volume production tasks

Pick cheap and fast latency and unit cost first

For tagging, extraction, routing, and simple Q&A, the right answer is usually the cheapest model that clears your eval.

CheckRun a 50-item sample before paying flagship prices.

Self-host or keep data local

Pick open weights control beats convenience

Open-weight models trade managed polish for privacy, offline use, customization, and predictable infrastructure costs.

CheckRead the license before building a business on it.

Don't see your ask? Scroll to the full directory and compare the traits underneath the recommendation.

Decode the AI lingo

Why these picks are these picks. Eight terms that quietly decide every choice.

working memory

Context window

Everything you paste — prompt, attachments, prior chat — has to fit. Bigger window means more code or more pages of a contract before you start chunking.

Use bigger windows for whole repos, long contracts, or many attached PDFs. Use smaller ones when the prompt is short and latency matters.

thinks before it answers

Reasoning model

A second class of model that runs an internal "scratchpad" before responding. Slower and pricier per query, but lands harder problems — math, multi-step logic, code refactors.

Best for hard math, planning, and codebase refactors. Overkill for rewriting a paragraph or classifying a support ticket.

handles more than text

Multimodal

Reads images, audio, video, or PDFs as input — not just text. Native multimodal models trained on all four feel sharper than ones bolted onto text-only bases.

Choose this when the task includes screenshots, charts, PDFs, audio, or video. Text-only work does not need the extra surface area.

the unit you pay by

Tokens

Sub-words the model reads and writes — roughly 0.75 word per token. Pricing is "$X per million input tokens / $Y per million output." Output is usually 3–5× more expensive than input.

Long inputs and long outputs both cost money. Summaries are cheap; repeated analysis loops over giant files are where bills grow.

fast vs. flagship

Tier (Fast · Mid · Flagship)

Each lab ships a tiered family. Fast tiers are 5–20× cheaper and answer in <1s — built for production pipelines. Flagship tiers cost more, think harder, and only make sense when quality matters more than latency.

Start with the cheapest tier that can do the job, then move up only when you see quality failures you can name.

can you run it yourself

Open vs. closed weights

Open-weight models publish the trained parameters — you can self-host, fine-tune, or run offline. Closed-weight models live behind an API only. Open trades some quality for full control and zero per-call cost.

Use open weights for privacy, offline use, fine-tuning, or predictable unit economics. Use hosted APIs when managed quality matters more.

can it call other things

Tool use / agents

Whether the model can call APIs, run code, search the web, or read files mid-conversation — instead of just generating text. The foundation under every "agent" headline.

Tool access is what turns a chat answer into a workflow: search, fetch, calculate, write, verify, repeat.

how recent its world is

Knowledge cutoff

The date training data ends. Anything after — a release, a news event, a library version — the model has to be told or look up. Models with web search smooth this over; pure-LLM answers don't.

For current events, prices, docs, or laws, use a model with retrieval or bring the sources yourself. Guessing from memory is the trap.

Match the term to the task. The Quick Picks above are shorthand — these are the dials underneath them.

All 15 model families, side-by-side

A deliberately stable snapshot as of April 2026. The labels are family-level on purpose: exact release names move faster than this page should. For what landed this week, see AI Radar →

Showing 15 models, sorted by their strongest practical signal.

Claude Opus

Anthropic

10 elite

The flagship shape. Use it when the answer has to be deeply considered.

Reasoning

Coding

Writing

Speed

Cost Efficiency

Long Context

hardest reasoning tasks
complex code architecture
nuanced long-form writing

DeepSeek reasoning

DeepSeek

10 elite

Open-source reasoning powerhouse. Shockingly cheap.

Reasoning

Coding

Writing

Speed

Cost Efficiency

Long Context

hard reasoning on a budget
open-source reasoning tasks
complex code problems

Midjourney

10 elite

The artist. Don't ask it to write code.

Image Generation

Image Understanding

—

Speed

Cost Efficiency

Open Source

—

Ecosystem

stunning visuals
creative direction
artistic imagery

OpenAI reasoning

OpenAI

10 elite

The heavyweight thinker. Bring your hard problems.

Reasoning

Coding

Writing

Speed

Cost Efficiency

Long Context

hard math and logic
complex code problems
deep analysis

Claude Sonnet

Anthropic

9 elite

The sweet spot. Smart, fast, and surprisingly affordable.

Reasoning

Coding

Writing

Speed

Cost Efficiency

Long Context

thoughtful writing
long documents
careful reasoning

Flux Pro

Black Forest Labs

9 elite

The new hotness in image gen. Punches up.

Image Generation

Image Understanding

—

Speed

Cost Efficiency

Open Source

Ecosystem

high-quality image generation
photorealistic outputs
open-weight option

Gemini Pro

Google

9 elite

The context monster. Feed it your whole codebase.

Reasoning

Coding

Writing

Speed

Cost Efficiency

Long Context

massive documents
video understanding
multimodal analysis

OpenAI mini reasoning

OpenAI

9 elite

The smart cheap one. Reasoning without the flagship price tag.

Reasoning

Coding

Writing

Speed

Cost Efficiency

Long Context

reasoning on a budget
hard math and code
high-volume reasoning tasks

DALL-E

OpenAI

8 strong

Easy image gen inside the OpenAI world.

Image Generation

Image Understanding

—

Speed

Cost Efficiency

Open Source

—

Ecosystem

quick image generation
text in images
ChatGPT integration

#10

GPT flagship

OpenAI

8 strong

The all-around OpenAI pick. Strong ecosystem, strong tool use.

Reasoning

Coding

Writing

Speed

Cost Efficiency

Long Context

all-around tasks
tool integrations
multimodal work

#11

Llama

Gemini Flash

Google

7 strong

Fast, cheap, and surprisingly capable.

Reasoning

Coding

Writing

Speed

Cost Efficiency

Long Context

fast multimodal tasks
large context on a budget
quick prototyping

#13

Mistral Large

Mistral

7 strong

The European contender. Solid all-around.

Reasoning

Coding

Writing

Speed

Cost Efficiency

Long Context

European data residency
multilingual work
balanced open-source option

#14

Claude Haiku

Anthropic

6 solid

Claude's fast little sibling. Great bang for buck.

Reasoning

Coding

Writing

Speed

Cost Efficiency

Long Context

high-volume processing
quick classification
budget API usage

#15

GPT mini

OpenAI

5 solid

Cheap, fast, and good enough for most things.

Reasoning

Coding

Writing

Speed

Cost Efficiency

Long Context

high-volume tasks
quick classification
budget-friendly apps

Models retire and new ones land. Spot something missing? let me know.