OpenClaw Model Picker

Find the best LLM for your Mac — local or cloud, matched to your hardware and task.

Your Mac Configuration

QualitySpeed
#1

Codex

Cloud / API Great
  • Runs in the cloud — no local hardware requirements
  • Top-tier Coding performance (96/100)
  • Specialized cloud coding agent, autonomous task execution

task: 96 × 0.525 + fit: 100 × 0.250 + speed: 80 × 0.175 + cost: 47 × 0.050 = 91.8

#2

GPT-5.3

Cloud / API Great
  • Runs in the cloud — no local hardware requirements
  • Top-tier Coding performance (94/100)
  • Latest frontier model, top-tier across all tasks

task: 94 × 0.525 + fit: 100 × 0.250 + speed: 80 × 0.175 + cost: 20 × 0.050 = 89.3

#3

Claude Sonnet 4.6

Cloud / API Great
  • Runs in the cloud — no local hardware requirements
  • Top-tier Coding performance (90/100)
  • Excellent all-rounder via API, strong coding and reasoning

task: 90 × 0.525 + fit: 100 × 0.250 + speed: 80 × 0.175 + cost: 60 × 0.050 = 89.3

#4

GPT-5.2

Cloud / API Great
  • Runs in the cloud — no local hardware requirements
  • Top-tier Coding performance (92/100)
  • Advanced reasoning and coding, agentic capabilities

task: 92 × 0.525 + fit: 100 × 0.250 + speed: 80 × 0.175 + cost: 33 × 0.050 = 89.0

#5

Claude Opus 4.6

Cloud / API Great
  • Runs in the cloud — no local hardware requirements
  • Top-tier Coding performance (95/100)
  • Frontier reasoning and coding model, highest quality

task: 95 × 0.525 + fit: 100 × 0.250 + speed: 80 × 0.175 + cost: 0 × 0.050 = 88.9

#6

GPT-4o

Cloud / API Great
  • Runs in the cloud — no local hardware requirements
  • Top-tier Coding performance (85/100)
  • Versatile multimodal model, fast and capable

task: 85 × 0.525 + fit: 100 × 0.250 + speed: 80 × 0.175 + cost: 73 × 0.050 = 87.3

How scoring works

Each model is scored using four components with hybrid weights controlled by the speed slider:

final_score = task × 0.525 + fit × 0.250 + speed × 0.175 + cost × 0.050

  • Task score (0-100): How well the model performs on the selected task, based on public benchmarks.
  • Mac fit (0 or 60 or 100): Whether the model fits in your RAM. 100 = great (4GB+ headroom), 60 = okay (1GB+ headroom), 0 = doesn't fit.
  • Speed (0-100): Estimated tokens/second based on your chip's memory bandwidth divided by model file size. 30+ tok/s = 100, <5 tok/s = 0.
  • Cost (0-100): Local models score 100 (free). Cloud models scored inversely by monthly cost.

Models that don't fit in RAM are excluded from recommendations. The speed slider shifts weight between task quality and inference speed.