Ollama by VRAM

Ollama models by GPU VRAM

Pick a command from the clean-fit tier that matches your GPU, then open the calculator when context length, quantization, or purpose matters.

Open full Ollama command table

8GB

Small local models and quick assistants.

12GB-24GB

Coding helpers, agents, and stronger local chat models.

48GB

Larger local models and heavier experiments.

8 GB VRAM mainstream GPU

Good for 7B/8B Q4 models
4 commands

12 GB VRAM local agent GPU

Local routing, agents, and model testing
4 commands

24 GB VRAM homelab workstation

Heavy local models and homelab inference
4 commands

48 GB VRAM workstation

Large local models and long context
4 commands