Ollama by VRAM

Ollama models by GPU VRAM

Pick a command from the clean-fit tier that matches your GPU, then open the calculator when context length, quantization, or purpose matters.

Open full Ollama command table

8GB

Small local models and quick assistants.

12GB-24GB

Coding helpers, agents, and stronger local chat models.

48GB

Larger local models and heavier experiments.

8 GB VRAM mainstream GPU

Good for 7B/8B Q4 models

4 commands

Qwen3 4B Thinking 2507

Runs locally Q4 about 3.2 GB qwen3:4b-thinking-2507-q4_K_M CodingAgentsReasoning

Clean Q4 fit for 8 GB VRAM mainstream GPU.

Open calculator with this setup ->

Gemma 3 4B

Runs locally Q4 about 3.5 GB gemma3:4b AgentsVisionChat

Clean Q4 fit for 8 GB VRAM mainstream GPU.

Open calculator with this setup ->

Qwen2.5 Coder 7B

Runs locally Q4 about 5.5 GB qwen2.5-coder:7b CodingAgentsChat

Clean Q4 fit for 8 GB VRAM mainstream GPU.

Open calculator with this setup ->

Mistral 7B

Runs locally Q4 about 5.5 GB mistral:7b AgentsChat

Clean Q4 fit for 8 GB VRAM mainstream GPU.

Open calculator with this setup ->

12 GB VRAM local agent GPU

Local routing, agents, and model testing

4 commands

Qwen3 4B Thinking 2507

Runs locally Q4 about 3.2 GB qwen3:4b-thinking-2507-q4_K_M CodingAgentsReasoning

Clean Q4 fit for 12 GB VRAM local agent GPU.

Open calculator with this setup ->

Gemma 3 4B

Runs locally Q4 about 3.5 GB gemma3:4b AgentsVisionChat

Clean Q4 fit for 12 GB VRAM local agent GPU.

Open calculator with this setup ->

Qwen2.5 Coder 7B

Runs locally Q4 about 5.5 GB qwen2.5-coder:7b CodingAgentsChat

Clean Q4 fit for 12 GB VRAM local agent GPU.

Open calculator with this setup ->

Mistral 7B

Runs locally Q4 about 5.5 GB mistral:7b AgentsChat

Clean Q4 fit for 12 GB VRAM local agent GPU.

Open calculator with this setup ->

24 GB VRAM homelab workstation

Heavy local models and homelab inference

4 commands

Qwen3 4B Thinking 2507

Runs locally Q4 about 3.2 GB qwen3:4b-thinking-2507-q4_K_M CodingAgentsReasoning

Clean Q4 fit for 24 GB VRAM homelab workstation.

Open calculator with this setup ->

Gemma 3 4B

Runs locally Q4 about 3.5 GB gemma3:4b AgentsVisionChat

Clean Q4 fit for 24 GB VRAM homelab workstation.

Open calculator with this setup ->

Qwen2.5 Coder 7B

Runs locally Q4 about 5.5 GB qwen2.5-coder:7b CodingAgentsChat

Clean Q4 fit for 24 GB VRAM homelab workstation.

Open calculator with this setup ->

Mistral 7B

Runs locally Q4 about 5.5 GB mistral:7b AgentsChat

Clean Q4 fit for 24 GB VRAM homelab workstation.

Open calculator with this setup ->

48 GB VRAM workstation

Large local models and long context

4 commands

Qwen3 4B Thinking 2507

Runs locally Q4 about 3.2 GB qwen3:4b-thinking-2507-q4_K_M CodingAgentsReasoning

Clean Q4 fit for 48 GB VRAM workstation.

Open calculator with this setup ->

Gemma 3 4B

Runs locally Q4 about 3.5 GB gemma3:4b AgentsVisionChat

Clean Q4 fit for 48 GB VRAM workstation.

Open calculator with this setup ->

Qwen2.5 Coder 7B

Runs locally Q4 about 5.5 GB qwen2.5-coder:7b CodingAgentsChat

Clean Q4 fit for 48 GB VRAM workstation.

Open calculator with this setup ->

Mistral 7B

Runs locally Q4 about 5.5 GB mistral:7b AgentsChat

Clean Q4 fit for 48 GB VRAM workstation.

Open calculator with this setup ->