GPU compatibility

What LLMs can run on 16 GB VRAM creator GPU?

Comfortable 14B Q4, some 20B-class models Examples: RTX 4060 Ti 16GB, RTX 4080.

Open calculator with this GPU preset

Preset VRAM16 GB
System memory64 GB
Use caseComfortable 14B Q4, some 20B-class models

Q4 model fit table

ModelSizeQ4 needStatusCalculator
Gemma 3 4B4B3.50 GBRuns locallyOpen calculator
Qwen2.5 Coder 7B7B5.50 GBRuns locallyOpen calculator
Mistral 7B7B5.50 GBRuns locallyOpen calculator
Llama 3.1 8B Instruct8B6.00 GBRuns locallyOpen calculator
Qwen3 8B8B6.00 GBRuns locallyOpen calculator
DeepSeek R1 Distill Qwen 8B8B6.00 GBRuns locallyOpen calculator
Gemma 3 12B12B9.00 GBRuns locallyOpen calculator
Qwen2.5 Coder 14B14B10.5 GBRuns locallyOpen calculator
DeepSeek R1 Distill Qwen 14B14B10.5 GBRuns locallyOpen calculator
Phi-4 14B14B10.5 GBRuns locallyOpen calculator
Gemma 3 27B27B18 GBRAM offloadOpen calculator
Qwen2.5 Coder 32B32B21 GBRAM offloadOpen calculator
DeepSeek R1 Distill Qwen 32B32B21 GBRAM offloadOpen calculator
Mixtral 8x7B46.7B28 GBRAM offloadOpen calculator
Llama 3.1 70B Instruct70B44 GBRAM offloadOpen calculator