Size / Speed Frontier
Model size on x, generation speed on y, bubble area by VRAM. This is the main “what actually wins here?” chart.
Comparison context
Crown Citadel
Host ciru • NixOS • Linux 6.19.9 • llama.cpp Vulkan • 16C / 32T • 64 / 64 split
Model size on x, generation speed on y, bubble area by VRAM. This is the main “what actually wins here?” chart.
Comparison context
Prompt-processing retention versus each model’s own 4096 baseline. Brighter is more stable as context grows.
VRAM, system RAM, and GTT at the active context, sorted by VRAM pressure.
Raw throughput view at the active context. Symbol size tracks model size, not memory.
Prompt-processing throughput across all tested contexts, one line per model family entry.
| Date | Model | Family | Size | Context | PP t/s | TG t/s | VRAM | RAM | GTT |
|---|