By the RunAIatHome editorial team. Buyer guidance shaped by VRAM-first local AI testing and current market fit.
How to Choose a GPU for AI
The definitive buyer's guide for AI hardware in 2025.
Nuestra recomendación
Para la mayoría: RTX 4060 Ti 16 GB. Para exprimir VRAM: RTX 3090 usada.
Si quieres una compra sencilla y compatible con casi todo el stack local, la RTX 4060 Ti 16 GB sigue siendo el punto de equilibrio más limpio. Si priorizas capacidad para modelos más grandes y aceptas más consumo y mercado de segunda mano, la RTX 3090 sigue siendo la mejor escalera de VRAM.
- Best pick: RTX 4060 Ti 16 GB para chat, coding, SDXL y una experiencia sin fricción.
- Value pick: RTX 3090 usada si tu prioridad es abrir la puerta a 24 GB de VRAM.
- Solo si ya vives en macOS: Apple Silicon con mucha memoria unificada, no como compra dedicada para throughput bruto.
1. VRAM Matters Most
When choosing a GPU for AI, forget about gaming benchmarks. The single most important specification is VRAM (Video Random Access Memory). Here is why:
- • AI models must be loaded entirely into VRAM to run at full speed. A model that does not fit in your VRAM either will not run or will fall back to slow CPU inference.
- • Model sizes are growing rapidly. The 7B models that were state-of-the-art in 2023 have been surpassed by 70B+ models in 2024-2025.
- • Quantization helps (reducing model precision from FP16 to INT4 cuts size by 4x), but you still need enough VRAM for the quantized model plus overhead.
VRAM Requirements by Model Size
| Model Params | FP16 | Q8 | Q4 |
|---|---|---|---|
| 3B | 6 GB | 3 GB | 2 GB |
| 7B | 14 GB | 7 GB | 4 GB |
| 13B | 26 GB | 13 GB | 7 GB |
| 34B | 68 GB | 34 GB | 18 GB |
| 70B | 140 GB | 70 GB | 36 GB |
Q4 = 4-bit quantization (most common for local inference). Add 1-2 GB overhead for context window.
Memory Bandwidth: The Second Key Metric
After VRAM capacity, memory bandwidth (measured in GB/s) is the second most important spec. It directly determines how fast tokens are generated during inference. Higher bandwidth means faster responses.
This is why the RTX 4090 (1008 GB/s) generates tokens significantly faster than the RTX 4060 Ti (288 GB/s), even when running the same model at the same quantization level.
2. NVIDIA vs AMD vs Apple Silicon
NVIDIA (Recommended)
Best Choice- Pros: Universal CUDA support, best software compatibility, tensor cores for AI acceleration, largest community and documentation.
- Cons: More expensive, higher power consumption, premium pricing on high-VRAM cards.
- Best for: Anyone who wants guaranteed compatibility with all AI software.
AMD
- Pros: Better price-per-VRAM ratio, ROCm improving rapidly, good Linux support, competitive performance on supported software.
- Cons: Limited software compatibility (ROCm only works on Linux), fewer optimized kernels, smaller AI community.
- Best for: Linux users on a budget who are comfortable with manual setup.
Apple Silicon (M1/M2/M3/M4)
- Pros: Unified memory (system RAM = GPU VRAM), excellent power efficiency, silent operation, Metal GPU acceleration well-supported by Ollama and llama.cpp.
- Cons: Lower memory bandwidth than discrete GPUs, limited to macOS, no CUDA support, slower token generation per GB of memory.
- Best for: Mac users who want a quiet, energy-efficient setup. M4 Pro/Max with 36-128 GB unified memory can run very large models.
Comparativa de GPUs NVIDIA para IA local — precios en euros
La siguiente tabla compara las mejores GPUs NVIDIA para IA local con sus precios aproximados en el mercado europeo, VRAM disponible, velocidad de inferencia real y el perfil de usuario más adecuado. Útil para comparar de un vistazo antes de decidir qué comprar.
| GPU | VRAM | Precio aprox (€) | Velocidad tok/s | Caso de uso |
|---|---|---|---|---|
| RTX 4090 | 24 GB | ~1.999 € | ~90 tok/s | Producción / modelos grandes |
| RTX 4080 Super | 16 GB | ~899 € | ~65 tok/s | Desarrollo avanzado |
| RTX 4070 Ti Super | 16 GB | ~699 € | ~55 tok/s | Equilibrio precio/rendimiento |
| RTX 4070 Super | 12 GB | ~549 € | ~45 tok/s | Gaming + IA local |
| RTX 4070 | 12 GB | ~449 € | ~40 tok/s | Entrada a 12 GB |
| RTX 4060 Ti | 8 / 16 GB | ~399 € | ~35 tok/s | Presupuesto medio |
| RTX 4060 | 8 GB | ~299 € | ~25 tok/s | Iniciarse en IA local |
| RTX 3090 (segunda mano) | 24 GB | ~799 € | ~70 tok/s | Alternativa económica 24 GB |
| RTX 3060 | 12 GB | ~269 € | ~30 tok/s | Entrada económica |
Precios orientativos en euros para el mercado europeo (abril 2026). Velocidades medidas con Llama 3.1 8B Q4_K_M en Ollama. Los precios de segunda mano corresponden a plataformas como Wallapop, eBay.es o Back Market.
3. Budget Recommendations by Tier
Entry Tier: Used-value / low-budget
6-8 GB VRAMGood for small models (7B quantized), Whisper, and basic Stable Diffusion.
- New: RTX 4060 (8 GB), RTX 3060 (12 GB, refurbished)
- Used: RTX 3060 12 GB, RTX 2060 12 GB
- Can run: Llama 3.1 8B (Q4), Mistral 7B, Phi-3 Mini, Whisper, SD 1.5
Precios y disponibilidad pueden cambiar. Enlaces de afiliado.
RTX 4060
8 GB VRAMRTX 3060
12 GB VRAMMid Tier: Mainstream / upper-mid
12-16 GB VRAMThe sweet spot for most users. Handles 13B models and SDXL comfortably.
- Best pick: RTX 4060 Ti 16 GB
- Alternative: RTX 4070 (12 GB), used RTX 3090 (24 GB)
- Can run: Llama 3.1 13B, Mixtral 8x7B (Q4), CodeLlama 34B (Q4), SDXL, Flux
RTX 4060 Ti 16GB
16 GB VRAMRTX 4070
12 GB VRAMHigh Tier: Enthusiast consumer
24 GB VRAMRun almost any model at good quality. Future-proof for 2-3 years.
- Best pick: RTX 4090 (24 GB), RTX 3090 used
- Alternative: RTX 4080 Super (16 GB)
- Can run: Llama 3.1 70B (Q4), all image gen models, Whisper Large, multi-model setups
RTX 4090
24 GB VRAMRTX 3090
24 GB VRAMEnthusiast Tier: Workstation / halo
48+ GB VRAMFor researchers and power users who need maximum model quality.
- Best pick: 2x RTX 3090 (48 GB total), RTX A6000 (48 GB)
- Alternative: Mac Studio M4 Ultra (128 GB unified)
- Can run: 70B+ models at high quantization, fine-tuning, multi-model serving
4. Upgrade Paths
If you already have a GPU and want to upgrade, here are the most common paths:
GTX 1060/1070/1080 (6-8 GB) → RTX 3060 12 GB or RTX 4060 Ti 16 GB
Doubles your VRAM and adds tensor core support. Biggest bang for buck upgrade.
RTX 3060/3070 (8-12 GB) → RTX 4070 Ti Super 16 GB or used RTX 3090 24 GB
Gets you into 24 GB territory for large model support.
RTX 3080/3090 (10-24 GB) → RTX 4090 or wait for RTX 5090
Already well-equipped. Consider adding a second GPU or waiting for next gen.
Pro tip: Before upgrading your GPU, check your power supply. High-end GPUs like the RTX 4090 need a 850W+ PSU with the right connectors. Also ensure your case has enough physical space for the card.
Not Sure Which GPU You Need?
Our tools can help you decide based on the models you want to run.