Skip to main content
Intermediate 10 min read ·

By the RunAIatHome editorial team. Buyer guidance shaped by VRAM-first local AI testing and current market fit.

How to Choose a GPU for AI

The definitive buyer's guide for AI hardware in 2025.

Nuestra recomendación

Para la mayoría: RTX 4060 Ti 16 GB. Para exprimir VRAM: RTX 3090 usada.

Si quieres una compra sencilla y compatible con casi todo el stack local, la RTX 4060 Ti 16 GB sigue siendo el punto de equilibrio más limpio. Si priorizas capacidad para modelos más grandes y aceptas más consumo y mercado de segunda mano, la RTX 3090 sigue siendo la mejor escalera de VRAM.

  • Best pick: RTX 4060 Ti 16 GB para chat, coding, SDXL y una experiencia sin fricción.
  • Value pick: RTX 3090 usada si tu prioridad es abrir la puerta a 24 GB de VRAM.
  • Solo si ya vives en macOS: Apple Silicon con mucha memoria unificada, no como compra dedicada para throughput bruto.

1. VRAM Matters Most

When choosing a GPU for AI, forget about gaming benchmarks. The single most important specification is VRAM (Video Random Access Memory). Here is why:

  • AI models must be loaded entirely into VRAM to run at full speed. A model that does not fit in your VRAM either will not run or will fall back to slow CPU inference.
  • Model sizes are growing rapidly. The 7B models that were state-of-the-art in 2023 have been surpassed by 70B+ models in 2024-2025.
  • Quantization helps (reducing model precision from FP16 to INT4 cuts size by 4x), but you still need enough VRAM for the quantized model plus overhead.

VRAM Requirements by Model Size

Model Params FP16 Q8 Q4
3B 6 GB 3 GB 2 GB
7B 14 GB 7 GB 4 GB
13B 26 GB 13 GB 7 GB
34B 68 GB 34 GB 18 GB
70B 140 GB 70 GB 36 GB

Q4 = 4-bit quantization (most common for local inference). Add 1-2 GB overhead for context window.

Memory Bandwidth: The Second Key Metric

After VRAM capacity, memory bandwidth (measured in GB/s) is the second most important spec. It directly determines how fast tokens are generated during inference. Higher bandwidth means faster responses.

This is why the RTX 4090 (1008 GB/s) generates tokens significantly faster than the RTX 4060 Ti (288 GB/s), even when running the same model at the same quantization level.

2. NVIDIA vs AMD vs Apple Silicon

NVIDIA (Recommended)

Best Choice
  • Pros: Universal CUDA support, best software compatibility, tensor cores for AI acceleration, largest community and documentation.
  • Cons: More expensive, higher power consumption, premium pricing on high-VRAM cards.
  • Best for: Anyone who wants guaranteed compatibility with all AI software.

AMD

  • Pros: Better price-per-VRAM ratio, ROCm improving rapidly, good Linux support, competitive performance on supported software.
  • Cons: Limited software compatibility (ROCm only works on Linux), fewer optimized kernels, smaller AI community.
  • Best for: Linux users on a budget who are comfortable with manual setup.

Apple Silicon (M1/M2/M3/M4)

  • Pros: Unified memory (system RAM = GPU VRAM), excellent power efficiency, silent operation, Metal GPU acceleration well-supported by Ollama and llama.cpp.
  • Cons: Lower memory bandwidth than discrete GPUs, limited to macOS, no CUDA support, slower token generation per GB of memory.
  • Best for: Mac users who want a quiet, energy-efficient setup. M4 Pro/Max with 36-128 GB unified memory can run very large models.

Comparativa de GPUs NVIDIA para IA local — precios en euros

La siguiente tabla compara las mejores GPUs NVIDIA para IA local con sus precios aproximados en el mercado europeo, VRAM disponible, velocidad de inferencia real y el perfil de usuario más adecuado. Útil para comparar de un vistazo antes de decidir qué comprar.

GPU VRAM Precio aprox (€) Velocidad tok/s Caso de uso
RTX 4090 24 GB ~1.999 € ~90 tok/s Producción / modelos grandes
RTX 4080 Super 16 GB ~899 € ~65 tok/s Desarrollo avanzado
RTX 4070 Ti Super 16 GB ~699 € ~55 tok/s Equilibrio precio/rendimiento
RTX 4070 Super 12 GB ~549 € ~45 tok/s Gaming + IA local
RTX 4070 12 GB ~449 € ~40 tok/s Entrada a 12 GB
RTX 4060 Ti 8 / 16 GB ~399 € ~35 tok/s Presupuesto medio
RTX 4060 8 GB ~299 € ~25 tok/s Iniciarse en IA local
RTX 3090 (segunda mano) 24 GB ~799 € ~70 tok/s Alternativa económica 24 GB
RTX 3060 12 GB ~269 € ~30 tok/s Entrada económica

Precios orientativos en euros para el mercado europeo (abril 2026). Velocidades medidas con Llama 3.1 8B Q4_K_M en Ollama. Los precios de segunda mano corresponden a plataformas como Wallapop, eBay.es o Back Market.

3. Budget Recommendations by Tier

Entry Tier: Used-value / low-budget

6-8 GB VRAM

Good for small models (7B quantized), Whisper, and basic Stable Diffusion.

  • New: RTX 4060 (8 GB), RTX 3060 (12 GB, refurbished)
  • Used: RTX 3060 12 GB, RTX 2060 12 GB
  • Can run: Llama 3.1 8B (Q4), Mistral 7B, Phi-3 Mini, Whisper, SD 1.5

Precios y disponibilidad pueden cambiar. Enlaces de afiliado.

RTX 4060

8 GB VRAM
Ver disponibilidad →

RTX 3060

12 GB VRAM
Ver disponibilidad →

Mid Tier: Mainstream / upper-mid

12-16 GB VRAM

The sweet spot for most users. Handles 13B models and SDXL comfortably.

  • Best pick: RTX 4060 Ti 16 GB
  • Alternative: RTX 4070 (12 GB), used RTX 3090 (24 GB)
  • Can run: Llama 3.1 13B, Mixtral 8x7B (Q4), CodeLlama 34B (Q4), SDXL, Flux

RTX 4060 Ti 16GB

16 GB VRAM
Ver disponibilidad →

RTX 4070

12 GB VRAM
Ver disponibilidad →

High Tier: Enthusiast consumer

24 GB VRAM

Run almost any model at good quality. Future-proof for 2-3 years.

  • Best pick: RTX 4090 (24 GB), RTX 3090 used
  • Alternative: RTX 4080 Super (16 GB)
  • Can run: Llama 3.1 70B (Q4), all image gen models, Whisper Large, multi-model setups

RTX 4090

24 GB VRAM
Ver disponibilidad →

RTX 3090

24 GB VRAM
Ver disponibilidad →

Enthusiast Tier: Workstation / halo

48+ GB VRAM

For researchers and power users who need maximum model quality.

  • Best pick: 2x RTX 3090 (48 GB total), RTX A6000 (48 GB)
  • Alternative: Mac Studio M4 Ultra (128 GB unified)
  • Can run: 70B+ models at high quantization, fine-tuning, multi-model serving

4. Upgrade Paths

If you already have a GPU and want to upgrade, here are the most common paths:

GTX 1060/1070/1080 (6-8 GB) RTX 3060 12 GB or RTX 4060 Ti 16 GB

Doubles your VRAM and adds tensor core support. Biggest bang for buck upgrade.

RTX 3060/3070 (8-12 GB) RTX 4070 Ti Super 16 GB or used RTX 3090 24 GB

Gets you into 24 GB territory for large model support.

RTX 3080/3090 (10-24 GB) RTX 4090 or wait for RTX 5090

Already well-equipped. Consider adding a second GPU or waiting for next gen.

Pro tip: Before upgrading your GPU, check your power supply. High-end GPUs like the RTX 4090 need a 850W+ PSU with the right connectors. Also ensure your case has enough physical space for the card.

Not Sure Which GPU You Need?

Our tools can help you decide based on the models you want to run.