Run AI
Step by Step
In-depth technical guides for running AI on your own hardware. From zero to chatting with Llama 3 in under 30 minutes — no accounts, no downloads, no subscriptions.
Ollama installs with a single script, exposes an OpenAI-compatible API, and handles model download and loading automatically. llama.cpp requires manual compilation but delivers the best raw performance and full control over inference parameters.
— RunAIatHome Guides — analysis of popular local AI tools 2026Getting Started with Local AI
Everything you need to know to run your first AI model at home. From hardware basics to software setup, this guide covers the complete journey from zero to running LLMs locally.
Complete Ollama Guide
Master Ollama, the easiest way to run LLMs locally. Step-by-step installation, model management, advanced configuration, and tips for getting the best performance.
How to Choose a GPU for AI
VRAM, bandwidth, tensor cores — what actually matters when buying a GPU for AI? Budget recommendations by tier, NVIDIA vs AMD vs Apple Silicon comparison, and upgrade paths.
How to Run DeepSeek Locally
Run DeepSeek R1 on your PC using Ollama. Exact VRAM requirements for the 8B, 14B, and 32B distill models, step-by-step setup, and a correction to the common "200 GB" myth about the full 671B model.
GPU Benchmarks for Local AI (2025)
Real benchmarks for RTX 4090, RTX 3090, RX 7900 XTX, Apple M4 Max, RTX A6000, and more. VRAM, tokens/sec for Llama 3 70B, Stable Diffusion speed, power consumption, and pricing.
How to Run Llama 3 Locally: Complete Guide
From zero to chatting with Llama 3 on your own hardware. Hardware requirements, Ollama setup, quantization explained (Q4_K_M, Q8, FP16), performance tuning, benchmarks by GPU, and troubleshooting.
Guides by level and required hardware
| Guide | Level | Read time | Required hardware |
|---|---|---|---|
| Getting Started with Local AI | Beginner | 12 min read | 8 GB VRAM GPU + 16 GB RAM |
| Complete Ollama Guide | Beginner | 15 min read | Any modern GPU or CPU |
| How to Choose a GPU for AI | Intermediate | 10 min read | No hardware required (buying guide) |
| How to Run DeepSeek Locally | Intermediate | 18 min read | 8 GB VRAM (8B), 10 GB (14B), 20 GB (32B) |
| GPU Benchmarks for Local AI (2025) | Intermediate | 22 min read | Reference guide — any GPU |
| How to Run Llama 3 Locally: Complete Guide | Beginner | 20 min read | 8 GB VRAM GPU + 16 GB RAM |
Affiliate disclosure: links below are Amazon affiliate links. If you purchase through them, we may earn a small commission at no extra cost to you. Prices and availability may change.
Compatible GPUs for these guides
The hardware requirements above map to three GPU tiers. Pick the tier that matches your target guide.
RTX 4060
Runs the Getting Started and Ollama guides comfortably — Llama 3 8B at full speed with headroom for quantized 13B models.
RTX 4060 Ti 16GB
Covers the DeepSeek 8B and 14B variants and Llama 3 guides. Enough VRAM for quantized 32B models in most configurations.
RTX 4090
Unlocks every guide on this page including DeepSeek 32B in full precision and the GPU benchmark guide's top-tier setups.
Local AI software comparison: Ollama vs LM Studio vs llama.cpp
The three main options for running AI models locally have very different profiles. This table summarises the key differences to help you pick the right tool for your use case.
| Criterion | Ollama | LM Studio | llama.cpp |
|---|---|---|---|
| Ease of use | ⭐⭐⭐⭐⭐ One command | ⭐⭐⭐⭐ Friendly GUI | ⭐⭐ Power-user CLI |
| REST API | Yes (OpenAI-compatible) | Yes (OpenAI-compatible) | Yes (server mode) |
| Available models | Curated library | HuggingFace GGUF | Any GGUF file |
| Performance | High (tuned) | High | Maximum (native) |
| OS support | Win, Mac, Linux | Win, Mac, Linux | Win, Mac, Linux |
| Best for | Development, production | Exploring models | Raw performance |
Verdict: which software should you use for local AI?
Ollama is the best choice for most users. Install it with a single command, pull models in seconds, and expose an OpenAI-compatible API that works with any client. It's fast, stable, and actively maintained.
LM Studio is ideal if you prefer a graphical interface or want to browse HuggingFace models without touching the terminal. It's the most accessible option for non-technical users.
llama.cpp is for power users who need maximum performance or full control over inference parameters. Requires manual compilation but delivers the best raw throughput.
RunAIatHome recommendation: start with Ollama + Open WebUI. If you need to browse models, add LM Studio. If raw performance is the priority, try llama.cpp with the same GGUF files.