Can I run DeepSeek R1 on my PC?

Yes — if you have a GPU with at least 8 GB VRAM you can run DeepSeek R1 Distill 8B, which requires only 4.8 GB VRAM at Q4 quantization. GPUs like the RTX 3060 (8 GB), RTX 3070, or GTX 1080 Ti work well. The full 671B model is not feasible for home use.

How much VRAM does DeepSeek R1 need?

It depends on which version: DeepSeek R1 Distill 8B needs ~4.8 GB VRAM (Q4), the 14B needs ~8.4 GB, and the 32B needs ~19.2 GB. The full DeepSeek R1 671B requires ~403 GB at Q4 — not 200 GB as many guides claim (that figure is Q2, which is too low quality for real use). For home use, the Distill models are the right choice.

Is DeepSeek free to run locally?

Yes. Ollama is free and open source. The DeepSeek R1 Distill models are released under the MIT license, which allows free use, modification, and distribution. The only cost is electricity to run your hardware.

What is the DeepSeek block?

The ... block is DeepSeek R1's chain-of-thought reasoning output. Before giving a final answer, the model works through the problem step by step and shows that process. This is expected behavior, not an error. Reasoning steps can take 30–60 seconds on complex questions.

How to Run DeepSeek Locally — RunAIatHome

1. What Is DeepSeek?

DeepSeek R1 is a reasoning-focused large language model released by DeepSeek AI in early 2025. It uses chain-of-thought reasoning — showing its work before giving a final answer — which makes it particularly strong for coding, math, and logic tasks.

Running it locally gives you full privacy (no data sent to external servers), zero API costs, and offline access. Your documents and code stay on your machine.

One important clarification upfront: the full DeepSeek R1 model (671B parameters) is designed for data centers and requires hundreds of gigabytes of VRAM. For home use, the distill versions — 8B, 14B, and 32B — are the practical choice. They retain most of R1's reasoning capability at a fraction of the hardware requirements.

2. Which DeepSeek Model Can You Run?

VRAM is the limiting factor. Here are the exact requirements for each DeepSeek variant at Q4 quantization — the standard balance of quality and size:

Model	Params	Q4 VRAM	Min GPU
DeepSeek R1 Distill 8B	8B	~4.8 GB	RTX 3060, GTX 1080 Ti
DeepSeek R1 Distill 14B	14B	~8.4 GB	RTX 3060 12GB, RTX 4060
DeepSeek R1 Distill 32B	32B	~19.2 GB	RTX 4090, RTX 3090
DeepSeek R1 Full	671B MoE	~403 GB	Multi-GPU data center

Correction for common guides

Most guides say DeepSeek R1 needs "200 GB at Q4." That's wrong — it requires ~403 GB at Q4 quantization. The 200 GB figure refers to Q2, which degrades output quality significantly and is not suitable for serious tasks. For home use, stick to the Distill models.

Not sure what your GPU can handle? Use our VRAM Calculator to check your exact GPU →

3. Requirements

GPU (VRAM)

8 GB VRAM: Runs Distill 8B at Q4 (4.8 GB). Minimum viable setup. RTX 3070, RTX 4060.

12 GB VRAM: Runs Distill 14B at Q4 (8.4 GB). Sweet spot for quality-to-VRAM ratio. RTX 3060 12GB, RTX 4060.

24 GB VRAM: Runs Distill 32B at Q4 (~19 GB). Best local reasoning quality. RTX 3090, RTX 4090.

System RAM and Storage

Model	Min RAM	Disk Space
Distill 8B	16 GB	~5 GB
Distill 14B	16 GB	~10 GB
Distill 32B	32 GB	~20 GB

4. Step-by-Step: Run DeepSeek with Ollama

1

Check Your VRAM

Before downloading anything, verify you have enough VRAM for your target model. On Windows, open Task Manager → Performance → GPU. On Linux/macOS with NVIDIA:

nvidia-smi

Use the VRAM Calculator to confirm which DeepSeek model your GPU can handle.
2

Install Ollama

Download from ollama.com/download for macOS or Windows. On Linux, run:

curl -fsSL https://ollama.com/install.sh | sh

See the Complete Ollama Guide for detailed installation steps per OS.
3

Pull DeepSeek

Download the model that fits your VRAM:

8B (recommended for most setups):
ollama pull deepseek-r1:8b

14B (12GB+ VRAM):
ollama pull deepseek-r1:14b

32B (24GB+ VRAM):
ollama pull deepseek-r1:32b
4

Run

Start an interactive chat session:

ollama run deepseek-r1:8b

The <think> block is normal. DeepSeek R1 shows its chain-of-thought reasoning before giving the final answer. This can take 30–60 seconds on complex questions. It is not an error or a hang.
5

Use via API (optional, for developers)

Ollama exposes a REST API on port 11434. Query it from any script or app:

curl http://localhost:11434/api/generate -d '{ "model": "deepseek-r1:8b", "prompt": "Explain recursion with a simple example", "stream": false }'

5. Common Issues

"Model is slow"

Check your VRAM usage with nvidia-smi. If the model doesn't fit entirely in VRAM, it spills to RAM (CPU inference), which is 10–20x slower. Either switch to a smaller model or add more VRAM.

"Out of memory" error

Your GPU doesn't have enough VRAM for the selected model. Switch to a smaller variant: if 14B fails, try 8B. If 8B still fails, you may need to close other applications consuming VRAM (browsers, games, other models).

"The model keeps thinking forever"

This is expected behavior for reasoning models. DeepSeek R1's chain-of-thought can run 30–60 seconds on complex problems. If it runs for several minutes without output, it may be stuck — try a shorter or more specific prompt.

6. DeepSeek R1 vs Distill: What's the Difference?

The full DeepSeek R1 (671B) uses a Mixture of Experts (MoE) architecture — it activates only a subset of parameters per token, which is why it needs less compute per inference than a dense 671B model, but still requires ~403 GB VRAM to load. That's multi-GPU server territory.

The Distill models (8B, 14B, 32B) are dense models fine-tuned from R1's outputs. They are not MoE — they are smaller, fully-dense networks trained to replicate R1's reasoning behavior using knowledge distillation. The result: they load in a fraction of the VRAM and run fast on consumer GPUs.

The distill models retain 90%+ of R1's reasoning capability for everyday tasks — coding, math, analysis — in a fraction of the VRAM. For home use, they are the right choice, not a compromise.

GPUs recomendadas para correr DeepSeek localmente

DeepSeek R1 8B Distill needs ~8 GB VRAM. 14B needs 16 GB. 32B needs 24 GB.

Precios y disponibilidad pueden cambiar. Enlaces de afiliado.

Entry Tier

8–12 GB VRAM

RTX 4060

8 GB VRAM

Ver disponibilidad →

RTX 3060

12 GB VRAM

Ver disponibilidad →

Mid Tier

12–16 GB VRAM

RTX 4060 Ti 16GB

16 GB VRAM

Ver disponibilidad →

RTX 4070

12 GB VRAM

Ver disponibilidad →

High Tier

24 GB VRAM

RTX 4090

24 GB VRAM

Ver disponibilidad →

RTX 3090

24 GB VRAM

Ver disponibilidad →

Check Which DeepSeek Model Fits Your GPU

Enter your GPU model and get exact VRAM headroom for each DeepSeek variant.

VRAM Calculator →

1. What Is DeepSeek?

2. Which DeepSeek Model Can You Run?

Correction for common guides

3. Requirements

GPU (VRAM)

System RAM and Storage

4. Step-by-Step: Run DeepSeek with Ollama

Check Your VRAM

Install Ollama

Pull DeepSeek

Run

Use via API (optional, for developers)

5. Common Issues

"Model is slow"

"Out of memory" error

"The model keeps thinking forever"

6. DeepSeek R1 vs Distill: What's the Difference?

GPUs recomendadas para correr DeepSeek localmente

Entry Tier

Mid Tier

High Tier

Check Which DeepSeek Model Fits Your GPU