Personal local AI assistant
Users who want privacy and want to skip cloud subscriptions
94 models with exact VRAM requirements at FP16, Q8, Q4, and Q2. Select any model to see which GPUs can run it and at what quality.
Llama 3.1 8B Q4 is the best entry point — runs on any GPU with 6 GB+ VRAM. Step up to Mistral 7B or Llama 3.1 13B if you have 12 GB+ VRAM.
View Llama 3.1 8BIntent-first guidance:These routes tie your intent to a minimum VRAM target plus a recommended model and GPU before entering the full model catalog.
Users who want privacy and want to skip cloud subscriptions
Journalists, researchers, healthcare professionals
Creators and digital artists
General-purpose LLMs for conversation and complex reasoning
Specialized models for writing, reviewing, and explaining code
Models that process images and text together
Diffusion models for generating and editing images locally
Transcription and translation models