From 212803e057300c197e810d78fba01408817c0e77 Mon Sep 17 00:00:00 2001 From: Nicolas Mowen Date: Fri, 25 Oct 2024 07:00:03 -0600 Subject: [PATCH] Make note of genai on CPU --- docs/docs/configuration/genai.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/docs/docs/configuration/genai.md b/docs/docs/configuration/genai.md index cdaf0adbe..8e2e4fbc2 100644 --- a/docs/docs/configuration/genai.md +++ b/docs/docs/configuration/genai.md @@ -29,7 +29,13 @@ cameras: ## Ollama -[Ollama](https://ollama.com/) allows you to self-host large language models and keep everything running locally. It provides a nice API over [llama.cpp](https://github.com/ggerganov/llama.cpp). It is highly recommended to host this server on a machine with an Nvidia graphics card, or on a Apple silicon Mac for best performance. CPU inference is not recommended. +:::warning + +Using Ollama on CPU is not recommended, high inference times make using generative AI impractical. + +::: + +[Ollama](https://ollama.com/) allows you to self-host large language models and keep everything running locally. It provides a nice API over [llama.cpp](https://github.com/ggerganov/llama.cpp). It is highly recommended to host this server on a machine with an Nvidia graphics card, or on a Apple silicon Mac for best performance. Most of the 7b parameter 4-bit vision models will fit inside 8GB of VRAM. There is also a [docker container](https://hub.docker.com/r/ollama/ollama) available.