Make note of genai on CPU

This commit is contained in:
Nicolas Mowen 2024-10-25 07:00:03 -06:00 committed by GitHub
parent 4631e7b3b5
commit 212803e057
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -29,7 +29,13 @@ cameras:
## Ollama ## Ollama
[Ollama](https://ollama.com/) allows you to self-host large language models and keep everything running locally. It provides a nice API over [llama.cpp](https://github.com/ggerganov/llama.cpp). It is highly recommended to host this server on a machine with an Nvidia graphics card, or on a Apple silicon Mac for best performance. CPU inference is not recommended. :::warning
Using Ollama on CPU is not recommended, high inference times make using generative AI impractical.
:::
[Ollama](https://ollama.com/) allows you to self-host large language models and keep everything running locally. It provides a nice API over [llama.cpp](https://github.com/ggerganov/llama.cpp). It is highly recommended to host this server on a machine with an Nvidia graphics card, or on a Apple silicon Mac for best performance.
Most of the 7b parameter 4-bit vision models will fit inside 8GB of VRAM. There is also a [docker container](https://hub.docker.com/r/ollama/ollama) available. Most of the 7b parameter 4-bit vision models will fit inside 8GB of VRAM. There is also a [docker container](https://hub.docker.com/r/ollama/ollama) available.