feat: Add remote embedding providers for semantic search

Adds support for remote embedding providers (OpenAI, Ollama) for semantic search. This change introduces a new option in the configuration, allowing users to choose between the existing local Jina AI models and the new remote providers. For vision embeddings, the remote providers use a two-step process: 1. A text description of the image is generated using the configured GenAI provider. 2. An embedding is created from that description using the configured remote embedding provider. This requires a GenAI provider to be configured when using a remote provider for semantic search. The configuration for remote providers has been updated to allow customizing the prompt used for the vision model. Documentation for the new feature has been added to .
2026-04-11 09:37:37 +03:00 · 2025-12-10 12:27:57 -05:00 · 2025-12-10 12:27:57 -05:00 · 39fc9e37e1
commit 39fc9e37e1
parent 9cdc10008d
7 changed files with 358 additions and 51 deletions
--- a/docs/docs/configuration/semantic_search.md
+++ b/docs/docs/configuration/semantic_search.md
@ -5,13 +5,13 @@ title: Semantic Search
 Semantic Search in Frigate allows you to find tracked objects within your review items using either the image itself, a user-defined text description, or an automatically generated one. This feature works by creating _embeddings_ — numerical vector representations — for both the images and text descriptions of your tracked objects. By comparing these embeddings, Frigate assesses their similarities to deliver relevant search results.
-Frigate uses models from [Jina AI](https://huggingface.co/jinaai) to create and save embeddings to Frigate's database. All of this runs locally.
+Frigate can run models locally or be configured to use a remote service. All local processing runs on your own hardware.
 Semantic Search is accessed via the _Explore_ view in the Frigate UI.
 ## Minimum System Requirements
-Semantic Search works by running a large AI model locally on your system. Small or underpowered systems like a Raspberry Pi will not run Semantic Search reliably or at all.
+When running models locally, Semantic Search works by running a large AI model on your system. Small or underpowered systems like a Raspberry Pi will not run Semantic Search reliably or at all.
 A minimum of 8GB of RAM is required to use Semantic Search. A GPU is not strictly required but will provide a significant performance increase over CPU-only systems.
@ -35,7 +35,11 @@ If you are enabling Semantic Search for the first time, be advised that Frigate
 :::
-### Jina AI CLIP (version 1)
+### Local Providers
 Frigate uses models from [Jina AI](https://huggingface.co/jinaai) to create and save embeddings to Frigate's database. All of this runs locally.
 #### Jina AI CLIP (version 1)
 The [V1 model from Jina](https://huggingface.co/jinaai/jina-clip-v1) has a vision model which is able to embed both images and text into the same vector space, which allows `image -> image` and `text -> image` similarity searches. Frigate uses this model on tracked objects to encode the thumbnail image and store it in the database. When searching for tracked objects via text in the search box, Frigate will perform a `text -> image` similarity search against this embedding. When clicking "Find Similar" in the tracked object detail pane, Frigate will perform an `image -> image` similarity search to retrieve the closest matching thumbnails.
@ -46,14 +50,14 @@ Differently weighted versions of the Jina models are available and can be select
 ```yaml
 semantic_search:
  enabled: True
-  model: "jinav1"
+  local_model: "jinav1"
-  model_size: small
+  local_model_size: small
 ```
 - Configuring the `large` model employs the full Jina model and will automatically run on the GPU if applicable.
 - Configuring the `small` model employs a quantized version of the Jina model that uses less RAM and runs on CPU with a very negligible difference in embedding quality.
-### Jina AI CLIP (version 2)
+#### Jina AI CLIP (version 2)
 Frigate also supports the [V2 model from Jina](https://huggingface.co/jinaai/jina-clip-v2), which introduces multilingual support (89 languages). In contrast, the V1 model only supports English.
@ -64,8 +68,8 @@ To use the V2 model, update the `model` parameter in your config:
 ```yaml
 semantic_search:
  enabled: True
-  model: "jinav2"
+  local_model: "jinav2"
-  model_size: large
+  local_model_size: large
 ```
 For most users, especially native English speakers, the V1 model remains the recommended choice.
@ -76,6 +80,25 @@ Switching between V1 and V2 requires reindexing your embeddings. The embeddings
 :::
 ### Remote Providers
 Frigate can be configured to use remote services for generating embeddings. This is done by setting the `provider` field to `openai` or `ollama`.
 For vision embeddings, remote providers use a two-step process:
 1. A text description of the image is generated using the configured GenAI provider.
 2. An embedding is created from that description using the configured remote embedding provider.
 This means that you must have a GenAI provider configured to use vision embeddings with a remote provider.
 ```yaml
 semantic_search:
  enabled: True
  provider: openai
  remote:
    model: "text-embedding-3-small"
    vision_model_prompt: "A detailed description of the image for semantic search."
 ```
 ### GPU Acceleration
 The CLIP models are downloaded in ONNX format, and the `large` model can be accelerated using GPU hardware, when available. This depends on the Docker build that is used. You can also target a specific device in a multi-GPU installation.
@ -83,7 +106,7 @@ The CLIP models are downloaded in ONNX format, and the `large` model can be acce
 ```yaml
 semantic_search:
  enabled: True
-  model_size: large
+  local_model_size: large
  # Optional, if using the 'large' model in a multi-GPU installation
  device: 0
 ```
--- a/frigate/config/classification.py
+++ b/frigate/config/classification.py
@ -114,6 +114,30 @@ class CustomClassificationConfig(FrigateBaseModel):
    state_config: CustomClassificationStateConfig | None = Field(default=None)
 class SemanticSearchProviderEnum(str, Enum):
    local = "local"
    openai = "openai"
    ollama = "ollama"
 class RemoteSemanticSearchConfig(FrigateBaseModel):
    """Config for remote semantic search providers."""
    api_key: Optional[str] = Field(
        default=None, title="API key for the remote embedding provider."
    )
    model: Optional[str] = Field(
        default=None, title="The embedding model to use for semantic search."
    )
    url: Optional[str] = Field(
        default=None, title="URL for the remote embedding provider."
    )
    vision_model_prompt: Optional[str] = Field(
        default="A detailed description of the image for semantic search.",
        title="Prompt for the vision model to describe the image for embedding. This uses the configured GenAI provider.",
    )
 class ClassificationConfig(FrigateBaseModel):
    bird: BirdClassificationConfig = Field(
        default_factory=BirdClassificationConfig, title="Bird classification config."
@ -124,22 +148,32 @@ class ClassificationConfig(FrigateBaseModel):
 class SemanticSearchConfig(FrigateBaseModel):
    """Config for semantic search."""
    enabled: bool = Field(default=False, title="Enable semantic search.")
    reindex: Optional[bool] = Field(
        default=False, title="Reindex all tracked objects on startup."
    )
-    model: Optional[SemanticSearchModelEnum] = Field(
+    provider: SemanticSearchProviderEnum = Field(
-        default=SemanticSearchModelEnum.jinav1,
+        default=SemanticSearchProviderEnum.local,
-        title="The CLIP model to use for semantic search.",
+        title="The semantic search provider to use.",
    )
-    model_size: str = Field(
+    local_model: Optional[SemanticSearchModelEnum] = Field(
-        default="small", title="The size of the embeddings model used."
+        default=SemanticSearchModelEnum.jinav1,
        title="The local CLIP model to use for semantic search.",
    )
    local_model_size: str = Field(
        default="small", title="The size of the local embeddings model used."
    )
    device: Optional[str] = Field(
        default=None,
        title="The device key to use for semantic search.",
        description="This is an override, to target a specific device. See https://onnxruntime.ai/docs/execution-providers/ for more information",
    )
    remote: RemoteSemanticSearchConfig = Field(
        default_factory=RemoteSemanticSearchConfig,
        title="Remote semantic search provider config.",
    )
 class TriggerConfig(FrigateBaseModel):
--- a/frigate/embeddings/embeddings.py
+++ b/frigate/embeddings/embeddings.py
@ -16,8 +16,7 @@ from frigate.comms.embeddings_updater import (
    EmbeddingsRequestEnum,
 )
 from frigate.comms.inter_process import InterProcessRequestor
-from frigate.config import FrigateConfig
+from frigate.config import FrigateConfig, SemanticSearchModelEnum, SemanticSearchProviderEnum
 from frigate.config.classification import SemanticSearchModelEnum
 from frigate.const import (
    CONFIG_DIR,
    TRIGGER_DIR,
@ -26,6 +25,7 @@ from frigate.const import (
 )
 from frigate.data_processing.types import DataProcessorMetrics
 from frigate.db.sqlitevecq import SqliteVecQueueDatabase
 from frigate.embeddings.remote import get_embedding_client
 from frigate.models import Event, Trigger
 from frigate.types import ModelStatusTypesEnum
 from frigate.util.builtin import EventsPerSecond, InferenceSpeed, serialize
@ -96,43 +96,48 @@ class Embeddings:
        # Create tables if they don't exist
        self.db.create_embeddings_tables()
-        models = self.get_model_definitions()
+        if self.config.semantic_search.provider == SemanticSearchProviderEnum.local:
            models = self.get_model_definitions()
-        for model in models:
+            for model in models:
-            self.requestor.send_data(
+                self.requestor.send_data(
-                UPDATE_MODEL_STATE,
+                    UPDATE_MODEL_STATE,
-                {
+                    {
-                    "model": model,
+                        "model": model,
-                    "state": ModelStatusTypesEnum.not_downloaded,
+                        "state": ModelStatusTypesEnum.not_downloaded,
-                },
+                    },
-            )
+                )
-        if self.config.semantic_search.model == SemanticSearchModelEnum.jinav2:
+            if self.config.semantic_search.local_model == SemanticSearchModelEnum.jinav2:
-            # Single JinaV2Embedding instance for both text and vision
+                # Single JinaV2Embedding instance for both text and vision
-            self.embedding = JinaV2Embedding(
+                self.embedding = JinaV2Embedding(
-                model_size=self.config.semantic_search.model_size,
+                    model_size=self.config.semantic_search.local_model_size,
-                requestor=self.requestor,
+                    requestor=self.requestor,
-                device=config.semantic_search.device
+                    device=config.semantic_search.device
-                or ("GPU" if config.semantic_search.model_size == "large" else "CPU"),
+                    or ("GPU" if config.semantic_search.local_model_size == "large" else "CPU"),
-            )
+                )
-            self.text_embedding = lambda input_data: self.embedding(
+                self.text_embedding = lambda input_data: self.embedding(
-                input_data, embedding_type="text"
+                    input_data, embedding_type="text"
-            )
+                )
-            self.vision_embedding = lambda input_data: self.embedding(
+                self.vision_embedding = lambda input_data: self.embedding(
-                input_data, embedding_type="vision"
+                    input_data, embedding_type="vision"
-            )
+                )
-        else:  # Default to jinav1
+            else:  # Default to jinav1
-            self.text_embedding = JinaV1TextEmbedding(
+                self.text_embedding = JinaV1TextEmbedding(
-                model_size=config.semantic_search.model_size,
+                    model_size=config.semantic_search.local_model_size,
-                requestor=self.requestor,
+                    requestor=self.requestor,
-                device="CPU",
+                    device="CPU",
-            )
+                )
-            self.vision_embedding = JinaV1ImageEmbedding(
+                self.vision_embedding = JinaV1ImageEmbedding(
-                model_size=config.semantic_search.model_size,
+                    model_size=config.semantic_search.local_model_size,
-                requestor=self.requestor,
+                    requestor=self.requestor,
-                device=config.semantic_search.device
+                    device=config.semantic_search.device
-                or ("GPU" if config.semantic_search.model_size == "large" else "CPU"),
+                    or ("GPU" if config.semantic_search.local_model_size == "large" else "CPU"),
-            )
+                )
        else:
            self.remote_embedding_client = get_embedding_client(self.config)
            self.text_embedding = self.remote_embedding_client.embed_texts
            self.vision_embedding = self.remote_embedding_client.embed_images
    def update_stats(self) -> None:
        self.metrics.image_embeddings_eps.value = self.image_eps.eps()
--- a/frigate/embeddings/remote/init.py
+++ b/frigate/embeddings/remote/init.py
@ -0,0 +1,67 @@
 """Remote embedding clients for Frigate."""
 import importlib
 import logging
 import os
 from typing import Any, Optional
 from frigate.config import FrigateConfig, SemanticSearchConfig, SemanticSearchProviderEnum
 from frigate.genai import get_genai_client
 logger = logging.getLogger(__name__)
 PROVIDERS = {}
 def register_embedding_provider(key: SemanticSearchProviderEnum):
    """Register a remote embedding provider."""
    def decorator(cls):
        PROVIDERS[key] = cls
        return cls
    return decorator
 class RemoteEmbeddingClient:
    """Remote embedding client for Frigate."""
    def __init__(self, config: FrigateConfig, timeout: int = 120) -> None:
        self.config = config
        self.timeout = timeout
        self.provider = self._init_provider()
        self.genai_client = get_genai_client(self.config)
    def _init_provider(self):
        """Initialize the client."""
        return None
    def embed_texts(self, texts: list[str]) -> Optional[list[list[float]]]:
        """Get embeddings for a list of texts."""
        return None
    def embed_images(self, images: list[bytes]) -> Optional[list[list[float]]]:
        """Get embeddings for a list of images."""
        return None
 def get_embedding_client(config: FrigateConfig) -> Optional[RemoteEmbeddingClient]:
    """Get the embedding client."""
    if not config.semantic_search.provider or config.semantic_search.provider == SemanticSearchProviderEnum.local:
        return None
    load_providers()
    provider = PROVIDERS.get(config.semantic_search.provider)
    if provider:
        return provider(config)
    return None
 def load_providers():
    package_dir = os.path.dirname(__file__)
    for filename in os.listdir(package_dir):
        if filename.endswith(".py") and filename != "__init__.py":
            module_name = f"frigate.embeddings.remote.{filename[:-3]}"
            importlib.import_module(module_name)
--- a/frigate/embeddings/remote/ollama.py
+++ b/frigate/embeddings/remote/ollama.py
@ -0,0 +1,91 @@
 """Ollama embedding client for Frigate."""
 import logging
 from typing import Optional
 from httpx import TimeoutException
 from ollama import Client as ApiClient
 from ollama import ResponseError
 from frigate.config import SemanticSearchProviderEnum
 from frigate.embeddings.remote import (
    RemoteEmbeddingClient,
    register_embedding_provider,
 )
 logger = logging.getLogger(__name__)
@register_embedding_provider(SemanticSearchProviderEnum.ollama)
 class OllamaEmbeddingClient(RemoteEmbeddingClient):
    """Remote embedding client for Frigate using Ollama."""
    provider: ApiClient
    def _init_provider(self):
        """Initialize the client."""
        try:
            client = ApiClient(
                host=self.config.semantic_search.remote.url, timeout=self.timeout
            )
            # ensure the model is available locally
            response = client.show(self.config.semantic_search.remote.model)
            if response.get("error"):
                logger.error(
                    "Ollama error: %s",
                    response["error"],
                )
                return None
            return client
        except Exception as e:
            logger.warning("Error initializing Ollama: %s", str(e))
            return None
    def embed_texts(self, texts: list[str]) -> Optional[list[list[float]]]:
        """Get embeddings for a list of texts."""
        if self.provider is None:
            logger.warning(
                "Ollama provider has not been initialized, embeddings will not be generated. Check your Ollama configuration."
            )
            return None
        try:
            embeddings = []
            for text in texts:
                result = self.provider.embeddings(
                    model=self.config.semantic_search.remote.model,
                    prompt=text,
                )
                embeddings.append(result["embedding"])
            return embeddings
        except (TimeoutException, ResponseError, ConnectionError) as e:
            logger.warning("Ollama returned an error: %s", str(e))
            return None
    def embed_images(self, images: list[bytes]) -> Optional[list[list[float]]]:
        """Get embeddings for a list of images.
        This uses a two-step process:
        1. Generate a text description of the image using the configured GenAI provider.
        2. Create an embedding from the description using the text embedding model.
        """
        if not self.genai_client:
            logger.warning(
                "A GenAI provider is not configured. Cannot generate image descriptions."
            )
            return None
        descriptions = []
        for image in images:
            description = self.genai_client.generate_image_description(
                prompt=self.config.semantic_search.remote.vision_model_prompt,
                images=[image],
            )
            if description:
                descriptions.append(description)
            else:
                descriptions.append("")
        if not descriptions:
            return None
        return self.embed_texts(descriptions)
--- a/frigate/embeddings/remote/openai.py
+++ b/frigate/embeddings/remote/openai.py
@ -0,0 +1,78 @@
 """OpenAI embedding client for Frigate."""
 import base64
 import logging
 from typing import Optional
 from httpx import TimeoutException
 from openai import OpenAI
 from frigate.config import SemanticSearchProviderEnum
 from frigate.embeddings.remote import (
    RemoteEmbeddingClient,
    register_embedding_provider,
 )
 logger = logging.getLogger(__name__)
@register_embedding_provider(SemanticSearchProviderEnum.openai)
 class OpenAIEmbeddingClient(RemoteEmbeddingClient):
    """Remote embedding client for Frigate using OpenAI."""
    provider: OpenAI
    def _init_provider(self):
        """Initialize the client."""
        return OpenAI(
            api_key=self.config.semantic_search.remote.api_key,
            base_url=self.config.semantic_search.remote.url,
        )
    def embed_texts(self, texts: list[str]) -> Optional[list[list[float]]]:
        """Get embeddings for a list of texts."""
        try:
            result = self.provider.embeddings.create(
                model=self.config.semantic_search.remote.model,
                input=texts,
                timeout=self.timeout,
            )
            if (
                result is not None
                and hasattr(result, "data")
                and len(result.data) > 0
            ):
                return [embedding.embedding for embedding in result.data]
            return None
        except (TimeoutException, Exception) as e:
            logger.warning("OpenAI returned an error: %s", str(e))
            return None
    def embed_images(self, images: list[bytes]) -> Optional[list[list[float]]]:
        """Get embeddings for a list of images.
        This uses a two-step process:
        1. Generate a text description of the image using the configured GenAI provider.
        2. Create an embedding from the description using the text embedding model.
        """
        if not self.genai_client:
            logger.warning(
                "A GenAI provider is not configured. Cannot generate image descriptions."
            )
            return None
        descriptions = []
        for image in images:
            description = self.genai_client.generate_image_description(
                prompt=self.config.semantic_search.remote.vision_model_prompt,
                images=[image],
            )
            if description:
                descriptions.append(description)
            else:
                descriptions.append("")
        if not descriptions:
            return None
        return self.embed_texts(descriptions)
--- a/frigate/genai/init.py
+++ b/frigate/genai/init.py
@ -291,6 +291,15 @@ Rules for the report:
        logger.debug(f"Sending images to genai provider with prompt: {prompt}")
        return self._send(prompt, thumbnails)
    def generate_image_description(
        self,
        prompt: str,
        images: list[bytes],
    ) -> Optional[str]:
        """Generate a description for an image."""
        logger.debug(f"Sending images to genai provider with prompt: {prompt}")
        return self._send(prompt, images)
    def _init_provider(self):
        """Initialize the client."""
        return None