Compare commits

...

3 Commits

Author SHA1 Message Date
lucaszhu-hue
d534e92347
Merge a2b92caab0 into d982b3a782 2026-06-21 05:03:00 -10:00
Daniel
d982b3a782
perf(util): use monotonic clock and bounded deque in EventsPerSecond (#23520)
Some checks failed
CI / AMD64 Build (push) Has been cancelled
CI / ARM Build (push) Has been cancelled
CI / Jetson Jetpack 6 (push) Has been cancelled
CI / AMD64 Extra Build (push) Has been cancelled
CI / ARM Extra Build (push) Has been cancelled
CI / Synaptics Build (push) Has been cancelled
CI / Assemble and push default build (push) Has been cancelled
* perf(util): use monotonic clock and bounded deque in EventsPerSecond

EventsPerSecond is updated on every captured frame, every detection and
every processed frame across all cameras and detectors. The previous
implementation derived timestamps from datetime.now().timestamp() (wall
clock), so an NTP or manual clock adjustment could skew the rolling-window
expiry; it also stored timestamps in a list and expired them with
del self._timestamps[0] (O(n) per removal) plus a periodic slice-copy to
cap growth.

Switch to time.monotonic() for the interval math (correct by construction
and immune to wall-clock jumps) and a collections.deque(maxlen=...) so
expiry is O(1) (popleft) and retention is bounded automatically. This
mirrors the deque-based expiry already used in video/ffmpeg.py and
watchdog.py. Observable output is unchanged.

Adds frigate/test/test_builtin.py covering rate calculation, window
expiry and the memory bound.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test: drop test_timestamps_are_memory_bounded

It only asserted that deque(maxlen=) caps length, which is stdlib behavior
rather than something this change needs to verify.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 07:38:41 -06:00
Lucas Zhu
a2b92caab0 feat(genai): add Atlas Cloud as an OpenAI-compatible GenAI provider
Add an `atlas` GenAI provider backed by Atlas Cloud, an OpenAI-compatible
inference platform serving vision-capable models. The provider subclasses
the existing OpenAIClient and only defaults the base_url to the Atlas
endpoint, reusing all vision, streaming, reasoning, and tool-calling logic.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-18 00:59:49 +08:00
8 changed files with 236 additions and 9 deletions

View File

@ -12,6 +12,44 @@
\[English\] | [简体中文](https://github.com/blakeblackshear/frigate/blob/dev/README_CN.md)
---
<p align="center">
<a href="https://www.atlascloud.ai/?utm_source=github&utm_medium=link&utm_campaign=frigate">
<img src="docs/static/img/branding/atlas-cloud-logo.png" alt="Atlas Cloud" width="200">
</a>
</p>
<p align="center">
<b><a href="https://www.atlascloud.ai/?utm_source=github&utm_medium=link&utm_campaign=frigate">Atlas Cloud</a></b> is an OpenAI-compatible inference platform that can power Frigate's
<a href="https://docs.frigate.video/configuration/genai/">Generative AI</a> features as a drop-in multimodal LLM backend.
Point the <code>atlas</code> provider at Atlas Cloud and use a vision-capable model
(such as <code>qwen/qwen3-vl-235b-a22b-thinking</code> or <code>Qwen/Qwen3-VL-235B-A22B-Instruct</code>)
to generate natural-language object and review descriptions from detection frames —
no local GPU required. See the <a href="https://docs.frigate.video/configuration/genai/">GenAI configuration docs</a>
to get started, or grab a <a href="https://www.atlascloud.ai/console/coding-plan">coding plan</a>.
</p>
<details>
<summary>Vision-capable Atlas Cloud models for GenAI descriptions</summary>
Frigate's GenAI features require a **vision-capable** model. Good multimodal choices on Atlas Cloud include:
- `qwen/qwen3-vl-235b-a22b-thinking`
- `Qwen/Qwen3-VL-235B-A22B-Instruct`
- `qwen/qwen3-vl-30b-a3b-instruct`
- `qwen/qwen3-vl-30b-a3b-thinking`
- `qwen/qwen3-vl-8b-instruct`
- `google/gemini-3.5-flash`
- `google/gemini-3.1-pro-preview`
The full, always-current model catalog is available at the
[Atlas Cloud console](https://www.atlascloud.ai/console).
</details>
---
A complete and local NVR designed for [Home Assistant](https://www.home-assistant.io) with AI object detection. Uses OpenCV and Tensorflow to perform realtime object detection locally for IP cameras.
Use of a GPU or AI accelerator is highly recommended. AI accelerators will outperform even the best CPUs with very little overhead. See Frigate's supported [object detectors](https://docs.frigate.video/configuration/object_detectors/).

View File

@ -12,6 +12,43 @@
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
---
<p align="center">
<a href="https://www.atlascloud.ai/?utm_source=github&utm_medium=link&utm_campaign=frigate">
<img src="docs/static/img/branding/atlas-cloud-logo.png" alt="Atlas Cloud" width="200">
</a>
</p>
<p align="center">
<b><a href="https://www.atlascloud.ai/?utm_source=github&utm_medium=link&utm_campaign=frigate">Atlas Cloud</a></b> 是一个兼容 OpenAI 接口的推理平台,可作为即插即用的多模态 LLM 后端,
为 Frigate 的<a href="https://docs.frigate.video/configuration/genai/">生成式 AIGenerative AI</a>功能提供算力支持。
只需将 <code>atlas</code> provider 指向 Atlas Cloud并选用一个支持视觉的模型
(例如 <code>qwen/qwen3-vl-235b-a22b-thinking</code><code>Qwen/Qwen3-VL-235B-A22B-Instruct</code>
即可基于检测帧画面生成自然语言的物体描述与审查摘要,无需本地 GPU。
请参阅 <a href="https://docs.frigate.video/configuration/genai/">GenAI 配置文档</a>开始使用,
或了解 <a href="https://www.atlascloud.ai/console/coding-plan">coding plan</a>
</p>
<details>
<summary>适合做 GenAI 描述的 Atlas Cloud 多模态模型</summary>
Frigate 的 GenAI 功能要求使用**支持视觉**的模型。Atlas Cloud 上推荐的多模态模型包括:
- `qwen/qwen3-vl-235b-a22b-thinking`
- `Qwen/Qwen3-VL-235B-A22B-Instruct`
- `qwen/qwen3-vl-30b-a3b-instruct`
- `qwen/qwen3-vl-30b-a3b-thinking`
- `qwen/qwen3-vl-8b-instruct`
- `google/gemini-3.5-flash`
- `google/gemini-3.1-pro-preview`
完整且实时更新的模型列表请见 [Atlas Cloud 控制台](https://www.atlascloud.ai/console)。
</details>
---
一个完整的本地网络视频录像机NVR专为[Home Assistant](https://www.home-assistant.io)设计,具备 AI 目标/物体检测功能。使用 OpenCV 和 TensorFlow 在本地为 IP 摄像头执行实时物体检测。
强烈推荐使用 GPU 或者 AI 加速器(例如[Google Coral 加速器](https://coral.ai/products/) 或者 [Hailo](https://hailo.ai/)等)。它们的运行效率远远高于现在的顶级 CPU并且功耗也极低。

View File

@ -386,3 +386,44 @@ genai:
</TabItem>
</ConfigTabs>
### Atlas Cloud
[Atlas Cloud](https://www.atlascloud.ai/?utm_source=github&utm_medium=link&utm_campaign=frigate) is an OpenAI-compatible inference platform that serves a range of vision-capable models, so it can act as a drop-in multimodal backend for Frigate's Generative AI features. The `atlas` provider defaults its base URL to the Atlas Cloud endpoint, so a minimal config only needs your API key and a model.
#### Supported Models
You must use a vision capable model with Frigate. Recommended multimodal models on Atlas Cloud include `qwen/qwen3-vl-235b-a22b-thinking`, `Qwen/Qwen3-VL-235B-A22B-Instruct`, `qwen/qwen3-vl-30b-a3b-instruct`, and `google/gemini-3.5-flash`. The full, always-current catalog is available in the [Atlas Cloud console](https://www.atlascloud.ai/console).
#### Get API Key
To start using Atlas Cloud, create an API key from the [Atlas Cloud console](https://www.atlascloud.ai/console/api-keys).
#### Configuration
<ConfigTabs>
<TabItem value="ui">
1. Navigate to <NavPath path="Settings > Enrichments > Generative AI" />.
- Set **Provider** to `atlas`
- Set **API key** to your Atlas Cloud API key (or use an environment variable such as `{FRIGATE_ATLAS_API_KEY}`)
- Set **Model** to a vision-capable model (e.g., `qwen/qwen3-vl-235b-a22b-thinking`)
</TabItem>
<TabItem value="yaml">
```yaml
genai:
provider: atlas
api_key: "{FRIGATE_ATLAS_API_KEY}"
model: qwen/qwen3-vl-235b-a22b-thinking
```
</TabItem>
</ConfigTabs>
:::note
The `atlas` provider points to `https://api.atlascloud.ai/v1` by default. To target a different OpenAI-compatible endpoint, set `base_url` explicitly.
:::

Binary file not shown.

After

Width:  |  Height:  |  Size: 131 KiB

View File

@ -12,6 +12,7 @@ __all__ = ["GenAIConfig", "GenAIProviderEnum", "GenAIRoleEnum"]
class GenAIProviderEnum(str, Enum):
openai = "openai"
azure_openai = "azure_openai"
atlas = "atlas"
gemini = "gemini"
ollama = "ollama"
llamacpp = "llamacpp"

View File

@ -0,0 +1,71 @@
"""Atlas Cloud Provider for Frigate AI.
Atlas Cloud (https://www.atlascloud.ai) is an OpenAI-compatible inference
platform that serves a range of vision-capable models. Because its chat
completions API follows the OpenAI standard, this provider inherits all
transport, vision, streaming, reasoning, and tool-calling logic from
:class:`OpenAIClient` and only overrides what is Atlas-specific:
- Client construction: defaults ``base_url`` to the Atlas Cloud endpoint
when the user has not set one explicitly, so a minimal config (provider +
api_key + model) works out of the box. A user-supplied ``base_url`` still
takes precedence.
- Context size: the Atlas ``/models`` endpoint does not reliably surface a
per-model context window, so we fall back to a conservative default rather
than the model-name heuristic used by OpenAI. It can be overridden via
``provider_options.context_size``.
"""
import logging
from typing import Optional
from openai import OpenAI
from frigate.config import GenAIProviderEnum
from frigate.genai import register_genai_provider
from frigate.genai.plugins.openai import OpenAIClient
logger = logging.getLogger(__name__)
DEFAULT_BASE_URL = "https://api.atlascloud.ai/v1"
# Atlas serves large-context models, but its model listing does not expose a
# per-model context window; default conservatively and let users override via
# provider_options.context_size when they know their model's window.
DEFAULT_CONTEXT_SIZE = 32000
@register_genai_provider(GenAIProviderEnum.atlas)
class AtlasClient(OpenAIClient):
"""Generative AI client for Frigate using Atlas Cloud."""
def _init_provider(self) -> OpenAI:
"""Initialize the OpenAI client pointed at Atlas Cloud.
Defaults ``base_url`` to the Atlas endpoint when the user has not set
one, then defers to the OpenAI implementation for everything else.
"""
if not self.genai_config.base_url:
self.genai_config.base_url = DEFAULT_BASE_URL
return super()._init_provider()
def get_context_size(self) -> int:
"""Return the context window for Atlas models.
A manually specified ``context_size`` in ``provider_options`` always
wins; otherwise fall back to a conservative default since Atlas does
not reliably surface per-model context windows.
"""
if self.context_size is not None:
return self.context_size
provider_context_size: Optional[int] = self.genai_config.provider_options.get(
"context_size"
)
if provider_context_size is not None:
self.context_size = provider_context_size
return self.context_size
self.context_size = DEFAULT_CONTEXT_SIZE
return self.context_size

View File

@ -0,0 +1,41 @@
"""Tests for frigate.util.builtin helpers."""
import unittest
from unittest.mock import patch
from frigate.util.builtin import EventsPerSecond
class TestEventsPerSecond(unittest.TestCase):
def test_eps_is_zero_before_any_events(self) -> None:
eps = EventsPerSecond()
with patch("frigate.util.builtin.time.monotonic", return_value=100.0):
self.assertEqual(eps.eps(), 0.0)
def test_eps_counts_events_in_window(self) -> None:
eps = EventsPerSecond(last_n_seconds=10)
clock = [1000.0]
with patch("frigate.util.builtin.time.monotonic", side_effect=lambda: clock[0]):
eps.start()
# one event per second for five seconds
for _ in range(5):
clock[0] += 1.0
eps.update()
# five events over the five seconds since start
self.assertAlmostEqual(eps.eps(), 1.0)
def test_old_timestamps_expire_from_window(self) -> None:
eps = EventsPerSecond(last_n_seconds=10)
clock = [0.0]
with patch("frigate.util.builtin.time.monotonic", side_effect=lambda: clock[0]):
eps.start()
for _ in range(10):
clock[0] += 1.0
eps.update()
# jump well past the window so every timestamp ages out
clock[0] += 100.0
self.assertEqual(eps.eps(), 0.0)
if __name__ == "__main__":
unittest.main()

View File

@ -2,7 +2,6 @@
import ast
import copy
import datetime
import logging
import math
import multiprocessing.queues
@ -10,7 +9,9 @@ import queue
import re
import shlex
import struct
import time
import urllib.parse
from collections import deque
from collections.abc import Mapping
from multiprocessing.managers import ValueProxy
from pathlib import Path
@ -32,23 +33,20 @@ class EventsPerSecond:
self._start = None
self._max_events = max_events
self._last_n_seconds = last_n_seconds
self._timestamps = []
self._timestamps: deque[float] = deque(maxlen=max_events)
def start(self) -> None:
self._start = datetime.datetime.now().timestamp()
self._start = time.monotonic()
def update(self) -> None:
now = datetime.datetime.now().timestamp()
now = time.monotonic()
if self._start is None:
self._start = now
self._timestamps.append(now)
# truncate the list when it goes 100 over the max_size
if len(self._timestamps) > self._max_events + 100:
self._timestamps = self._timestamps[(1 - self._max_events) :]
self.expire_timestamps(now)
def eps(self) -> float:
now = datetime.datetime.now().timestamp()
now = time.monotonic()
if self._start is None:
self._start = now
# compute the (approximate) events in the last n seconds
@ -63,7 +61,7 @@ class EventsPerSecond:
def expire_timestamps(self, now: float) -> None:
threshold = now - self._last_n_seconds
while self._timestamps and self._timestamps[0] < threshold:
del self._timestamps[0]
self._timestamps.popleft()
class InferenceSpeed: