frigate/tools
Josh Casada a2c43ad8bb feat: ZMQ embedding runner for offloading ONNX inference to native host
Extends the ZMQ split-detector pattern (apple-silicon-detector) to cover
ONNX embedding models — ArcFace face recognition and Jina semantic search.

On macOS, Docker has no access to CoreML or the Apple Neural Engine, so
embedding inference is forced to CPU (~200ms/face for ArcFace). This adds
a ZmqEmbeddingRunner that sends preprocessed tensors to a native host
process over ZMQ TCP and receives embeddings back, enabling CoreML/ANE
acceleration outside the container.

Files changed:
- frigate/detectors/detection_runners.py: add ZmqEmbeddingRunner class
  and hook into get_optimized_runner() via "zmq://" device prefix
- tools/zmq_embedding_server.py: new host-side server script

Tested on Mac Mini M4, 24h soak test, ~5000 object reindex.
2026-02-21 12:44:42 -05:00
..
zmq_embedding_server.py feat: ZMQ embedding runner for offloading ONNX inference to native host 2026-02-21 12:44:42 -05:00