Extends the ZMQ split-detector pattern (apple-silicon-detector) to cover
ONNX embedding models — ArcFace face recognition and Jina semantic search.
On macOS, Docker has no access to CoreML or the Apple Neural Engine, so
embedding inference is forced to CPU (~200ms/face for ArcFace). This adds
a ZmqEmbeddingRunner that sends preprocessed tensors to a native host
process over ZMQ TCP and receives embeddings back, enabling CoreML/ANE
acceleration outside the container.
Files changed:
- frigate/detectors/detection_runners.py: add ZmqEmbeddingRunner class
and hook into get_optimized_runner() via "zmq://" device prefix
- tools/zmq_embedding_server.py: new host-side server script
Tested on Mac Mini M4, 24h soak test, ~5000 object reindex.