clarify docs

This commit is contained in:
Josh Hawkins 2025-05-27 10:15:53 -05:00
parent 59a7c79b88
commit 7a1d5e018b

View File

@ -75,7 +75,7 @@ audio:
### Audio Transcription ### Audio Transcription
Frigate supports fully local text transcription using `sherpa-onnx` and OpenAI's fully local, open source Whisper models (using `faster-whisper`). Audio transcription can be enabled at the global level of your config, but since you likely will not want to use audio transcription for every camera, just set the config for audio transcription features at the global level: Frigate supports fully local audio transcription using either `sherpa-onnx` or OpenAIs open-source Whisper models via `faster-whisper`. To enable transcription, it is recommended to only configure the features at the global level, and enable it at the individual camera level.
```yaml ```yaml
audio_transcription: audio_transcription:
@ -84,7 +84,7 @@ audio_transcription:
model_size: ... model_size: ...
``` ```
Then enable audio transcription for select cameras only at the camera level: Enable audio transcription for select cameras at the camera level:
```yaml ```yaml
cameras: cameras:
@ -100,8 +100,11 @@ Audio detection must be enabled and configured as described above in order to us
::: :::
Optional config parameters that can be set at the global level include: The optional config parameters that can be set at the global level include:
- **`enabled`**: Enable or disable the audio transcription feature.
- Default: `False`
- It is recommended to only configure the features at the global level, and enable it at the individual camera level.
- **`device`**: Device to use to run transcription and translation models. - **`device`**: Device to use to run transcription and translation models.
- Default: `CPU` - Default: `CPU`
- This can be `CPU` or `GPU`. The `sherpa-onnx` models are lightweight and run on the CPU only. The `whisper` models can run on GPU but are only supported on CUDA hardware. - This can be `CPU` or `GPU`. The `sherpa-onnx` models are lightweight and run on the CPU only. The `whisper` models can run on GPU but are only supported on CUDA hardware.
@ -116,6 +119,8 @@ Optional config parameters that can be set at the global level include:
- Transcriptions for `speech` events are translated. - Transcriptions for `speech` events are translated.
- Live audio is translated only if you are using the `large` model. The `small` `sherpa-onnx` model is English-only. - Live audio is translated only if you are using the `large` model. The `small` `sherpa-onnx` model is English-only.
The only field that is valid at the camera level is `enabled`.
#### Live transcription #### Live transcription
The single camera Live view in the Frigate UI supports live transcription of audio for streams defined with the `audio` role. Use the Enable/Disable Live Audio Transcription button/switch to toggle transcription processing. When speech is heard, the UI will display a black box over the top of the camera stream with text. The MQTT topic `frigate/<camera_name>/audio/transcription` will also be updated in real-time with transcribed text. The single camera Live view in the Frigate UI supports live transcription of audio for streams defined with the `audio` role. Use the Enable/Disable Live Audio Transcription button/switch to toggle transcription processing. When speech is heard, the UI will display a black box over the top of the camera stream with text. The MQTT topic `frigate/<camera_name>/audio/transcription` will also be updated in real-time with transcribed text.