clarify audio transcription docs

This commit is contained in:
Josh Hawkins 2025-11-23 12:35:35 -06:00
parent bf4f63e50e
commit 42fdadecd9
2 changed files with 10 additions and 4 deletions

View File

@ -144,4 +144,10 @@ In order to use transcription and translation for past events, you must enable a
The transcribed/translated speech will appear in the description box in the Tracked Object Details pane. If Semantic Search is enabled, embeddings are generated for the transcription text and are fully searchable using the description search type. The transcribed/translated speech will appear in the description box in the Tracked Object Details pane. If Semantic Search is enabled, embeddings are generated for the transcription text and are fully searchable using the description search type.
Recorded `speech` events will always use a `whisper` model, regardless of the `model_size` config setting. Without a GPU, generating transcriptions for longer `speech` events may take a fair amount of time, so be patient. :::note
Only one `speech` event may be transcribed at a time. Frigate does not automatically transcribe `speech` events or implement a queue for long-running transcription model inference.
:::
Recorded `speech` events will always use a `whisper` model, regardless of the `model_size` config setting. Without a supported Nvidia GPU, generating transcriptions for longer `speech` events may take a fair amount of time, so be patient.

View File

@ -700,11 +700,11 @@ genai:
# Optional: Configuration for audio transcription # Optional: Configuration for audio transcription
# NOTE: only the enabled option can be overridden at the camera level # NOTE: only the enabled option can be overridden at the camera level
audio_transcription: audio_transcription:
# Optional: Enable audio transcription (default: shown below) # Optional: Enable live and speech event audio transcription (default: shown below)
enabled: False enabled: False
# Optional: The device to run the models on. (default: shown below) # Optional: The device to run the models on for live transcription. (default: shown below)
device: CPU device: CPU
# Optional: Set the model size used for transcription. (default: shown below) # Optional: Set the model size used for live transcription. (default: shown below)
model_size: small model_size: small
# Optional: Set the language used for transcription translation. (default: shown below) # Optional: Set the language used for transcription translation. (default: shown below)
# List of language codes: https://github.com/openai/whisper/blob/main/whisper/tokenizer.py#L10 # List of language codes: https://github.com/openai/whisper/blob/main/whisper/tokenizer.py#L10