mirror of
https://github.com/blakeblackshear/frigate.git
synced 2025-12-06 21:44:13 +03:00
Remove camera spatial context
This commit is contained in:
parent
c00772e82e
commit
323a6c7fef
@ -68,36 +68,6 @@ The mere presence of an unidentified person in private areas during late night h
|
||||
|
||||
</details>
|
||||
|
||||
### Camera Spatial Context
|
||||
|
||||
In addition to defining activity patterns, you can provide spatial context for specific cameras to help the LLM generate more accurate and descriptive titles and scene descriptions. The `camera_context` field allows you to describe physical features and locations that are outside the camera's field of view but are relevant for understanding the scene.
|
||||
|
||||
**Important Guidelines:**
|
||||
|
||||
- This context is used **only for descriptive purposes** to help the LLM write better titles and scene descriptions
|
||||
- It should describe **physical features and spatial relationships** (e.g., "front door is to the right", "driveway on the left")
|
||||
- It should **NOT** include subjective assessments or threat evaluations (e.g., "high-crime area")
|
||||
- Threat level determination remains based solely on observable actions defined in the activity patterns
|
||||
|
||||
Example configuration:
|
||||
|
||||
```yaml
|
||||
cameras:
|
||||
front_door:
|
||||
review:
|
||||
genai:
|
||||
enabled: true
|
||||
camera_context: |
|
||||
- Front door entrance is to the right of the frame
|
||||
- Driveway and street are to the left
|
||||
- Steps in the center lead from the sidewalk to the front door
|
||||
- Garage is located beyond the left edge of the frame
|
||||
```
|
||||
|
||||
This helps the LLM generate more natural descriptions like "Person approaching front door" instead of "Person walking toward right side of frame".
|
||||
|
||||
The `camera_context` can be defined globally under `genai.review` and overridden per camera for specific spatial details.
|
||||
|
||||
### Image Source
|
||||
|
||||
By default, review summaries use preview images (cached preview frames) which have a lower resolution but use fewer tokens per image. For better image quality and more detailed analysis, you can configure Frigate to extract frames directly from recordings at a higher resolution:
|
||||
|
||||
@ -140,10 +140,6 @@ Evaluate in this order:
|
||||
The mere presence of an unidentified person in private areas during late night hours is inherently suspicious and warrants human review, regardless of what activity they appear to be doing or how brief the sequence is.""",
|
||||
title="Custom activity context prompt defining normal and suspicious activity patterns for this property.",
|
||||
)
|
||||
camera_context: str = Field(
|
||||
default="",
|
||||
title="Spatial context about the camera's field of view to help with descriptive accuracy. Should describe physical features and locations outside the frame.",
|
||||
)
|
||||
|
||||
|
||||
class ReviewConfig(FrigateBaseModel):
|
||||
|
||||
@ -459,7 +459,6 @@ def run_analysis(
|
||||
genai_config.preferred_language,
|
||||
genai_config.debug_save_thumbnails,
|
||||
genai_config.activity_context_prompt,
|
||||
genai_config.camera_context,
|
||||
)
|
||||
review_inference_speed.update(datetime.datetime.now().timestamp() - start)
|
||||
|
||||
|
||||
@ -45,7 +45,6 @@ class GenAIClient:
|
||||
preferred_language: str | None,
|
||||
debug_save: bool,
|
||||
activity_context_prompt: str,
|
||||
camera_context: str = "",
|
||||
) -> ReviewMetadata | None:
|
||||
"""Generate a description for the review item activity."""
|
||||
|
||||
@ -70,16 +69,6 @@ class GenAIClient:
|
||||
else:
|
||||
return "\n- (No objects detected)"
|
||||
|
||||
def get_camera_context_section() -> str:
|
||||
if camera_context:
|
||||
return f"""## Camera Spatial Context
|
||||
|
||||
Use this spatial information when writing the title and scene description to provide more accurate context about where activity is occurring or where people/objects are moving to/from.
|
||||
|
||||
{camera_context}"""
|
||||
return ""
|
||||
|
||||
camera_context_section = get_camera_context_section()
|
||||
context_prompt = f"""
|
||||
Your task is to analyze the sequence of images ({len(thumbnails)} total) taken in chronological order from the perspective of the {review_data["camera"].replace("_", " ")} security camera.
|
||||
|
||||
@ -87,8 +76,6 @@ Your task is to analyze the sequence of images ({len(thumbnails)} total) taken i
|
||||
|
||||
{activity_context_prompt}
|
||||
|
||||
{camera_context_section}
|
||||
|
||||
## Task Instructions
|
||||
|
||||
Your task is to provide a clear, accurate description of the scene that:
|
||||
@ -113,7 +100,7 @@ When forming your description:
|
||||
## Response Format
|
||||
|
||||
Your response MUST be a flat JSON object with:
|
||||
- `title` (string): A concise, direct title that describes the primary action or event in the sequence, not just what you literally see. {"Use spatial context when available to make titles more meaningful." if camera_context_section else ""} When multiple objects/actions are present, prioritize whichever is most prominent or occurs first. Use names from "Objects in Scene" based on what you visually observe. If you see both a name and an unidentified object of the same type but visually observe only one person/object, use ONLY the name. Examples: "Joe walking dog", "Person taking out trash", "Vehicle arriving in driveway", "Joe accessing vehicle", "Person leaving porch for driveway".
|
||||
- `title` (string): A concise, direct title that describes the primary action or event in the sequence, not just what you literally see. Use spatial context when available to make titles more meaningful. When multiple objects/actions are present, prioritize whichever is most prominent or occurs first. Use names from "Objects in Scene" based on what you visually observe. If you see both a name and an unidentified object of the same type but visually observe only one person/object, use ONLY the name. Examples: "Joe walking dog", "Person taking out trash", "Vehicle arriving in driveway", "Joe accessing vehicle", "Person leaving porch for driveway".
|
||||
- `scene` (string): A narrative description of what happens across the sequence from start to finish, in chronological order. Start by describing how the sequence begins, then describe the progression of events. **Describe all significant movements and actions in the order they occur.** For example, if a vehicle arrives and then a person exits, describe both actions sequentially. **Only describe actions you can actually observe happening in the frames provided.** Do not infer or assume actions that aren't visible (e.g., if you see someone walking but never see them sit, don't say they sat down). Include setting, detected objects, and their observable actions. Avoid speculation or filling in assumed behaviors. Your description should align with and support the threat level you assign.
|
||||
- `confidence` (float): 0-1 confidence in your analysis. Higher confidence when objects/actions are clearly visible and context is unambiguous. Lower confidence when the sequence is unclear, objects are partially obscured, or context is ambiguous.
|
||||
- `potential_threat_level` (integer): 0, 1, or 2 as defined in "Normal Activity Patterns for This Property" above. Your threat level must be consistent with your scene description and the guidance above.
|
||||
|
||||
@ -67,9 +67,6 @@
|
||||
},
|
||||
"activity_context_prompt": {
|
||||
"label": "Custom activity context prompt defining normal activity patterns for this property."
|
||||
},
|
||||
"camera_context": {
|
||||
"label": "Spatial context about the camera's field of view to help with descriptive accuracy. Should describe physical features and locations outside the frame. This is for spatial reference only and should NOT include subjective assessments."
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
Loading…
Reference in New Issue
Block a user