Review prompt adjustments (#20704)

* Make prompt more fair and reduce time extension * Adjust naming of unrecognized objects * Improve object naming behavior * Add more context image levels
2026-01-22 20:18:30 +03:00 · 2025-10-28 07:28:36 -06:00 · 2025-10-28 07:28:36 -06:00 · 16e17e027d
commit 16e17e027d
parent c2cbb0fa87
3 changed files with 36 additions and 24 deletions
--- a/frigate/config/camera/review.py
+++ b/frigate/config/camera/review.py
@ -108,20 +108,20 @@ class GenAIReviewConfig(FrigateBaseModel):
        default="""### Normal Activity Indicators (Level 0)
 - Known/verified people in any zone
 - People with pets in residential areas
- Deliveries: carrying packages to porches/doors, placing packages, leaving
- Access to private areas: entering back yards, garages, or homes
- Brief movement through semi-public areas (driveways, front yards) with clear purpose (carrying items, going to/from vehicles)
+- Brief activity near vehicles: approaching vehicles, brief standing, then leaving or entering vehicle (unloading, loading, checking something)
+- Deliveries or services: brief approach to doors/porches, standing briefly, placing or retrieving items, then leaving
+- Access to private areas: entering back yards, garages, or homes (with or without visible purpose in frame)
+- Brief movement through semi-public areas (driveways, front yards) with items or approaching structure/vehicle
 - Activity on public areas only (sidewalks, streets) without entering property
- Services/maintenance with visible indicators (tools, uniforms, work vehicles)
+- Services/maintenance workers with tools, uniforms, or vehicles

 ### Suspicious Activity Indicators (Level 1)
- Testing doors or windows on vehicles or buildings
- Standing near vehicles or in private zones without clear purpose or direct movement to destination
- Taking items from property (packages, objects from porches/driveways)
- Accessing areas at unusual hours without visible legitimate indicators (items, tools, purpose)
- Climbing or jumping fences/barriers
- Attempting to conceal actions or items
- Person in semi-public areas (driveways, front yards) at unusual hours without clear purpose
+- Testing or attempting to open doors/windows on vehicles or buildings
+- Taking items that don't belong to them (stealing packages, objects from porches/driveways)
+- Climbing or jumping fences/barriers to access property
+- Attempting to conceal actions or items from view
+- Prolonged presence without purpose: remaining in same area (near vehicles, private zones) throughout most/all of the sequence without clear activity or task. Brief stops (a few seconds of standing) are normal; sustained presence (most of the duration) without interaction is concerning.
+- Activity at unusual hours (very late night/early morning) combined with suspicious behavior patterns

 ### Critical Threat Indicators (Level 2)
 - Holding break-in tools (crowbars, pry bars, bolt cutters)
@ -131,7 +131,9 @@ class GenAIReviewConfig(FrigateBaseModel):
 - Active property damage or theft

 ### Assessment Guidance
-These patterns are guidance, not absolute rules. Context matters: time of day, visible items/tools, and apparent purpose help distinguish normal from suspicious. Not all cameras show full entry/exit paths - focus on observable behavior in frame. Use judgment based on the complete picture.""",
+When evaluating activity, first check if it matches Normal Activity Indicators. If it clearly matches normal patterns (brief vehicle access, delivery behavior, known people, pet activity), assign Level 0. Only consider Level 1 if the activity shows clear suspicious behaviors that don't fit normal patterns (testing access, stealing items, lingering across many frames without task, forced entry attempts).
+
+These patterns are guidance, not rigid rules. Consider the complete context: time, zone, objects, and sequence of actions. Brief activity with apparent purpose is generally normal. Sustained problematic behavior or clear security violations warrant elevation.""",
        title="Custom activity context prompt defining normal and suspicious activity patterns for this property.",
    )

--- a/frigate/data_processing/post/review_descriptions.py
+++ b/frigate/data_processing/post/review_descriptions.py
@ -29,8 +29,7 @@ from ..types import DataProcessorMetrics

 logger = logging.getLogger(__name__)

-RECORDING_BUFFER_START_SECONDS = 5
-RECORDING_BUFFER_END_SECONDS = 10
+RECORDING_BUFFER_EXTENSION_PERCENT = 0.10


 class ReviewDescriptionProcessor(PostProcessorApi):
@ -59,7 +58,11 @@ class ReviewDescriptionProcessor(PostProcessorApi):
            # With recordings at 480p resolution (480px height), each image uses ~200-300 tokens
            # This is ~2-3x more than preview images, so we reduce frame count accordingly
            # to avoid exceeding context limits and maintain reasonable inference times
-            if context_size > 10000:
+            if context_size > 14000:
+                return 16
+            elif context_size > 12000:
+                return 14
+            elif context_size > 10000:
                return 12
            elif context_size > 6000:
                return 10
@ -112,10 +115,13 @@ class ReviewDescriptionProcessor(PostProcessorApi):
            image_source = camera_config.review.genai.image_source

            if image_source == ImageSourceEnum.recordings:
+                duration = final_data["end_time"] - final_data["start_time"]
+                buffer_extension = duration * RECORDING_BUFFER_EXTENSION_PERCENT
+
                thumbs = self.get_recording_frames(
                    camera,
-                    final_data["start_time"] - RECORDING_BUFFER_START_SECONDS,
-                    final_data["end_time"] + RECORDING_BUFFER_END_SECONDS,
+                    final_data["start_time"],
+                    final_data["end_time"] + buffer_extension,
                    height=480,  # Use 480p for good balance between quality and token usage
                )

@ -415,13 +421,13 @@ def run_analysis(
        name = sub_labels_list[i].replace("_", " ").title()
        unified_objects.append(f"{name} ({object_type})")

-    # Add non-verified objects as "Unknown (type)"
+    # Add non-verified objects as "Unrecognized (type)"
    for label in objects_list:
        if "-verified" in label:
            continue
        elif label in labelmap_objects:
            object_type = label.replace("_", " ")
-            unified_objects.append(f"Unknown ({object_type})")
+            unified_objects.append(f"Unrecognized ({object_type})")

    analytics_data["unified_objects"] = unified_objects

--- a/frigate/genai/init.py
+++ b/frigate/genai/init.py
@ -99,7 +99,7 @@ When forming your description:
 ## Response Format

 Your response MUST be a flat JSON object with:
- `title` (string): A concise, one-sentence title that captures the main activity. Use the exact names from "Objects in Scene" below (e.g., if the list shows "Joe (person)" and "Unknown (person)", say "Joe and unknown person"). Examples: "Joe walking dog in backyard", "Unknown person testing car doors at night", "Joe and unknown person in driveway".
+- `title` (string): A concise, one-sentence title that captures the main activity. Use names from "Objects in Scene" based on what you visually observe. If you see both a recognized name and "Unrecognized" for the same type but visually observe only one person/object, use ONLY the recognized name. Examples: "Joe walking dog in backyard", "Britt near vehicle in driveway", "Joe and an unrecognized person on front porch".
 - `scene` (string): A narrative description of what happens across the sequence from start to finish. **Only describe actions you can actually observe happening in the frames provided.** Do not infer or assume actions that aren't visible (e.g., if you see someone walking but never see them sit, don't say they sat down). Include setting, detected objects, and their observable actions. Avoid speculation or filling in assumed behaviors. Your description should align with and support the threat level you assign.
 - `confidence` (float): 0-1 confidence in your analysis. Higher confidence when objects/actions are clearly visible and context is unambiguous. Lower confidence when the sequence is unclear, objects are partially obscured, or context is ambiguous.
 - `potential_threat_level` (integer): 0, 1, or 2 as defined below. Your threat level must be consistent with your scene description and the guidance above.
@ -107,8 +107,8 @@ Your response MUST be a flat JSON object with:

 ## Threat Level Definitions

- 0 — **Normal activity**: The observable activity aligns with the Normal Activity Patterns above. The evidence—considering zone, objects, time, and actions together—supports a benign explanation. **Use this level for routine activities even if minor ambiguous elements exist.**
- 1 — **Potentially suspicious**: The observable activity aligns with the Suspicious Activity Indicators above, or shows behavior that raises genuine security concerns. The activity warrants human review. **Use this level when the evidence suggests concerning behavior, even if not an immediate threat.**
+- 0 — **Normal activity**: The observable activity matches Normal Activity Indicators (brief vehicle access, deliveries, known people, pet activity, services). The evidence supports a benign explanation when considering zone, objects, time, and actions together. **Brief activities with apparent legitimate purpose are generally Level 0.**
+- 1 — **Potentially suspicious**: The observable activity matches Suspicious Activity Indicators (testing access, stealing items, climbing barriers, lingering without interaction across multiple frames, unusual hours with suspicious behavior). The activity shows concerning patterns that warrant human review. **Requires clear suspicious behavior, not just ambiguity.**
 - 2 — **Immediate threat**: Clear evidence of active criminal activity, forced entry, break-in, vandalism, aggression, weapons, theft in progress, or property damage.

 ## Sequence Details
@ -119,7 +119,11 @@ Your response MUST be a flat JSON object with:

 ## Objects in Scene

-Each line represents one object in the scene. Named objects are verified identities; "Unknown" indicates unverified objects of that type:
+Each line represents a detection state, not necessarily unique individuals. Named objects are recognized/verified identities; "Unrecognized" indicates objects detected but not identified.
+
+**CRITICAL: When you see both recognized and unrecognized entries of the same type (e.g., "Name (person)" and "Unrecognized (person)"), visually count how many distinct people/objects you actually see based on appearance and clothing. If you observe only ONE person throughout the sequence, use ONLY the recognized name (e.g., "Name"), not "Unrecognized". The same person may be recognized in some frames but not others. Only describe both recognized and unrecognized if you visually see MULTIPLE distinct people with clearly different appearances.**
+
+**Note: "Unrecognized" is NOT an indicator of suspicious activity—it simply means the system hasn't identified that object.**
 {get_objects_list()}

 ## Important Notes
@ -161,7 +165,7 @@ Each line represents one object in the scene. Named objects are verified identit
                metadata = ReviewMetadata.model_validate_json(clean_json)

                if any(
-                    not obj.startswith("Unknown")
+                    not obj.startswith("Unrecognized")
                    for obj in review_data["unified_objects"]
                ):
                    metadata.potential_threat_level = 0