mirror of
https://github.com/blakeblackshear/frigate.git
synced 2025-12-21 20:46:43 +03:00
Compare commits
1 Commits
cdd22168ea
...
52ab54c14f
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
52ab54c14f |
@ -25,7 +25,7 @@ Examples of available modules are:
|
||||
|
||||
- `frigate.app`
|
||||
- `frigate.mqtt`
|
||||
- `frigate.object_detection.base`
|
||||
- `frigate.object_detection`
|
||||
- `detector.<detector_name>`
|
||||
- `watchdog.<camera_name>`
|
||||
- `ffmpeg.<camera_name>.<sorted_roles>` NOTE: All FFmpeg logs are sent as `error` level.
|
||||
|
||||
@ -35,15 +35,6 @@ For object classification:
|
||||
- Ideal when multiple attributes can coexist independently.
|
||||
- Example: Detecting if a `person` in a construction yard is wearing a helmet or not.
|
||||
|
||||
## Assignment Requirements
|
||||
|
||||
Sub labels and attributes are only assigned when both conditions are met:
|
||||
|
||||
1. **Threshold**: Each classification attempt must have a confidence score that meets or exceeds the configured `threshold` (default: `0.8`).
|
||||
2. **Class Consensus**: After at least 3 classification attempts, 60% of attempts must agree on the same class label. If the consensus class is `none`, no assignment is made.
|
||||
|
||||
This two-step verification prevents false positives by requiring consistent predictions across multiple frames before assigning a sub label or attribute.
|
||||
|
||||
## Example use cases
|
||||
|
||||
### Sub label
|
||||
@ -75,18 +66,14 @@ classification:
|
||||
|
||||
## Training the model
|
||||
|
||||
Creating and training the model is done within the Frigate UI using the `Classification` page. The process consists of two steps:
|
||||
Creating and training the model is done within the Frigate UI using the `Classification` page.
|
||||
|
||||
### Step 1: Name and Define
|
||||
|
||||
Enter a name for your model, select the object label to classify (e.g., `person`, `dog`, `car`), choose the classification type (sub label or attribute), and define your classes. Include a `none` class for objects that don't fit any specific category.
|
||||
|
||||
### Step 2: Assign Training Examples
|
||||
|
||||
The system will automatically generate example images from detected objects matching your selected label. You'll be guided through each class one at a time to select which images represent that class. Any images not assigned to a specific class will automatically be assigned to `none` when you complete the last class. Once all images are processed, training will begin automatically.
|
||||
### Getting Started
|
||||
|
||||
When choosing which objects to classify, start with a small number of visually distinct classes and ensure your training samples match camera viewpoints and distances typical for those objects.
|
||||
|
||||
// TODO add this section once UI is implemented. Explain process of selecting objects and curating training examples.
|
||||
|
||||
### Improving the Model
|
||||
|
||||
- **Problem framing**: Keep classes visually distinct and relevant to the chosen object types.
|
||||
|
||||
@ -48,23 +48,13 @@ classification:
|
||||
|
||||
## Training the model
|
||||
|
||||
Creating and training the model is done within the Frigate UI using the `Classification` page. The process consists of three steps:
|
||||
Creating and training the model is done within the Frigate UI using the `Classification` page.
|
||||
|
||||
### Step 1: Name and Define
|
||||
### Getting Started
|
||||
|
||||
Enter a name for your model and define at least 2 classes (states) that represent mutually exclusive states. For example, `open` and `closed` for a door, or `on` and `off` for lights.
|
||||
When choosing a portion of the camera frame for state classification, it is important to make the crop tight around the area of interest to avoid extra signals unrelated to what is being classified.
|
||||
|
||||
### Step 2: Select the Crop Area
|
||||
|
||||
Choose one or more cameras and draw a rectangle over the area of interest for each camera. The crop should be tight around the region you want to classify to avoid extra signals unrelated to what is being classified. You can drag and resize the rectangle to adjust the crop area.
|
||||
|
||||
### Step 3: Assign Training Examples
|
||||
|
||||
The system will automatically generate example images from your camera feeds. You'll be guided through each class one at a time to select which images represent that state.
|
||||
|
||||
**Important**: All images must be assigned to a state before training can begin. This includes images that may not be optimal, such as when people temporarily block the view, sun glare is present, or other distractions occur. Assign these images to the state that is actually present (based on what you know the state to be), not based on the distraction. This training helps the model correctly identify the state even when such conditions occur during inference.
|
||||
|
||||
Once all images are assigned, training will begin automatically.
|
||||
// TODO add this section once UI is implemented. Explain process of selecting a crop.
|
||||
|
||||
### Improving the Model
|
||||
|
||||
|
||||
@ -962,6 +962,7 @@ model:
|
||||
# path: /config/yolov9.zip
|
||||
# The .zip file must contain:
|
||||
# ├── yolov9.dfp (a file ending with .dfp)
|
||||
# └── yolov9_post.onnx (optional; only if the model includes a cropped post-processing network)
|
||||
```
|
||||
|
||||
#### YOLOX
|
||||
|
||||
@ -849,7 +849,6 @@ async def vod_ts(camera_name: str, start_ts: float, end_ts: float):
|
||||
|
||||
clips = []
|
||||
durations = []
|
||||
min_duration_ms = 100 # Minimum 100ms to ensure at least one video frame
|
||||
max_duration_ms = MAX_SEGMENT_DURATION * 1000
|
||||
|
||||
recording: Recordings
|
||||
@ -867,11 +866,11 @@ async def vod_ts(camera_name: str, start_ts: float, end_ts: float):
|
||||
if recording.end_time > end_ts:
|
||||
duration -= int((recording.end_time - end_ts) * 1000)
|
||||
|
||||
if duration < min_duration_ms:
|
||||
# skip if the clip has no valid duration (too short to contain frames)
|
||||
if duration <= 0:
|
||||
# skip if the clip has no valid duration
|
||||
continue
|
||||
|
||||
if min_duration_ms <= duration < max_duration_ms:
|
||||
if 0 < duration < max_duration_ms:
|
||||
clip["keyFrameDurations"] = [duration]
|
||||
clips.append(clip)
|
||||
durations.append(duration)
|
||||
|
||||
@ -792,10 +792,6 @@ class FrigateConfig(FrigateBaseModel):
|
||||
# copy over auth and proxy config in case auth needs to be enforced
|
||||
safe_config["auth"] = config.get("auth", {})
|
||||
safe_config["proxy"] = config.get("proxy", {})
|
||||
|
||||
# copy over database config for auth and so a new db is not created
|
||||
safe_config["database"] = config.get("database", {})
|
||||
|
||||
return cls.parse_object(safe_config, **context)
|
||||
|
||||
# Validate and return the config dict.
|
||||
|
||||
@ -18,6 +18,7 @@ from frigate.detectors.detector_config import (
|
||||
ModelTypeEnum,
|
||||
)
|
||||
from frigate.util.file import FileLock
|
||||
from frigate.util.model import post_process_yolo
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@ -177,6 +178,13 @@ class MemryXDetector(DetectionApi):
|
||||
logger.error(f"Failed to initialize MemryX model: {e}")
|
||||
raise
|
||||
|
||||
def load_yolo_constants(self):
|
||||
base = f"{self.cache_dir}/{self.model_folder}"
|
||||
# constants for yolov9 post-processing
|
||||
self.const_A = np.load(f"{base}/_model_22_Constant_9_output_0.npy")
|
||||
self.const_B = np.load(f"{base}/_model_22_Constant_10_output_0.npy")
|
||||
self.const_C = np.load(f"{base}/_model_22_Constant_12_output_0.npy")
|
||||
|
||||
def check_and_prepare_model(self):
|
||||
if not os.path.exists(self.cache_dir):
|
||||
os.makedirs(self.cache_dir, exist_ok=True)
|
||||
@ -228,6 +236,7 @@ class MemryXDetector(DetectionApi):
|
||||
|
||||
# Handle post model requirements by model type
|
||||
if self.memx_model_type in [
|
||||
ModelTypeEnum.yologeneric,
|
||||
ModelTypeEnum.yolonas,
|
||||
ModelTypeEnum.ssd,
|
||||
]:
|
||||
@ -236,10 +245,7 @@ class MemryXDetector(DetectionApi):
|
||||
f"No *_post.onnx file found in custom model zip for {self.memx_model_type.name}."
|
||||
)
|
||||
self.memx_post_model = post_candidates[0]
|
||||
elif self.memx_model_type in [
|
||||
ModelTypeEnum.yolox,
|
||||
ModelTypeEnum.yologeneric,
|
||||
]:
|
||||
elif self.memx_model_type == ModelTypeEnum.yolox:
|
||||
# Explicitly ignore any post model even if present
|
||||
self.memx_post_model = None
|
||||
else:
|
||||
@ -267,6 +273,8 @@ class MemryXDetector(DetectionApi):
|
||||
logger.info("Using cached models.")
|
||||
self.memx_model_path = dfp_path
|
||||
self.memx_post_model = post_path
|
||||
if self.memx_model_type == ModelTypeEnum.yologeneric:
|
||||
self.load_yolo_constants()
|
||||
return
|
||||
|
||||
# ---------- CASE 3: download MemryX model (no cache) ----------
|
||||
@ -295,6 +303,9 @@ class MemryXDetector(DetectionApi):
|
||||
else None
|
||||
)
|
||||
|
||||
if self.memx_model_type == ModelTypeEnum.yologeneric:
|
||||
self.load_yolo_constants()
|
||||
|
||||
finally:
|
||||
if os.path.exists(zip_path):
|
||||
try:
|
||||
@ -589,232 +600,127 @@ class MemryXDetector(DetectionApi):
|
||||
|
||||
self.output_queue.put(final_detections)
|
||||
|
||||
def _generate_anchors(self, sizes=[80, 40, 20]):
|
||||
"""Generate anchor points for YOLOv9 style processing"""
|
||||
yscales = []
|
||||
xscales = []
|
||||
for s in sizes:
|
||||
r = np.arange(s) + 0.5
|
||||
yscales.append(np.repeat(r, s))
|
||||
xscales.append(np.repeat(r[None, ...], s, axis=0).flatten())
|
||||
|
||||
yscales = np.concatenate(yscales)
|
||||
xscales = np.concatenate(xscales)
|
||||
anchors = np.stack([xscales, yscales], axis=1)
|
||||
return anchors
|
||||
|
||||
def _generate_scales(self, sizes=[80, 40, 20]):
|
||||
"""Generate scaling factors for each detection level"""
|
||||
factors = [8, 16, 32]
|
||||
s = np.concatenate([np.ones([int(s * s)]) * f for s, f in zip(sizes, factors)])
|
||||
return s[:, None]
|
||||
|
||||
@staticmethod
|
||||
def _softmax(x: np.ndarray, axis: int) -> np.ndarray:
|
||||
"""Efficient softmax implementation"""
|
||||
x = x - np.max(x, axis=axis, keepdims=True)
|
||||
np.exp(x, out=x)
|
||||
x /= np.sum(x, axis=axis, keepdims=True)
|
||||
return x
|
||||
|
||||
def dfl(self, x: np.ndarray) -> np.ndarray:
|
||||
"""Distribution Focal Loss decoding - YOLOv9 style"""
|
||||
x = x.reshape(-1, 4, 16)
|
||||
weights = np.arange(16, dtype=np.float32)
|
||||
p = self._softmax(x, axis=2)
|
||||
p = p * weights[None, None, :]
|
||||
out = np.sum(p, axis=2, keepdims=False)
|
||||
return out
|
||||
|
||||
def dist2bbox(
|
||||
self, x: np.ndarray, anchors: np.ndarray, scales: np.ndarray
|
||||
def onnx_reshape_with_allowzero(
|
||||
self, data: np.ndarray, shape: np.ndarray, allowzero: int = 0
|
||||
) -> np.ndarray:
|
||||
"""Convert distances to bounding boxes - YOLOv9 style"""
|
||||
lt = x[:, :2]
|
||||
rb = x[:, 2:]
|
||||
shape = shape.astype(int)
|
||||
input_shape = data.shape
|
||||
output_shape = []
|
||||
|
||||
x1y1 = anchors - lt
|
||||
x2y2 = anchors + rb
|
||||
for i, dim in enumerate(shape):
|
||||
if dim == 0 and allowzero == 0:
|
||||
output_shape.append(input_shape[i]) # Copy dimension from input
|
||||
else:
|
||||
output_shape.append(dim)
|
||||
|
||||
wh = x2y2 - x1y1
|
||||
c_xy = (x1y1 + x2y2) / 2
|
||||
# Now let NumPy infer any -1 if needed
|
||||
reshaped = np.reshape(data, output_shape)
|
||||
|
||||
out = np.concatenate([c_xy, wh], axis=1)
|
||||
out = out * scales
|
||||
return out
|
||||
|
||||
def post_process_yolo_optimized(self, outputs):
|
||||
"""
|
||||
Custom YOLOv9 post-processing optimized for MemryX ONNX outputs.
|
||||
Implements DFL decoding, confidence filtering, and NMS in pure NumPy.
|
||||
"""
|
||||
# YOLOv9 outputs: 6 outputs (lbox, lcls, mbox, mcls, sbox, scls)
|
||||
conv_out1, conv_out2, conv_out3, conv_out4, conv_out5, conv_out6 = outputs
|
||||
|
||||
# Determine grid sizes based on input resolution
|
||||
# YOLOv9 uses 3 detection heads with strides [8, 16, 32]
|
||||
# Grid sizes = input_size / stride
|
||||
sizes = [
|
||||
self.memx_model_height
|
||||
// 8, # Large objects (e.g., 80 for 640x640, 40 for 320x320)
|
||||
self.memx_model_height
|
||||
// 16, # Medium objects (e.g., 40 for 640x640, 20 for 320x320)
|
||||
self.memx_model_height
|
||||
// 32, # Small objects (e.g., 20 for 640x640, 10 for 320x320)
|
||||
]
|
||||
|
||||
# Generate anchors and scales if not already done
|
||||
if not hasattr(self, "anchors"):
|
||||
self.anchors = self._generate_anchors(sizes)
|
||||
self.scales = self._generate_scales(sizes)
|
||||
|
||||
# Process outputs in YOLOv9 format: reshape and moveaxis for ONNX format
|
||||
lbox = np.moveaxis(conv_out1, 1, -1) # Large boxes
|
||||
lcls = np.moveaxis(conv_out2, 1, -1) # Large classes
|
||||
mbox = np.moveaxis(conv_out3, 1, -1) # Medium boxes
|
||||
mcls = np.moveaxis(conv_out4, 1, -1) # Medium classes
|
||||
sbox = np.moveaxis(conv_out5, 1, -1) # Small boxes
|
||||
scls = np.moveaxis(conv_out6, 1, -1) # Small classes
|
||||
|
||||
# Determine number of classes dynamically from the class output shape
|
||||
# lcls shape should be (batch, height, width, num_classes)
|
||||
num_classes = lcls.shape[-1]
|
||||
|
||||
# Validate that all class outputs have the same number of classes
|
||||
if not (mcls.shape[-1] == num_classes and scls.shape[-1] == num_classes):
|
||||
raise ValueError(
|
||||
f"Class output shapes mismatch: lcls={lcls.shape}, mcls={mcls.shape}, scls={scls.shape}"
|
||||
)
|
||||
|
||||
# Concatenate boxes and classes
|
||||
boxes = np.concatenate(
|
||||
[
|
||||
lbox.reshape(-1, 64), # 64 is for 4 bbox coords * 16 DFL bins
|
||||
mbox.reshape(-1, 64),
|
||||
sbox.reshape(-1, 64),
|
||||
],
|
||||
axis=0,
|
||||
)
|
||||
|
||||
classes = np.concatenate(
|
||||
[
|
||||
lcls.reshape(-1, num_classes),
|
||||
mcls.reshape(-1, num_classes),
|
||||
scls.reshape(-1, num_classes),
|
||||
],
|
||||
axis=0,
|
||||
)
|
||||
|
||||
# Apply sigmoid to classes
|
||||
classes = self.sigmoid(classes)
|
||||
|
||||
# Apply DFL to box predictions
|
||||
boxes = self.dfl(boxes)
|
||||
|
||||
# YOLOv9 postprocessing with confidence filtering and NMS
|
||||
confidence_thres = 0.4
|
||||
iou_thres = 0.6
|
||||
|
||||
# Find the class with the highest score for each detection
|
||||
max_scores = np.max(classes, axis=1) # Maximum class score for each detection
|
||||
class_ids = np.argmax(classes, axis=1) # Index of the best class
|
||||
|
||||
# Filter out detections with scores below the confidence threshold
|
||||
valid_indices = np.where(max_scores >= confidence_thres)[0]
|
||||
if len(valid_indices) == 0:
|
||||
# Return empty detections array
|
||||
final_detections = np.zeros((20, 6), np.float32)
|
||||
return final_detections
|
||||
|
||||
# Select only valid detections
|
||||
valid_boxes = boxes[valid_indices]
|
||||
valid_class_ids = class_ids[valid_indices]
|
||||
valid_scores = max_scores[valid_indices]
|
||||
|
||||
# Convert distances to actual bounding boxes using anchors and scales
|
||||
valid_boxes = self.dist2bbox(
|
||||
valid_boxes, self.anchors[valid_indices], self.scales[valid_indices]
|
||||
)
|
||||
|
||||
# Convert bounding box coordinates from (x_center, y_center, w, h) to (x_min, y_min, x_max, y_max)
|
||||
x_center, y_center, width, height = (
|
||||
valid_boxes[:, 0],
|
||||
valid_boxes[:, 1],
|
||||
valid_boxes[:, 2],
|
||||
valid_boxes[:, 3],
|
||||
)
|
||||
x_min = x_center - width / 2
|
||||
y_min = y_center - height / 2
|
||||
x_max = x_center + width / 2
|
||||
y_max = y_center + height / 2
|
||||
|
||||
# Convert to format expected by cv2.dnn.NMSBoxes: [x, y, width, height]
|
||||
boxes_for_nms = []
|
||||
scores_for_nms = []
|
||||
|
||||
for i in range(len(valid_indices)):
|
||||
# Ensure coordinates are within bounds and positive
|
||||
x_min_clipped = max(0, x_min[i])
|
||||
y_min_clipped = max(0, y_min[i])
|
||||
x_max_clipped = min(self.memx_model_width, x_max[i])
|
||||
y_max_clipped = min(self.memx_model_height, y_max[i])
|
||||
|
||||
width_clipped = x_max_clipped - x_min_clipped
|
||||
height_clipped = y_max_clipped - y_min_clipped
|
||||
|
||||
if width_clipped > 0 and height_clipped > 0:
|
||||
boxes_for_nms.append(
|
||||
[x_min_clipped, y_min_clipped, width_clipped, height_clipped]
|
||||
)
|
||||
scores_for_nms.append(float(valid_scores[i]))
|
||||
|
||||
final_detections = np.zeros((20, 6), np.float32)
|
||||
|
||||
if len(boxes_for_nms) == 0:
|
||||
return final_detections
|
||||
|
||||
# Apply NMS using OpenCV
|
||||
indices = cv2.dnn.NMSBoxes(
|
||||
boxes_for_nms, scores_for_nms, confidence_thres, iou_thres
|
||||
)
|
||||
|
||||
if len(indices) > 0:
|
||||
# Flatten indices if they are returned as a list of arrays
|
||||
if isinstance(indices[0], list) or isinstance(indices[0], np.ndarray):
|
||||
indices = [i[0] for i in indices]
|
||||
|
||||
# Limit to top 20 detections
|
||||
indices = indices[:20]
|
||||
|
||||
# Convert to Frigate format: [class_id, confidence, y_min, x_min, y_max, x_max] (normalized)
|
||||
for i, idx in enumerate(indices):
|
||||
class_id = valid_class_ids[idx]
|
||||
confidence = valid_scores[idx]
|
||||
|
||||
# Get the box coordinates
|
||||
box = boxes_for_nms[idx]
|
||||
x_min_norm = box[0] / self.memx_model_width
|
||||
y_min_norm = box[1] / self.memx_model_height
|
||||
x_max_norm = (box[0] + box[2]) / self.memx_model_width
|
||||
y_max_norm = (box[1] + box[3]) / self.memx_model_height
|
||||
|
||||
final_detections[i] = [
|
||||
class_id,
|
||||
confidence,
|
||||
y_min_norm, # Frigate expects y_min first
|
||||
x_min_norm,
|
||||
y_max_norm,
|
||||
x_max_norm,
|
||||
]
|
||||
|
||||
return final_detections
|
||||
return reshaped
|
||||
|
||||
def process_output(self, *outputs):
|
||||
"""Output callback function -- receives frames from the MX3 and triggers post-processing"""
|
||||
if self.memx_model_type == ModelTypeEnum.yologeneric:
|
||||
# Use complete YOLOv9-style postprocessing (includes NMS)
|
||||
final_detections = self.post_process_yolo_optimized(outputs)
|
||||
if not self.memx_post_model:
|
||||
conv_out1 = outputs[0]
|
||||
conv_out2 = outputs[1]
|
||||
conv_out3 = outputs[2]
|
||||
conv_out4 = outputs[3]
|
||||
conv_out5 = outputs[4]
|
||||
conv_out6 = outputs[5]
|
||||
|
||||
concat_1 = self.onnx_concat([conv_out1, conv_out2], axis=1)
|
||||
concat_2 = self.onnx_concat([conv_out3, conv_out4], axis=1)
|
||||
concat_3 = self.onnx_concat([conv_out5, conv_out6], axis=1)
|
||||
|
||||
shape = np.array([1, 144, -1], dtype=np.int64)
|
||||
|
||||
reshaped_1 = self.onnx_reshape_with_allowzero(
|
||||
concat_1, shape, allowzero=0
|
||||
)
|
||||
reshaped_2 = self.onnx_reshape_with_allowzero(
|
||||
concat_2, shape, allowzero=0
|
||||
)
|
||||
reshaped_3 = self.onnx_reshape_with_allowzero(
|
||||
concat_3, shape, allowzero=0
|
||||
)
|
||||
|
||||
concat_4 = self.onnx_concat([reshaped_1, reshaped_2, reshaped_3], 2)
|
||||
|
||||
axis = 1
|
||||
split_sizes = [64, 80]
|
||||
|
||||
# Calculate indices at which to split
|
||||
indices = np.cumsum(split_sizes)[
|
||||
:-1
|
||||
] # [64] — split before the second chunk
|
||||
|
||||
# Perform split along axis 1
|
||||
split_0, split_1 = np.split(concat_4, indices, axis=axis)
|
||||
|
||||
num_boxes = 2100 if self.memx_model_height == 320 else 8400
|
||||
shape1 = np.array([1, 4, 16, num_boxes])
|
||||
reshape_4 = self.onnx_reshape_with_allowzero(
|
||||
split_0, shape1, allowzero=0
|
||||
)
|
||||
|
||||
transpose_1 = reshape_4.transpose(0, 2, 1, 3)
|
||||
|
||||
axis = 1 # As per ONNX softmax node
|
||||
|
||||
# Subtract max for numerical stability
|
||||
x_max = np.max(transpose_1, axis=axis, keepdims=True)
|
||||
x_exp = np.exp(transpose_1 - x_max)
|
||||
x_sum = np.sum(x_exp, axis=axis, keepdims=True)
|
||||
softmax_output = x_exp / x_sum
|
||||
|
||||
# Weight W from the ONNX initializer (1, 16, 1, 1) with values 0 to 15
|
||||
W = np.arange(16, dtype=np.float32).reshape(
|
||||
1, 16, 1, 1
|
||||
) # (1, 16, 1, 1)
|
||||
|
||||
# Apply 1x1 convolution: this is a weighted sum over channels
|
||||
conv_output = np.sum(
|
||||
softmax_output * W, axis=1, keepdims=True
|
||||
) # shape: (1, 1, 4, 8400)
|
||||
|
||||
shape2 = np.array([1, 4, num_boxes])
|
||||
reshape_5 = self.onnx_reshape_with_allowzero(
|
||||
conv_output, shape2, allowzero=0
|
||||
)
|
||||
|
||||
# ONNX Slice — get first 2 channels: [0:2] along axis 1
|
||||
slice_output1 = reshape_5[:, 0:2, :] # Result: (1, 2, 8400)
|
||||
|
||||
# Slice channels 2 to 4 → axis = 1
|
||||
slice_output2 = reshape_5[:, 2:4, :]
|
||||
|
||||
# Perform Subtraction
|
||||
sub_output = self.const_A - slice_output1 # Equivalent to ONNX Sub
|
||||
|
||||
# Perform the ONNX-style Add
|
||||
add_output = self.const_B + slice_output2
|
||||
|
||||
sub1 = add_output - sub_output
|
||||
|
||||
add1 = sub_output + add_output
|
||||
|
||||
div_output = add1 / 2.0
|
||||
|
||||
concat_5 = self.onnx_concat([div_output, sub1], axis=1)
|
||||
|
||||
# Expand B to (1, 1, 8400) so it can broadcast across axis=1 (4 channels)
|
||||
const_C_expanded = self.const_C[:, np.newaxis, :] # Shape: (1, 1, 8400)
|
||||
|
||||
# Perform ONNX-style element-wise multiplication
|
||||
mul_output = concat_5 * const_C_expanded # Result: (1, 4, 8400)
|
||||
|
||||
sigmoid_output = self.sigmoid(split_1)
|
||||
outputs = self.onnx_concat([mul_output, sigmoid_output], axis=1)
|
||||
|
||||
final_detections = post_process_yolo(
|
||||
outputs, self.memx_model_width, self.memx_model_height
|
||||
)
|
||||
self.output_queue.put(final_detections)
|
||||
|
||||
elif self.memx_model_type == ModelTypeEnum.yolonas:
|
||||
|
||||
@ -76,12 +76,7 @@
|
||||
}
|
||||
},
|
||||
"npuUsage": "NPU Usage",
|
||||
"npuMemory": "NPU Memory",
|
||||
"intelGpuWarning": {
|
||||
"title": "Intel GPU Stats Warning",
|
||||
"message": "GPU stats unavailable",
|
||||
"description": "This is a known bug in Intel's GPU stats reporting tools (intel_gpu_top) where it will break and repeatedly return a GPU usage of 0% even in cases where hardware acceleration and object detection are correctly running on the (i)GPU. This is not a Frigate bug. You can restart the host to temporarily fix the issue and confirm that the GPU is working correctly. This does not affect performance."
|
||||
}
|
||||
"npuMemory": "NPU Memory"
|
||||
},
|
||||
"otherProcesses": {
|
||||
"title": "Other Processes",
|
||||
|
||||
@ -56,7 +56,6 @@ export function TrackingDetails({
|
||||
const apiHost = useApiHost();
|
||||
const imgRef = useRef<HTMLImageElement | null>(null);
|
||||
const [imgLoaded, setImgLoaded] = useState(false);
|
||||
const [isVideoLoading, setIsVideoLoading] = useState(true);
|
||||
const [displaySource, _setDisplaySource] = useState<"video" | "image">(
|
||||
"video",
|
||||
);
|
||||
@ -71,10 +70,6 @@ export function TrackingDetails({
|
||||
(event.start_time ?? 0) + annotationOffset / 1000 - REVIEW_PADDING,
|
||||
);
|
||||
|
||||
useEffect(() => {
|
||||
setIsVideoLoading(true);
|
||||
}, [event.id]);
|
||||
|
||||
const { data: eventSequence } = useSWR<TrackingDetailsSequence[]>([
|
||||
"timeline",
|
||||
{
|
||||
@ -532,28 +527,22 @@ export function TrackingDetails({
|
||||
)}
|
||||
>
|
||||
{displaySource == "video" && (
|
||||
<>
|
||||
<HlsVideoPlayer
|
||||
videoRef={videoRef}
|
||||
containerRef={containerRef}
|
||||
visible={true}
|
||||
currentSource={videoSource}
|
||||
hotKeys={false}
|
||||
supportsFullscreen={false}
|
||||
fullscreen={false}
|
||||
frigateControls={true}
|
||||
onTimeUpdate={handleTimeUpdate}
|
||||
onSeekToTime={handleSeekToTime}
|
||||
onUploadFrame={onUploadFrameToPlus}
|
||||
onPlaying={() => setIsVideoLoading(false)}
|
||||
isDetailMode={true}
|
||||
camera={event.camera}
|
||||
currentTimeOverride={currentTime}
|
||||
/>
|
||||
{isVideoLoading && (
|
||||
<ActivityIndicator className="absolute left-1/2 top-1/2 -translate-x-1/2 -translate-y-1/2" />
|
||||
)}
|
||||
</>
|
||||
<HlsVideoPlayer
|
||||
videoRef={videoRef}
|
||||
containerRef={containerRef}
|
||||
visible={true}
|
||||
currentSource={videoSource}
|
||||
hotKeys={false}
|
||||
supportsFullscreen={false}
|
||||
fullscreen={false}
|
||||
frigateControls={true}
|
||||
onTimeUpdate={handleTimeUpdate}
|
||||
onSeekToTime={handleSeekToTime}
|
||||
onUploadFrame={onUploadFrameToPlus}
|
||||
isDetailMode={true}
|
||||
camera={event.camera}
|
||||
currentTimeOverride={currentTime}
|
||||
/>
|
||||
)}
|
||||
{displaySource == "image" && (
|
||||
<>
|
||||
|
||||
@ -130,8 +130,6 @@ export default function HlsVideoPlayer({
|
||||
return;
|
||||
}
|
||||
|
||||
setLoadedMetadata(false);
|
||||
|
||||
const currentPlaybackRate = videoRef.current.playbackRate;
|
||||
|
||||
if (!useHlsCompat) {
|
||||
|
||||
@ -309,7 +309,6 @@ function PreviewVideoPlayer({
|
||||
playsInline
|
||||
muted
|
||||
disableRemotePlayback
|
||||
disablePictureInPicture
|
||||
onSeeked={onPreviewSeeked}
|
||||
onLoadedData={() => {
|
||||
if (firstLoad) {
|
||||
|
||||
@ -2,10 +2,7 @@ import { Recording } from "@/types/record";
|
||||
import { DynamicPlayback } from "@/types/playback";
|
||||
import { PreviewController } from "../PreviewPlayer";
|
||||
import { TimeRange, TrackingDetailsSequence } from "@/types/timeline";
|
||||
import {
|
||||
calculateInpointOffset,
|
||||
calculateSeekPosition,
|
||||
} from "@/utils/videoUtil";
|
||||
import { calculateInpointOffset } from "@/utils/videoUtil";
|
||||
|
||||
type PlayerMode = "playback" | "scrubbing";
|
||||
|
||||
@ -75,20 +72,38 @@ export class DynamicVideoController {
|
||||
return;
|
||||
}
|
||||
|
||||
if (
|
||||
this.recordings.length == 0 ||
|
||||
time < this.recordings[0].start_time ||
|
||||
time > this.recordings[this.recordings.length - 1].end_time
|
||||
) {
|
||||
this.setNoRecording(true);
|
||||
return;
|
||||
}
|
||||
|
||||
if (this.playerMode != "playback") {
|
||||
this.playerMode = "playback";
|
||||
}
|
||||
|
||||
const seekSeconds = calculateSeekPosition(
|
||||
time,
|
||||
this.recordings,
|
||||
this.inpointOffset,
|
||||
);
|
||||
let seekSeconds = 0;
|
||||
(this.recordings || []).every((segment) => {
|
||||
// if the next segment is past the desired time, stop calculating
|
||||
if (segment.start_time > time) {
|
||||
return false;
|
||||
}
|
||||
|
||||
if (seekSeconds === undefined) {
|
||||
this.setNoRecording(true);
|
||||
return;
|
||||
}
|
||||
if (segment.end_time < time) {
|
||||
seekSeconds += segment.end_time - segment.start_time;
|
||||
return true;
|
||||
}
|
||||
|
||||
seekSeconds +=
|
||||
segment.end_time - segment.start_time - (segment.end_time - time);
|
||||
return true;
|
||||
});
|
||||
|
||||
// adjust for HLS inpoint offset
|
||||
seekSeconds -= this.inpointOffset;
|
||||
|
||||
if (seekSeconds != 0) {
|
||||
this.playerController.currentTime = seekSeconds;
|
||||
|
||||
@ -14,10 +14,7 @@ import { VideoResolutionType } from "@/types/live";
|
||||
import axios from "axios";
|
||||
import { cn } from "@/lib/utils";
|
||||
import { useTranslation } from "react-i18next";
|
||||
import {
|
||||
calculateInpointOffset,
|
||||
calculateSeekPosition,
|
||||
} from "@/utils/videoUtil";
|
||||
import { calculateInpointOffset } from "@/utils/videoUtil";
|
||||
import { isFirefox } from "react-device-detect";
|
||||
|
||||
/**
|
||||
@ -112,10 +109,10 @@ export default function DynamicVideoPlayer({
|
||||
const [isLoading, setIsLoading] = useState(false);
|
||||
const [isBuffering, setIsBuffering] = useState(false);
|
||||
const [loadingTimeout, setLoadingTimeout] = useState<NodeJS.Timeout>();
|
||||
|
||||
// Don't set source until recordings load - we need accurate startPosition
|
||||
// to avoid hls.js clamping to video end when startPosition exceeds duration
|
||||
const [source, setSource] = useState<HlsSource | undefined>(undefined);
|
||||
const [source, setSource] = useState<HlsSource>({
|
||||
playlist: `${apiHost}vod/${camera}/start/${timeRange.after}/end/${timeRange.before}/master.m3u8`,
|
||||
startPosition: startTimestamp ? startTimestamp - timeRange.after : 0,
|
||||
});
|
||||
|
||||
// start at correct time
|
||||
|
||||
@ -187,7 +184,7 @@ export default function DynamicVideoPlayer({
|
||||
);
|
||||
|
||||
useEffect(() => {
|
||||
if (!recordings?.length) {
|
||||
if (!controller || !recordings?.length) {
|
||||
if (recordings?.length == 0) {
|
||||
setNoRecording(true);
|
||||
}
|
||||
@ -195,6 +192,10 @@ export default function DynamicVideoPlayer({
|
||||
return;
|
||||
}
|
||||
|
||||
if (playerRef.current) {
|
||||
playerRef.current.autoplay = !isScrubbing;
|
||||
}
|
||||
|
||||
let startPosition = undefined;
|
||||
|
||||
if (startTimestamp) {
|
||||
@ -202,12 +203,14 @@ export default function DynamicVideoPlayer({
|
||||
recordingParams.after,
|
||||
(recordings || [])[0],
|
||||
);
|
||||
|
||||
startPosition = calculateSeekPosition(
|
||||
startTimestamp,
|
||||
recordings,
|
||||
inpointOffset,
|
||||
const idealStartPosition = Math.max(
|
||||
0,
|
||||
startTimestamp - timeRange.after - inpointOffset,
|
||||
);
|
||||
|
||||
if (idealStartPosition >= recordings[0].start_time - timeRange.after) {
|
||||
startPosition = idealStartPosition;
|
||||
}
|
||||
}
|
||||
|
||||
setSource({
|
||||
@ -215,18 +218,6 @@ export default function DynamicVideoPlayer({
|
||||
startPosition,
|
||||
});
|
||||
|
||||
// eslint-disable-next-line react-hooks/exhaustive-deps
|
||||
}, [recordings]);
|
||||
|
||||
useEffect(() => {
|
||||
if (!controller || !recordings?.length) {
|
||||
return;
|
||||
}
|
||||
|
||||
if (playerRef.current) {
|
||||
playerRef.current.autoplay = !isScrubbing;
|
||||
}
|
||||
|
||||
setLoadingTimeout(setTimeout(() => setIsLoading(true), 1000));
|
||||
|
||||
controller.newPlayback({
|
||||
@ -234,7 +225,7 @@ export default function DynamicVideoPlayer({
|
||||
timeRange,
|
||||
});
|
||||
|
||||
// we only want this to change when controller or recordings update
|
||||
// we only want this to change when recordings update
|
||||
// eslint-disable-next-line react-hooks/exhaustive-deps
|
||||
}, [controller, recordings]);
|
||||
|
||||
@ -272,48 +263,46 @@ export default function DynamicVideoPlayer({
|
||||
|
||||
return (
|
||||
<>
|
||||
{source && (
|
||||
<HlsVideoPlayer
|
||||
videoRef={playerRef}
|
||||
containerRef={containerRef}
|
||||
visible={!(isScrubbing || isLoading)}
|
||||
currentSource={source}
|
||||
hotKeys={hotKeys}
|
||||
supportsFullscreen={supportsFullscreen}
|
||||
fullscreen={fullscreen}
|
||||
inpointOffset={inpointOffset}
|
||||
onTimeUpdate={onTimeUpdate}
|
||||
onPlayerLoaded={onPlayerLoaded}
|
||||
onClipEnded={onValidateClipEnd}
|
||||
onSeekToTime={(timestamp, play) => {
|
||||
if (onSeekToTime) {
|
||||
onSeekToTime(timestamp, play);
|
||||
}
|
||||
}}
|
||||
onPlaying={() => {
|
||||
if (isScrubbing) {
|
||||
playerRef.current?.pause();
|
||||
}
|
||||
<HlsVideoPlayer
|
||||
videoRef={playerRef}
|
||||
containerRef={containerRef}
|
||||
visible={!(isScrubbing || isLoading)}
|
||||
currentSource={source}
|
||||
hotKeys={hotKeys}
|
||||
supportsFullscreen={supportsFullscreen}
|
||||
fullscreen={fullscreen}
|
||||
inpointOffset={inpointOffset}
|
||||
onTimeUpdate={onTimeUpdate}
|
||||
onPlayerLoaded={onPlayerLoaded}
|
||||
onClipEnded={onValidateClipEnd}
|
||||
onSeekToTime={(timestamp, play) => {
|
||||
if (onSeekToTime) {
|
||||
onSeekToTime(timestamp, play);
|
||||
}
|
||||
}}
|
||||
onPlaying={() => {
|
||||
if (isScrubbing) {
|
||||
playerRef.current?.pause();
|
||||
}
|
||||
|
||||
if (loadingTimeout) {
|
||||
clearTimeout(loadingTimeout);
|
||||
}
|
||||
if (loadingTimeout) {
|
||||
clearTimeout(loadingTimeout);
|
||||
}
|
||||
|
||||
setNoRecording(false);
|
||||
}}
|
||||
setFullResolution={setFullResolution}
|
||||
onUploadFrame={onUploadFrameToPlus}
|
||||
toggleFullscreen={toggleFullscreen}
|
||||
onError={(error) => {
|
||||
if (error == "stalled" && !isScrubbing) {
|
||||
setIsBuffering(true);
|
||||
}
|
||||
}}
|
||||
isDetailMode={isDetailMode}
|
||||
camera={contextCamera || camera}
|
||||
currentTimeOverride={currentTime}
|
||||
/>
|
||||
)}
|
||||
setNoRecording(false);
|
||||
}}
|
||||
setFullResolution={setFullResolution}
|
||||
onUploadFrame={onUploadFrameToPlus}
|
||||
toggleFullscreen={toggleFullscreen}
|
||||
onError={(error) => {
|
||||
if (error == "stalled" && !isScrubbing) {
|
||||
setIsBuffering(true);
|
||||
}
|
||||
}}
|
||||
isDetailMode={isDetailMode}
|
||||
camera={contextCamera || camera}
|
||||
currentTimeOverride={currentTime}
|
||||
/>
|
||||
<PreviewPlayer
|
||||
className={cn(
|
||||
className,
|
||||
|
||||
@ -24,57 +24,3 @@ export function calculateInpointOffset(
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* Calculates the video player time (in seconds) for a given timestamp
|
||||
* by iterating through recording segments and summing their durations.
|
||||
* This accounts for the fact that the video is a concatenation of segments,
|
||||
* not a single continuous stream.
|
||||
*
|
||||
* @param timestamp - The target timestamp to seek to
|
||||
* @param recordings - Array of recording segments
|
||||
* @param inpointOffset - HLS inpoint offset to subtract from the result
|
||||
* @returns The calculated seek position in seconds, or undefined if timestamp is out of range
|
||||
*/
|
||||
export function calculateSeekPosition(
|
||||
timestamp: number,
|
||||
recordings: Recording[],
|
||||
inpointOffset: number = 0,
|
||||
): number | undefined {
|
||||
if (!recordings || recordings.length === 0) {
|
||||
return undefined;
|
||||
}
|
||||
|
||||
// Check if timestamp is within the recordings range
|
||||
if (
|
||||
timestamp < recordings[0].start_time ||
|
||||
timestamp > recordings[recordings.length - 1].end_time
|
||||
) {
|
||||
return undefined;
|
||||
}
|
||||
|
||||
let seekSeconds = 0;
|
||||
|
||||
(recordings || []).every((segment) => {
|
||||
// if the next segment is past the desired time, stop calculating
|
||||
if (segment.start_time > timestamp) {
|
||||
return false;
|
||||
}
|
||||
|
||||
if (segment.end_time < timestamp) {
|
||||
// Add the full duration of this segment
|
||||
seekSeconds += segment.end_time - segment.start_time;
|
||||
return true;
|
||||
}
|
||||
|
||||
// We're in this segment - calculate position within it
|
||||
seekSeconds +=
|
||||
segment.end_time - segment.start_time - (segment.end_time - timestamp);
|
||||
return true;
|
||||
});
|
||||
|
||||
// Adjust for HLS inpoint offset
|
||||
seekSeconds -= inpointOffset;
|
||||
|
||||
return seekSeconds >= 0 ? seekSeconds : undefined;
|
||||
}
|
||||
|
||||
@ -375,50 +375,6 @@ export default function GeneralMetrics({
|
||||
return Object.keys(series).length > 0 ? Object.values(series) : undefined;
|
||||
}, [statsHistory]);
|
||||
|
||||
// Check if Intel GPU has all 0% usage values (known bug)
|
||||
const showIntelGpuWarning = useMemo(() => {
|
||||
if (!statsHistory || statsHistory.length < 3) {
|
||||
return false;
|
||||
}
|
||||
|
||||
const gpuKeys = Object.keys(statsHistory[0]?.gpu_usages ?? {});
|
||||
const hasIntelGpu = gpuKeys.some(
|
||||
(key) => key === "intel-vaapi" || key === "intel-qsv",
|
||||
);
|
||||
|
||||
if (!hasIntelGpu) {
|
||||
return false;
|
||||
}
|
||||
|
||||
// Check if all GPU usage values are 0% across all stats
|
||||
let allZero = true;
|
||||
let hasDataPoints = false;
|
||||
|
||||
for (const stats of statsHistory) {
|
||||
if (!stats) {
|
||||
continue;
|
||||
}
|
||||
|
||||
Object.entries(stats.gpu_usages || {}).forEach(([key, gpuStats]) => {
|
||||
if (key === "intel-vaapi" || key === "intel-qsv") {
|
||||
if (gpuStats.gpu) {
|
||||
hasDataPoints = true;
|
||||
const gpuValue = parseFloat(gpuStats.gpu.slice(0, -1));
|
||||
if (!isNaN(gpuValue) && gpuValue > 0) {
|
||||
allZero = false;
|
||||
}
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
if (!allZero) {
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
return hasDataPoints && allZero;
|
||||
}, [statsHistory]);
|
||||
|
||||
// npu stats
|
||||
|
||||
const npuSeries = useMemo(() => {
|
||||
@ -683,46 +639,8 @@ export default function GeneralMetrics({
|
||||
<>
|
||||
{statsHistory.length != 0 ? (
|
||||
<div className="rounded-lg bg-background_alt p-2.5 md:rounded-2xl">
|
||||
<div className="mb-5 flex flex-row items-center justify-between">
|
||||
<div className="mb-5">
|
||||
{t("general.hardwareInfo.gpuUsage")}
|
||||
{showIntelGpuWarning && (
|
||||
<Popover>
|
||||
<PopoverTrigger asChild>
|
||||
<button
|
||||
className="flex flex-row items-center gap-1.5 text-yellow-600 focus:outline-none dark:text-yellow-500"
|
||||
aria-label={t(
|
||||
"general.hardwareInfo.intelGpuWarning.title",
|
||||
)}
|
||||
>
|
||||
<CiCircleAlert
|
||||
className="size-5"
|
||||
aria-label={t(
|
||||
"general.hardwareInfo.intelGpuWarning.title",
|
||||
)}
|
||||
/>
|
||||
<span className="text-sm">
|
||||
{t(
|
||||
"general.hardwareInfo.intelGpuWarning.message",
|
||||
)}
|
||||
</span>
|
||||
</button>
|
||||
</PopoverTrigger>
|
||||
<PopoverContent className="w-80">
|
||||
<div className="space-y-2">
|
||||
<div className="font-semibold">
|
||||
{t(
|
||||
"general.hardwareInfo.intelGpuWarning.title",
|
||||
)}
|
||||
</div>
|
||||
<div>
|
||||
{t(
|
||||
"general.hardwareInfo.intelGpuWarning.description",
|
||||
)}
|
||||
</div>
|
||||
</div>
|
||||
</PopoverContent>
|
||||
</Popover>
|
||||
)}
|
||||
</div>
|
||||
{gpuSeries.map((series) => (
|
||||
<ThresholdBarGraph
|
||||
|
||||
Loading…
Reference in New Issue
Block a user