Add script and documentation for generating TRT Models

2026-02-03 09:45:22 +03:00 · 2022-12-28 12:07:14 -05:00 · 2022-12-28 12:07:14 -05:00 · a16231e624
commit a16231e624
parent 111fdfbdbc
3 changed files with 98 additions and 13 deletions
--- a/docker/tensorrt_models.sh
+++ b/docker/tensorrt_models.sh
@ -0,0 +1,34 @@
+#!/bin/bash
+
+set -euxo pipefail
+
+CUDA_HOME=/usr/local/cuda
+LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64
+OUTPUT_FOLDER=/tensorrt_models
+echo "Generating the following TRT Models: ${YOLO_MODELS:="yolov4-tiny-288,yolov4-tiny-416,yolov7-tiny-416"}"
+
+# Create output folder
+mkdir -p ${OUTPUT_FOLDER}
+
+# Install packages
+pip install --upgrade pip && pip install onnx==1.9.0 protobuf==3.20.3
+
+# Clone tensorrt_demos repo
+git clone --depth 1 https://github.com/yeahme49/tensorrt_demos.git /tensorrt_demos
+
+# Build libyolo
+cd /tensorrt_demos/plugins && make all
+cp libyolo_layer.so ${OUTPUT_FOLDER}/libyolo_layer.so
+
+# Download yolo weights
+cd /tensorrt_demos/yolo && ./download_yolo.sh
+
+# Build trt engine
+cd /tensorrt_demos/yolo
+
+for model in ${YOLO_MODELS//,/ }
+do
+    python3 yolo_to_onnx.py -m ${model}
+    python3 onnx_to_tensorrt.py -m ${model}
+    cp /tensorrt_demos/yolo/${model}.trt ${OUTPUT_FOLDER}/${model}.trt;
+done
--- a/docs/docs/configuration/detectors.md
+++ b/docs/docs/configuration/detectors.md
@ -3,11 +3,10 @@ id: detectors
 title: Detectors
 ---

-Frigate provides the following builtin detector types: `cpu`, `edgetpu`, and `openvino`. By default, Frigate will use a single CPU detector. Other detectors may require additional configuration as described below. When using multiple detectors they will run in dedicated processes, but pull from a common queue of detection requests from across all cameras.
-
-**Note**: There is not yet support for Nvidia GPUs to perform object detection with tensorflow. It can be used for ffmpeg decoding, but not object detection.
+Frigate provides the following builtin detector types: `cpu`, `edgetpu`, `openvino`, and `tensorrt`. By default, Frigate will use a single CPU detector. Other detectors may require additional configuration as described below. When using multiple detectors they will run in dedicated processes, but pull from a common queue of detection requests from across all cameras.

 ## CPU Detector (not recommended)
+
 The CPU detector type runs a TensorFlow Lite model utilizing the CPU without hardware acceleration. It is recommended to use a hardware accelerated detector type instead for better performance. To configure a CPU based detector, set the `"type"` attribute to `"cpu"`.

 The number of threads used by the interpreter can be specified using the `"num_threads"` attribute, and defaults to `3.`
@ -60,6 +59,7 @@ detectors:
 ```

 ### Native Coral (Dev Board)
+
 _warning: may have [compatibility issues](https://github.com/blakeblackshear/frigate/issues/1706) after `v0.9.x`_

 ```yaml
@ -139,11 +139,66 @@ Additionally, the Frigate docker container needs to run with the following confi
 ```bash
 --device-cgroup-rule='c 189:\* rmw' -v /dev/bus/usb:/dev/bus/usb
 ```
+
 or in your compose file:

 ```yml
 device_cgroup_rules:
-  - 'c 189:* rmw'
+  - "c 189:* rmw"
 volumes:
  - /dev/bus/usb:/dev/bus/usb
 ```
+
+## NVidia TensorRT Detector
+
+NVidia GPUs may be used for object detection using the TensorRT libraries.
+
+### Minimum Hardware Support
+
+**TODO**
+
+### Generate Models
+
+The models used for TensorRT must be preprocessed on the same hardware platform that they will run on. This means that each user must run additional setup to generate these model files for the TensorRT library. A script is provided that will build several common models.
+
+To generate the model files, create a new folder to save the models, download the script, and launch a docker container that will run the script.
+
+```bash
+mkdir trt-models
+wget https://github.com/blakeblackshear/frigate/raw/master/docker/tensorrt_models.sh
+docker run --gpus=all --rm -it -v `pwd`/trt-models:/tensorrt_models -v `pwd`/tensorrt_models.sh:/tensorrt_models.sh nvcr.io/nvidia/tensorrt:22.07-py3 /tensorrt_models.sh
+```
+
+The `trt-models` folder can then be mapped into your frigate container as `trt-models` and the models referenced from the config.
+
+If your GPU does not support FP16 operations, you can pass the environment variable `-e USE_FP16=False` to the `docker run` command to disable it.
+
+Specific models can be selected by passing an environment variable to the `docker run` command. Use the form `-e YOLO_MODELS=yolov4-416,yolov4-tiny-416` to select one or more model names. The models available are shown below.
+
+```
+yolov3-288
+yolov3-416
+yolov3-608
+yolov3-spp-288
+yolov3-spp-416
+yolov3-spp-608
+yolov3-tiny-288
+yolov3-tiny-416
+yolov4-288
+yolov4-416
+yolov4-608
+yolov4-csp-256
+yolov4-csp-512
+yolov4-p5-448
+yolov4-p5-896
+yolov4-tiny-288
+yolov4-tiny-416
+yolov4x-mish-320
+yolov4x-mish-640
+yolov7-tiny-288
+yolov7-tiny-416
+```
+
+### Configuration Parameters
+
+**TODO**
--- a/frigate/detectors/plugins/tensorrt.py
+++ b/frigate/detectors/plugins/tensorrt.py
@ -83,9 +83,7 @@ class TensorRtDetector(DetectionApi):
            )
            trt.init_libnvinfer_plugins(self.trt_logger, "")

-            ctypes.cdll.LoadLibrary(
-                "/media/frigate/models/tensorrt_demos/yolo/libyolo_layer.so"
-            )
+            ctypes.cdll.LoadLibrary("/trt-models/libyolo_layer.so")
        except OSError as e:
            logger.error(
                "ERROR: failed to load libraries. %s",
@ -250,11 +248,9 @@ class TensorRtDetector(DetectionApi):
        # 1 - score
        # 2..5 - a value between 0 and 1 of the box: [top, left, bottom, right]

-        # transform [height, width, 3] into (3, H, W)
-        # tensor_input = tensor_input.transpose((2, 0, 1)).astype(np.float32)
-
        # normalize
-        # tensor_input /= 255.0
+        tensor_input = tensor_input.astype(np.float32)
+        tensor_input /= 255.0

        self.inputs[0].host = np.ascontiguousarray(tensor_input.astype(np.float32))
        trt_outputs = self._do_inference()