Persistent Unique Ids for Objects Using the SDK's Object Detection

birneamstiel · March 18, 2025, 4:36pm

Hi there,

we are evaluating the ZED SDK’s built in object detection capabilities for our use case: We want to count objects of a specific class within a room. For that it is important that detected objects are spatially uniquely identified. From what the docs on object detection say this seems to be possible with the functionality provided by the SDK:

The objects can also be tracked within the environment over time, even if the camera is in motion, thanks to data from the positional tracking module.

To verify this I was running the birds eye viewer sample from the SDK’s sample directory (python version) with the object class ELECTRONICS and positional tracking enabled. As shown in the attached video I’m pointing the camera at a laptop which is recognized, but after rotating out of sight and looking at the laptop once again, another id is assigned.

Is this expected or is there any way of having a consistent id for an object which stays at the same position? Thanks in advance!

(I’m using a ZED X on a Jetson AGX Orin 64GB platform.)

hbeaumont · March 19, 2025, 3:32pm

Hello @birneamstiel, welcome on StereoLabs forum !

The tracking is tuned to track objects over time, but as object can be moving ones, the algorithm cannot ensure that the new detection he makes corresponds to an object detected a long time ago (“a long time” is subjective and specific to our tracking model, but 10sec as you present in your video is “too long”). To ensure the correspondence, we have dynamic timeout to drop objects which are no longer visible. This parameter is not settable in the object detection mode.

However, to set by your own some of the tracking parameters, you can proceed with two different methods (please find examples there zed-sdk/object detection/custom detector at master · stereolabs/zed-sdk · GitHub):

use a custom onnx (which must be a YOLO from v5 to v11). Using this method you will be able to set your neural network in the SDK and to retrieve objects via retrieveObjects method: Camera Class Reference | API Reference | Stereolabs, and you have an example here: zed-sdk/object detection/custom detector/cpp/tensorrt_yolov5-v6-v8_onnx_internal at master · stereolabs/zed-sdk · GitHub
use a custom object detector. Using this mode, you will need to run an external neural network and to ingest the inference output to the SDK via ingestCustomBoxObjects method: Camera Class Reference | API Reference | Stereolabs, and you have an example here: zed-sdk/object detection/custom detector/cpp/tensorrt_yolov5-v6-v8_onnx at master · stereolabs/zed-sdk · GitHub

In both cases you will be able to set your specific “tracking_timeout” value and other parameters, whose cannot be set in the object detection mode using our neural networks.
Notice that for now, using 4.2 version, only “static object” can be impacted by this parameter. In the 5.0 release which will be online soon, we will remove this restriction and it will work for “dynamic object” as well.

I hope it will help you,

Best regards,

birneamstiel · March 24, 2025, 9:27am

Hey @hbeaumont,

thanks for this clarification and your helpful hints towards a solution!

I’ve been trying out the onxx_internal sample with a custom model as you proposed and could indeed achieve close to what I want with increasing the tracking_timeout parameter. However I encountered another problem: As shown in the screen capture from running the sample (on the same SVO file as before) the notebook is still being recognized as a new object (opposed to being re-identified).

It seems that the drift of the localization system towards the end of the demo video leads to a mismatch of the (old) recognized bounding box and the new match. This makes total sense given the constraints of visual-inertial localization, however it would be super useful for our application if we could fine tune the proximity threshold i.e. the distance of two detected objects of the same class below which the system treats them as the same object.

Is this already possible using the ZED SDK or are there any plans of exposing this?

BenjaminV · April 1, 2025, 2:21pm

Hi,

Did you correctly set your object as “static” in the API ? (zed-sdk/object detection/custom detector/cpp/tensorrt_yolov5-v6-v8_onnx_internal/src/main.cpp at master · stereolabs/zed-sdk · GitHub)

Can you tell me what SDK version you are using please?

birneamstiel · April 8, 2025, 9:51am

Hi @BenjaminV,

thanks for your reply. I did indeed set the is_static property to true. The only modifications I made to the file you were linking to are the following:

    // Prepare SDK's output retrieval
    const sl::Resolution display_resolution = zed.getCameraInformation().camera_configuration.resolution;
    sl::Mat left_sl, point_cloud;
    cv::Mat left_cv;
    sl::CustomObjectDetectionRuntimeParameters objectTracker_parameters_rt;
    // All classes parameters
    objectTracker_parameters_rt.object_detection_properties.detection_confidence_threshold = 75.f;
    objectTracker_parameters_rt.object_detection_properties.is_static = true;
    objectTracker_parameters_rt.object_detection_properties.tracking_timeout = 100.f;
    // // Per classes paramters override
    // objectTracker_parameters_rt.object_class_detection_properties[0U].detection_confidence_threshold = 80.f;
    // objectTracker_parameters_rt.object_class_detection_properties[1U].min_box_width_normalized = 0.01f;
    // objectTracker_parameters_rt.object_class_detection_properties[1U].max_box_width_normalized = 0.5f;
    // objectTracker_parameters_rt.object_class_detection_properties[1U].min_box_height_normalized = 0.01f;
    // objectTracker_parameters_rt.object_class_detection_properties[1U].max_box_height_normalized = 0.5f;

I’m on SDK version 4.2.5.