Segmentation masks not available using custom Object Detection with YOLOv8

Hi there,

I’m using:

  • ZED SDK 4.1
  • ZED X camera

and trying to get the mask output from a YOLOv8 customer detector.

I’m using the ZED SDK custom detector example, with object mask computation enabled:

obj_param = sl.ObjectDetectionParameters()
obj_param.detection_model = sl.OBJECT_DETECTION_MODEL.CUSTOM_BOX_OBJECTS
obj_param.enable_tracking = True
obj_param.enable_segmentation = True  # masks enabled
zed.enable_object_detection(obj_param)

and a printout of the mask information:

for obj in objects.object_list:
    if obj.tracking_state == sl.OBJECT_TRACKING_STATE.OK:
    print(obj.mask.is_init())
    print(obj.mask.get_data())

However, the result is:

  1. obj.mask.is_init() always returns False
  2. obj.mask.get_data() always returns []

This post seems to suggest that segmentation masks are available with custom object detectors (e.g. YOLOv8) via the ZED SDK, however this post suggests they aren’t available.

Here’s the complete code:
custom_detector_w_masks.py (7.7 KB)

Is it possible to retrieve instance segmentation masks using the ZED SDK Object Detection model and a custom object detector, e.g. YOLOv8?

Thanks!

Hi @andrew.stringfield,

Currently, the ZED SDK only supports ingesting custom bounding boxes from a custom object detector model.

As Object Detection models are increasingly supporting segmentation masks, we are currently working on supporting custom instance segmentation masks in the ZED SDK to add our 3D tracking capabilities. This is planned for future versions of the SDK.

Both posts you’ve linked are correct: the SDK can ingest custom bounding boxes, and if segmentation is enabled, the SDK computes a segmentation mask based on the geometry of the object (not with AI).

With the errors you’ve noticed, we have indeed detected an issue in the return of the masks, and are preparing a fix for a patch of SDK.

@mattrouss thanks for the update!

When you say:

the SDK computes a segmentation mask based on the geometry of the object

How is this done? Is the mask calculated as a square that’s the same area as the bounding box, or something else?

Thanks!

Hi @andrew.stringfield,

Indeed this is a limitation of the custom object detection segmentation. As we do not have additional information on the objects, it is difficult to refine the mask of the object in a generic way.

For comparison in the ZED SDK Object Detection classes which are known, you can retrieve the segmentation masks, and this is performed using the geometry of the object, a.k.a its point cloud information in 3D.

So in your case, ingesting custom segmentation masks is probably more interesting, this is a feature in our roadmap short-term.