Custom object detector returning different distance to Depth Viewer app

georgeman93 · August 9, 2022, 1:39am

We have a python script that processes the frames of an SVO running some openCV to determine the position (pixel space) of a coloured ball in the frame. We draw a circle around it and save the image and it works well.

To get the 3d position (camera space) we calculate bounding box coordinates and use the custom object detector to calculate position.

This appeared to be working, until we found that the results (z coord in camera space) were out by >5m whereas probing the same part of the image using the same camera the shot the SVO and the Depth Viewer program produced results with error <1m

The bounding box is given by

np.array([[x-radius, y-radius],[x+radius, y-radius],[x-radius, y+radius],[x+radius, y+radius]])

where x,y are pixels to the right and pixels down respectively from the top left corner of the image
and radius is radius of the ball as given by cv2.minEnclosingCircle
as

CustomBoxObjectData Class Reference | API Reference | Stereolabs and https://github.com/stereolabs/zed-sdk/blob/ed3f068301fbdf3898f7c42de864dc578467e061/object%20detection/custom%20detector/python/pytorch_yolov5/detector.py
Reording the bounding box and scaling the radius don’t seem to help.

Are there any other possible sources of error to account for?

EDIT: We get the same results just loading a frame and extracting depth at the ball pixel coordinates manually using zed.retrieve_measure

EDIT2: comparing the depth given in the depth map to the left image (as opposed to the right) reduces the error (to 1-2m) significantly but enough and they don’t match the readings given by the Depth Viewer app

SL-PY · August 11, 2022, 8:53am

Hi,
the given bouding boxes should be expressed accordingly to the real images resolution and given for the LEFT image.

When using this sample, does the displayed bounding boxes are at the expected coordinates (in the rendered image) ?

georgeman93 · August 11, 2022, 10:13am

I don’t know how to explicitly render the zed bounding box object on the input image but am using the following to draw them onto the image

cv2.rectangle(cv2_svo_image,(int(x - radius),int(y - radius),(int(x + radius),int(y + radius),(255,0,0),2)

This puts them right over the ball such that the centre of the rectangles are at the point of the ball that I click on in the Depth Viewer that gives a different depth reading to my code.

I have verified that all intermediate image have the same size 1920x1080