We have a python script that processes the frames of an SVO running some openCV to determine the position (pixel space) of a coloured ball in the frame. We draw a circle around it and save the image and it works well.
To get the 3d position (camera space) we calculate bounding box coordinates and use the custom object detector to calculate position.
This appeared to be working, until we found that the results (z coord in camera space) were out by >5m whereas probing the same part of the image using the same camera the shot the SVO and the Depth Viewer program produced results with error <1m
The bounding box is given by
np.array([[x-radius, y-radius],[x+radius, y-radius],[x-radius, y+radius],[x+radius, y+radius]])
where x,y are pixels to the right and pixels down respectively from the top left corner of the image
and radius is radius of the ball as given by cv2.minEnclosingCircle
as
CustomBoxObjectData Class Reference | API Reference | Stereolabs and https://github.com/stereolabs/zed-sdk/blob/ed3f068301fbdf3898f7c42de864dc578467e061/object%20detection/custom%20detector/python/pytorch_yolov5/detector.py
Reording the bounding box and scaling the radius don’t seem to help.
Are there any other possible sources of error to account for?
EDIT: We get the same results just loading a frame and extracting depth at the ball pixel coordinates manually using zed.retrieve_measure
EDIT2: comparing the depth given in the depth map to the left image (as opposed to the right) reduces the error (to 1-2m) significantly but enough and they don’t match the readings given by the Depth Viewer app