Custom Object Detection Using Zed Ros2 Wrapper

I am using Zed Ros2 wrapper and I know there is not custom detector support right now. I have written custom code for object detection and calculating depth using depth map. How can I convert 2D object detection to 3D and generate a grid map ? My code is in Python and I am not getting any examples.

Hi @Akshay
Welcome to the Stereolabs community.

Unfortunately, the custom object detection is not yet available in the ZED ROS 2 Wrapper.
We have scheduled the development of this feature for the next few months.

I recommend you start from the zed-sdk examples to adapt them to your ROS 2 Python node.

Hi @Myzhar
Will be eagerly waiting for custom object detection feature. I am following the example and already reached to detection and calculating distance. Is there any reference(maths) or blogs to convert 2D bb to 3D using point cloud .

You can find conversion formulas in the support post:

Thanks for the post. How can I get the f_x, f_y, c_x, and c_y in the ros2 wrapper?

You must subscribe to the camera_info topic relative to the image topic that you subscribed.

The blogs you have shared convert image bounding box values to 3D world coordinates but how will we calculate the shape of the 3D bounding box. Any more document or mathematics ?

@Akshay this is part of the point cloud (or depth map) processing.
You must extract the depth information related to the data inside the bounding box and calculate the minimum/maximum coordinates (extension) for each direction.

@Myzhar Do you have any thing for reference for point cloud processing ?
Let’s talk about getting 3D world coordinates. I have bounding box coordinates and the depth information of that point. This means I have (u,v) and Z . Using camera info of the zed image compressed from which I have taken (u,v) .I am getting the values of X,Y negatives sometimes ? If this is correct or there is something wrong with my code ?

# Function to convert image coordinates and depth to 3D world coordinates
def image_depth_to_world(x, y, depth, camera_info):
    # Extract camera parameters from CameraInfo
    fx = camera_info.K[0]
    fy = camera_info.K[4]
    cx = camera_info.K[2]
    cy = camera_info.K[5]

    # Calculate 3D world coordinates
    Z = depth
    X = (x - cx) * Z / fx
    Y = (y - cy) * Z / fy

    return X, Y, Z

For point cloud processing you can use the Open3D or PCL library

Hey Akshay,

I am wondering if you have found a solution to this problem, it sounds as if you were getting close. I am currently getting a VTOL to land autonomously and the last step I am on is integrating my ZED custom object detection into my Isaac ROS dev container, I didn’t realize how big of a step this was going to be. If by any chance I could see how you approached this problem or tips it would be very appreciated. Thanks.