Custom Object Detection Using Zed Ros2 Wrapper

I am using Zed Ros2 wrapper and I know there is not custom detector support right now. I have written custom code for object detection and calculating depth using depth map. How can I convert 2D object detection to 3D and generate a grid map ? My code is in Python and I am not getting any examples.

Hi @Akshay
Welcome to the Stereolabs community.

Unfortunately, the custom object detection is not yet available in the ZED ROS 2 Wrapper.
We have scheduled the development of this feature for the next few months.

I recommend you start from the zed-sdk examples to adapt them to your ROS 2 Python node.

Hi @Myzhar
Will be eagerly waiting for custom object detection feature. I am following the example and already reached to detection and calculating distance. Is there any reference(maths) or blogs to convert 2D bb to 3D using point cloud .

You can find conversion formulas in the support post:

Thanks for the post. How can I get the f_x, f_y, c_x, and c_y in the ros2 wrapper?

You must subscribe to the camera_info topic relative to the image topic that you subscribed.

@Myzhar
The blogs you have shared convert image bounding box values to 3D world coordinates but how will we calculate the shape of the 3D bounding box. Any more document or mathematics ?

@Akshay this is part of the point cloud (or depth map) processing.
You must extract the depth information related to the data inside the bounding box and calculate the minimum/maximum coordinates (extension) for each direction.

@Myzhar Do you have any thing for reference for point cloud processing ?
Let’s talk about getting 3D world coordinates. I have bounding box coordinates and the depth information of that point. This means I have (u,v) and Z . Using camera info of the zed image compressed from which I have taken (u,v) .I am getting the values of X,Y negatives sometimes ? If this is correct or there is something wrong with my code ?
Code:

# Function to convert image coordinates and depth to 3D world coordinates
def image_depth_to_world(x, y, depth, camera_info):
    # Extract camera parameters from CameraInfo
    fx = camera_info.K[0]
    fy = camera_info.K[4]
    cx = camera_info.K[2]
    cy = camera_info.K[5]

    # Calculate 3D world coordinates
    Z = depth
    X = (x - cx) * Z / fx
    Y = (y - cy) * Z / fy

    return X, Y, Z

For point cloud processing you can use the Open3D or PCL library