How to save rendered 3d bounding box coordinates in the image space (pixel coordinates) from ZED SDK?

harishkool · January 13, 2022, 7:59pm

I am following the example shown in zed-examples/object detection/image viewer at master · stereolabs/zed-examples · GitHub. It uses OpenGL to draw objects on the image, by default 3D bounding box coordinates from ZED are normalized. I have tried something similar to the rendering of ‘ID’ on the image for all the 3D bounding box coordinates, like below:

            bbox = objects.object_list[i].bounding_box
        #     _cam_mat = np.array(_cam, np.float32).reshape(4,4)
            N = 8
            hom_obj_coords = np.c_[bbox, np.ones(N)]
            proj3D_cam = np.matmul(hom_obj_coords, _cam_mat) # 8 x 4
            # proj3D_cam[1] = proj3D_cam[1] + 0.25

            # proj2D = [((proj3D_cam[0] / pt4d[3]) * _wnd_size.width) / (2. * proj3D_cam[3]) + (_wnd_size.width * 0.5)
            # , ((proj3D_cam[1] / pt4d[3]) * _wnd_size.height) / (2. * proj3D_cam[3]) + (_wnd_size.height * 0.5)]

            proj2D = [((proj3D_cam[:, 0] / hom_obj_coords[:, 3]) * 1920) / (2. * proj3D_cam[:, 3]) + (1920 * 0.5)
                    , ((proj3D_cam[:, 1] / hom_obj_coords[:, 3]) * 1172) / (2. * proj3D_cam[:, 3]) + ((1172 * 0.5) + ((1172 - 1080)*0.5))]
            proj2D_x = proj2D[0]
            proj2D_y = proj2D[1]

where

_cam_mat

is the projection matrix I got from the OpenGL code. But objects are not getting aligned properly, Is there any way to directly save the rendered 3D bounding box coordinates in the image space i.e., in pixel coordinates ? because OpenGL is doing it anyway but I am not able to understand how to get those final projected pixel coordinates.

Myzhar · January 14, 2022, 7:25am

Follow the discussion on Github. Please do not cross-post:

github.com/stereolabs/zed-examples

Saving rendered 3d bounding box coordinates in the image space (pixel coordinates) from ZED SDK?

opened 09:05PM - 13 Jan 22 UTC

harishkool

feature_request

### Preliminary Checks - [X] This issue is not a duplicate. Before opening a ne…w issue, please search existing issues. - [X] This issue is not a question, bug report, or anything other than a feature request directly related to this project. ### Proposal I am following the example shown in zed-examples/object detection/image viewer at master · stereolabs/zed-examples · GitHub. It uses OpenGL to draw objects on the image. Right now the 3D bounding box coordinates from the ZED SDK are normalized, I think it would be great if ZED provides the feasibility of returning the 3D bounding box coordinates in the pixel space by taking projection matrix and image shape as the input. I am doing like below to get the 3D bounding box coordinates in the image space i.e., pixel coordinates bbox = objects.object_list[i].bounding_box # _cam_mat = np.array(_cam, np.float32).reshape(4,4) N = 8 hom_obj_coords = np.c_[bbox, np.ones(N)] proj3D_cam = np.matmul(hom_obj_coords, _cam_mat) # 8 x 4 # proj3D_cam[1] = proj3D_cam[1] + 0.25 # proj2D = [((proj3D_cam[0] / pt4d[3]) * _wnd_size.width) / (2. * proj3D_cam[3]) + (_wnd_size.width * 0.5) # , ((proj3D_cam[1] / pt4d[3]) * _wnd_size.height) / (2. * proj3D_cam[3]) + (_wnd_size.height * 0.5)] proj2D = [((proj3D_cam[:, 0] / hom_obj_coords[:, 3]) * 1920) / (2. * proj3D_cam[:, 3]) + (1920 * 0.5) , ((proj3D_cam[:, 1] / hom_obj_coords[:, 3]) * 1172) / (2. * proj3D_cam[:, 3]) + ((1172 * 0.5) + ((1172 - 1080)*0.5))] proj2D_x = proj2D[0] proj2D_y = proj2D[1] where `_cam_mat` is the projection matrix I got from the OpenGL code. But objects are not getting aligned properly, I think it would be great if the ZED SDK provides the support for this. ### Use-Case Saving the 3D bounding boxes in the pixel space will help to train any custom 3D object detection network without any associated point clouds. ### Anything else? _No response_