I am following the example shown in zed-examples/object detection/image viewer at master · stereolabs/zed-examples · GitHub . It uses OpenGL to draw objects on the image, by default 3D bounding box coordinates from ZED are normalized. I have tried something similar to the rendering of ‘ID’ on the image for all the 3D bounding box coordinates, like below:
bbox = objects.object_list[i].bounding_box
# _cam_mat = np.array(_cam, np.float32).reshape(4,4)
N = 8
hom_obj_coords = np.c_[bbox, np.ones(N)]
proj3D_cam = np.matmul(hom_obj_coords, _cam_mat) # 8 x 4
# proj3D_cam[1] = proj3D_cam[1] + 0.25
# proj2D = [((proj3D_cam[0] / pt4d[3]) * _wnd_size.width) / (2. * proj3D_cam[3]) + (_wnd_size.width * 0.5)
# , ((proj3D_cam[1] / pt4d[3]) * _wnd_size.height) / (2. * proj3D_cam[3]) + (_wnd_size.height * 0.5)]
proj2D = [((proj3D_cam[:, 0] / hom_obj_coords[:, 3]) * 1920) / (2. * proj3D_cam[:, 3]) + (1920 * 0.5)
, ((proj3D_cam[:, 1] / hom_obj_coords[:, 3]) * 1172) / (2. * proj3D_cam[:, 3]) + ((1172 * 0.5) + ((1172 - 1080)*0.5))]
proj2D_x = proj2D[0]
proj2D_y = proj2D[1]
where
_cam_mat
is the projection matrix I got from the OpenGL code. But objects are not getting aligned properly, Is there any way to directly save the rendered 3D bounding box coordinates in the image space i.e., in pixel coordinates ? because OpenGL is doing it anyway but I am not able to understand how to get those final projected pixel coordinates.
Myzhar
January 14, 2022, 7:25am
2
Follow the discussion on Github. Please do not cross-post:
opened 09:05PM - 13 Jan 22 UTC
feature_request
### Preliminary Checks
- [X] This issue is not a duplicate. Before opening a ne… w issue, please search existing issues.
- [X] This issue is not a question, bug report, or anything other than a feature request directly related to this project.
### Proposal
I am following the example shown in zed-examples/object detection/image viewer at master · stereolabs/zed-examples · GitHub. It uses OpenGL to draw objects on the image. Right now the 3D bounding box coordinates from the ZED SDK are normalized, I think it would be great if ZED provides the feasibility of returning the 3D bounding box coordinates in the pixel space by taking projection matrix and image shape as the input. I am doing like below to get the 3D bounding box coordinates in the image space i.e., pixel coordinates
bbox = objects.object_list[i].bounding_box
# _cam_mat = np.array(_cam, np.float32).reshape(4,4)
N = 8
hom_obj_coords = np.c_[bbox, np.ones(N)]
proj3D_cam = np.matmul(hom_obj_coords, _cam_mat) # 8 x 4
# proj3D_cam[1] = proj3D_cam[1] + 0.25
# proj2D = [((proj3D_cam[0] / pt4d[3]) * _wnd_size.width) / (2. * proj3D_cam[3]) + (_wnd_size.width * 0.5)
# , ((proj3D_cam[1] / pt4d[3]) * _wnd_size.height) / (2. * proj3D_cam[3]) + (_wnd_size.height * 0.5)]
proj2D = [((proj3D_cam[:, 0] / hom_obj_coords[:, 3]) * 1920) / (2. * proj3D_cam[:, 3]) + (1920 * 0.5)
, ((proj3D_cam[:, 1] / hom_obj_coords[:, 3]) * 1172) / (2. * proj3D_cam[:, 3]) + ((1172 * 0.5) + ((1172 - 1080)*0.5))]
proj2D_x = proj2D[0]
proj2D_y = proj2D[1]
where
`_cam_mat`
is the projection matrix I got from the OpenGL code. But objects are not getting aligned properly, I think it would be great if the ZED SDK provides the support for this.
### Use-Case
Saving the 3D bounding boxes in the pixel space will help to train any custom 3D object detection network without any associated point clouds.
### Anything else?
_No response_