I am conducting research and attempting to re-project the results of a fused 3D pose using the multi-camera fusion API’s “retrieve_bodies” method. My objective is to compute the 2D pose from the perspective of four Zed 2i stereo cameras in my setup (I don’t want to use the 2D pose detected by a single camera). However, I have encountered an issue where the re-projected 2D poses are not accurate.
I have used each camera’s extrinsic and intrinsic matrix to re-project the 3D poses onto the 2D plane. However, the resulting 2D poses for each stereo camera are incorrect. Upon investigation, it appears that the original 3D pose results (which uses a Right-handed, y-up coordinate system) might have been transformed to the perspective of a super camera for 3D visualization. This transformation might be affecting the accuracy of the re-projected 2D poses, making them appear incorrect.
If my assumption is correct, I believe that the 3D pose results from the “retrieve_bodies” method need to be transformed back to their original coordinate system before re-projecting them onto the 2D plane from the perspective of each stereo camera. I would like to confirm if there’s any form of transformation after the multi camera fusion process. If there is any, I would like to know how I can get the original 3D pose result without the transformation. Thanks
Hello and welcome to the forum !
I assume that you use our latest 4.0.1 SDK with the Fusion API.
To use the Fusion API, you need to provide a calibration file that ZED360 can compute for you. Did you do that ?
Then the poses are in reference to this room calibration that you made. What kind of world reference would you expect instead ?
Yes, I am using the latest 4.0.1 SDK with the Fusion API, and I did provide the calibration file computed by ZED360 to the Fusion API.
Regarding the world reference, I am expecting the poses to be in reference to the room calibration that was provided. However, I am currently facing an issue where the output pose appears to be translated and not correctly referenced to the room’s calibration.
If the poses are in reference to the room calibration as you mentioned, it’s possible that there might be some issue with the computed calibration parameter of ZED360. Besides recalibration, are there any additional information or troubleshooting steps that you can suggest to help me resolve this issue and ensure that the output 3D pose is correctly referenced to the room’s calibration?
The Fusion world’s origine is defined the first added camera (with a [0, H, 0] position, with H its height if the calibration manage to find the floor, 0 otherwise).
When you run the body fusion sample with the calibration file do you notice misalignement?
How do you notice that there is a difference between the calibration and what you get?
When I ran the body fusion sample with the calibration file, I did not observe any misalignment issues. Also, to verify that the calibration is correctly done by Zed360, I made a measurement of the real-world setup. The translation vectors of each camera in the calibration file closely matches the actual measurements of the real-world setup. However, the position of the first added camera which is the fusion world’s origin is [0, -H, 0], where H represents the camera’s height from the floor as you mentioned.
Below is a screenshot of the 2D projections from each camera’s perspective and the fused result.
To generate the 2D projections from each camera’s perspective and the fused result, I used the projection matrix, P, and the 3D fused keypoints. The projection matrix is computed as
P = intrinsic_matrix * world2cam
world2Cam = [R_matrix t]
t = -R_matrix * t_vec
R_matrix is the rotation matrix and t_vec is the translation vector for each camera
I also inverted the R_matrix on both the y-axis and z-axis before computing t and the world2cam to convert from the Right-handed Y-Up coordinate system to the image coordinate system.
Based on the 2D projections results, it appears that the projected poses are translated along the y-axis for all the cameras. Is it possible that there could be an issue with the coordinate system conversion or that the 3D fused points need to be translated before projection. Is there anything I’m getting wrong?
Thank you for sharing the sample code. I have tested it with my setup, and it works perfectly. I will investigate my code further based on the shared code to figure out what could be causing the translation-scale issue. Thanks again for your help!
Yes, I did notice some latency/jitter issue in the 2D reprojection results, as shown in the video below. The result in the video was based on the shared code with body fitting enabled for “BodyTrackingFusionParameters”, and when body fitting was disabled, the result was even more unstable.
I don’t know if this could be as a result of my setup’s calibration?