Before asking a question, I want to talk a little about what we do with ZED cameras. So, we have a large cubic stand (3m x 3m x 3m), in the upper corners of which zed2i cameras weigh. In the middle of the stand, there are scales on the floor, all this is used by the logistics company to assess the type of cargo, damage on it (we trained different ML models for this) and cargo dimensions
The whole system is controlled by ROS2 (we have asked questions about it before), we have one 6 GB gpu, so our ML models are load into memory if we use the ULTRA depth quality (not neural).
Well, the part of the pipeline related to sizing is not working well for us. After we have segmented the cargo, we estimate the depth to the points of the cargo within the segmented polygon and transfer those points to world 3D coordinates. We get transfer matrices after calibrating the camera to ArUco markers (we know in advance where these markers are located in the world coordinate system). And, of course, we use some postprocessing after we get point clouds from different cameras
So, we found that the main problem is poor calibration - Zed2i is sometimes wrong by 12% when estimating the distance to the ArUco marker (for example: 4249 mm instead of 3803 mm). At the same time, there is no strict correlation that the farther the markers are, the worse the estimation of the distance to them, as if the combination of white and black on A4 sheet confuses ZED …
In the presented photos, the paper is laminated, but we had experiments with plain paper as well - the errors are approximately the same everywhere
Actually, the questions arising from this story:
What can you advise to do to increase the accuracy of depth estimation to ArUco markers? How much better would it be if I switched to ULTRA quality (I couldn’t find a quality comparison between different modes)?
Maybe i should tweak some Zed/ROS settings to improve the quality of depth estimation? (ArUcos are always in the same place, maybe this can somehow help zed cameras?)
In the ideal case, zed cameras are not fixed, and they can be moved (recalibration is automatic after system reboot), but if we fix them, can we help the depth estimates somehow by telling them the exact distance to each marker? Will the camera be able to “calibrate” to this information?
What are the prospects for solving this problem with Multi-Camera Fusion API? Will we need ArUco markers to synchronize the images with each other (obviously with them it will be easier to find correspondences between the cameras)? In what coordinate system will the Multi-Camera Fusion API upload point clouds? Will it be possible to say that all cameras will calibrate themselves and it will be possible to estimate the size of the cargo simply by pointcloud?