Enquiry about the meaning of ZED360 Calibration result

Dear ZED Team,

Recently, I have been working with the Fusion API, specifically the Livelink (Fusion) project, using three ZED 2i cameras connected to the same local computer.

As part of the workflow, the Fusion API requires the multi-camera calibration JSON file generated by ZED360 as input. I am particularly interested in understanding the pose field in this configuration file, which consists of 16 numerical values.

My questions are as follows:

  1. Could you please clarify what these 16 values in the pose field represent?

    • Are they a 4×4 transformation matrix?

    • If so, what is the order (row-major or column-major), and how are rotation and translation encoded?

  2. If some of these values correspond to camera position (translation), are the units expressed in meters (e.g., 1.39 m, 0.88 m, etc.)?

  3. I recall seeing a statement suggesting that the first camera connected to the computer is set as the world origin (x = 0, y = 0, z = 0).
    However, after inspecting the code and printing the camera translations at runtime, I noticed that Camera 0 does not have a (0, 0, 0) translation.

    • Could you clarify where the Fusion world coordinate origin is defined?

    • Is the origin tied to a specific camera, or is it a virtual coordinate frame derived from calibration?

I have attached:

  • The calibration JSON file generated by ZED360

    cali-demo.json (2.6 KB)

  • A code snippet showing the printed camera translations, which appears to be related to the values in the calibration file

    • Code:

    • for (const auto& cfg: configurations){
          std::cout << "serial: " << cfg.serial_number
                  << ", comm type: "<< static_cast<int>(cfg.communication_parameters.getType())
                  << std::endl;    
          auto t = cfg.pose.getTranslation();
          std::cout << "pose translation: "<< t.x << ", " << t.y << ", " <<t.z << std::endl;
      
      }
      
    • Output:

      serial: 31122074, comm type: 1
      pose translation: 0, 1.3984, 0
      serial: 33616805, comm type: 1
      pose translation: 0.384386, 1.34435, 6.95971
      serial: 37635958, comm type: 1
      pose translation: -2.87276, 1.27234, 5.40546
      

The motivation behind these questions is that, before proceeding to analyze fused skeleton joint data, I would like to have a clear understanding of the coordinate system and spatial reference frame used by the Fusion API.

Thank you very much for your time and support.

Best regards,

waystogetthere

Hi @waystogetthere
You can find all the relevant information to use the Fusion module on the ZED SDK API documentation:

Please do not hesitate to report any missing information or parts that are not clear.

Hi, thanks for your feedback. I would like to clarify the interpretation of the camera pose returned at runtime when using Fusion.

  1. Interpretation of the runtime camera translation
    When printing the camera pose returned by cfg.pose.getTranslation() in Fusion, I obtain a translation such as
    (x = 0, y = 1.3984, z = 0).
    Could you please confirm whether this means that the camera origin is located 1.3984 meters directly above (the coordinate system is set to LEFT_HANDED_Y_UP) the Fusion world origin, i.e. the Fusion world origin lies directly below the camera along the Y axis?
  2. Difference between calibration JSON and runtime pose (Y sign only)
    I noticed that the Y component of the translation has the opposite sign compared to the value stored in the ZED360 calibration JSON file, while the X and Z components remain unchanged.
    In my code, the Fusion runtime is explicitly initialized with LEFT_HANDED_Y_UP, whereas the ZED360 calibration result appears to use a default Y-down coordinate convention.
    Could you please confirm whether this difference in the Y sign is expected and caused by the coordinate system conversion applied when loading the calibration file (i.e. Y-down in calibration vs Y-up in runtime)?
    Thank you for your clarification.

As you said, the pose available in the calibration file generated by ZED 360 is in COORDINATE_SYSTEM::IMAGE (with Y down).

The first camera to be loaded is defined as the world origin; its position will be (0, -H ,0) with H being its height.

3 Likes

Thanks! This is really helpful!

Thanks, that clarifies the coordinate system.

Following this, I noticed that in the calibration JSON from ZED360, the pose of the first loaded camera is essentially a pure translation, placing the camera origin at (0,−H,0) in IMAGE coordinates. However, the estimated height H does not match the physically measured camera height (I measured H′ in the real world).

Given this, would it be valid to define a simple linear correction (rigid translation) to adjust the first camera pose, so that the mapped point can be (0,−H′,0) instead of (0, H, 0)? Since there is a calibration error for the first loaded camera, if the relative position between the reference camera and other cameras, we can then apply the same correction consistently to all other camera poses in the calibration file.

From a geometric point of view, this would correct the world-frame vertical offset while preserving all inter-camera relative poses.

For example, the calibration result shows that the first loaded camera has a position (0 1.38 0) in the fusion world, where the origin point in fusion world is 1.38m right below the first loaded camera.

The pose matrix is:

T1 =



1 0 0 0

0 1 0 -1.38

0 0 1 0

0 0 0 1

However, the true height is 1.50m, so the true spatial mapping matrix is:

T1_{true} =

1 0 0 0

0 1 0 -1.50

0 0 1 0

0 0 0 1

Note that we can find a matrix C such that T1_{true} = CT1

Then we should apply the C to both T2 and T3

hi,
As the system is rigid between the cameras, If you want to ‘move’ one, you should apply the same transformation to each camera of the system (Apply only translations, as the rotation is given by the camera IMU).
The height of the first camera (H) is computed by ZED360 only if your anckles are being seen during the calibration process, otherwise it remains at 0. The wrong measurement you get may came from a poor detection of this.

1 Like

Thanks for the clarification — that makes sense.
I’ll re-run ZED360 calibration ensuring the ankles are clearly visible throughout the process (to avoid an incorrect H estimate).
In the meantime, since the camera rig is rigid, I’ll apply the same global translation offset (Δy = H′ − H) to all camera poses in the calibration JSON (translation only, no rotation changes), so the inter-camera geometry remains unchanged.

1 Like