We are using the ZED SDK for a multi-camera body tracking setup (3 ZED X) in a biomechanics research context. We have several technical questions that we couldn’t find clear answers to in the documentation or existing issues.
1. Anatomical definition of keypoints
What is the exact anatomical landmark used for the shoulder keypoint in BODY_34 and BODY_38? Is it the glenohumeral joint center, the acromion, or something else? This matters significantly for biomechanical analysis where joint center accuracy is critical. Any other information about the other key point is welcome.
2. Body fitting at camera level vs fusion level
In your official fused_cameras.py example, enable_body_fitting is set to False at the per-camera level and True at the fusion level. Is this the recommended configuration? Does enabling body fitting at both levels improve accuracy or introduce redundancy? What is the interaction between the two?
3. BODY_34 vs BODY_38 accuracy and use case
Is there a recommended format for research applications requiring high joint position accuracy? Does BODY_38 provide better 3D accuracy or is it mainly adding distal joints (fingers, toes) on top of BODY_34?
4. depth_stabilization and body tracking quality
Does the depth_stabilization parameter in InitParameters meaningfully affect body tracking quality, particularly joint position stability across frames? Is there a recommended value for static camera setups?
5. Joint orientations with body fitting option
The global_root_orientation field provides a quaternion for the pelvis joint, and local_orientation_per_joint provides one quaternion per joint. What is the reference/neutral pose these rotations are relative to? Is it a T-pose aligned with the world axes? Any additional information on how joint orientations are defined in the SDK would be welcome.
Thanks for the detailed and well-scoped questions. Answers below, in your order, with references where we have them, and I’ll flag honestly the two points where our public docs doesn’t have the elements you required.
Anatomical definition of the shoulder keypoint (BODY_34 / BODY_38)
This is a genuine gap in our published documentation: we do not publish the precise anatomical landmark (e.g. acromion vs glenohumeral joint center) that each keypoint maps to. The keypoints are produced by a neural network trained on human-pose datasets, so the shoulder keypoint is a learned approximation that sits near the glenohumeral region rather than a metrologically-defined joint center. For biomechanics where joint-center accuracy is load-bearing, I’d treat the SDK keypoints as a kinematic skeleton, not as
anatomically-calibrated landmarks. I’ve logged this internally.
Body fitting: per-camera vs Fusion level
Enable body fitting only at the Fusion level (BodyTrackingFusionParameters.enable_body_fitting = true), and leave it off on each per-camera BodyTrackingParameters — that’s the recommended setup. The reason it
works: in the Fusion pipeline the fitting is applied to the single fused skeleton built from all
viewpoints, and per the SDK, enabling it at the Fusion level will also promote the fused output to BODY_34 (with local rotations) even if the cameras feed BODY_18. Enabling fitting additionally at the per-camera level is redundant — it adds compute without improving the fused result. (Note: the stock fused_cameras.py ships with BODY_18 and fitting disabled everywhere, so it outputs positions only with no orientations — you’ll need the change above to get fitted skeletons.)
BODY_34 vs BODY_38 — which for high joint accuracy
BODY_38 is primarily BODY_34 plus distal joints (hands/feet) and some additional spine/head detail; both run on the same detection backbone, so BODY_38 does not give you fundamentally better 3D position accuracy on the joints the two formats share — it gives you more joints and better hand/foot orientation. For your case I’d choose by which joints you need: if you don’t need fingers/toes, BODY_34 is sufficient and lighter; if you want feet/hands, use BODY_38. In both cases use the ACCURATE model (HUMAN_BODY_ACCURATE)
for best quality, and note you must set enable_body_fitting = true to get local joint rotations at all.
depth_stabilization for a static rig
depth_stabilization (0–100, default 30) is a temporal filter over the depth map: higher = smoother/less jitter but more latency and ghosting on fast-moving regions; lower = more reactive. Your cameras are static but your subjects move, so I’d keep it near the default (≈30) rather than maxing it — pushing it high will smear the moving body. The bigger win for a fixed rig is to tell the SDK the camera doesn’t move: set PositionalTrackingParameters::set_as_static = true. That stabilizes depth/tracking and reduces load without blurring the subject.
Joint orientation reference pose
Reference/neutral pose: the identity orientation corresponds to a T-pose — i.e. a
local_orientation_per_joint of identity means that bone is in its T-pose orientation relative to its
parent. (This is the fitting template’s neutral pose; your subject does not need to stand in a T-pose to capture.)
local_orientation_per_joint is expressed relative to the parent bone in the kinematic chain — not
relative to the world or the root. To reconstruct a joint’s world-space orientation you must compose the full chain of parent transforms down from the root, not just apply the root transform.
global_root_orientation places the root (pelvis) in the world frame.
Honest gap: the exact world-axis alignment of the identity root orientation is not explicitly documented in the BodyData API reference. I’ve logged this alongside the first documentation need.
To make sure this is exactly right for your build, could you confirm your ZED SDK version (e.g. 4.2.x),
your host platform (Jetson model + JetPack/L4T, or x86 + GPU + driver), and the format/model you’re
currently running? Orientation conventions and keypoint sets are version-specific.