Help with joint orientation

peppersaltyy · June 18, 2025, 8:57am

Can someone gives a clear definition of what ZED SDK could give in regards of orientation of each joint? So my purpose is to get the 3 DOF(roll, pitch, yaw) of wrist joint. However, I only saw in the documentation that local_orientation_per_joint gives [x, y, z, w] but I do not really know what that is.
Besides, I also saw someone pasting a picture saying it is what orientation is based on. Can someone explain what this image really means? What are these green circles?

BenjaminV · June 18, 2025, 1:00pm

Hi,

The ZED SDK allows you to choose between two body tracking formats called “Body 34” and “Body 38,” which refer to the number of joints it tracks.

What you said is true for the Body 34 model; all the SDK does not provide orientation information to every joints, but only the ones highlighted in green in your image.
However, that’s not the case for the Body 38 model. Every orientation is available (except for the limbs end, such as the toes or fingers).

Best,

Stereolabs Support

peppersaltyy · June 18, 2025, 6:57pm

Thank you so much. I successfully get the orientation of wrist. I saw four numbers. As stated in the doc, they are [x, y, z, w]. Are they the quaternion for the joint detected?
Besides, I also notice that for the same video, switching from BODY34 to BODY38 gives one more body tracking id, looks like two people is being detected (the video only has one person moving)
Like this,
“1749585303466”: {
“is_new”: true,
“is_tracked”: true,
“timestamp”: 1749585303466475000,
“body_list”: [
{
“id”: 1,
“head”: [
-0.21899113059043884,
0.44649237394332886,
-2.880984306335449
],
“right_wrist”: [
0.09812593460083008,
0.17842966318130493,
-2.8459103107452393
],
“right_wrist_orientation”: [
0.09469082951545715,
-0.12142148613929749,
-0.06885623186826706,
0.9856719970703125
]
},
{
“id”: 0,
“head”: [
1.061621069908142,
-0.12769785523414612,
-3.014038324356079
],
“right_wrist”: [
1.2440309524536133,
-0.8990139365196228,
-2.895113945007324
],
“right_wrist_orientation”: [
0.0,
0.0,
0.0,
1.0
]
}
]
},
As you can see, id 0 still gives orientation of [0,0,0,1] like BODY34 (for all time steps). While id 1 gives some meaningful numbers. May I know what exactly happens behind this? And what I should do to avoid this?

BenjaminV · June 19, 2025, 6:29am

Can you share the code sample you are using please?

Stereolabs Support

peppersaltyy · June 19, 2025, 7:06am

Thank you. Below is my code. With BODY_34, though I cannot get orientation of wrist, yet there is only one person being detected. I only changed BODY_34 to BODY_38 in the code, but now it is giving me two persons. With one person always having a wrist orientation of [0,0,0,1]. I tried to use confidence to filter. It succeed in some timestamps since most of time person with wrist orientation [0,0,0,1] has a low confidence. However, there is still some timestampes. Both id get a high confidence of above 80.


import cv2
import sys
import pyzed.sl as sl
import time
import ogl_viewer.viewer as gl
import numpy as np
import json

def addIntoOutput(out, identifier, tab):
    out[identifier] = []
    for element in tab:
        out[identifier].append(element)
    return out

def serializeBodyData(body_data):
    """Serialize BodyData into a JSON like structure"""
    kp = body_data.keypoint
    orient  = body_data.local_orientation_per_joint 
    kp_conf = body_data.keypoint_confidence

    return {
        "id": body_data.id,
        "detection_confidence": int(body_data.confidence), 
        "head": list(body_data.head_position),                    
        "right_wrist": kp[17].tolist(),            
        "right_wrist_orientation": orient[17].tolist(),

        "right_wrist_confidence": float(kp_conf[17])
    }


def serializeBodies(bodies):
    """Serialize Bodies objects into a JSON like structure"""
    out = {}
    out["is_new"] = bodies.is_new
    out["is_tracked"] = bodies.is_tracked
    out["timestamp"] = bodies.timestamp.data_ns
    out["body_list"] = []
    for sk in bodies.body_list:
        out["body_list"].append(serializeBodyData(sk))
    return out

class NumpyEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, np.ndarray):
            return obj.tolist()
        return json.JSONEncoder.default(self, obj)


if __name__ == "__main__":

    # common parameters
    init_params = sl.InitParameters()
    init_params.coordinate_system = sl.COORDINATE_SYSTEM.RIGHT_HANDED_Y_UP
    init_params.coordinate_units = sl.UNIT.METER
    init_params.depth_mode = sl.DEPTH_MODE.NEURAL
    init_params.svo_real_time_mode = False
    init_params.set_from_svo_file("/home/xijie/Desktop/workspace/zed_records/HD2K_SN35358357_12-54-38.svo2")
    zed = sl.Camera()
    error_code = zed.open(init_params)
    if(error_code != sl.ERROR_CODE.SUCCESS):
        print("Can't open camera: ", error_code)

    positional_tracking_parameters = sl.PositionalTrackingParameters()
    error_code = zed.enable_positional_tracking(positional_tracking_parameters)
    if(error_code != sl.ERROR_CODE.SUCCESS):
        print("Can't enable positionnal tracking: ", error_code)

    body_tracking_parameters = sl.BodyTrackingParameters()
    body_tracking_parameters.detection_model = sl.BODY_TRACKING_MODEL.HUMAN_BODY_ACCURATE
    body_tracking_parameters.body_format = sl.BODY_FORMAT.BODY_38
    body_tracking_parameters.enable_body_fitting = True
    body_tracking_parameters.enable_tracking = True

    body_tracking_parameters_rt = sl.BodyTrackingRuntimeParameters()
    body_tracking_parameters_rt.detection_confidence_threshold = 40
    error_code = zed.enable_body_tracking(body_tracking_parameters)
    if(error_code != sl.ERROR_CODE.SUCCESS):
        print("Can't enable positionnal tracking: ", error_code)

    bodies = sl.Bodies()

    skeleton_file_data = {}
    while zed.grab() == sl.ERROR_CODE.SUCCESS:
            zed.retrieve_bodies(bodies)
            skeleton_file_data[str(bodies.timestamp.get_milliseconds())] = serializeBodies(bodies)

    # Save data into JSON file:
    file_sk = open("/home/xijie/Desktop/workspace/bodies_rotation.json", 'w')
    file_sk.write(json.dumps(skeleton_file_data, cls=NumpyEncoder, indent=4))
    file_sk.close()

BenjaminV · June 19, 2025, 7:18am

As I explained previously, not having the wrist orientation using the body 34 model is expected as it does not compute the orientations of all the joints.
However, it’s quite surprising the body 38 model detects two people.
Do you have any display to show what’s being detected in the image?
You can also try to increase the detection_confidence_threshold value to avoid false-positive detections.