How are joint transformations and scaling performed in livelink for Unity and Unreal?

The livelink for Unity project (GitHub - stereolabs/zed-unity-livelink: ZED Livelink plugin for Unity) uses a standard humanoid avatar (Unity - Manual: Humanoid Avatars) to animate the skeleton data coming from the SDK. The Body38 format of the SDK looks a lot like the skeleton of a humanoid avatar, but when using Body34, how are the joints fitted to match the humanoid avatar? For example, the pelvis bone of a Body34 skeleton is lower than that of the Body38 (and humanoid avatar) skeleton. Do you have a formula for calculating the location and rotation of the pelvis bone of the humanoid avatar when using Body34?

Also, how are the body parts scaled to the appropriate length? When running the livelink in realtime animating people with different heights, is the height of the avatar scaled accordingly, or are the skeleton data fitted to an avatar of a set height of for example 1.80 meters?

Hello @haakonflaar,

  • We’re indeed going forward with Body38 for this kind of reasons, as Body34 pelvis being in the middle of the hips is not standard in many implementations. Currently, there is no offset applied when using Body34 in Unity. What you could do is adding an offset to this line based on the height difference between hips and pelvis in the Unity avatar, something like animator.GetBoneTransform(HumanBodyBones.Hips).position.y - animator.GetBoneTransform(HumanBodyBones.LeftUpperLeg).position.y. We’ll add this offset in the next version for Body34, but in the meantime you can use this.
  • The body parts are currently not scaled to the tracked skeleton length. We’re studying the best solution for this, as we know there can be a mismatch while the animation is based only on rotations like now.
  • The height of the avatar is not scaled, the skeleton animation data (the local rotations and the root position) is directly applied to the avatar.

Thanks for reaching out, these are indeed issues with our plugins and you can expect to hear about a solution in a further release.

May I ask what kind of project you need this for?

Best,
Jean-Loup

Thanks for the reply, @JPlou ! :slight_smile:

Did I understand you correct that for animating characters in Unity and Unreal, the only raw position (x,y,z) data that you are using from the camera is for the pelvis bone? The location of the other joints are calculated by using the local rotation data of said joints, which comes from the SDK, and a set length between the joints?

Could you point to the place(s) in the zed unity livelink project where this transformation is done? I would like to manually do it myself as I need to transform skeleton data to be used in programs other than Unity and Unreal (we are doing computational fluid dynamics simulations).

Yes, the pelvis is the only position from the SDK that is used. :ok_hand:
We don’t directly manipulate the other positions, we feed the local rotations (from the SDK) to the animator (Unity avatar). The lengths used for the limbs are the unity avatar’s ones.

The keypoints (raw positions of the bones) and other data from the SDK are available, alongside the local rotations, in the BodyData structures that come within the Bodies structure sent by the sender. If you want to access it in the C++ sender, take a look in the main.cpp file, if you want it in the Unity project take a look at the ZEDBodyTrackingManager script.

I think that the senders could be an appropriate first step if you do not only need Unity or Unreal, as they are an adaptation of a C++ body tracking sample, which extracts all the body tracking data from the SDK, with an extra layer of UDP communication.

Do not hesitate if you have more questions.

Jean-Loup

@JPlou Do you know the math behind scaling of the avatars? Basically, given the rotation data for each joint coming from the cameras, the position of the pelvis joint, and fixed body part lengths, how can I calculate the position (x,y,z) of each of the joints? I tried to illustrate the problem here: python - Find the coordinates of a 3D point given the euclidean distance and rotation to another known point - Stack Overflow

I can’t help with Python @haakonflaar, but I’ll try to explain what we have on our side.

If I understand correctly, you want to obtain the 3D positions of each joint based on the root’s position/orientation, the local joints’ orientations, and the length of the bones.

It is important to note that the local rotations provided by the SDK are relative to the T-pose, where bones are assumed to have an identity quaternion as their local rotation when in T-Pose.

You must recursively apply the local orientation and translation provided by the SDK throughout the kinematic chain using the equation

image

image represents the global pose (rotation and translation) of the current joint, image is the global pose of the current joint’s parent, and image is the local pose provided by the SDK.

The global pose of the root, provided by the SDK, serves as the initialization for the recursion.

image is a 4x4 SE(3) matrix, and image is the rotation matrix obtained from the quaternion image.

The SE(3) matrix is the matrix made of the rotation matrix and the translation matrix, it looks like this:

image

Also, you could want to check out this topic on StackExchange about applying quaternions to vectors.

By the way, in Unity, you can directly grab the positional data from the animator in the LateUpdate loop, after the engine has done this kind of calculation.

I think that’s all I have about this, I hope it helps a bit.
Jean-Loup

@JPlou Thanks for the response!

What is the translation part of (R(q),T)_localJoint ? Is it the local_position_per joint from the cameras? Is that also relative to T-pose? Meaning you will get a different position value than if you used the “keypoint” list which from my understanding is the cameras prediction of joint positions?

I have T-pose measurements of an avatar I want to scale the data to. Should the translation part of (R(q),T)_localJoint instead be my pre-calculated x,y,z distances to the previous joint?

@haakonflaar This was indeed not the clearest part.

Let’s start by the end:

I have T-pose measurements of an avatar I want to scale the data to. Should the translation part of (R(q),T)_localJoint instead be my pre-calculated x,y,z distances to the previous joint?

Yes, that’s it! :smiley:

And now let’s go more in depth:

The translation T is the local vector representing your bone of defined length, so it should be constant. You could apply a global or local scaling to it along the execution, though, but it should in any case be the vector from the parent to the child, in the parent system. If I re-use your image:

The local_position_per_joint is the position of the keypoint in its parent coordinate frame (so, relative to its parent translation and orientation) and the root (index 0) is either in world or camera reference frame, depending on your RuntimeParameters.

The keypoints list is indeed the joints’ positions, computed by combining computer vision detection and depth sensing. Thus you will not have the same position as with your fixed-length bones.


This discussion brings a lot to the table, we'll update our documentation accordingly, thanks for asking all this!

Best,
Jean-Loup