Improve Body Tracking Accuracy

Description
I am using the Body Tracking module to track the movement of a single person. The camera is mounted on a video wall that has the height of 3m, and the tracking person is standing 1-2m away from the wall, When there are people standing behind the tracking person and overlap with the tracking person arm, the accuracy of the arms will become inaccurate.

The sdk version for the program is 4.0.3 and it has used the following parameters:

InitParameters

{
    resolution = RESOLUTION.AUTO,
    cameraFPS = 60,
    depthMode = DEPTH_MODE.NEURAL,
    coordinateUnits = UNIT.METER,
    coordinateSystem = COORDINATE_SYSTEM.LEFT_HANDED_Y_UP,
};

Positional Tracking Parameter

{
     setAsStatic = true
};

Body Tracking Parameter

{
     enableObjectTracking = true,
     enableSegmentation = false,
     enableBodyFitting = true,
     imageSync = true,
     detectionModel = BODY_TRACKING_MODEL.HUMAN_BODY_ACCURATE,
     bodyFormat = BODY_FORMAT.BODY_18,
     maxRange = 3.5
 };

Body Tracking Runtime Parameter

{
    detectionConfidenceThreshold = 40
};

Is there anything that I can do in order to improve the tracking accuracy when someone is standing behind?

Hi @iamdickyjai, welcome to the forums! :wave:

First, if possible, I invite you to update the SDK, we’ve made improvements on the depth that could help since the 4.0.3 was released.

Using NEURAL as you already do would be the first step, using a better resolution should help too (1080p instead of the AUTO that would be 720).
Playing around with the confidence may give you better results.

At this distance and in this configuration, you should not encounter that much issues. I invite you to record an SVO of the problematic configuration and test settings on it. You can also send it either here or to support@stereolabs.com if you want us to take a deeper look. (please reference this ticket in your mail)

Hi,

I have updated sdk to the latest version. Since I cannot reach to the video wall at the time, so I tried it in the office instead, and I find that the problem presist. Image 1 and 2 are the images where the arm is overlaped with the people in background, circled point is the detected hand wrist position from the camera.

The configuration is as below:
InitParameters
init_params.resolution = RESOLUTION.HD1080;
init_params.depthMode = DEPTH_MODE.NEURAL;
init_params.coordinateUnits = UNIT.METER;
init_params.coordinateSystem = COORDINATE_SYSTEM.LEFT_HANDED_Y_UP;
//init_params.depthStabilization = 50;

PositionalTrackingParameters
setAsStatic = true,
enablePoseSmothing = true,
setFloorAsOrigin = true,
mode = sl.POSITIONAL_TRACKING_MODE.QUALITY

BodyTrackingParameters
enableObjectTracking = true,
enableSegmentation = false,
enableBodyFitting = true,
imageSync = true,
detectionModel = sl.BODY_TRACKING_MODEL.HUMAN_BODY_ACCURATE,
bodyFormat = sl.BODY_FORMAT.BODY_18;

BodyTrackingRuntimeParameters
detectionConfidenceThreshold = 50;
skeletonSmoothing = 0.5f;

I tried to record the SVO file, but it returns CAMERA_NOT_DETECTED error.

Thank you for your time and assistance.

Regards,

Kelvin

Hi @iamdickyjai
An SVO would definitely help investigate this.

To record an SVO easily, close all applications that use the ZED, open ZED Explorer, and record a sequence from it.

My first idea, seeing the images, would be that the uniform, dark, loose clothing doesn’t help the tracking, but I’m surprised it would be to this extent.

Hi,

You mentioned the dark clothing doesn’t help the tracking, Can I use tools like OpenCV to perform image processing on the grabbed image and feed it back to the tracking model?

I have attached the SVO file. I hope it can help the investigation.

Regards,
Kelvin

Hi Kelvin,

You won’t be able to send images into our SDK, it only works with direct input from our camera (or SVO file). We have not implemented a “Custom Body Tracking” like we have for the custom detector for Object Detection.

On the SVO you sent, the main issue is the motion blur that gives out a very bad depth. Increasing the fps is important if you have high-speed-movement.
There is also the occlusion of the elbow of the person in the back, with the elbow of the person in the front, which may decrease confidence. You can see the forearms switching skeletons because the 2D detection hesitates on whose it is.

I used your exact parameters and tried to reproduce the issue on my camera, with a setup resembling yours and couldn’t reproduce the issue. It happened only when I hid my hand in my sleeve, which is not the case in the pictures you sent.

If you can provide me with a SVO and code sample reproducing your first issue, I would be very happy to continue investigating this, but my current conclusion is:

  • increase the fps, it will improve the tracking (1080p 30fps should work wonders)
  • try other body models than accurate, in some situations I got more stable results with fast than accurate on the SVO you sent
  • if the issue still arises, please try disabling fitting and disabling both fitting and tracking in the body tracking parameters

Also, some questions:

  • Can you also provide a full code sample reproducing your issue?

Hi,

Thank you for the detailed explanation on my issue.

I cannot provide the exact SVO that reproduce the first issue, but I have attached a text file with some code fragments from my program. It will keep tracking the right hand from bodyData and check if the hand goes above the neck.

I also tried the Body Tracking example from Github. The skeleton hand did not raise up in the example when the user did raised their hand.

Thanks again for looking into this. Looking forward to hearing your thoughts.

Regards,
Kelvin

Hi Kelvin,

Unfortunately, without an SVO with your particular setup, it’s difficult to go further than my previous advice.

I invite you to try other combos of resolution and body models (FAST, MEDIUM, ACCURATE) to see if the results can improve.
Please also check the quality of the depth in your setting, there seems to be a difficult lighting setup that could prove complicated for the depth calculation, and maybe even for the 2D skeleton detection.

Hi,

Thanks for your help, I will try your suggestions.

Regards,
Kelvin