3D Object tracking of dancers

nanashi · November 17, 2025, 9:09am

Hi, I’m looking for some general information about how well I would be able to track one particular dancer in a group of 5 dancers. Their outfits would be similar and there would be frequent occlusions of the target by other dancers.

The setup would be an Orin NX or a Jetson Nano with either a zed mini or a zed 2i. Ideally the smaller the better.

Would the 3d object tracking be able to maintain a robust track in this situation?

Myzhar · November 17, 2025, 1:17pm

Hi @nanashi
Welcome to the Stereolabs community.

What I can say is that it’s almost impossible to do what you described using a single camera.
More than one camera is recommended to improve the detection under occlusions.
Furthermore, the Jetson Nano is surely not recommended because not powerful enough to run precise skeleton tracking models in real-time.

I recommend you consider a multi-camera approach using Orin NX for local skeleton tracking and a central server PC to fuse all the information together.

Read more here:

nanashi · November 18, 2025, 3:52am

Thanks for the response and sorry but I meant Jetson Orin Nano Super not the old regular jetson nano. Would that be powerful enough for the tracking you are referring to? Currently I have:

1x Orin NX

1x Orin Nano super

1x zed 2i

1x zed mini

My goal is to keep a regular non-stereo camera on a moving platform oriented towards the selected dancer. I could have one of the cameras fixed to something high up near or over the dance area and pointed downward.

It them seems like it would probably require the mobile camera attached to the platform locating itself via SLAM in a unified coordinate space with the fixed camera. Fusion sounds like it would be able to accomplish this, but do you see anything that could be problematic for it in this scenario? Anything perhaps like drifting coordinate space synchronization over time as an example.

Would a higher up angle affect human detection and tracking?

A higher angle would show more head and shoulders and less lower body, but also have much less occlusion. Skeleton tracking is not necessary, only position is needed. What would the ideal angle the fixed camera should be?

Should the mobile camera also contribute to detection and tracking or solely do SLAM to properly locate itself and just reorient the platform to target coordinates provided by the fixed camera’s tracking?

Thank you!

Myzhar · November 18, 2025, 1:50pm

You can use it to perform single-camera Skeleton Tracking locally and send the result to the fusion server.

Simultaneously, each camera can be used to perform Positional Tracking.

Please note that there’s no processing onboard the cameras; the host device performs detection and localization using the cameras’ stream.