Human Step Detection - Best Approach?

I’m trying to work out the best way to determine that a tracked skeleton is taking a step. I’ve tried using changes in the distance from the knee to the hip etc, but leg tracking isn’t very accurate with the Zed2 in my experience even with the highest accuracy levels (arms are fine). I’ve considered height of feet from the floor, but there will be too much variability there since the avatars feet don’t always land on the floor and could float slightly above or below it, and again, accuracy isn’t the greatest. Maybe its related to our black floors? And I thought maybe I could look at changes in the bends in the knees themselves. Any suggestions would be appreciated! Basically I just want to count number of steps when walking or jogging on the spot. I’m not having much luck. Deteting arms raises, thats another story it works great.

Hi Chuck,

That’s an interesting question.

Even if the feet are slightly above or below the floor, the offset should be relatively constant. Most of the time, this error comes from the camera’s inability to correctly estimate its height (the black floor may impact that estimation).

If latency is not that important, I’d smooth the data to avoid all the jitter and keep only the “real” movements.
Instead of the position, you might try to use the velocity of the feet to detect when it reaches the floor.
Another alternative might be to compute the distance between the left and right feet.

Also, can I ask you how the camera is positioned? It might be looking too downward if the leg accuracy is significantly worse than the upper body detection.

Stereolabs Support

The camera is mounted at the top center of the back wall of my 40x40 foot room, 9’ above the ground. The ceiling and floor are black. This 40’ wall acts as a giant projection screen. There are 8 skeletons being tracked in 2 rows. The ones in the front row roughly 15’ from the camera obviously track better than the ones at 28’ from the camera since they’re smaller. I’ve got resolution set to 1080p and I wish that supported 60fps, but I digress :).

Anyway as for step tracking, even though I have the most accurate models set, the feet and legs don’t track that well. I know I could probably get better results with a second or third Zed2i but I’m worried that would cause too much extra latency and I want real-time performance, and for this project accuracy isn’t “too” important as long as I can at least detect they’re walking on the spot or raising a knee. When I say legs don’t track that well I mean it looks like the person is moonwalking or the feet are sliding, that sort of thing. The problem could be due to the lighting, since I don’t want ambient light on my projection wall I am using LED floodlights and in ZED Explorer I can see the room isn’t as bright as it probably should be for good tracking.

I’ve tried measuring foot distance from the floor and knee height relative to pelvis, but no matter what thresholds I use the results aren’t good enough. Sometimes the player has to really kick their legs up high (which I for one physically can’t do) to get it to register, especially if you’re in the back row furthest from the camera. However, using the body tracking demo I can see from the animations that there is enough good tracking there to work with, if only I could find the best way to do it.

Do you think it would help for foot placement accuracy to scan the room and turn on spacial mapping?

I really like your idea of foot separation, I’ll try that next and see how it goes.

Hi,

From my experience, the quality decreases significantly after reaching 8m (~26 feet). Moreover, the person in the second row might get occluded by those in the first row, which does not help either.

Adding a second camera, looking at the back of the area, would help for sure but this would slightly increase the latency as we need to synchronise the data from both cameras. This adds around 1 frame of latency.

Stereolabs Support

Today during focus group testing of my Zed2 educational game, the body tracking didn’t register smaller kids in the back row when raising their arms. I noticed that it was almost always kids shorter than 48", 6 years old typically, because testing always worked with teens and adult employees. And it happened whether or not there was someone in the front row potentially blocking part of the camera view. The front row always worked well, but the back row which is roughly 8 feet further back had the problems. I would say they were at about 6 meters from the camera.

I thought maybe it was the resolution, so I increased from 1080p to 2k but the problem remained. I also tried increasing the lighting and that didn’t help either. So I’m wondering if maybe the model wasn’t trained on kids :slight_smile: or maybe the combination of their small stature and the distance just didn’t give the model enough pixels to work with? I have it set for ULTRA depth mode, H2K, 30 FPS, static tracking. Any suggestions for other settings I could try?

Here is a video, the little girl on the right was one of the ones having the problems:

Zed2 Game Testing

Hi,

Yes it’s possible our models did not see as many kids as adults and is not detecting them as accurately. I’ll ask the team for more details about it and if we can improve it without re training our model.

You might get better results if you reduce the detection_confidence_threshold value. The model is probably less confident with kids and is filtering them out.

Increasing the resolution is a good idea, but you are also reducing the frame rate doing so, which may reduce the tracking quality during fast movements. I’d stick with 1080p/30fps, it might be the best compromise.