ZED X Depth Noise on moving ball (~10 meters)

Hi,

I’m experimenting with a ZED X camera (2.2mm lens) for moving balls (quite slow).

Here the ball comes from high and bounce on the ground then on the wall.
The camera has approximately the view of the plot.
We would expect a quite parabolic flight but instead get a saw pattern oscillating on the depth axis (roughly) with >1 meter error which is huge. The approximate distance is ~8 meters and

The depth is calculated using NEURAL_PLUS model.

Does that ring a bell? Is there some way I could improve the depth? I thought about some kind of smoothing but that’s not entierly satisfactory and I feel like we’re too far from the truth.

I made a viewer to see the RGB and Depth both zoomed around the cursor

Which shows me that the ball is “seen” at a depth level (blue pixels = low difference wrt. to aimed pixel = area of same depth).

Hi,

Thank you for reaching out.

Can you give more details about your SDK version and parameters use. The SDK API contains a depth stabilization parameter that could be increased as well for more accuracy on moving objects : https://www.stereolabs.com/docs/api/structsl_1_1InitParameters.html#a5fedf3f4c6eb7e39739b459903b51549.

If your use case is the tracking of a ball, you can also use our 3D Object Dectection module with custom yolo models to help get a better estimate of your ball position: https://www.stereolabs.com/docs/object-detection/custom-od

best

Stereolabs Support

Hi @RodolphePerrin thanks for your support.

  1. SDK:
python -c "import pyzed.sl as sl; print(sl.Camera().get_sdk_version())"
5.1.1
  1. Recording parameters: HD1200, 60fps, NEURAL_PLUS, H264_LOSSLESS
  2. I’ve been setting camera as static and tweaking stabilization with not much success. I see some changes, but improvements, if any, are order of magnitude lower that the depth error.
  3. Ball position is annotated manually for now by selecting a click + aggregation of positions of the neighborhood (depth contour) => we see quite a local consensus over depth / confidence of the ball contour
  4. It still results in poorly looking trajectories

Another ZED X camera has been setup so I’m curious whether I could improve the depth cloud using 2 ZED X?

Hi @abal
First of all, I recommend you install the latest ZED SDK v5.2.3.

Does this mean that you are analyzing recorded SVO data?
Please note that depth is not recorded in SVO, but only raw stereo and inertial sensor streams.

How do you extract the depth information? Do you get a single point value, or do you perform some kind of average?
We normally recommend using a median average, rejecting INF and NAN values.

What’s the size of the ball?
The depth accuracy at 10 m is ~4% with respect to the depth value, meaning +/- 40 cm of depth precision…

  1. Indeed, I’m analyzing recorded SVO
  2. Depth is computed on the fly during analysis using NEURAL_PLUS model (I setup the camera with the SVO and NEURAL_PLUS then just grab()
  3. The depth is averaged considering a region of interest (previous screenshot).
  4. The ball is around 6.5 cm.

Interestingly the depth confidence in the ROI is pretty strong. The variances is more temporal with big changes from one frame to another.

Using too cameras show poor agreement between the 2 world positions making it unclear how to actually combine / fuse them.

Please note that the SVO does not store depth information, but only the raw stereo stream and the IMU information.

You can change the depth model when opening the SVO file to test different configurations.

Yes but then I’m running cameras from the SVO2 + depth configuration so it’s computed on the fly. Right?!

Yes, exactly.
What SVO mode are you using, “real-time” or not?

If you enable svo_real_time_mode and the processing power is not enough, then the SDK will drop frames to keep grab frequency.

By the way, to obtain good tracking, I recommend implementing a Kalman Filter to take the depth noise into account.

I’m not in real time and running on another machine with RTX3080 Ti. In fact, I’m annotating the video frame by frame by clicking on the ball. In fact, the interface waits on grab to actually display the image so by design, we are waiting for it to be available.

I’ve been wondering about applying kalman but then I thought this might also make the trajectory “rounder” eg. changing sharp direction changes into smoother curves, which is not really wanted. We would like to identifying which axis/3D vector holds the more error to apply more smoothing in that direction. Does that make sense?