Hi Team,
I’m working with the ZED 2i camera and have set the capture FPS to 60, but in practice, I’m observing a significantly reduced frame rate of around 3 FPS.
- I’m using
grab()
followed by fusion operations (IMU, magnetometer, GPS) and running YOLO-based inference on each frame.
- These downstream processing steps (fusion + inference) are taking longer than the camera’s frame interval, resulting in dropped frames.
- I’m aware that
grab()
drops frames if not called in real-time, as per documentation.
- The camera is mounted on a moving vehicle, so the low frame rate results in loss of critical spatial and temporal data, which affects downstream geolocation and asset detection accuracy.
My Questions:
- What are the best practices to avoid losing frames while performing heavy computation (fusion + inference)?
- Is there a recommended multi-threading or queue-based strategy to decouple
grab()
from post-processing?
- Can we access raw camera buffers asynchronously or cache them before processing to maintain real-time behavior?
- Would using an asynchronous pipeline with
grab()
in a producer thread and fusion/inference in consumer threads be viable with the SDK?
Any guidance on maintaining high frame rate while still running complex inference would be greatly appreciated.
Thanks!
Hi @karthikreddy157 , I’d like to ask how you were able to observe the framerate during runtime? Sorry that I cannot help out by the way
Hi @immanueln98
No worries at all, appreciate your interest!
I actually follow two approaches to monitor the frame rate during runtime:
- Logging timestamps in my main loop – I log the current second and count how many frames are processed per second.
- Enabling recording – I enabled the recording, and then use ZED Explorer to play back the
.svo
file and check the actual frame rate visually.
Thanks for you response @karthikreddy157.
Would monitoring the image topic publishing rate i.e ros2 topic hz the image topic while using the wrapper also suffice as a frame rate check if I may ask?
Hi Karthik,
Thank you for your questions. Firstly, are you using C++ or python ? For better performances with multi threading, we strongly recommend to use C++.
About your use case, I understand that you are using Camera::grab()
to get the frame, but you do not need the depth to run your YOLO-based inference. If I am right, I recommend you to use Camera::read
which reads the latest images and IMU from the camera and rectifies the images, and then to use Camera::retrieveImage
to get image before using Camera::grab()
. When you have the image, you can run your inference on it in another thread and continue your main thread with the Camera::grab()
which provides more things like the depth. Notice that if Camera::read
has already been called, Camera::grab()
method knows it, so there is not any additional runtime while doing this. Please find an example of asynchronous pipeline in the c++ sample here: zed-sdk/object detection/custom detector/cpp/tensorrt_yolov5-v6-v8_onnx_async/src/main.cpp at master · stereolabs/zed-sdk · GitHub
About best practices, it depends on your application, but you have two ways. You can design your code to run an inference as soon as a frame is available or to retrieve a frame as soon as your latest inference ends. Please find an other example of code here zed-sdk/object detection/multi-camera/cpp/src/main.cpp at master · stereolabs/zed-sdk · GitHub in which the grab is triggered by the main thread.
Best regards,