Zed2 Crashing in Docker

System Info:

  • Host - Jetson Xavier NX (Ubuntu 20.04)
  • Camera - Zed 2
  • Docker - Ubuntu 20.04 + ROS2 Foxy

Hi all, I’m using a Zed 2 camera on the system above that’s running on our robot car. I’m able to run the Zed 2 camera inside Docker using the command in the docs. I’m also able to open RViz using the example. However, when I open both Zed in RViz (using that command) and also my Lidar sensor, the Zed camera briefly works before crashing and disconnecting. The only solution is to restart the container and unplug/replug camera.

After digging into this, I changed the Jetson’s power mode from “10W Desktop” to “20W 6 Core”. This solved the crashing issue and allowed both RViz windows (I assume there’s some hardware limitation here). However, my Jetson keeps producing the warning System throttled due to Over-current. This happens exactly when I open the Zed RViz window.

Was the issue initially due to the hardware limitation, or were the Zed and Lidar devices clashing? Regarding power mode, is there any way to workaround this?

About the Power mode, I guess it’s simply that 10W is not enough to power both the ZED and the Lidar. I don’t think they are clashing.
The overcurrent throttles are not a problem, however your jetson might get warm :slight_smile:

@alassagne thanks for the reply!

I thought that power mode was the cause, but I think I just killed the Zed + Lidar ROS2 programs too quickly :frowning_face:. When I let them run a for a little longer, the Zed 2 camera crashes/disconnects once again.

Any suggestions on how to troubleshoot this issue? All components (i.e., Zed, Lidar, etc) seem to work fine separately, but the moment I open another terminal(s) to run both of them, Zed stops working. Either they are clashing somehow…or the Jetson NX is unable to handle the large monitor screen (NoMachine) from which I’m accessing it.

You should definitely try without nomachine first to eliminate this possibility.
Even with the right power mode, it’s possible that your host cannot handle a ZED2i and a Lidar. The bandwidth can be a issue for example. Maybe if you reduce framerate and/or resolution ?

Sure @alassagne, I’ll give that a try. What would you recommend instead of NoMachine? And do you think there could be any other factors involved with this issue?

I’d recommend to start making your program run natively, without any remote stuff, then add the remote feature. All remote software have different ways of handling the GPUs, and some of them are limited. For example, using OpenGL without an actual screen can be tricky.

@alassagne I’m guessing by “natively” you mean connect the Jetson NX directly to a monitor (instead of accessing via SSH / remote desktops like NoMachine). I can give it a try to see if that’s the issue.

But I think long-term we would need the remote-connection option. I’m surprised that remote software could be causing the issue with camera disconnects / system overload. Do you suggest ways to accomplish that?

It’s probably not the issue, it’s just a good practice not to go remote too early. What about lowering the resolution and framerate ? Did you try it ?
We usually advise to use one ZED per USB controller, to be sure that it does not get bullied by another device. But I guess in this case you cannot, because you only have on USB controller. Reducing the bandwidth is the first thing to try. If that works, you can try to add an external USB controller on a PCIExpress slot.

Thanks @alassagne for the timely responses, much appreciated.

We are indeed only using one Zed2 camera in our Robot’s Jetson NX. I’m not too familiar with the PCIExpress idea - is this a way of expanding the USB capabilities of the Jetson?

As for resolution and frame rate, is this modified in some kind of config file? I know we definitely need 1080p for our YOLO model but we could change frame rate.

Yes, it’s a way to expand the USB capabilities on pretty much any computer.
I just figured, are you using the ZED with a wrapper like ROS or ROS2 ?

Yes @alassagne. We used dustynv’s L4T 35.4.1 Jetson Docker container, re-installed SDK to have v4.6, and then installed the ROS2 wrapper (Foxy, because that’s what the rest of the robot packages are in). We also followed docs to fix the known issues with image_transport. We also installed zed-ros2-examples.

As mentioned, individual components work fine. Tested using ZED_Explorer and also ros2 launch zed_display_rviz2 display_zed2.launch.py (although this produces the system throttled warning on 20W 6 Core power mode, but not 10W Desktop).

Our robot’s lidar sensor also has a ros2 launch command to start the sensor and/or open in Rviz. It is when running both of them in 2 or more terminals that we run into this disconnect/crash issue. Whether that’s because of the power mode, other hardware limitations, or if it’s due to ROS2 packages clashing is also something we don’t know.

@kingrio are you running rviz2 on the Jetson or on a remote PC?
I wouldn’t recommend running rviz2 on Jetson as it can cause unpredictable behavior due to resource hunger.

Can you confirm that the problem mostly happens when you run the lidar and the Jetson simultaneously?

@Myzhar The Jetson NX is mounted on the robot car, and I’m accessing/working via Remote Desktop (NoMachine). i.e., rviz2 and everything else is running on the Jetson.

I had a suspicion that could be the issue. I’ll give it a try by just launching the Lidar and Zed with ros2 launch and no rviz2, and report back shortly.

If you are working in headless mode you must be sure that the GPU is correctly started.
I advise you to use a dummy HDMI or dummy Display Port device to simulate a connected display and activate the GPU modules of the Jetson.

Read more here

Apologies for the delay. Thank you for that link, right now we have a small HDMI screen mounted on the robot and connected to the Jetson but we were planning to get rid of that screen soon, so this helps.

Also wanted to provide an update: as @Myzhar suggested, we tried running just Lidar and Zed (no rviz2) but ran into the same Zed crash issue. Looking closer into the error log, when the disconnect happened, a negative serial number was being printed. So we thought that the USB connection of the Zed inside Docker is somehow getting disrupted, which made me think about what @alassagne mentioned regarding USB controllers.

Long story short, our robot has a USB serial expansion board that’s connected to the Jetson and the camera was actually connected to that board instead of directly to the Jetson. So we connected the camera directly to the Jetson and tried again - Zed and Lidar worked together! Haven’t tested with rviz2 yet, but so far everything seems to work fine.

Not sure why this crash happened previously, but I believe the issue is finally resolved. Thank you both for being so fast and responsive in diagnosing this problem!

When you share different sensors on the same USB3 channel you can normally face two kinds of problems:

  • USB3 bandwidth
  • USB3 power

If the Lidar is powered by the USB port, the second option is more likely. Otherwise, it’s the first.