ZED Area File Corruption on Long Runtimes

lstrichashl · August 28, 2025, 4:44pm

We are experiencing an issue with our ZED camera where the area memory file becomes corrupted after extended periods of operation. This effectively prevents the camera from being loaded at all

Problem Description:

The core issue seems to be related to the runtime duration before saving the area memory file.

Short Runtimes (e.g., under 30 seconds): If we run the camera, map an area, and save the .area file within a short timeframe (approx. 20-30 seconds), the file saves correctly. We can then successfully reload this file and initialize positional tracking without any issues.
Long Runtimes (e.g., over 2 minutes): When the camera operates for an extended period (2 minutes or more), the node fails to shut down gracefully. We initiate the shutdown process with a single SIGINT (Ctrl+C), which should trigger the saving of the area file. As shown in the logs, the system escalates to SIGTERM and finally SIGKILL to force termination (not by our code).

Logs:

Here are the relevant logs that illustrate the failure sequence.

1. Error During Area File Save After a Long Run:

[component_container_isolated-2] [INFO] [1756372570.245063866] [zed_upper.zed_upper_node]: Saving area memory to: '/home/test_area_memory.area' ... [ERROR] [component_container_isolated-2]: process[component_container_isolated-2] failed to terminate '5' seconds after receiving 'SIGINT', escalating to 'SIGTERM' [INFO] [component_container_isolated-2]: sending signal 'SIGTERM' to process[component_container_isolated-2] [component_container_isolated-2] [INFO] [1756372575.487479308] [rclcpp]: signal_handler(signum=15) [ERROR] [component_container_isolated-2]: process[component_container_isolated-2] failed to terminate '10.0' seconds after receiving 'SIGTERM', escalating to 'SIGKILL' [INFO] [component_container_isolated-2]: sending signal 'SIGKILL' to process[component_container_isolated-2] [ERROR] [component_container_isolated-2]: process has died [pid 73563, exit code -9, cmd '/opt/ros/humble/lib/rclcpp_components/component_container_isolated --use_multi_threaded_executor --ros-args --log-level info --ros-args -r __node:=zed_container -r __ns:=/zed_upper'].

2. Failure to Start with Corrupted Area File:

[component_container_isolated-2] [2025-08-28 10:32:56 UTC][ZED][ERROR] Unable to load area file
[component_container_isolated-2] [WARN] [1756377176.803425038] [zed_upper.zed_upper_node]: Pos. Tracking not started: INVALID AREA FILE
...
[component_container_isolated-2] [2025-08-28 10:33:00 UTC][ZED][ERROR] Unable to load area file
[component_container_isolated-2] [WARN] [1756377180.201687141] [zed_upper.zed_upper_node]: Pos. Tracking not started: INVALID AREA FILE
[component_container_isolated-2] [FATAL] [1756377180.201800834] [zed_upper.zed_upper_node]: It's not possible to enable the required Positional Tracking module.
[component_container_isolated-2] (Argus) Error InvalidState: Argus client is exiting with 1 outstanding client threads (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadCore(), line 366)
[component_container_isolated-2] WARNING Argus: 5 client objects still exist during shutdown:
[component_container_isolated-2] 	281473505878704 (0xffff0c0f3d30)
[component_container_isolated-2] 	281473506010912 (0xffff0c0f3b30)
[component_container_isolated-2] 	281473509866680 (0xffff0c1a9898)
[component_container_isolated-2] 	281473530845216 (0xffff0c1a8b30)
[component_container_isolated-2] 	281473534906912 (0xffff0c0f24b0)
[ERROR] [component_container_isolated-2]: process has died [pid 118715, exit code 1, cmd '/opt/ros/humble/lib/rclcpp_components/component_container_isolated --use_multi_threaded_executor --ros-args --log-level info --ros-args -r __node:=zed_container -r __ns:=/zed_upper'].

System Environment:

ROS Version: ROS 2 Humble
Platform: NVIDIA Jetson
JetPack Version: 6.2
ZEDXM
ZED SDK version 5.0.5
Docker container image: stereolabs/zed:5.0-runtime-jetson-jp6.1.0

Myzhar · August 29, 2025, 10:20am

Hi @lstrichashl

Welcome to the Stereolabs community.

Does the problem persist if you trigger area file saving by calling the save_area_memory service?

Unfortunately, area memory saving can take too long, and ROS 2 will terminate the node if it exceeds 5 seconds.

lstrichashl · September 1, 2025, 11:43am

Thank you for your attention to this issue. I’ve been doing further testing specifically with the save_area_memory service and have encountered several related issues.

After few consecutive successful runs of ros2 service call ~/save_area_memory zed_msgs/srv/SetROI “{}” I get log Failed to save area memory: ‘FAILURE’ the full log:

[component_container_isolated-2] [INFO] [1756725436.722152954] [zed_upper.zed_upper_node]: ** Save Area Memory_Response service called **
[component_container_isolated-2] [INFO] [1756725436.722295063] [zed_upper.zed_upper_node]: Saving area memory to: '/mtb_robot_config/test.area' ...
[component_container_isolated-2] [INFO] [1756725438.721870166] [zed_upper.zed_upper_node]: ... Area memory saved successfully
[component_container_isolated-2] [INFO] [1756725446.259978890] [zed_upper.zed_upper_node]: ** Save Area Memory_Response service called **
[component_container_isolated-2] [INFO] [1756725446.260141479] [zed_upper.zed_upper_node]: Saving area memory to: '/mtb_robot_config/test.area' ...
[component_container_isolated-2] [INFO] [1756725446.265265467] [zed_upper.zed_upper_node]: ... Area memory saved successfully
[component_container_isolated-2] [INFO] [1756725454.941491917] [zed_upper.zed_upper_node]: ** Save Area Memory_Response service called **
[component_container_isolated-2] [INFO] [1756725454.941621770] [zed_upper.zed_upper_node]: Saving area memory to: '/mtb_robot_config/test.area' ...
[component_container_isolated-2] [INFO] [1756725454.946681440] [zed_upper.zed_upper_node]: ... Area memory saved successfully
[component_container_isolated-2] [INFO] [1756725479.763997388] [zed_upper.zed_upper_node]: ** Save Area Memory_Response service called **
[component_container_isolated-2] [INFO] [1756725479.764169193] [zed_upper.zed_upper_node]: Saving area memory to: '/mtb_robot_config/test.area' ...
[component_container_isolated-2] [INFO] [1756725479.769296925] [zed_upper.zed_upper_node]: ... Area memory saved successfully
[component_container_isolated-2] [INFO] [1756725484.647462211] [zed_upper.zed_upper_node]: ** Save Area Memory_Response service called **
[component_container_isolated-2] [INFO] [1756725484.647618943] [zed_upper.zed_upper_node]: Saving area memory to: '/mtb_robot_config/test.area' ...
[component_container_isolated-2] [WARN] [1756725484.647679454] [zed_upper.zed_upper_node]: Failed to save area memory: 'FAILURE'

Sometime the loading process of area file takes long time. The size of the area file in this case is 54MB. This often forces me to manually stop the node, which then triggers the same unclean shutdown and file corruption issue detailed in my original post.
On one occasion, I was able to load a previously saved area file, but the node crashed shortly after with a fatal graph optimization error:

[component_container_isolated-2] [2025-09-01 11:31:22 UTC][ZED][INFO] Area file loading...
[component_container_isolated-2] [INFO] [1756726297.187301703] [zed_upper.zed_upper_node]: === Odometry reset for LOOP CLOSURE event ===
[component_container_isolated-2] addVertex: FATAL, a vertex with ID 59 has already been registered with this graph
[component_container_isolated-2] bool slg2o::OptimizableGraph::Edge::resolveParameters(): edge not registered with a graph
[component_container_isolated-2] addEdge: FATAL, cannot resolve parameters for edge 0xfffe70005e20
[component_container_isolated-2] addVertex: FATAL, a vertex with ID 65 has already been registered with this graph
[component_container_isolated-2] bool slg2o::OptimizableGraph::Edge::resolveParameters(): edge not registered with a graph
[component_container_isolated-2] addEdge: FATAL, cannot resolve parameters for edge 0xfffe7010f690
[ERROR] [component_container_isolated-2]: process has died [pid 1644367, exit code -11, cmd '/opt/ros/humble/lib/rclcpp_components/component_container_isolated --use_multi_threaded_executor --ros-args --log-level info --ros-args -r __node:=zed_container -r __ns:=/zed_upper'].

Current Positional Tracking Configuration

For context, here is the pos_tracking configuration I am using:

        pos_tracking:
            pos_tracking_enabled: true # True to enable positional tracking from start
            pos_tracking_mode: 'GEN_1' # Matches the ZED SDK setting: 'GEN_1', 'GEN_2', 'GEN_3'
            imu_fusion: true # enable/disable IMU fusion. When set to false, only the optical odometry will be used.
            publish_tf: true # [overwritten by launch file options] publish `odom -> camera_link` TF
            publish_map_tf: true # [overwritten by launch file options] publish `map -> odom` TF
            map_frame: 'map'
            odometry_frame: 'odom'
            area_memory: true # Enable to detect loop closure
            area_file_path: '/mtb_robot_config/test.area' # Path to the area memory file for relocalization and loop closure in a previously explored environment. 
            save_area_memory_on_closing: true # Save Area memory before closing the camera if `area_file_path` is not empty. You can also use the `save_area_memory` service to save the area memory at any time.
            reset_odom_with_loop_closure: true # Re-initialize odometry to the last valid pose when loop closure happens (reset camera odometry drift)
            depth_min_range: 0.0 # Set this value for removing fixed zones of the robot in the FoV of the camerafrom the visual odometry evaluation
            set_as_static: false # If 'true' the camera will be static and not move in the environment
            set_gravity_as_origin: true # If 'true' align the positional tracking world to imu gravity measurement. Keep the yaw from the user initial pose.
            floor_alignment: false # Enable to automatically calculate camera/floor offset
            initial_base_pose: [0.0, 0.0, 0.0, 0.0, 0.0, 0.0] # Initial position of the `camera_link` frame in the map -> [X, Y, Z, R, P, Y]
            path_pub_rate: 2.0 # [DYNAMIC] - Camera trajectory publishing frequency
            path_max_count: -1 # use '-1' for unlimited path size
            two_d_mode: false # Force navigation on a plane. If true the Z value will be fixed to 'fixed_z_value', roll and pitch to zero
            fixed_z_value: 0.0 # Value to be used for Z coordinate if `two_d_mode` is true
            transform_time_offset: 0.0 # The value added to the timestamp of `map->odom` and `odom->camera_link` transform being generated
            reset_pose_with_svo_loop: true # Reset the camera pose the `initial_base_pose` when the SVO loop is enabled and the SVO playback reaches the end of the file.

Questions

Given these issues, I have a few broader questions:

Based on my configuration, am I missing a key parameter or using an incorrect setting?
Could you outline the best practices for managing area files, especially regarding when and how often to call the save_area_memory service to avoid failures?
Our primary goal is to have the robot reliably relocalize itself in a previously mapped environment. Is using the area file as we are the correct approach for this functionality?

Any guidance on these points would be greatly appreciated. Thank you.