Nvargus crash after a few hours of working

Hello,

I faced an issue with nvargus-daemon crash after a few hours of continuous streaming.

Here is my setup:

  • Jetson Orin AGX
  • Yocto build with 5.15.148 kernel, L4T R36.4.3
  • Stereolabs Quad Link
  • 4 ZEDX Mono + 2 ZEDX
  • zedx-driver (built from sources with bsp_master branch).
  • zed_x_daemon is in sync_mode=1, related jumper is set up on the Quad Link board

After a few (8-30) hours of continuous streaming my nvargus-daemon crashes with such logs:

Logs
Feb 02 20:32:27 media nvargus-daemon[2048]: SCF: Error InvalidState:  Corr Error 8 Received for sensor 22 .. Continuing!
Feb 02 20:32:27 media nvargus-daemon[2048]:  (in src/services/capture/FusaCaptureViCsiHw.cpp, function waitCsiFrameEnd(), line 654)
Feb 02 20:32:27 media nvargus-daemon[2048]: assignAllBuffersFromStream if ii 0 stream 0xfffeec001310 and buffer (nil) 281473483379072meout 4af0000077f
Feb 02 20:32:27 media nvargus-daemon[2048]: assignAllBuffersFromStream if ii 0 stream 0xfffee0001310 and buffer (nil) 281473474924928meout 4af0000077f
Feb 02 20:32:27 media nvargus-daemon[2048]: assignAllBuffersFromStream if ii 0 stream 0xfffe98001310 and buffer (nil) 281473583124864meout 4af0000077f
Feb 02 20:32:27 media nvargus-daemon[2048]: assignAllBuffersFromStream if ii 0 stream 0xfffe50001310 and buffer (nil) 281473591579008meout 4af0000077f
Feb 02 20:32:27 media nvargus-daemon[2048]: SCF: Error ResourceAlreadyInUse:  (propagating from src/services/capture/FusaCaptureViCsiHw.cpp, function startCaptureInternal(), line 877)
Feb 02 20:32:27 media nvargus-daemon[2048]: SCF: Error ResourceAlreadyInUse:  (propagating from src/services/capture/CaptureRecord.cpp, function doCSItoMemCapture(), line 547)
Feb 02 20:32:27 media nvargus-daemon[2048]: SCF: Error ResourceAlreadyInUse:  (propagating from src/services/capture/CaptureRecord.cpp, function issueCapture(), line 490)
Feb 02 20:32:27 media nvargus-daemon[2048]: SCF: Error ResourceAlreadyInUse:  (propagating from src/services/capture/CaptureServiceDevice.cpp, function issueCaptures(), line 1565)
Feb 02 20:32:27 media nvargus-daemon[2048]: SCF: Error ResourceAlreadyInUse:  (propagating from src/services/capture/CaptureServiceDevice.cpp, function issueCaptures(), line 1394)
Feb 02 20:32:27 media nvargus-daemon[2048]: assignAllBuffersFromStream if ii 0 stream 0xfffe58001310 and buffer (nil) 281473625395584meout 4af0000077f
Feb 02 20:32:27 media nvargus-daemon[2048]: SCF: Error ResourceAlreadyInUse:  (propagating from src/common/Utils.cpp, function workerThread(), line 125)
Feb 02 20:32:27 media nvargus-daemon[2048]: SCF: Error ResourceAlreadyInUse: Worker thread CaptureScheduler frameStart failed (in src/common/Utils.cpp, function workerThread(), line 144)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error Timeout:  (propagating from src/api/Buffer.cpp, function waitForUnlock(), line 644)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error Timeout:  (propagating from src/components/CaptureContainerImpl.cpp, function returnBuffer(), line 430)
Feb 02 20:32:28 media nvargus-daemon[2048]: assignAllBuffersFromStream if ii 0 stream 0xfffea4001310 and buffer (nil) 281473500287360meout 4af0000077f
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState:  (propagating from src/components/stages/SensorCaptureStage.cpp, function doHandleRequest(), line 91)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState:  (propagating from src/components/stages/MemoryToISPCaptureStage.cpp, function doHandleRequest(), line 148)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState:  (propagating from src/components/stages/MemoryToISPCaptureStage.cpp, function doHandleRequest(), line 148)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState: Sending critical error event for Session 1
Feb 02 20:32:28 media nvargus-daemon[2048]:  (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
Feb 02 20:32:28 media nvargus-daemon[2048]: assignAllBuffersFromStream if ii 0 stream 0xffff68001360 and buffer (nil) 281473474924928meout 4af0000077f
Feb 02 20:32:28 media nvargus-daemon[2048]: assignAllBuffersFromStream if ii 0 stream 0xfffe50001310 and buffer (nil) 281473491833216meout 4af0000077f
Feb 02 20:32:28 media nvargus-daemon[2048]: assignAllBuffersFromStream if ii 0 stream 0xfffe98001310 and buffer (nil) 281473616941440meout 4af0000077f
Feb 02 20:32:28 media nvargus-daemon[2048]: assignAllBuffersFromStream if ii 0 stream 0xfffe58001310 and buffer (nil) 281473600033152meout 4af0000077f
Feb 02 20:32:28 media nvargus-daemon[2048]: assignAllBuffersFromStream if ii 0 stream 0xfffee0001310 and buffer (nil) 281473491833216meout 4af0000077f
Feb 02 20:32:28 media nvargus-daemon[2048]: assignAllBuffersFromStream if ii 0 stream 0xfffeec001310 and buffer (nil) 281473608487296meout 4af0000077f
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState:  (propagating from src/components/stages/MemoryToISPCaptureStage.cpp, function doHandleRequest(), line 148)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState: Sending critical error event for Session 13
Feb 02 20:32:28 media nvargus-daemon[2048]:  (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState:  (propagating from src/components/stages/SensorCaptureStage.cpp, function doHandleRequest(), line 91)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState: Sending critical error event for Session 14
Feb 02 20:32:28 media nvargus-daemon[2048]:  (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState:  (propagating from src/components/stages/MemoryToISPCaptureStage.cpp, function doHandleRequest(), line 148)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState: Sending critical error event for Session 23
Feb 02 20:32:28 media nvargus-daemon[2048]:  (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState: Sending critical error event for Session 23
Feb 02 20:32:28 media nvargus-daemon[2048]:  (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState:  (propagating from src/components/stages/SensorCaptureStage.cpp, function doHandleRequest(), line 91)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState: Sending critical error event for Session 15
Feb 02 20:32:28 media nvargus-daemon[2048]:  (in src/api/Session.cpp, function sendErrorEvent(), line 1039)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState:  (propagating from src/components/stages/SensorCaptureStage.cpp, function doHandleRequest(), line 91)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState: Capture Scheduler not running (in src/services/capture/CaptureServiceDevice.cpp, function addNewItemToSchedule(), line 1039)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState:  (propagating from src/services/capture/CaptureService.cpp, function addRequest(), line 413)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState:  (propagating from src/components/stages/SensorCaptureStage.cpp, function doHandleRequest(), line 91)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState:  (propagating from src/components/stages/OrderedStage.cpp, function doExecute(), line 158)
Feb 02 20:32:28 media nvargus-daemon[2048]: SCF: Error InvalidState: Sending critical error event for Session 0
Feb 02 20:32:28 media nvargus-daemon[2048]:  (in src/api/Session.cpp, function sendErrorEvent(), line 1039)

However when I run the same exact setup with Jetpack 6.2 Linux and compiled 1.3.1 zedx drivers it works perfectly.

Moreover, I tried to downgrade back to bsp_1.3_alpha zedx-driver branch and seems like it even helps thus I had 50+ hours of streaming with no crashes but it lacks of sync features what are really needed.

Here is a full thread with a discussion on the Nvidia forum but since I suspect it is a zedx driver issue I post this question here.

Could you please help me with this, I would appreciate any help.

Hi @ironyoid
if you are using custom ZED X Drivers, please open an issue on the ZED X Driver GitHub repository.