Log flooding due to unpin memory in python sdk

Issue

log flooding causing

Reproduce

Hardware

  • ZED Box with ZED X

Code

  • run as a systemd service
import pyzed.sl as sl
// ... existing code ...

    init = sl.InitParameters()
    # NOTES: prevent log flooding
    init.sdk_verbose = False 
    init.camera_resolution = sl.RESOLUTION.HD1200
    init.depth_mode = sl.DEPTH_MODE.NONE
    cam = sl.Camera()
    status = cam.open(init)
    if status != sl.ERROR_CODE.SUCCESS:
        logger.error("Failed to open camera: {status}", status=repr(status))
        exit(1)
    else:
        logger.info("Camera opened")

    runtime = sl.RuntimeParameters()

    stream = sl.StreamingParameters()
    stream.codec = sl.STREAMING_CODEC.H264
    stream.bitrate = 8000
    status = cam.enable_streaming(stream)
    if status != sl.ERROR_CODE.SUCCESS:
        logger.error("Failed to enable streaming: {status}", status=repr(status))
        exit(1)
    else:
        logger.info("Camera Streaming enabled")

    logger.info("Started streaming")
    logger.info("Press Ctrl+C to quit")

    signal.alarm(100)  # Set the timeout again
    err = cam.grab(runtime)
    signal.alarm(0)  # Disable the timeout

// ... existing code ...

            signal.alarm(TIMEOUT)  # Set the timeout again
            err = cam.grab(runtime)
            signal.alarm(0)  # Disable the timeout
            if restart_signal:
                cam.disable_streaming()
                cam.close()
                exit(0)
        except Exception as e:
            # Check if the exception was raised due to a timeout
            cam.disable_streaming()
            cam.close()
            logger.exception("Exception occurred")
            raise e

    cam.disable_streaming()
    cam.close()

// ... existing code ...

Asking for

how to mute log

Hi @CircleOnCircles,

This error seems to be related to CUDA, so I believe the best way to limit these logs is to find the source of the issue.

I would need the following information:

  • are you using an abstraction layer to run the application? Docker, systemd, etc
  • is your GPU memory full? do the logs appear before the GPU memory is full?
  • can you run the application with depth_mode.PERFORMANCE? If this changes something, this may show a bug in the SDK
  • app run via systemd
  • yes mem is full, the machine starts with 3.8/8 GB ram usage. and for some reasons its at 7.4/8 GB.
  • ok i ll try depth_mode.PERFORMANCE

doesn’t help, app use 180 MB reported from top, mem usage at6/8GB at the moment

Hi @CircleOnCircles,

Do you notice any difference when run as script outside of systemd ?
What I do not quite understand is how does the memory go from 3.8 → 7.4 if the application uses 180MB of RAM.

i wrote a cpp code. the same issue persist.

but when i run outside systemd, no unpin mem log for 5 min, in a 5 min run.
not sure if this would be the case forever, but promising.

jetson arch, use the same mem space for both cpu ram and gpu mem. top cli might not be sufficient indictor. i might just be cpu ram usage. i might need to check e.g. nvidia-smi. im not sure.

In this case this seems to be a user permissions problem with which you are running the program with system.

You should probably check the systemd user permissions, this could help.

1 Like
  • [workaround] how could i run a script without systemd?
  • [root cause analysis] what set systemd , apart from normal exec ? run as root? …

I may have misunderstood, I believed you were running the application as a systemd service, for which you may set a different user and permissions configuration rather than your user permissions.

Can you run the ZED_Diagnostic tool and share the resulting JSON file? This can help me troubleshoot what may be happening.