Causes and Mechanism of ZED Camera Initialization Interfering with CUDA Variable Declaration

stm32h757 · September 2, 2024, 4:45am

Hello.

[I. My Environment]
Jetson tk1, Linux4Tegra 21.8(Ubuntu 14), Jetpack 3.1, ZED 1, ZED SDK 1.2

[II. Problem]
If I declare a CUDA device variable and allocate GPU memory before ZED Initialization( calling zed->init(parameters)), the CUDA memcpy function results in an “invalid argument” error.

I don’t know why this happens and would like to understand the reason, from the expert of ZED SDK.

[III. Code]
Here is the code for main():

#include <stdio.h>
#include <string.h>
#include <chrono>

#include <zed/Mat.hpp>
#include <zed/Camera.hpp>
#include <zed/utils/GlobalDefine.hpp>

#include "opencv2/core/core.hpp"
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"

#include "kernel.cuh"

using namespace sl::zed;
using namespace std;

int main(int argc, char **argv) {
    Camera* zed = new Camera(HD720, 15);
    InitParams parameters;
    parameters.unit = UNIT::METER;
    ERRCODE err = zed->init(parameters);

    int host_side_array[3] = {0,0,0};
    int* gpu_side_array_pointer;

    cudaMalloc((void**)&gpu_side_array_pointer, 3*sizeof(int));
    cudaMemcpy(gpu_side_array_pointer, host_side_array, 3*sizeof(int), cudaMemcpyHostToDevice);

    while(true)
    {
   cuManipulateArray(gpu_side_array_pointer);

   cudaMemcpy(host_side_array, gpu_side_array_pointer, 3*sizeof(int), cudaMemcpyDeviceToHost);

   for(int i = 0; i<3; i++)
   {
std::cout << "Value passed from GPU : " << host_side_array[i] << std::endl;
   }
    }

    cudaFree(gpu_side_array_pointer);
    delete zed;
  
    return 0;
}

Here is code for CUDA kernel. :

#include "kernel.cuh"

__global__ void _cuManipulateArray2(int* gpu_side_array_pointer)
{
        atomicAdd(gpu_side_array_pointer + 0, 1);
        atomicAdd(gpu_side_array_pointer + 1, 1);
        atomicAdd(gpu_side_array_pointer + 2, 1);
}

void cuManipulateArray(int* gpu_side_array_pointer)
{
_cuManipulateArray2<<<5,5>>>(gpu_side_array_pointer);
cudaDeviceSynchronize();
}

If you place zed->init(parameters) at the beginning of the code, the main() function works correctly. However, if you place zed->init(parameters) in the middle of the code, CUDA does not function properly and may produce errors, without sufficient information(I had hard time to find root cause, and now I found.) I suspect that the ZED SDK is somehow interfering with CUDA’s operations without explicit error message. Could someone explain the underlying mechanism?

[IV. What I wanted to do]
I wanted to detect obstacles, but my code didn’t work for no reason.
While debugging, I encountered this strange problem and searched Google but couldn’t find any similar issues.

I couldn’t find any mention of this problem in the documentation or examples. Does this issue also occur with the latest versions of the ZED SDK and CUDA? If so, it might be helpful to include a note about it in the documentation.

adujardin · September 2, 2024, 10:11am

Hi,

As you may know, CUDA requires an initialization to work. Typically by creating explicitly a CUDA context or calling the function “cudaSetDevice(0)”.

Since the ZED SDK also uses CUDA, it creates the CUDA context by default and initializes the GPU in the main application thread. This can’t be done twice on the same thread.

This explains why the CUDA code works only when the zed->init is first called. If you weren’t using the SDK you could just call cudaSetDevice (it wouldn’t work if you also use the ZED SDK as is, but you could provide the Context in the InitParameters using InitParameters::sdk_cuda_ctx)