Neural Network Inference Speed Using ZED2's Unity Plugin

Hi again!
I’ve spent some time to try to run and optimize custom object detectors inside Unity using either ZED2’s Unity Plugin or just OpenCVforUnity.

Took me a bit of time to notice, but I think the problem is:
Neither approach is utilizing the GPU really for inferencing the networks. I’m not too sure about the ZED examples, but the OpenCVforUnity ones definitely don’t.

That’s a huge downside and renders most object detectors useless. Am I missing something or is that assessment correct?

If so, I’m considering different options to utilize the GPU:

-Barracuda (Introduction to Barracuda | Barracuda | 1.0.4)
-Unity-TensorRT (GitHub - aman-tiwari/Unity-TensorRT: Unity + TensorRT integration)
-NatML (NatML for Unity - Unity)

I don’t have any experience with these frameworks but I fear that it will be quite unstable or hard to adapt to custom model to use these. Do you have any experience and/or tipps regarding them?

At this point, I’m a bit disappointed in Unity generally when it comes to neural network inference. I’m considering streaming the video using the ZED SDK. However, I’m a bit afraid of latency and synchronization issues.

Therefore, I wondered whether you have any tipps wrt sending the output of the neural network. Can I package it somehow in the output stream? If so how?

Thanks in advance!

Hi,

We decided to use opencv for our custom detection sample because it was the easiest to use/understand. This sample has been done to show how to interface the data from a detector with the ZED SDK.
After that, you are free to choose any detector you want.

I’m sorry but I’ve not used those frameworks in Unity myself, so I can’t give you too much advice about it.

You said that you are disappointed about Unity. I’m really sorry to hear that, could you tell me more about it? We are always happy to receive constructive feedback from our users, it helps us improve our products.

Stereolabs Support

Hi Benjamin,
thanks for your response!
Do you have any advise on how to send the output from a neural network (say bboxes from an object detector) alongside the video stream? Is it possible to package that into the stream via your SDK somehow? Or via gstreamer?
Thanks a lot!