ZED X Mask inconsistent

I am using ZED X with ZED Orin Box. I am trying to use the object detection parameters to localize a person in the frames and then use the mask value given by the module to get the corresponding point cloud. The idea is this:

  1. Use the object detection parameters and localize the person
  2. Use the objects.mask() to get the mask value for the detected person.
  3. Overlay the mask on the depth map generated by the camera and filter out all noises and only get the point cloud data of the detected object.
  4. Use that point cloud to create a reconstructed mesh that can be used in a physics engine.
    But the problem is that the mask generated by the object detection parameters is inconsistent and it unfortunately cuts the mask of the detected object making it difficult to overlay on the depth map.
    Is there any way to make the mask more accurate? I also tried to use segmentation models like YOLO but in vain. Do guide me with the mask because it is a vital role in my reconstruction process. If you have any other ideas for reconstruction of a detected object via ZED X Camera do let me know.

The mask quality highly depends on the depth quality. Please make sure you are using NEURAL (or at least ULTRA).
You can also use the masks of the body tracking module, instead of the object detection module, you may have better results.