When I use human pose detection models, the detection result is very noisy, in the order of several dozen pixels. However, with another camera, this noise is much lower. So I was wondering where the problem might be coming from, given that the images from the left and right cameras look correct, as does the depth estimation.