I read this post that explains how the ZED SDK performs data association to output object tracks. When they say the “ZED SDK internally uses 2D matching and 3D matching for data association”, do they mean that ZED SDK “predicts” where the current 2D bounding box and 3D bounding box will be at in the next frame (e.g., using estimated velocity from a Kalman filter), and matches it with what is actually observed in the next frame?
Yes, each tracked object has an associated filter to estimate and predict the position.