How to get position of a detected object relative to the world?

Hi I am currently working on a project where the ZED Camera 2 will be fixed at a certain angle and height, and would be constantly tracking any people coming in the sight of the camera. I would like to map the position of the detected people on a 2D canvas (imagine a 2D game where character will move in x-y coordinates) and I was wondering what would be the best way to translate the position information made available for the detected objects to map the detected objects onto the canvas. How would I make use of the position data from Object Detection module?

Thank you for your time

PS: What is the unit of measurement for the x, y, z coordinates of the Position given by the Object Detection output? Is it in meters?

Hi,

First you need to enable the positional tracking module so that the object detection can output real world coordinates.
You can either set the position manually by giving an initial world transform (if you want to set (x-y) position on the canvas), and/or use the floor_as_origin parameters (that will calculate height of the camera regarding the floor + the gravity of the camera).

    PositionalTrackingParameters positional_tracking_parameters;
    positional_tracking_parameters.set_as_static = true; //If your camera is static, set to true
    positional_tracking_parameters.set_floor_as_origin = true; 
    positional_tracking_parameters.initial_world_transform = <your_transform>;
    zed.enablePositionalTracking(positional_tracking_parameters);

Then, call the grab function with the world reference frame in the runtime parameters

sl::RuntimeParameters rt_param;
rt_param.measure3D_reference_frame = sl::REFERENCE_FRAME::WORLD;
sl::ERROR_CODE err_ = p_zed->grab(rt_param);

That will ensure that the objects position are given in the world frame (and not the camera frame).

Then use the position from the objectData of the sl::Objects (objectdata.position.x/y/z…)

All metrics output of the SDK are given according to the InitParameters::coordinate_units , used in the open() function.
https://www.stereolabs.com/docs/api/structsl_1_1InitParameters.html#a75cb1e715c1be19c591d21fa2591b421

You should take a look at this sample here :
The bird view window seems to be exactly what you want to do…

OB.

1 Like

Hi, thank you so much for your thorough answer. I think I understand almost every part of your response except for when it comes to setting the initial_world_transform, as even after reading through the documentation I am not quite certain what it means. Does this mean the initial x, y, z dimension of the space the camera is exposed to? Also I wasn’t sure as to how to construct the Transform necessary.

Yes, the initial transform corresponds to the first position of the camera. (translation and rotation), regarding to the (0,0,0) of the world.
If not set, the first position is the identity.

1 Like

And where would the (0, 0, 0) of the world lie if I may ask?

I am working on a similar project. I have set the postional_tracking_parameters.initial_world_transform(<my transform>) and runtime_params.measure3D_reference_frame = sl.REFERENCE_FRAME_WORLD
and when I output the position of detected objects it outputs the distance in reference to the camera. Is there an additional parameter that needs to be set or when grabbing the object position is there another way to do so other than object.position?