Accurate Surface Area Calculation with ZED 2i Stereo Camera

Hello ZED Team,

I am working with Zed camera and ZED SDK. I want to calculate the surface area of the tables on live camera (with ZED 2i Stereo Camera).

The dimensions of each table is different i.e., Length, breath and height. I want to calculate the length and breadth of the table as accurate as possible. The main problem I am facing is that the edge of the table that is near to the camera covers more pixels and the edge that is far away covers less pixels of the image. So how can I estimate the right breadth of the table? Moreover, the length of the tables is also diminishing.
I am attaching a reference image herewith this query. Kindly suggest a solution in order to achieve the aim.

My approach:
I clicked the images from the left camera in grayscale mode and saved depth information as well. Then I annotated the dataset. I segment out the table from the image using Yolov8 model. I have attached the image of the segmented table.

Now, I want to calculate the correct area of the table surface (determining length and breadth). Kindly suggest me some solution.

Hi @aakashgoyal,

Welcome to the Stereolabs forums, and thank you for using the ZED! :wave:

There are many ways to solve this problem depending on if there are known priors or not. Here are a couple ways you could perform this.

  1. A simple way of performing this could be, if the tables you are trying to detect are all rectangular in shape, first detecting the four corners of the table in the segmented image using an image corner detector. You can then retrieve these corners either from the depth map, or more easily in the 3D point cloud which is also provided in the SDK. Once you have the 3D coordinates of the corners, you can then compute the length and breadth of the table.

  2. A more accurate method would use more points on the segmented table. Since you have already annotated the tables in the images using YOLOv8, you can use the bounding box coordinates of the segmented table to extract the corresponding region from the 3D point cloud. This step will allow you to work with only the points belonging to the table. You can then run a plane fitting algorithm (e.g., RANSAC) on the extracted points to estimate the table’s surface. This will filter out any unwanted outliers in 3D. After this, you can project the 3D points on the surface in order to remove any surface irregularities for the area computation, as we can assume the table is flat. Converting the point cloud into a mesh should be then enough to compute the surface area.

It’s important to note that the accuracy of the measurement depends on the depth map, the performance of the table segmentation algorithm and the precision of the plane fitting algorithm. You may need to fine-tune the parameters or explore alternative algorithms to obtain the best results for your specific scenario.

I hope this approach helps you estimate the correct area of the table surface. Let me know if you have any further questions!

1 Like

Thank you so much for the suggestions. Those are really giving some light to me.

Firstly, Yes, all tables are rectangular. They all have 4 edges and surface is also flat.

Secondly, what do you mean by known priors? May if you explain a bit then I can add information about them.

Thirdly, i didn’t mention this earlier that I want to run the code in real-time on live camera and I want to calculate the area of the flat table as soon as the table is detected in the frame. I hope your point 2 is still correct for my use case. if there are any changes from your side then please suggest?

Fourth, As you mentioned, "you can use the bounding box coordinates of the segmented table to extract the corresponding region from the 3D point cloud. ". In this, are you talking about the 4 corner coordinates of the white area (Mask image (black and white image) in first my post ) or the bounding box as you see in the below image: (bounding box is not so accurate at the moment, but do I need to extract the four edges of the green box)

Fifth, in my points cloud I can see some [ nan nan nan nan] values that means OCCLUSION_VALUE. The depth of the pixel cannot be estimated as it is occluded or an outlier. How can I fix this?

I am using cam_init_params.depth_mode = sl.DEPTH_MODE.ULTRA and depth sensing mode is STANDARD. I am taking pictures at the moment from the Left camera in Grayscale mode.
I would like to also add that : I also got [ 42.468235 -64.1018 204.62054 nan] and [-4.0450558e+01 -8.9020546e+01 2.8416415e+02 -2.8768687e+38] values as well. What does the 4th value represent? I understood 1st value is X, 2nd is Y and 3rd is Z.
PS: I am saving my points cloud in pickle file.

 with open(file_name + '.pkl', 'wb') as file:
                pickle.dump(points_cloud, file)

If you need any other information to answer these queries I am happy to share. @mattrouss

  1. What I meant by priors were hypotheses you could use to build your algorithm, the tables being rectangular is one of them.
  2. The fact that you want it to run in real-time depends on your implementation of the algorithms. Our depth algorithms run in real-time with different choices to tune the quality-performance tradeoff.
  3. I’ll change my previous statement and rather say that you should use the segmentation mask to filter out the unwanted points in the point cloud.
  4. As the doc describes, these values are either an occlusion or a value with confidence lower than the defined threshold. I would suggest using the ZED_Depth_Viewer tool to try out our different depth models and confidence value thresholds (you can select these in the settings of the tool), to see which model fits your use case. NEURAL depth mode can provide depth maps without occlusions as it is based on artificial intelligence.

As mentioned in the documentation, in sl::MEASURE::XYZ the fourth value is not to be used.

1 Like

Thank you so much @mattrouss ! I am progressing with my use case.

Considering the discussion, I made the necessary changes in my code (now using NEURAL Depth Mode).
Now, I am saving the images along with their point clouds in .ply format. I did annotations and trained my model again. Further, I am segmenting the table out of the image (as shown in the 1st post (black and white image)). Then, i am extracting the coordinates of the corner of the table (shown in the image here). Now, I want to map these 2D coordinates to the 3D coordinates in the point cloud. Is there a function available to map 2D points to 3D points in the SDK? if not, how can i achieve it?
As once I get the 3d points then I can calculate the distance between these points and calculate the surface area of the table.

PS: I have not implemented RANSAC yet. I think if I get 3D points then I can calculate the area using distance formula.

Hi @aakashgoyal,

Please take a look at this post that provides the formulas to convert a 2d pixel to a 3d point and vice versa:

Hello @mattrouss !
On the link you mentioned, it is mentioned that:

> ## What are f_x, f_y, c_x, and c_y?
**> **
> They are the intrinsic camera parameters obtained with the camera calibration procedure, using the “pin hole” camera model:
**> **
> * (f_x, f_y): focal lengths in pixel units
> * (c_x, c_y): coordinates of the principal point

and on the [Camera Calibration - Stereolabs], it is mentioned that :
**> **
> It is possible to recalibrate your camera manually using the ZED Calibration tool. However, we do not recommend this for ZED 2 cameras. They go through extensive and rigorous multi-step factory calibration (including thermal measurements), and a manual calibration might degrade its calibration parameters.

So, do I need to re-calibrate the camera or not?

I am aware that I can download the config file from

All ZED cameras are factory-calibrated, so you do not require to recalibrate the camera. The camera’s config file is automatically downloaded by the ZED SDK when you’ve used it for the first time, and this is the calibration used to perform our depth estimation algorithms.

You can retrieve the left camera’s calibration params using: zed.getCameraInformation().camera_configuration.calibration_parameters.left_cam

Hello @mattrouss… I have tried converting the 2d corrdinates to 3d coordinates using the link.
I retrieve saved depth map:

# Depth Map
pgm_file_path = '20240319_161207.pgm'
image_depth_map = cv2.imread(pgm_file_path, -1)

and converted the 2d points into 3d using:

X = [((u - cx) * image_depth_map[v, u]) / fx for u, v in rectangle_points]
Y = [((v - cy) * image_depth_map[v, u]) / fy for u, v in rectangle_points]
Z = [image_depth_map[v, u] for u, v in rectangle_points]

where fx, fy, cx, cy are intrinsic parameters

# Extract the intrinsic parameters for 2K resolution
fx = 1929.147094
fy = 1929.14709472
cx = 1131.6126
cy = 605.3469

After this I am trying to calculate the euclidean distance between two points lets say (316, 929) and (1864, 814) in 5th post but the distance between two is wrong. I am calculating using the below approach

import numpy as np

p1 = np.array(p1)
p2 = np.array(p2)

# Calculate the difference vector and compute its L2 norm (Euclidean distance)
distance = np.linalg.norm(p1 - p2)

I don’t have a camera stand and I am just holding the camera in hand while clicking pictures, saving point cloud and depth map. Can it be the reason behind the inaccurate euclidean distance values?

Second, how can I retrieve the values from the saved point cloud for a particular coordinate like in this case (316, 929) and (1864, 814)?
As I am trying to retrieve it referring and getting an error:

# Load the point cloud
ply_path = "20240319_161207.ply"
point_cloud =
points = np.asarray(point_cloud.points)

point3D = point_cloud.get_value(316, 929)

point3D = point_cloud.get_value(316, 929)
AttributeError: ‘open3d.cuda.pybind.geometry.PointCloud’ object has no attribute ‘get_value’

Please help

Hi @aakashgoyal,

In your depth map conversion, there could be a few things that may go wrong, please make sure you verify:

  • the depth map images are stored in the correct format and unit, and that the values are correct
  • the calibration values are retrieved using the ZED SDK API, with the CameraConfiguration object (you have to use the left camera rectified calibration values)
  • the conversion itself is performed correctly

The get_value() method does not work as you are using the open3d API to read the saved point cloud, which not the ZED SDK API. As it is a point cloud there will be no information of “image UV coordinates”, and you will not be able to search for a given value as you are trying to do.

You can save and load an sl.Mat with the sl.Mat.write and sl.Mat.load methods

I am facing two problems at the moment.

  1. How to use and sl.Mat.Load function? I don’t find any definition or documentation anywhere? Can you provide a sample code for it?
    Moreover, will I be able to get a 3d coordinate from point cloud for a coordinate (316, 929) with a get_value function when I save and load the point cloud using and sl.Mat.load functions?

  2. I am not able to get auto completion option in VScode for the zed-sdk after I imported in the python program import as sl ? How can I solve this? I searched the web for this issue and came across a solution to download the sl.pyx and rename it to and place it into pyzed folder. I didn’t able to find sl.pyx file and even after installing ZED sdk multiple times I can’t see pyzed folder in my virtual environment. Any soltuion?

Hi @aakashgoyal,

  1. The namings are and sl.Mat.write, sorry about that.

  2. You can find the path to the pyzed’s module with the following:

>>> import pyzed
>>> pyzed.__file__
  1. Okay, Thank you so much for the correction. I am saving the image, point cloud and depth map using .write() function only. I will read it with .read() from now on.

After running the code with suggested changes, I am getting the following output:

`Point_Cloud: (SUCCESS, array([ -40.71487808,   23.08045769, 2844.52709961,           nan]))
x: 1104 y: 621
Distance to Camera at {1104;621}: 2844.9120951685723
n/a Ts 1711647054325184993 sl::Mat of size [2208,1242], with 1 channels of type float allocated on CPU (memory owned).

Saved Image, Point Cloud, and Depth Map as zz20240328_183054
Point Cloud at x and Y: (SUCCESS, 2.9388984276775804e-39)
Depth Map Value: (SUCCESS, 7.255130313343095e-39)`

I am not able to understand what does these values show? Why there is a change in the output before and after saving the files. Please explain?

Point Cloud at x and Y: (SUCCESS, 2.9388984276775804e-39)
Depth Map Value: (SUCCESS, 7.255130313343095e-39)

This is the code I wrote:

import cv2
import as sl
import math
from datetime import datetime

if __name__ == "__main__":
    # Create a Camera object
    cam = sl.Camera()
    # Create an InitParameters object and set configuration parameters
    cam_init_params = sl.InitParameters()
    cam_init_params.camera_resolution = sl.RESOLUTION.HD2K  
    cam_init_params.camera_fps = 15

    # Depth Mode
    # Depth Modes Available: NEURAL, ULTRA, QUALITY, PERFORMANCE(default)
    cam_init_params.depth_mode = sl.DEPTH_MODE.NEURAL #Recommended by ZED: "ULTRA" mode
    # cam_init_params.depth_stabilization = True (It is True by default and stables the image)

    cam_init_params.coordinate_units = sl.UNIT.MILLIMETER  # Use millimetre units (for depth measurements)
    # cam_init_params.coordinate_units = sl.UNIT.CENTIMETER  #default: Millimeters
    cam_init_params.depth_minimum_distance= 35 # min possible 15 cm/ 150 mm/ 0.15 mtrs
    cam_init_params.depth_maximum_distance = 35000  # Setting maximum depth perception distance to 35m
    # Open the camera
    # Set the selected sensor for retrieving the image: from the left sensor
    view = sl.VIEW.LEFT_GRAY
    # set the measure types for getting the depth information. This is done based on the selected view
    measure_depth = sl.MEASURE.DEPTH
    point_cloud_mode = sl.MEASURE.XYZRGBA
    # Create the parameters types required for retrieving the image from the camera
    cam_runtime_parameters = sl.RuntimeParameters()
    print("Stream View. Press 's' to save image and cloud points")
    print("Stream View. Press 'q' to exit")
    frame_time_ms = int(1000 / 30)
    # camera_parameters=cam.get_camera_information().camera_configuration.calibration_parameters.left_cam
    # for attr in dir(camera_parameters):
    #     if not attr.startswith("__"):
    #         print(f"{attr}: {getattr(camera_parameters, attr)}")
    # print('fxcccc:', getattr(camera_parameters, 'fx')) 

    while True:
        # StereoLabs data type for grabbing the images
        zed_image = sl.Mat()
        zed_point_cloud = sl.Mat()
        zed_depth_map= sl.Mat()
        if cam.grab(cam_runtime_parameters) == sl.ERROR_CODE.SUCCESS:            
            # Retrieve image from the selected view
            cam.retrieve_image(zed_image, view)
            # Retrieve point cloud. Point cloud is aligned on the selected view
            cam.retrieve_measure(zed_point_cloud, point_cloud_mode)
            cam.retrieve_measure(zed_depth_map, measure_depth) # Retrieve depth
            image = zed_image.get_data()
            points_cloud = zed_point_cloud.get_data()

            x = int(zed_image.get_width() / 2)
            y = int(zed_image.get_height() / 2)
            cv2.imshow("Image", image)

        # Single call to cv2.waitKey()
        key = cv2.waitKey(5)  # Adjust time as needed

        # press s to save image and the cloud points
        if key == ord('s'):
            point_cloud_value = zed_point_cloud.get_value(x, y) 
            print('Point_Cloud:', point_cloud_value)
            print('x:', x, 'y:', y)
            distance = math.sqrt(point_cloud_value[1][0]*point_cloud_value[1][0] + point_cloud_value[1][1]*point_cloud_value[1][1] + point_cloud_value[1][2]*point_cloud_value[1][2])
            print(f"Distance to Camera at {{{x};{y}}}: {distance}")
  , (x,y), 5, (0,255,0), thickness=1)

            # save the captured image
            file_name = "zz""%Y%m%d_%H%M%S")
            cv2.imwrite(file_name+".png", image)
            print(f"Saved Image, Point Cloud, and Depth Map as {file_name}")

            #point_cloud_get_data= point_cloud.get_data()
            print("Point Cloud at x and Y:", point_cloud.get_value(x, y))

            print("Depth Map Value:", dp.get_value(x,y))
        if key == ord('q'):
# Close camera
    status = cam.close()
  1. And on second issue, I found the location of the pyzed folder. What to do now to get the auto completion feature working in vscode? Sorry to ask this but I didn’t understand it and I was looking to solve this auto completion thing from a very long time.

Hi @mattrouss Thank you for your great job.
I have almost the same issue, however I am dealing with irregular shape objects. Could you please advise how can I find an accurate surface area while I am dealing with irregular shape?

Hi @aakashgoyal,

  1. I apologize, the read method does not handle sl.Mat in float formats (such as pfm or ply) so your approach to use open3d previously was correct.

  2. Have you tried the solution given by this post? No Auto Suggest in VSCode for

Hi @code_lover,

For an irregular shape, you will have to find a way to segment the surface, either in 2d or in 3d.

I forgot to mention that the ZED SDK has a plane detection sample here: Plane Detection Overview - Stereolabs
It will estimate the plane at any given point of an image, and will provide the normal, center, and bounds of the plane as a polygon.

You can test it to see if this meets the requirements of your application.

1 Like

Thanks. I have already segment the surface but since the object has a irregular shape I don’t know how to measure the area.

You can look into ways of computing the area of a closed polygon: algorithm - How do I calculate the area of a 2d polygon? - Stack Overflow

1 Like

Hello @mattrouss,

As mentioned in the link, I didn’t find sl.pyx file anywhere. :frowning:
It would be nice if you can send a link to the file and then I can try.

I got the link and I am posting it here for easy access if someone not able to find it easily. May this will help:


And yes, it worked and now i am getting auto code suggestions in VS code.