Extracing point cloud and mesh of human body

felixshing · July 25, 2023, 8:46pm

Hello, I am currently trying to extract point cloud and mesh data from the human body. My current design is to first perform object detection to get the mask of the human body, then I have two choices:

retrive the whole point cloud of the scene, then based on the mask data (the pixel position of the human body) to retrive the point cloud of the human body in the whole point cloud.
Based on the mask, segment rgb image, then align it with depth information to get the point cloud of human body.
I feel like 2) should be faster than 1), but 1) is easier. Thus, I try to implement 1) first, but meet some problems that would like a help. Follwing is my main code

ObjectDetectionRuntimeParameters detection_parameters_rt;
    detection_parameters_rt.detection_confidence_threshold = 50;
    detection_parameters_rt.object_class_filter = {OBJECT_CLASS::PERSON};
    detection_parameters_rt.object_class_detection_confidence_threshold[OBJECT_CLASS::PERSON] = 50;
    Objects objects;
    Mat mask;
    Mat image;
    Mat point_cloud;
    while(zed.grab() == ERROR_CODE::SUCCESS){
        zed.retrieveImage(image, VIEW::LEFT, MEM::CPU);

        // save the image in PNG with the image timestamp as name
        auto timestamp = image.timestamp.getMilliseconds();
        zed.retrieveObjects(objects, detection_parameters_rt);

            cout << objects.object_list.size() << " Object(s) detected\n\n";
            if (!objects.object_list.empty()) {

                auto first_object = objects.object_list.front();
   
                mask = first_object.mask;
                // save mask
                mask.write(("mask/"+to_string(timestamp)+".png").c_str());
                
                // Retrieve the point cloud
                zed.retrieveMeasure(point_cloud, MEASURE::XYZRGBA);

                Mat body_point_cloud(point_cloud.getHeight(), point_cloud.getWidth(), MAT_TYPE::F32_C4, MEM::CPU);

                for (int y = 0; y < point_cloud.getHeight(); y++) {
                    for (int x = 0; x < point_cloud.getWidth(); x++) {
                        //retrive the point cloud in the mask
                        if (mask.getValue<float>(y, x) > 0) {
                            body_point_cloud.setValue<float>(y, x, point_cloud.getValue<float>(y, x));
                        }
                    }
                }

                body_point_cloud.write(("body_ptcl/"+to_string(timestamp)+".ply").c_str());
            }
        }

The error shows in the mask.getValue and point_cloud.getValue that “no instance of function template “sl::Mat::getValue” matches the argument list”. Could anyone help me fix this error?
Moreover, could anyone give me sone hint that how to implement the 2) scheme?

felixshing · July 26, 2023, 12:00am

Problem solved for scheme #1. Any suggestion on scheme #2 is welcome!

My original code incorrectly calls getValue and setValue. Moreover, another error is that the mask is generated from the bounding box, which means the coordinate of the mask is in the bounding box, not in the original whole scene. My new code should fix the problem, which is listed below.

while(zed.grab() == ERROR_CODE::SUCCESS){
        zed.retrieveImage(image, VIEW::LEFT, MEM::CPU);

        // save the image in PNG with the image timestamp as name
        auto timestamp = image.timestamp.getMilliseconds();
        zed.retrieveObjects(objects, detection_parameters_rt);

            cout << objects.object_list.size() << " Object(s) detected\n\n";
            if (!objects.object_list.empty()) {

                auto first_object = objects.object_list.front();

                mask = first_object.mask;
                // save mask
                //mask.write(("mask/"+to_string(timestamp)+".png").c_str());
                
                // Retrieve the point cloud
                zed.retrieveMeasure(point_cloud, MEASURE::XYZRGBA);

                point_cloud.write(("ptcl_ori/"+to_string(timestamp)+".ply").c_str());

                //Bounding box coordinate
                int bb_x_min = first_object.bounding_box_2d[0][0];
                int bb_y_min = first_object.bounding_box_2d[0][1];
                int bb_x_max = first_object.bounding_box_2d[2][0];
                int bb_y_max = first_object.bounding_box_2d[2][1];

                sl::Mat body_point_cloud(bb_x_max - bb_x_min, bb_y_max - bb_y_min, MAT_TYPE::F32_C4, MEM::CPU);

                for (int y = bb_y_min; y < bb_y_max; y++) {
                    for (int x = bb_x_min; x < bb_x_max; x++) {
                        // If the pixel belongs to the body (mask pixel is 255), copy the point to the new point cloud
                        sl::uchar1 mask_value;
                        if (mask.getValue(x - bb_x_min, y - bb_y_min, &mask_value) == ERROR_CODE::SUCCESS && mask_value == 255) {
                            sl::float4 point_value;
                            if(point_cloud.getValue(x, y, &point_value) == ERROR_CODE::SUCCESS){
                                body_point_cloud.setValue(x - bb_x_min, y - bb_y_min, point_value);
                            }
                        }
                    }
                }
                body_point_cloud.write(("ptcl_data/"+to_string(timestamp)+".ply").c_str());
            }
        }

Update: This algorithm is very inefficient. In my PC with 3080GPU and i9 CPU, it sometimes incurs “*** buffer overflow detected ***: terminated error.” And it seems that this algorithm has some errors, which is discussed in the follwing.

felixshing · July 26, 2023, 3:14am

After running the above algorithm, the extracted point cloud somehow has a lot of points on the human face that do now in the original point, messing up the face part. For example, the following is the original point cloud

And this is the execrated point cloud after running the above algorithm, which is very weird

But I cannot find the error in my algorithm. Any comments would be appreciated!

alassagne · July 26, 2023, 1:56pm

Since you used a lot of loops, if you want to speed this up I suggest you make it multithread - check out openmp, it’s quite easy. Even so it will be even better with CUDA.

About the 2) solution you suggest, I’m not sure it’s worth it.

felixshing · July 26, 2023, 2:10pm

Thank you for your reply. But why there are a lot of noisy points on my face? The algorithem itself should work, even though it takes a long time to do that.

Regarding 2) solution, why do you think it is not worth it? Cause in my understanding, this solution can reduce the number of generated points, since we don’t need to get the point cloud for the whole scene. Thus, it should be faster than 1)?

alassagne · July 27, 2023, 10:00am

As I understood, you still retrieve the full depth and apply the mask afterwards in your second solution. Retrieving the point cloud once you have the depth is not much.

About the points on you face, I can’t really tell - does it also happen with the spatial mapping sample ?

felixshing · July 28, 2023, 2:43am

aha yes, the second solution still requires getting full depth and alignment with the segmented RGB image. I would like to ask how point_cloud.getValue(x,y) function works? Does it get the RGB and depth information from pixel (x,y) in the RGB image and depth map, respectively, and then synthesize the point cloud? If so, I may not need to get the point cloud of the whole scene first. Instead, when I get the pixel of the mask, I can just synthesize the point cloud within the mask, which is essentially the second solution.

Regarding the points on my face, no, I haven’t tried spatial mapping yet. Will the quality of point cloud and mesh in spatial mapping be better than depth sensing? If I use spatial mapping, I should also be able to get the point cloud and mesh from the mask, right?

Moreover, I found that when I generate the point cloud of the whole scene, there is a very long “shadow point” that should not be occur, as I posted in Incorrect initial camera position when displaying point cloud captured by Zed 2i on MeshLab. Moreover, in the extracted human model, if I look at it from the side part, there are also a lot of “shadow points”. Not sure if this is the real reason and don’t know how to address it.

felixshing · July 30, 2023, 5:21am

I have fixed the weird point on my face. I did two revisions. First, the mask is initially defined as uchar1. I transfer it to int when comparing it with 255. Second, in my previous code, I do not assign value to the area that does not belong to the mask while in the bounding box, which is why it leads to “*** buffer overflow detected ***: terminated error.” when I save the new point cloud. In new code, for those points, I set them as Nan.

However, I found that some frames do not have mask, even though the object (person) has been detected. Any reason for this?

 while(zed.grab() == ERROR_CODE::SUCCESS){
        zed.retrieveImage(image, VIEW::LEFT, MEM::CPU);


        // save the image in PNG with the image timestamp as name
        auto timestamp = image.timestamp.getMilliseconds();
        image.write(("../outputs/image/"+to_string(timestamp)+".png").c_str());
        // continue;

        zed.retrieveObjects(objects, detection_parameters_rt);

            cout << objects.object_list.size() << " Object(s) detected\n\n";
            if (!objects.object_list.empty()) {

                auto first_object = objects.object_list.front();

                mask = first_object.mask;

                //if mask is empty, skip
                if(mask.getWidth() == 0 || mask.getHeight() == 0){
                    continue;
                }

                // save mask
                mask.write(("../outputs/mask/"+to_string(timestamp)+".png").c_str());
                
                // Retrieve the point cloud
                zed.retrieveMeasure(point_cloud, MEASURE::XYZRGBA);

                point_cloud.write(("../outputs/ptcl_ori/"+to_string(timestamp)+".ply").c_str());

                //Bounding box coordinate
                int bb_x_min = first_object.bounding_box_2d[0][0];
                int bb_y_min = first_object.bounding_box_2d[0][1];
                int bb_x_max = first_object.bounding_box_2d[2][0];
                int bb_y_max = first_object.bounding_box_2d[2][1];

                sl::Mat body_point_cloud(bb_x_max - bb_x_min, bb_y_max - bb_y_min, MAT_TYPE::F32_C4, MEM::CPU);

                for (int y = bb_y_min; y < bb_y_max; y++) {
                    for (int x = bb_x_min; x < bb_x_max; x++) {
                        // If the pixel belongs to the body (mask pixel is 255), copy the point to the new point cloud
                        sl::uchar1 mask_value;
                        

                        if (mask.getValue(x - bb_x_min, y - bb_y_min, &mask_value) == ERROR_CODE::SUCCESS ) {
                            if (int(mask_value) != 255){
                                sl::float4 null_value(NAN, NAN, NAN, NAN);  // Define a "null" value.
                                body_point_cloud.setValue(x - bb_x_min, y - bb_y_min, null_value);
                                continue;
                            }
                                
                            //cout<<"mask_value: "<<int(mask_value)<<endl;
                            sl::float4 point_value;
                            if(point_cloud.getValue(x, y, &point_value) == ERROR_CODE::SUCCESS){
                                returned_state= body_point_cloud.setValue(x - bb_x_min, y - bb_y_min, point_value);
                                if (returned_state != ERROR_CODE::SUCCESS) {
                                    cout << "Error when set value " << returned_state << ", exit program.\n";
                                }
                            }
                            else{
                                cout << "Error when get value: " << point_cloud.getValue(x, y, &point_value) << endl;
                            }
                        }
                    }
                }
                returned_state = body_point_cloud.write(("../outputs/ptcl_data/"+to_string(timestamp)+".ply").c_str());
                if (returned_state != ERROR_CODE::SUCCESS) {
                    cout << "Error " << returned_state << ", exit program.\n";
                    zed.close();
                    return EXIT_FAILURE;
                }
            }
        }

alassagne · July 31, 2023, 7:57am

I thought you were using spatial mapping in the first place. You will not have a better quality with spatial mapping, just something really different.

About your mask issue : Are you saying that some detections do not have a mask ? That seems like a bug. How do you get this information ?

felixshing · July 31, 2023, 8:10pm

felixshing:

        zed.retrieveObjects(objects, detection_parameters_rt);

            cout << objects.object_list.size() << " Object(s) detected\n\n";
            if (!objects.object_list.empty()) {

                auto first_object = objects.object_list.front();

                mask = first_object.mask;

                //if mask is empty, skip
                if(mask.getWidth() == 0 || mask.getHeight() == 0){
                    continue;
                }

About the mask issue, see my code above. I define “detection_parameters_rt” to detect person object only. I can detect object in each frame. However, sometimes both mask.getWidth() and mask.getHeight() return 0 and it cannot save the mask. Thus, I have to write a new code that skips the empty mask. However, the mask should be detected, given that the object is detected. I can provide my code and svo file to you guys, if you need.

Regarding spatial mapping vs. depth sensing, if I want to extract human mesh data from zed, I must use spatial mapping, right? It seems that only API from spatial mapping can get mesh data. And I will be also able to apply mask to extract mesh only for the human body. Is it correct?

alassagne · August 1, 2023, 7:32am

You found a bug. Thank you for reporting it, we’ll fix that.

It’s correct that only the spatial mapping will retrieve a Mesh. However you will not be able to apply a mask using only our SDK, You will need to code it.

Spatial Mapping will have significant improvements in the very next versions, stay tuned

felixshing · August 1, 2023, 12:58pm

Thank you for your reply. Regarding extracting mesh for the human model, could you give me some hints about how to implement it, for example, some open-source algorithms?

BTW, is there any way to conveniently upgrade zed sdk? Each time I just delete the old one and download the new one…

alassagne · August 1, 2023, 5:05pm

Installing a newer version erase the old one, you don’t need anything else. An automatic updater would be nice, but it’s not something we’ll have short-term.

felixshing · August 1, 2023, 5:38pm

Thank you for your reply. You may miss my another question in my last post since I add it by editing. I would also like to ask could you recommend some open-source project or algorithms that can extract human body from mesh in an efficient way, cause we may need this in real time streaming

alassagne · August 1, 2023, 5:39pm

Sorry, I don’t know about such a project.