Individual skeleton data looks decent, but fused is terrible

haakonflaar · February 7, 2024, 11:24am

As displayed in the following video, skeleton extraction from some of the cameras seems good, however, the fused skeleton is rubbish. The calibration is maybe not perfect, but it was pretty good (in the sense that all skeletongs were fairly on top of one another when the calibration was completed.

I am using the newest Edge Body Tracking app in the hub, and the recording is done using your example of JSON skeleton data extraction of fused data.

Vid:

JPlou · February 7, 2024, 5:51pm

Hi @haakonflaar ,

It looks like the poor fps may be the cause. We’re working on improving them, and you will see an improvement in the next release of the SDK.
Can I ask on which hardware the ZEDs are running on ZEDHub?

haakonflaar · February 8, 2024, 7:59am

It looks more like a fusion software error than poor FPS, don’t you think? Since some of the raw skeletons from each of the individual cameras looks good. Are you not scaling the fused pose estimation by the confidence of each of the cameras?

I have been in contact with Stereolabs for about a year now regarding poor FPS on fused skeleton extraction with our 6 cameras. I manage to get around 10-12 fps by setting allow_reduced_precision_inference to true in a custom version of the Edge Body Tracking app. That has previously been at least good enough to get a decent skeleton extraction - miles better then what I get here. Now I get this though. I need it fixed with the current version of the SDK - can’t keep waiting for improvements that may or may not resolve the issue at some point in the future.

Could it be some corrupted files hanging around in the zed boxes? In that case, what should I do?

Each camera is running through its own ZED Box.

haakonflaar · February 8, 2024, 3:16pm

@JPlou Okay so it does seem like FPS makes a big difference. If I use the fast body tracking model I get an FPS of around 14 and the tracking is better. Still not usable as the skeletons are far too unstable, but at least the joints are not crunched up into one another all the time.

I also notice that there can be a big difference in the FPS each camera is managing to reach - some may go up to 14 while others is at 10 or below (this seems kinda random and frequently changing). I don’t see any point in capping the FPS at 15 as no camera seem to reach that FPS anyway.

Okay, so please give me your optimization tips so I can reach a higher FPS. I have created a custom Edge Body Tracking app and enabled “allow_reduced_prevision_inference” as adviced by @alassagne previously.

Edit: We have Xavier NX ZED Boxes. Please send a pre-release of the updated SDK if you have it ready and have seen FPS gain with it.
Edit 2: Running the newest version of Edge Body Tracking I get [2024/02/08, 15:37:02] [edge_body_tracking_service_1] Using SDK version : 4.0.7 - why not 4.0.8 which is what I have installed on my computer? Does it matter?

JPlou · February 8, 2024, 4:04pm

@haakonflaar Sorry for the delay,

The other tips to gain fps would be to disable both fitting and tracking in the senders, they are not used in the Fusion, so you could gain some fps.
The Fusion API did not change between 4.0.7 and 4.0.8, there is no issue having the senders in 4.0.7.

We do have improvements for body 18/34, please contact us through support@stereolabs.com so that the demand for the pre-release is formalized, I’ll see what’s possible with the team.

(Edit: Please reference this post in your mail.)

haakonflaar · February 8, 2024, 4:54pm

@JPlou Thanks for the help - email is sent.

I need to use Body 34 and for that it seems body fitting is automatically turned on(?) Turning it off in the hub app does not improve FPS.

alassagne · February 13, 2024, 9:16am

Hi, please don’t open duplicates, I sent you the windows version on our support platform already.
As for Edge body tracking, you’ll need to build a docker file that installs the latest SDK, and the app. In the docker-compose, you’ll just have to write the name of the new image.
I suggest you read docker and docker-compose documentation, it’s really good, and it’s a very useful technology.

haakonflaar · February 13, 2024, 12:04pm

When are you planning to officially release SDK 4.1 with an updated Edge Body Tracking application?

alassagne · February 13, 2024, 12:12pm

I can’t give you a precise timeline, but the development cycle is coming to an end. So, soon.

haakonflaar · February 16, 2024, 9:56am

@alassagne

I am sure it is very useful, however, you know Stereolabs’ docker images far better than I do, so I am guessing, whereas it would take me hours to figure out the ins and outs of your images, it would take you only minutes. I would greatly appreciate if you could serve up a docker image in docker hub with SDK 4.1 that I can directly pull from.

To be fair, the solution we use with 6 cameras, which you recommended to us more than a year ago, is simply not working due to poor FPS (and possibly imperfect fusion logic?). With great help from you guys we have come a long way to help the problem, but it is just not quite good enough to be used. I don’t think receiving help in applying an update you believe will solve our issues is too much to ask for, although I understand it comes with some inconveniences for you.

Really not trying to be impolite here btw, we just need this to work.

alassagne · February 19, 2024, 9:40am

Hi,

We will provide a 4.1 ocker image as soon as the 4.1 is out. We cannot spend so much resources to release the compatibility with an early-access version. It’s not final, some bugs can be there. If I understand you’d like to use it in production but it is not really safe.

haakonflaar · February 19, 2024, 9:46am

Hi,

I understand. We are not putting it into production in a fashion where it has to be 100% reliable - we are running some recordings next week where we need the best working body tracking available. I was hoping it wouldn’t be too much work for you to set up the required docker image, but I understand it will. I might give it a go myself, otherwise, we will revert to the early pre-alpha release we received that has proved most stable this far.

Thanks for the help anyways

haakonflaar · February 19, 2024, 1:19pm

@alassagne

By the way, do you have a sample application running individual senders that I can run directly from within the zed box?

That way I don’t need to bother with the docker image - can’t seem to find the Dockerfile of your images (don’t know if they are hidden).

alassagne · February 19, 2024, 1:40pm

You can check this out: Camera Streamer As A Service Memory Leak - #15 by alassagne

haakonflaar · February 20, 2024, 3:51pm

@alassagne

I have made a sender script inspired by your Python example. It works, I get the following confirmation when starting the fusion receiver script:

[RTPSession] Adding Receiver with IP: 10.50.232.191 ID 1386474649

Each Zed box has ZED SDK 4.1 installed and I have also installed ZED SDK 4.1 on Windows computer.

When I run the fusion receiver script (below) the GLViewer opens and I can see the FPS of each camera (image below), however, I can’t see any skeletons. The problem is when running Windows ZED SDK 4.1 with the fusion receiver.

Running Zed SDK 4.0.8 on computer and Edge Body Tracking app shows skeleton data (standard setup).
Running Zed SDK 4.0.8 on computer and sender script on cameras does not show skeletons.
Running Zed SDK 4.1 on computer and Edge Body Tracking app does not show skeletons.
Running Zed SDK 4.1 on computer and sender script on cameras does not show skeletons.

Also, for some reason, the first time I run the script the cameras are optimizing BODY 18 model instead of BODY 34 even though that is what I have in the script.

Sender script:

#include <sl/Camera.hpp>

int main() {
    // Create a Camera object
    sl::Camera zed;

    sl::InitParameters init_parameters;
    init_parameters.depth_mode = sl::DEPTH_MODE::ULTRA;
    init_parameters.camera_fps = 30;
    init_parameters.camera_resolution = sl::RESOLUTION::HD1080;
    init_parameters.coordinate_units = sl::UNIT::METER;
    init_parameters.coordinate_system = sl::COORDINATE_SYSTEM::RIGHT_HANDED_Y_UP;
    auto state = zed.open(init_parameters);
    if (state != sl::ERROR_CODE::SUCCESS)
    {
        std::cout << "Error with init parameters: " << state << std::endl;
        return false;
    }

    // in most cases in body tracking setup, the cameras are static
    sl::PositionalTrackingParameters positional_tracking_parameters;
    // in most cases for body detection application the camera is static:
    positional_tracking_parameters.set_as_static = true;
    state = zed.enablePositionalTracking(positional_tracking_parameters);
    if (state != sl::ERROR_CODE::SUCCESS)
    {
        std::cout << "Error with positional tracking: " << state << std::endl;
        return false;
    }

    // define the body tracking parameters, as the fusion can does the tracking and fitting you don't need to enable them here, unless you need it for your app
    sl::BodyTrackingParameters body_tracking_parameters;
    body_tracking_parameters.detection_model = sl::BODY_TRACKING_MODEL::HUMAN_BODY_ACCURATE;
    body_tracking_parameters.body_format = sl::BODY_FORMAT::BODY_34;
    body_tracking_parameters.enable_body_fitting = false;
    body_tracking_parameters.enable_tracking = false;
    body_tracking_parameters.allow_reduced_precision_inference = true;
    state = zed.enableBodyTracking(body_tracking_parameters);
    if (state != sl::ERROR_CODE::SUCCESS)
    {
        std::cout << "Error with body tracking parameters: " << state << std::endl;
        return false;
    }


    // std::string ip = "10.180.69.199";
    int port = 30020;

    sl::CommunicationParameters communication_parameters;
    communication_parameters.setForLocalNetwork(port);

    zed.startPublishing(communication_parameters);

    sl::Bodies bodies;
    sl::BodyTrackingRuntimeParameters body_runtime_parameters;
    body_runtime_parameters.detection_confidence_threshold = 40;


    // as long as you call the grab function and the retrieveBodies (which runs the detection) the camera will be able to seamlessly transmit the data to the fusion module.
    while (true) {
        if (zed.grab() == sl::ERROR_CODE::SUCCESS) {
            // just be sure to run the bodies detection
            zed.retrieveBodies(bodies, body_runtime_parameters);
        }
    }


    // Close the camera
    zed.close();
    return 0;
}

Fusion receiver script:

// ZED include
#include "ClientPublisher.hpp"
#include "GLViewer.hpp"
#include "utils.hpp"
#include "SK_Serializer.hpp"


int main(int argc, char **argv) {
	if (argc != 2) {
		// this file should be generated by using the tool ZED360
		std::cout << "Need a Configuration file in input" << std::endl;		
		return 1;
	}
    
    // Defines the Coordinate system and unit used in this sample
    constexpr sl::COORDINATE_SYSTEM COORDINATE_SYSTEM = sl::COORDINATE_SYSTEM::RIGHT_HANDED_Y_UP;
    constexpr sl::UNIT UNIT = sl::UNIT::METER;

    // Read json file containing the configuration of your multicamera setup.    
    auto configurations = sl::readFusionConfigurationFile(argv[1], COORDINATE_SYSTEM, UNIT);

    if (configurations.empty()) {
        std::cout << "Empty configuration File." << std::endl;
        return EXIT_FAILURE;
    }

    // Initialize the fusion module
    sl::InitFusionParameters init_params;
    init_params.coordinate_units = UNIT;
    init_params.coordinate_system = COORDINATE_SYSTEM;
    init_params.verbose = true;

    // create and initialize it
    sl::Fusion fusion;
    fusion.init(init_params);

    // subscribe to every cameras of the setup to internally gather their data
    std::vector<sl::CameraIdentifier> cameras;
    for (auto& it : configurations) {
        sl::CameraIdentifier uuid(it.serial_number);
        // to subscribe to a camera you must give its serial number, the way to communicate with it (shared memory or local network), and its world pose in the setup.
        auto state = fusion.subscribe(uuid, it.communication_parameters, it.pose);
        if (state != sl::FUSION_ERROR_CODE::SUCCESS)
            std::cout << "Unable to subscribe to " << std::to_string(uuid.sn) << " . " << state << std::endl;
        else
            cameras.push_back(uuid);
    }

    // check that at least one camera is connected
    if (cameras.empty()) {
        std::cout << "no connections " << std::endl;
        return EXIT_FAILURE;
    }

    // as this sample shows how to fuse body detection from the multi camera setup
    // we enable the Body Tracking module with its options
    sl::BodyTrackingFusionParameters body_fusion_init_params;
    body_fusion_init_params.enable_tracking = true;
    body_fusion_init_params.enable_body_fitting = true; // skeletons will looks more natural but requires more computations
    fusion.enableBodyTracking(body_fusion_init_params);

    // define fusion behavior 
    sl::BodyTrackingFusionRuntimeParameters body_tracking_runtime_parameters;
    // be sure that the detection skeleton is complete enough
    body_tracking_runtime_parameters.skeleton_minimum_allowed_keypoints = 7;

    // we can also want to retrieve skeleton seen by multiple camera, in this case at least half of them
    body_tracking_runtime_parameters.skeleton_minimum_allowed_camera = 2; // cameras.size() / 2.;

    // creation of a 3D viewer
    GLViewer viewer;
    viewer.init(argc, argv);

    std::cout << "Viewer Shortcuts\n" <<
        "\t- 'r': swicth on/off for raw skeleton display\n" <<
        "\t- 'p': swicth on/off for live point cloud display\n" <<
        "\t- 'c': swicth on/off point cloud display with flat color\n" << std::endl;

    // fusion outputs
    sl::Bodies fused_bodies;
    std::map<sl::CameraIdentifier, sl::Bodies> camera_raw_data;
    sl::FusionMetrics metrics;
    std::map<sl::CameraIdentifier, sl::Mat> views;
    std::map<sl::CameraIdentifier, sl::Mat> pointClouds;
    sl::Resolution low_res(512,360);

    // JSON output
    nlohmann::json bodies_json_file;

    bool new_data = false;
    sl::Timestamp ts_new_data = sl::Timestamp(0);

    
    // run the fusion as long as the viewer is available.
    while (viewer.isAvailable()) {
        try {
            // run the fusion process (which gather data from all camera, sync them and process them)
            if (fusion.process() == sl::FUSION_ERROR_CODE::SUCCESS) {
                // Retrieve fused body
                fusion.retrieveBodies(fused_bodies, body_tracking_runtime_parameters);
                // for debug, you can retrieve the data send by each camera
                for (auto& id : cameras) {
                    fusion.retrieveBodies(camera_raw_data[id], body_tracking_runtime_parameters, id);
                    sl::Pose pose;
                    if(fusion.getPosition(pose, sl::REFERENCE_FRAME::WORLD, id, sl::POSITION_TYPE::RAW) == sl::POSITIONAL_TRACKING_STATE::OK)
                        viewer.setCameraPose(id.sn, pose.pose_data);

                    auto state_view = fusion.retrieveImage(views[id], id, low_res);
                    auto state_pc = fusion.retrieveMeasure(pointClouds[id], id, sl::MEASURE::XYZBGRA, low_res);
                    if(state_view == sl::FUSION_ERROR_CODE::SUCCESS && state_pc ==  sl::FUSION_ERROR_CODE::SUCCESS)
                        viewer.updateCamera(id.sn, views[id], pointClouds[id]);                    
                }

                // get metrics about the fusion process for monitoring purposes
                fusion.getProcessMetrics(metrics);
            }
            // update the 3D view
            viewer.updateBodies(fused_bodies, camera_raw_data, metrics);
            //std::this_thread::sleep_for(std::chrono::microseconds(10));

            // Serialize dected bodies into a json container
            bodies_json_file[std::to_string(fused_bodies.timestamp.getMilliseconds())] = sk::serialize(fused_bodies); 
        } catch (std::exception &e) {
            std::cerr << "Error: " << e.what() << std::endl;
            break;
        }
    } 

    // Set outFileName to detected_bodies_<timestamp>.json
    std::string outfileName("detected_bodies_" + std::to_string(fused_bodies.timestamp.getMilliseconds()) + ".json");
    if (bodies_json_file.size())
    {
        std::ofstream file_out(outfileName);
        file_out << std::setw(4) << bodies_json_file << std::endl;
        file_out.close();
        std::cout << "Successfully saved the body data to " + outfileName << std::endl;        
    }

    viewer.exit();

    fusion.close();

    return EXIT_SUCCESS;
}

GLViewer:

EDIT: Printing body format and number of bodies shows no skeletons were detected by the sender:

            std::cout << "Body format: " << bodies.body_format << "" << std::endl;
            std::cout << "Number of bodies detected: " << bodies.body_list.size() << " bodies" << std::endl;

Body format: BODY_34
Number of bodies detected: 0 bodies
Body format: BODY_34
Number of bodies detected: 0 bodies

alassagne · February 20, 2024, 4:59pm

Thank you for the report. Testing this kind of setup is part of our QA process, we’ll make sure it works when the final 4.1 is released.

haakonflaar · February 20, 2024, 9:37pm

@alassagne You are very welcome

You don’t happen to have an idea why this is happening, and a potential fix? Maybe a syntax change for fetching skeleton data and body tracking parameters? Feel like I’m close with this, and I could see the FPS gain so it would be nice to have this working for next week.

Edit: Trying to build the sender on ZED box with 4.0.8 SDK gives

/usr/bin/ld: /usr/local/zed/lib/libsl_zed.so: undefined reference to `NvBufSurfaceGetMapParams'

Trying to run the sender built on sdk 4.1 on sender with ZED SDK 4.0.8 captures no skeletons.

Do you see any issues with the sender?

haakonflaar · February 23, 2024, 8:32am

@alassagne @JPlou Bump - do you see any faults in my code that could cause the issue?

alassagne · February 23, 2024, 5:21pm

This is a problem in your SDK installation. Probably you use the wrong L4T version? Or a mismatch between ZED SDK and Hub, if you use both.

haakonflaar · February 23, 2024, 6:47pm

Yeah nevermind that, I managed to build it with the current sdk as well, however, it is still detecting no bodies. Is it something wrong with the sender?