Depth-based “humanoid” mask, feet-vs-floor separation issues

Jeff · September 15, 2025, 9:50am

Context

Unity 6000.0.33, URP
ZED Unity SDK 5.0.1
I render a humanoid mask in a custom shader using depth. The RGB image is cropped by distance and I subtract the floor using a one-shot Y-height calibration (from MEASURE.XYZ) to keep the feet while removing the floor.
Attached screenshot: either the mask stops around the ankles, or, if I widen the band, feet are included but so is the floor and nearby objects.
Camera : ZED 2 / ZED 2i
Depth mode : Neural Depth Mode

Current results

Narrow band → mask cuts at the ankles.

Wider band (floorHeightBandMeters) → feet included better, but floor starts to leak.

Even wider band → floor and floor objects included.

Implementation (short)

Full-screen Graphics.Blit into a RenderTexture with Custom/ZEDMask.
Runtime mask uses MEASURE.DEPTH.
Floor calibration uses MEASURE.XYZ; I take Y (meters) to build a “Floor” RenderTexture that I subtract in the shader.
Key knob: floorHeightBandMeters (tolerance around floor height).

Minimal C# excerpt (param passing)

public void CalibrateFloor()
{
Texture rgb = zedPlane?.TextureEye;
if (rgb == null || maskMaterial == null) return;

 Texture xyzTex = zedCamera != null
     ? zedCamera.CreateTextureMeasureType(sl.MEASURE.XYZ)
     : zedPlane?.Depth; 
 if (xyzTex == null) return;

 int w = rgb.width, h = rgb.height;
 if (floorRT == null || floorRT.width != w || floorRT.height != h)
 {
     if (floorRT != null) floorRT.Release();
     floorRT = new RenderTexture(w, h, 0, RenderTextureFormat.ARGB32);
     floorRT.Create();
 }

 maskMaterial.SetFloat("_HasFloor", 0f);
 maskMaterial.SetTexture("_FloorTex", Texture2D.blackTexture);


 maskMaterial.SetTexture("_DepthTex", xyzTex);  
 maskMaterial.SetFloat("_UseYFloor", 1f);
 maskMaterial.SetFloat("_FloorY", floorHeightMeters);
 maskMaterial.SetFloat("_FloorYBand", floorHeightBandMeters);
 maskMaterial.SetFloat("_BuildFloor", 1f);

 Graphics.Blit(rgb, floorRT, maskMaterial);

 maskMaterial.SetFloat("_BuildFloor", 0f);
 floorReady = true;

}

Minimal C# Shader

// — Build floor mode: mark floor pixels (Y close to _FloorY) —
if (_BuildFloor > 0.5)
{
float3 P = depth.rgb; // XYZ in meters (camera space)
if (all(P == 0)) return half4(0,0,0,1);
float dy = abs(P.y - _FloorY);
float isFloor = step(dy, _FloorYBand);

return half4(isFloor, isFloor, isFloor, 1);
}

// — Runtime: combine mask and floor subtraction —
float4 maskRaw = SAMPLE_TEXTURE2D(_MaskTex, sampler_MaskTex, uv);
float4 maskCol = float4(maskRaw.b, maskRaw.g, maskRaw.r, maskRaw.a);

Works

Distance-based mask
Y-based floor subtraction (generally)
Feet vs floor

Issue

Feet vs floor boundary is unstable: either ankles get cut, or floor leaks into the mask when widening the band.

Tried

floorHeightBandMeters 0.02–0.08 m
Floor normal dot thresholding

Questions

I currently build a floor mask from a Y-band (MEASURE.XYZ) and subtract it from a DEPTH-based humanoid mask.

Is there a better way to make the feet vs floor split more robust?
Is MEASURE.XYZ Y strictly in camera space (roll-independent), or should I transform camera→world to stabilize floor height?

BenjaminV · September 15, 2025, 12:04pm

Hi,
If you know the camera’s height in advance, you can also set it directly to avoid any estimation inaccuracy when performed by the ZED SDK.

Also, I’d recommend setting “tracking is static” to true (see image). With this option enabled, the camera’s position is estimated once (same as your floor subtraction).

Yes, the depth is in Camera space, you might want to transform it to world space (you can use zedManager.GetZedRootTransform().TransformPoint()).

Jeff · September 15, 2025, 3:09pm

Hello, and thank you for the quick answer !

I’ve enabled Tracking is static. I did not express myself clearly; My issue isn’t pose stability, but the stability boundary between the feet and the floor in the mask.

What I’m doing now:

Build a foreground mask from depth (range-based).
Build a floor mask once from MEASURE.XYZ in a low ROI by marking pixels whose Y is near Y0.
Subtract the floor mask from the foreground mask.

Outcome: with a narrow Y band I lose the ankles; if I widen the band, the floor starts leaking (and floor objects come in too).

Could you suggest a more reliable method than “Y close to Y₀” to keep shoes/feet while removing the floor? Is using MEASURE.XYZ as a starting point the right approach?

My goal is a stable feet-vs-floor split in a game area, keeping shoes while removing the ground.

Thanks a lot for any best practices or parameters you can share!

Best Regards, Jeff

BenjaminV · September 16, 2025, 8:33am

I guess the most important thing is to make sure the camera’s estimated pose is correct, otherwise the plane you will use to remove the floor will not be aligned with the floor, which can create some issues as well.
It’s also possible that the depth values on your shoes are not correct. Is it possible for you to display the point cloud to validate the depth at your feet ?