Combining zed2 and segment anything

hey guys can anybody please help me on how i can capture images form zed2i and then send the images to segment anything which will allow ne to detect the object as this is my university project and any help would be highly appreciated. ia ma currently using python for coding and zed box as my power unit

Hi @Tilak1709,

I imagine you’re referring to Segment Anything (GitHub) from Meta AI?
As first steps, you can try following the doc about Custom Object Detection along with our Python samples.

Don’t hesitate if you have more precise questions,


Hey @JPlou

Thank you for your quick response I have used the sample from object detection and drafted a code but the problem I am facing is that the zed camera starts but I cannot see any image

import as sl
import cv2
from segment_anything import SamPredictor, sam_model_registry

def main():
# Create a Camera object
zed = sl.Camera()

# Create a InitParameters object and set configuration parameters
init_params = sl.InitParameters()
init_params.camera_resolution = sl.RESOLUTION.HD1080  # Use HD1080 video mode    
init_params.coordinate_units = sl.UNIT.METER
init_params.camera_fps = 30                          # Set fps at 30
init_params.coordinate_system = sl.COORDINATE_SYSTEM.RIGHT_HANDED_Y_UP

# Open the camera
err =
if err != sl.ERROR_CODE.SUCCESS:
    print(f"Failed to open the camera: {err}")

# Enable the key callback function
cv2.namedWindow("ZED Camera")
cv2.setMouseCallback("ZED Camera", key_callback)

# Load the Segment Anything model
model_type = "vit_h"  # Replace with your desired model type
checkpoint_path = "/usr/local/zed/samples/Tilak/sam_vit_h_4b8939.pth"  # Replace with your checkpoint path
sam = sam_model_registry[model_type](checkpoint=checkpoint_path)
predictor = SamPredictor(sam)

# Capture a frame when 'i' key is pressed
image = sl.Mat()
runtime_parameters = sl.RuntimeParameters()
while True:
    if zed.grab(runtime_parameters) == sl.ERROR_CODE.SUCCESS:
        zed.retrieve_image(image, sl.VIEW.LEFT)

        # Convert the ZED image to RGB format
        image_rgb = image.get_data()[:, :, ::-1]

        # Display the image
        cv2.imshow("ZED Camera", image_rgb)

    # Check for key press
    key = cv2.waitKey(1)
    if key == ord('i'):
        # Save the image to a file
        cv2.imwrite("captured_image.jpg", image_rgb)
        print("Image captured and saved successfully!")

        # Send the image to Segment Anything for segmentation
        masks, _, _ = predictor.predict(image_rgb)
        # Perform further processing with the generated masks

        print("Image segmented successfully!")

    # Check for 'Esc' key press to exit
    if key == 27:

# Close the camera and destroy the OpenCV windows

def key_callback(event, x, y, flags, param):
# Empty key callback function

if name == “main”:

i followed the instruction given in this link. i downloaded the [ViT-H SAM model.] and used it as my data set (
GitHub - facebookresearch/segment-anything: The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
if you could please let me know where i am going wrong it would be really helpful
thanks in advance for your suggestions


Do you have any errors, or just no image? Did you try waiting for a bit?

I was able to run this code fine after following the installation instructions on Segment Anything’s repo (there are definitely still some things to fix, however, the segment anything part does not work as is, but that’s not the issue here).

I do have the image of the ZED, it can take 5-10 seconds to open (and I have a pretty beefy machine), so maybe it’s an issue with that?