Tutorial - Using 3D Object Detection

This tutorial shows how to use your ZED 3D camera to detect, classify and locate persons in space (compatible with ZED 2 only). Detection and localization works with both a static or moving camera.

Getting Started

  • First, download the latest version of the ZED SDK.
  • Download the C++ or Python sample code.

Code Overview

Open the camera

In this tutorial, we will use the Object Detection AI module of the ZED SDK. As in previous tutorials, we create, configure and open the camera.

// Create ZED objects
Camera zed;
InitParameters initParameters;
initParameters.camera_resolution = RESOLUTION::HD720;
initParameters.depth_mode = DEPTH_MODE::ULTRA;
initParameters.sdk_verbose = true;

// Open the camera
ERROR_CODE zed_error = zed.open(initParameters);
if (zed_error != ERROR_CODE::SUCCESS) {
	std::cout << "Error " << zed_error << ", exit program.\n";
	return 1; // Quit if an error occurred
}

Enable 3D Object detection

Before enabling object detection, we specify the ObjectDetectionParameters of the module. In this tutorial, we use the following settings:

// Define the Objects Detection module parameters
ObjectDetectionParameters detection_parameters;
detection_parameters.image_sync = true;
detection_parameters.enable_tracking = true;
detection_parameters.enable_mask_output = true;

// Object tracking requires camera tracking to be enabled
if (detection_parameters.enable_tracking)
	zed.enablePositionalTracking();
  • image_sync determines if object detection runs for each frame or asynchronously in a separate thread.
  • enable_tracking allows objects to be tracked across frames and keep the same ID as long as possible. Positional tracking must be active in order to track objects movements independently from camera motion.
  • enable_mask_output outputs 2D masks over detected objects. Since it requires additional processing, disable this option if not used.

Now let’s enable object detection which will load an AI model. This operation can take a few seconds. The first time the module is used, the model will be optimized for your hardware and this can take up to a few minutes. The model optimization operation is done only once.

cout << "Object Detection: Loading Module..." << endl;
err = zed.enableObjectDetection(detection_parameters);
if (err != ERROR_CODE::SUCCESS) {
	cout << "Error " << err << ", exit program.\n";
	zed.close();
	return 1;
}

Retrieve object data

To retrieve detected objects in an image, use the retrieveObjects() function with an Objects parameter that will store objects data.

Since image_sync is enabled, for each grab call, the image will be fed into the AI module that will output the detected objects for each frame. We also set object confidence threshold at 40 to keep only very confident detections.

// Set runtime parameter confidence to 40
ObjectDetectionRuntimeParameters detection_parameters_runtime;
detection_parameters_runtime.detection_confidence_threshold = 40;

Objects objects;

// Grab new frames and detect objects
while (zed.grab() == ERROR_CODE::SUCCESS) {
	err = zed.retrieveObjects(objects, detection_parameters_rt);

	if (objects.is_new) {
        // Count the number of objects detected
        cout << objects.object_list.size() << " Object(s) detected"

        // Display the 3D location of an object    
        cout << " 3D position: " << first_object.position;

        // Display its 3D bounding box coordinates        
        cout << " Bounding box 3D \n";
        for (auto it : first_object.bounding_box)
          cout << "    " << it;    
	}
}

Disable modules and exit

Before exiting the application, modules need to be disabled and the camera closed. Note that zed.close() can also disable properly all active modules. The close() function is also called automatically by the destructor if necessary.

// Disable object detection and close the camera
zed.disableObjectDetection();
zed.close();
return 0;

And this is it!

Next Steps

At this point, you know how to retrieve image, depth and 3D objects data from ZED stereo cameras.

To detect objects in the scene and display their 3D bounding boxes over the live point cloud, check the 3D Object Detection advanced sample code.