Object Detection

In this tutorial, you will place virtual boxes around real-world people detected by your ZED 2.

Note: The object detection module requires a ZED 2 camera; the original ZED and ZED Mini cameras do not support this feature. Purchase a ZED 2 here[LINK].

What is 3D Object Detection?

The ZED SDK Object Detection module uses a highly-optimized AI model to recognize specific objects (currently people and vehicles) within the video feed. Using depth, it goes a step further than similar algorithms to calculate the object’s 3D position in the world, not just within the 2D image.

In Unity, this lets us represent real objects/people in a 3D scene with virtual GameObjects. This tutorial will simply add boxes to encompass each person, but this feature could be extended to many applications, such as counting and visualizing occupancy in a zone or having virtual characters react to nearby people.

Preparing your ZED 2

The Object Detection module uses the floor to help make sense of object positions, which significantly improves the accuracy of estimated 3D positions. The ZED will automatically detect the floor on the first frame it’s initialized, provided you have it pointing so that a decent amount of the floor is visible. The more floor that is visible when the camera starts, the better, like so:

Setting Up the Scene

Basic Settings

  • Create a new scene and delete the Main Camera

  • In the Project window, go to ZED -> Prefabs and drag ZED_Rig_Mono into the Hierarchy

  • Select the new ZED_Rig_Mono in the Hierarchy.

  • In the Motion Tracking section, make sure Estimate Initial Position is checked. This enables floor detection.

  • In the Inspector, set the Resolution to 1080p. This is not required but increases object detection accuracy

  • Set Depth Mode to ULTRA. Also not required.

  • If your camera is fixed and will not move, enable Tracking Is Static to prevent incorrect drift throughout the scene.

Object Detection Settings

Scroll down to the Object Detection section.

You don’t need to change the default settings, but let’s take a moment to review them for future reference.

  • Object Detection Model: AI model for detection. You can choose between MULTI_CLASS_BOX and MULTI_CLASS_BOX_ACCURATE. The accurate version is more accurate but slower than the base version.
  • Enable Object Tracking: If enabled, the ZED SDK will track objects between frames, providing more accurate data and giving access to more information, such as velocity.
  • Enable 2D Mask: Enabling this allows scripts to access a 2D image that shows exactly which pixels in an object’s 2D bounding box belong to the object. Since we’re using the 3D bounding boxes only in this tutorial, leave this unchecked to save performance.

The ZED SDK is able to detect multiple types of objects.

  • Confidence Threshold: Sets the minimum confidence value for a detected object to be published. Ex: If set to 40, the ZED SDK needs to be at least 40% confident that a detected object exists. Avaiblable for each class.
  • Class Filter: If enabled, the ZED SDK will detect this class. Available for each class.

Last is the “Start Object Detection” button. Normally, the Object Detection module doesn’t start when the ZED does because it causes a long delay. This button is one of two ways to start the module. The other is via script, which we’ll be doing here.

Adding Visuals

  • Create a new empty GameObject in the Hierarchy and rename it “3D Object Visualizer”

  • Add the ZED3DObjectVisualizer component to it

  • In the Projects window, go to ZED -> Examples -> Object Detection -> Prefabs

  • Drag the 3D Bounding Box prefab into Bounding Box Prefab in the Inspector

Note that “Start Object Detection Automatically” is checked by default. This will call the function in ZEDManager to initialize the Object Detection module as soon as the ZED 2 itself is ready.

Run the Scene

Double check to make sure your ZED can see the floor and run the scene. After a short initialization period, the app will pause for 10-20 seconds as the object detection module loads.

Once it does, step into the ZED’s view. You should see yourself enclosed in a box. Walk around, crouch, wave your arms, and the box will transform itself accordingly. Bring some friends into view to see it track multiple people at once.