Using the Body Tracking API

Since V4.0, the AI detection module has been split into two different modules: body tracking and object detection. Each module has its own data structures, methods and parameters. Before that, the body tracking feature was directly integrated into the object detection module.

Body Tracking Configuration #

To configure the body tracking module, use BodyTrackingParameters at initialization and BodyTrackingRuntimeParameters to change specific parameters during use.

📌 Note: The initial configuration must be set only once when enabling the module and runtime configuration can be changed at runtime.

new detection model BodyTrackingParameters::detection_model that enables human body detection. This preset configures the runtime and accuracy of the human body detector :
- BODY_TRACKING_MODEL::HUMAN_BODY_FAST: real time performance even on NVIDIA® Jetson™ or low-end GPU cards
- BODY_TRACKING_MODEL::HUMAN_BODY_MEDIUM: this is a compromise between accuracy and speed
- BODY_TRACKING_MODEL::HUMAN_BODY_ACCURATE: state-of-the-art accuracy, requires powerful GPU
BodyTrackingParameters::enable_body_fitting: this enables the fitting process of each detected person. This must be enabled to retrieve the local rotations of each keypoints. Otherwise, the data will be empty.
BodyTrackingParameters::body_format is the body format outputted by the ZED SDK. The currently supported body formats are:
- BODY_FORMAT::BODY_18 : 18 keypoints body model. This is a COCO18 format and is not directly compatible with public software like Unreal or Unity. For this reason, all local keypoints’ rotation and translation are not available with this format.
- BODY_FORMAT::BODY_34 : 34 keypoints body model. This model is compatible with public software and all available data for BODY_18 can be extracted with this format. The body_fitting option must be enabled to use this format.
- BODY_FORMAT::BODY_38 : 38 keypoints body model. This includes simplified hands and feet keypoints.

The code below shows how to set these new attributes.

C++
Python
C#

// Set initialization parameters
BodyTrackingParameters detection_parameters;
detection_parameters.detection_model = BODY_TRACKING_MODEL::HUMAN_BODY_ACCURATE; //specific to human skeleton detection
detection_parameters.enable_tracking = true; // Objects will keep the same ID between frames
detection_parameters.enable_body_fitting = true; // Fitting process is called, user have access to all available data for a person processed by SDK
detection_parameters.body_format = BODY_FORMAT::BODY_34; // selects the 34 keypoints body model for SDK outputs

// Set runtime parameters
BodyTrackingRuntimeParameters detection_parameters_rt;
detection_parameters_rt.detection_confidence_threshold = 40;

# Set initialization parameters
detection_parameters = sl.BodyTrackingParameters()
detection_parameters.detection_model = sl.BODY_TRACKING_MODEL.HUMAN_BODY_ACCURATE  
detection_parameters.enable_tracking = true
detection_parameters.enable_body_fitting = True
detection_parameters.body_format = sl.BODY_FORMAT.BODY_34

# Set runtime parameters
detection_parameters_rt = sl.BodyTrackingRuntimeParameters()
detection_parameters_rt.detection_confidence_threshold = 40

// Set initialization parameters
BodyTrackingParameters detection_parameters = new BodyTrackingParameters();
detection_parameters.enableObjectTracking = true; // Objects will keep the same ID between frames
detection_parameters.detectionModel = sl.BODY_TRACKING_MODEL.HUMAN_BODY_ACCURATE;
detection_parameters.enableBodyFitting = true;
detection_parameters.bodyFormat = sl.BODY_FORMAT.BODY_34;

// Set runtime parameters
BodyTrackingRuntimeParameters detection_parameters_rt = new BodyTrackingRuntimeParameters();
detection_parameters_rt.detectionConfidenceThreshold = 40;

If you want to track persons’ motion within their environment, you will first need to activate the positional tracking module. Then, set detection_parameters.enable_tracking to true.

C++
Python
C#

if (detection_parameters.enable_tracking) {
    // Set positional tracking parameters
    PositionalTrackingParameters positional_tracking_parameters;
    // Enable positional tracking
    zed.enablePositionalTracking(positional_tracking_parameters);
}

if detection_parameters.enable_tracking:
    # Set positional tracking parameters
    positional_tracking_parameters = sl.PositionalTrackingParameters()
    # Enable positional tracking
    zed.enable_positional_tracking(positional_tracking_parameters)

if (detection_parameters.enableObjectTracking ) {
    // Set positional tracking parameters
    PositionalTrackingParameters trackingParams = new PositionalTrackingParameters();
    // Enable positional tracking
    zed.EnablePositionalTracking(ref trackingParams);
  }

With these parameters configured, you can enable the Body Tracking module:

C++
Python
C#

// Enable body tracking with initialization parameters
zed_error = zed.enableBodyTracking(detection_parameters);
if (zed_error != ERROR_CODE::SUCCESS) {
    cout << "enableBodyTracking: " << zed_error << "\nExit program.";
    zed.close();
    exit(-1);
}

# Enable body tracking with initialization parameters
zed_error = zed.enable_body_tracking(detection_parameters)
if zed_error != sl.ERROR_CODE.SUCCESS:
    print("enable_body_tracking", zed_error, "\nExit program.")
    zed.close()
    exit(-1)

// Enable body tracking with initialization parameters
zed_error = zedCamera.EnableBodyTracking(ref detection_parameters);
if (zed_error != ERROR_CODE.SUCCESS) {
    Console.WriteLine("enableBodyTracking: " + zed_error + "\nExit program.");
    zed.Close();
    Environment.Exit(-1);
}

📌 Note: The Body Tracking module can be used with all our ZED Cameras, expect for the ZED 1 Camera.

Getting Human Body Data #

To get the detected person in a scene, get a new image with grab(...) and extract the detected person with retrieveBodies(). This process is exactly the same as getting new objects with the Object detection module.

C++
Python
C#

sl::Bodies bodies; // Structure containing all the detected bodies
// grab runtime parameters
RuntimeParameters runtime_parameters;
runtime_parameters.measure3D_reference_frame = sl::REFERENCE_FRAME::WORLD;

if (zed.grab(runtime_parameters) == ERROR_CODE::SUCCESS) {
  zed.retrieveBodies(bodies, detection_parameters_rt); // Retrieve the detected bodies
}

bodies = sl.Bodies() # Structure containing all the detected bodies
# grab runtime parameters
runtime_params = sl.RuntimeParameters()
runtime_params.measure3D_reference_frame = sl.REFERENCE_FRAME.WORLD

if zed.grab(runtime_params) == sl.ERROR_CODE.SUCCESS:
  zed.retrieve_bodies(bodies, obj_runtime_param) # Retrieve the detected bodies

sl.Bodies bodies = new sl.Bodies(); // Structure containing all the detected bodies
// grab runtime parameters
RuntimeParameters runtimeParameters = new RuntimeParameters();
runtimeParameters.measure3DReferenceFrame = sl.REFERENCE_FRAME.WORLD;

if (zed.Grab(ref runtimeParameters) == ERROR_CODE.SUCCESS) {
  zed.RetrieveBodies(ref bodies, ref detection_parameters_rt); // Retrieve the detected bodies
}

sl::Bodies class stores all data regarding the different persons present in the scene in its vector<sl::BodyData> body_list attribute. Each person’s data are stored as a sl::BodyData. sl::Bodies also contain the timestamp of the detection, which can help connect the bodies to the images.

All 2D data are related to the left image, while the 3D data are either in the CAMERA or WORLD referential depending on RuntimeParameters.measure3D_reference_frame (given to the grab() function). The 2D data are expressed in the initial camera resolution RESOLUTION. A scaling can be applied if the value is needed in another resolution.

For 3D data, the coordinate frame and units can be set by the user using COORDINATE_SYSTEM and UNIT respectively. These settings are accessible using InitParameters when opening the ZED camera with the open function.

Accessing 2D and 3D body keypoints #

Once a sl::BodyData is retrieved from the body vector, you can access information such as its ID, position, velocity, label, and tracking_state but also its keypoint positions and rotations.

2D and 3D keypoint data of a detected person are accessible in a vector of pixel keypoint keypoint_2d and a vector of 3D position keypoint.

C++
Python
C#

// collect all 2D keypoints
for (auto& kp_2d : body.keypoint_2d) {
  // user code using each kp_2d point
}

// collect all 3D keypoints
for (auto& kp_3d : obj.keypoint)
{
  // user code using each kp_3d point
}

# collect all 2D keypoints
for kp_2d in obj.keypoint_2d:
    # user code using each kp_2d point
    # collect all 3D keypoints

for kp_3d in obj.keypoint:
    # user code using each kp_3d point

// collect all 2D keypoints
foreach (var kp_2d in obj.keypoints2D)
{
  // user code using each kp_2d point
}

// collect all 3D keypoints
foreach (var kp_3d in obj.keypoints)
{
  // user code using each kp_3d point
}

See the keypoint index and name correspondence as well as the output skeleton format here.

Getting more results #

When the fitting is enabled at the initial configuration stage, more results are available according to the chosen BODY_FORMAT. All local rotation and translation of each keypoint become available to the user with BODY_FORMAT::BODY_34 or BODY_FORMAT::BODY_38 format.

C++
Python
C#

// collect local rotation for each keypoint
for (auto &kp : body.local_orientation_per_joint)
{
   // kp is the local keypoint rotation represented by a quaternion
   // user code
}

// collect local translation for each keypoint
for (auto &kp : body.local_position_per_joint)
{

   // kp is the local keypoint translation
   // user code
}

// get global root orientation
auto global_root_orientation = body.global_root_orientation

// note that global root translation is available in body.keypoint[root_index] where root_index is the root index of the body model

# collect local rotation for each keypoint
for kp in body.local_orientation_per_joint:
    # kp is the local keypoint rotation represented by a quaternion
    # user code


# collect local translation for each keypoint
for kp in body.local_position_per_joint:
  # kp is the local keypoint translation
  # user code

# get global root orientation
global_root_orientation = body.global_root_orientation

# note that global root translation is available in obj.keypoint[root_index] where root_index is the root index of the body model

// collect local rotation for each keypoint
foreach (var kp in body.localOrientationPerJoint)
{
   // kp is the local keypoint rotation represented by a quaternion
   // user code
}

// collect local translation for each keypoint
foreach (var kp in body.localPositionPerJoint)
{

   // kp is the local keypoint translation
   // user code
}

// get global root orientation
Quaternion globalRootOrientation = body.globalRootOrientation;

📌 Note: All these data are expressed in the chosen COORDINATE_SYSTEM and UNITS.

Code Example #

For code examples, check out the Tutorial and Sample on GitHub.