Positional Tracking Overview

Positional tracking is the ability of a device to estimate its position relative to the world around it. Also called motion tracking or match moving in the movie industry, this is used to track the movement of a camera or user in 3D space with six degrees of freedom (6DoF).

How It Works

The ZED uses visual tracking of its surroundings to understand the movement of the user or system holding it. As the camera moves in the real-world, it reports its new position and orientation. This information is called the camera 6DoF pose. Pose information is output at the frame rate of the camera, up to 100 times per second in WVGA mode.

Inertial Information fusion

The ZED Mini has been the first ZED camera equipped with an inertial sensor capable of providing reliable high-frequency measurements of camera accelerations and angular velocities in motion. When an IMU (Inertial Measurement Unit) sensor is available, the inertial information is merged with visual tracking information to provide an even more reliable estimate of camera movements.

Getting Position and Orientation

The ZED API returns pose information for each frame. The pose is given for the left eye of the camera. It is relative to a reference coordinate frame, usually the World Frame which is fixed in space.

Pose

The following pose data is returned:

Position: the location of the camera in space is available as a vector [X, Y, Z]. Its norm represents the total distance traveled between the current camera position and the reference coordinate frame.
Orientation: the orientation of the camera in space is available as a quaternion [X, Y, Z, W]. A quaternion can be directly manipulated to reflect yaw, pitch and roll in different coordinate frames.

The sl::Pose class also contains timestamp, confidence value and a rotation matrix that describes the rotation of the camera with respect to the World Frame.

To learn how to get Position and Orientation, see using the API section.