Positional Tracking Overview

Positional tracking is the ability of a device to estimate its position and orientation in the world around it — in other words, understanding where it is and how it moves in 3D space. It enables cameras to be tracked in real time with six degrees of freedom (6DoF): translation (X, Y, Z) and rotation (roll, pitch, yaw).

Powered by advanced stereo SLAM (Simultaneous Localization and Mapping) algorithms of the ZED SDK, the ZED camera family brings precise and reliable 6DoF positional tracking to a wide range of applications — from mobile robotics and autonomous navigation, to augmented reality, VR, and cinematic visual effects.

Learn how to install and use the Positional Tracking sample in the quickstart section of the module documentation.

How It Works #

The ZED cameras use visual tracking to continuously estimate the motion of the camera within its environment. By analyzing and matching visual features across consecutive image frames, the 6DoF pose can be extracted relative to a chosen reference frame.

Any ZED camera is also equipped with an inertial sensor capable of providing reliable high-frequency measurements of camera accelerations and angular velocities in motion. The inertial information can be activated in order to be directly fused with visual tracking information and provide an even more reliable estimate of camera movements.

Learn more about the different positional tracking modes and settings in the dedicated pages of the module documentation.

Visual-Inertial SLAM (VSLAM) #

The Positional Tracking module is powered by an advanced stereo Visual-Inertial SLAM pipeline that seamlessly integrates:

  • Stereo Visual Odometry: Estimates motion by tracking visual features in 3D across frames.

  • Inertial Measurement Unit (IMU) data: Provides high-frequency accelerations and angular velocities for robust motion estimation during rapid movement or in low-texture areas.

  • SLAM: Builds and maintains a sparse map of 3D landmarks to reduce drift and enable loop closure, ensuring a scalable localization system capable of operating indoors, outdoors, or in dynamic environments — even when GPS is unavailable. Loop closure is a technique used in SLAM systems to correct accumulated drift when the camera returns to a previously visited location.

To learn more about the VSLAM usage and associated modes (Mapping, Lifelong Mapping and Relocalization), please consult this dedicated page.

Pose Information #

The ZED SDK outputs the camera’s pose in real-time. The pose is given for the left eye of the stereo camera. It is relative to a reference coordinate frame. Each pose includes:

  • Position: [X, Y, Z] — the 3D location of the left camera in the reference coordinate frame. The vector’s norm represents the total distance traveled between the current camera position and the reference coordinate frame.

  • Orientation: [X, Y, Z, W] — a quaternion representing the camera’s rotation. This can be converted into Roll, Pitch, and Yaw or expressed as a rotation matrix.

  • Linear and Angular velocity.

  • Additional metadata: timestamp, tracking confidence, and transformation matrices for convenient integration into robotic and simulation frameworks.

Pose data is available through the ZED SDK and its ecosystem plugins for Unity 3D, ROS 2, Unreal Engine, and other platforms.

To learn how to get Position and Orientation, see using the API section.

Typical Use Cases #

Robotics Applications #

AMRsHumanoids

Modern robotics depends on accurate perception and localization. From mobile robots navigating complex environments to humanoids performing precise manipulation, positional tracking is a core component of autonomy. With the ZED’s stereo VSLAM, robots can:

  • Navigate autonomously indoors and outdoors.

  • Perform simultaneous localization and mapping (SLAM) for real-time map building.

  • Estimate velocity and trajectory for motion planning and control.

  • Fuse with external data (wheel odometry, IMU, LiDAR) for hybrid localization pipelines 1.

Various types of robots can benefit from this module:

  • Autonomous Mobile Robots (AMR): autonomous delivery, warehouse navigation, or last-mile logistics.

  • Drones and UAVs: GPS-denied flight and obstacle avoidance.

  • Humanoids and manipulators: dynamic tasks such as picking, carrying, or interacting with objects that require accurate 3D awareness.

The ZED SDK provides native ROS 2 integration, enabling seamless deployment in robotic frameworks such as Nav2 and MoveIt.

All positional tracking features are fully accessible through the ZED ROS2 wrapper packages.

1 External data fusion is not provided by the ZED SDK. It requires custom implementation or external libraries.

Virtual and Augmented Reality #

In VR and AR, inside-out positional tracking allows users to move freely in physical space while maintaining perfect alignment between the virtual and real worlds. The ZED’s tracking enables:

  • Full 6DoF user motion estimation in mixed reality applications.
  • Real-time spatial understanding for occlusion and interaction.
  • Integration with Unity 3D or Unreal Engine for immersive MR experiences.

For an example of how to use tracking in Unity, see the sample Build Your First MR App.

Visual Effects and Match Moving #

In film and VFX production, positional tracking allows precise match moving — reproducing real-world camera motion in a 3D scene. With the ZED’s real-time 6DoF tracking, filmmakers can:

  • Capture camera movement directly on set.
  • Drive virtual cameras in real time for previsualization or compositing.
  • Seamlessly align CG elements with live-action footage for perfect perspective matching.

Learn more in the Green Screen Mixed Reality example.