Positional Tracking Modes
The StereoLabs Positional Tracking module uses visual-inertial SLAM (VSLAM) to deliver precise, real-time estimates of the camera’s 3D position and orientation. By fusing stereo vision with IMU data, it builds and refines a 3D map of the environment while simultaneously tracking motion within it. This approach ensures accurate, drift-minimized localization—even in dynamic or visually challenging conditions—making it suitable for demanding robotics, AR/VR, and autonomous navigation applications.
The ZED SDK includes multiple generations of positional tracking algorithms — GEN_1 and GEN_3 1— each designed to meet different performance and accuracy requirements.
The positional tracking module supports two operating localization strategies:
-
VIO (Visual-Inertial Odometry) — which performs pure localization without using a prior map.
-
Relocalization — which performs localization within an area map, allowing the system to recognize known environments and maintain consistent tracking across sessions.
These modes and localization strategies allow you to choose the best balance between computational load, precision, and robustness for your specific use case.
1 The GEN_2 positional tracking generation has been deprecated and will be removed in a future release.
GEN_1 mode
GEN_1 implements a dense VSLAM approach that leverages depth data from the stereo camera to estimate motion. It generates a dense representation directly from depth information rather than sparse keypoints, enabling more stable and drift-resistant tracking in low-texture or feature-poor areas.
The fusion of stereo vision and IMU data improves robustness against motion blur and rapid movements. Originally designed for AR/VR applications, GEN_1 delivers smooth and consistent positional tracking, especially during fast or complex movements.
GEN_1 is optimized primarily for VIO (Visual-Inertial Odometry) mode. While it excels at real-time localization and smooth tracking, its dense VSLAM architecture was not designed with loop closure or area map relocalization in mind. For applications requiring persistent mapping or multi-session relocalization, GEN_3 is the recommended choice.
GEN_1 Load Performances
GEN_1 Accuracy Performances
performance obtained with ZED SDK v5.1.1, ZED X Driver v1.3.2, and ZED X camera using the positional tracking sample available on GitHub. Each value represents the average result from a series of real-world tests collected under a specific scenario dataset (e.g., indoor, outdoor). The average test sequence spans approximately 400 m.
GEN_3 mode
GEN_3 introduces a scalable, feature-based VSLAM pipeline built for robustness and precision. By extracting and tracking high-quality visual features—rather than relying on GEN_1’s dense mapping—it maintains an accurate, lightweight map with strong loop closuren and global optimization capabilities. This minimizes drift, improves long-term consistency, and delivers a fast and reliable relocalization.
Combined with visual-inertial fusion, GEN_3 adapts seamlessly to large, dynamic, or revisited environments—ideal for advanced robotics and autonomous navigation applications.
GEN_3 is recommended for both the VIO and relocalization localization strategies.
GEN_3 Load Performances
performance obtained with ZED SDK v5.1.1, ZED X Driver v1.3.2, and ZED X camera using the positional tracking sample available on GitHub.
GEN_3 Accuracy Performances
performance obtained with ZED SDK v5.1.1, ZED X Driver v1.3.2, and ZED X camera using the positional tracking sample available on GitHub. Each value represents the average result from a series of real-world tests collected under a specific scenario dataset (e.g., indoor, outdoor). The average test sequence spans approximately 400 m.

