Robotics & Autonomy

LiDAR SLAM

A spinning laser sweeps the world, a point cloud forms, scan matching ties each frame to the last, and loop closure snaps the drift away — the localization stack inside every modern autonomous vehicle

LiDAR SLAM — Simultaneous Localization And Mapping with light-based ranging — is the family of algorithms that lets a robot or vehicle build a map of an unknown environment while continuously tracking its own pose inside that map. A spinning laser builds a point cloud; scan registration aligns each new sweep to the previous; a pose graph accumulates the deltas; loop closure on revisit fixes the global frame; an optimiser snaps the drift back to zero. It is the technology that won the DARPA Urban Challenge, it runs inside every Waymo robotaxi and Boston Dynamics Spot, and it has reduced the cost of autonomous mapping from tens of thousands of dollars per unit to under a thousand.

ICPBesl & McKay, 1992
EKF-SLAMSmith, Self & Cheeseman, 1990
LOAMZhang & Singh, 2014
Front-endICP / NDT scan registration
Back-endPose-graph optim. (g2o, GTSAM, Ceres)
DARPA winnerVelodyne HDL-64, 2007

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

The chicken-and-egg problem

A robot dropped into an unknown building has two coupled questions to answer. Where am I? And what does the building look like? Each answer needs the other. To build a map you have to know where you are when each sensor return comes in — otherwise you would just be plotting noise. To localise yourself you have to know the map — otherwise you have no landmarks to measure against. Solve either one in isolation and the other becomes easy. Solve neither in advance and the joint problem is what the SLAM literature spent thirty years cracking.

The classical formulation is Bayesian. Let x_{0:t} be the trajectory of robot poses, m the (unknown) map, u_{1:t} the control inputs, and z_{1:t} the sensor measurements. The full SLAM posterior is

p(x_{0:t}, m | z_{1:t}, u_{1:t})  ∝  p(x_0) · ∏ p(x_k | x_{k-1}, u_k) · ∏ p(z_k | x_k, m)

This joint distribution is intractable in general — it lives over a trajectory in SE(3)^t and a map of unbounded dimension. Every practical SLAM algorithm is a different approximation of this posterior, with different trade-offs between expressiveness, runtime, and memory.

Three generations of SLAM

The field has gone through three distinct paradigms, each defined by what it approximates.

EKF-SLAM (1990–2000). Smith, Self and Cheeseman wrote down the first practical SLAM framework in 1990: model the joint distribution over robot pose and map landmarks as a single multivariate Gaussian, and update it recursively with an Extended Kalman Filter. The covariance matrix has size (3 + 3N)² for a planar robot with N landmarks. Every measurement update touches the entire covariance, costing O(N²). The Gaussian approximation also struggles with the multimodal data-association uncertainty that arises when two landmarks look identical. By the late 1990s EKF-SLAM was the textbook approach but already hitting its scaling limits in the hundreds of landmarks.

Particle-filter SLAM (2002–2008). FastSLAM, by Montemerlo, Thrun, Koller and Wegbreit, used the Rao-Blackwellised particle filter to factor the trajectory and the map. Each particle carries a hypothesised trajectory plus a per-particle map; conditioned on a trajectory, each landmark posterior factors into its own small Kalman filter. FastSLAM scaled to thousands of landmarks. The catch: particle depletion meant that long trajectories required exponentially many particles, and the algorithm forgot the past once a sample died out. FastSLAM's sweet spot was forest-floor robotics with a few hundred metres of operation.

Graph SLAM (2008–present). The modern paradigm is to keep the full trajectory as a pose graph and solve it offline (or near-online) as a sparse nonlinear least-squares problem. Lu and Milios proposed it in 1997; it became practical when efficient sparse solvers (g2o in 2011, GTSAM, Ceres) made graph sizes of tens of thousands of poses tractable in real time. Graph SLAM is what powers every state-of-the-art LiDAR or visual SLAM system shipped this decade.

Front-end: turning point clouds into pose deltas

The front-end of a LiDAR SLAM stack takes raw scans — typically 100,000 to 2,000,000 points per sweep — and produces a single relative pose between consecutive (or near-consecutive) scans. The two dominant techniques are ICP and NDT.

Iterative Closest Point. The Besl-McKay 1992 algorithm is conceptually simple. Given a source cloud S and a target cloud T, find the rigid (R, t) ∈ SE(3) that minimises

E(R, t) = ∑_i ‖ (R s_i + t) − nearest(T, R s_i + t) ‖²

by iterating: (1) for each source point, find its nearest target point in a k-d tree; (2) solve the resulting weighted Procrustes problem in closed form via SVD (Horn 1987); (3) apply the transform to S; (4) repeat until convergence. The basin of attraction is limited — ICP needs a decent initial guess, typically from IMU or wheel odometry — but inside that basin it converges in 5–20 iterations.

The point-to-point version slides on planar surfaces (think: aligning two corridor scans). The point-to-plane variant (Chen-Medioni 1991) replaces the residual with the distance from each source point to the local target plane, computed from k-nearest-neighbour normals; this kills the planar-sliding failure mode and is the production default. Generalized ICP (Segal-Haehnel-Thrun 2009) takes one further step and models each correspondence as a probabilistic pairing between two local plane patches.

Normal Distributions Transform. NDT (Biber-Strasser 2003) bins the target cloud into a 3D voxel grid and fits a Gaussian (mean and covariance) per voxel. Registration then maximises the likelihood of the source points under that Gaussian mixture model. NDT avoids the k-d-tree lookup at every iteration and tolerates poorer initial guesses than ICP. Autoware and many automotive stacks use NDT as the primary front-end.

Feature-based fronts — LOAM and descendants

Running full-cloud ICP at 10 Hz on a million-point cloud is expensive. LOAM (Zhang & Singh, CMU, 2014) introduced the production trick that every modern LiDAR stack uses some version of: classify points by local curvature, register only the salient ones.

For each point, compute the smoothness

c = (1 / (|N| · ‖p‖)) · ‖ ∑_{q∈N} (p − q) ‖

where N is the set of nearby points along the same scan line. High c means an edge feature (corner, post, pole); low c means a planar feature (wall, ground, ceiling). LOAM extracts a few hundred edge and planar features per scan, then runs scan-to-scan matching of edge-to-edge-line and planar-to-planar-patch correspondences. The result is two orders of magnitude faster than full-cloud ICP and topped the KITTI odometry benchmark in 2014.

LOAM also introduced the two-rate architecture: a fast (10 Hz) scan-to-scan odometry produces a coarse pose immediately; a slower (1 Hz) scan-to-map matching refines that pose against an accumulated voxel map. Real-time downstream consumers get bounded-latency pose; the map gets the benefit of multi-scan refinement.

Back-end: the pose graph and how to solve it

The back-end converts a sequence of relative-pose measurements into a globally consistent trajectory. Each measurement is an edge in the pose graph. Nodes are 6-DOF poses; edges encode relative transforms with covariances.

Mathematically the back-end minimises

∑_{(i,j) ∈ E}  e_ij(x_i, x_j)^T  Ω_ij  e_ij(x_i, x_j)

where e_ij is the error between the measured transform z_ij and the transform implied by the current pose estimates x_i and x_j, and Ω_ij = Σ_ij^{-1} is the information matrix. This is a nonlinear least-squares problem on the manifold SE(3)^N. Gauss-Newton or Levenberg-Marquardt solve it iteratively, exploiting the fact that each edge touches only two poses so the linearised system H δx = -g has a banded plus low-rank sparse structure — solvable in O(N) for a chain and O(N^{1.5}) for typical loopy graphs.

The three reference solvers are g2o (Kümmerle et al., 2011, optimised for sparse pose graphs with custom SE(3) parameterisation), GTSAM (Frank Dellaert, the academic reference, with iSAM2 supporting incremental updates), and Ceres (Google's general nonlinear solver, slower per iteration but with the deepest pre-conditioner library). All three implement essentially the same Gauss-Newton-on-manifold algorithm; the differences are threading model, incremental support, and language bindings.

Loop closure — the cure for drift

Every scan-to-scan registration has a small error. Over a kilometre of dead-reckoning a good LiDAR-only system drifts 1–10 metres. The cure is loop closure: when the robot revisits a previously mapped place, the back-end gets a long-range constraint of the form "pose at time t_now equals pose at time t_then, up to noise", and the entire intermediate trajectory snaps into global consistency.

The detection problem is non-trivial. A naive scan-to-every-past-scan search is O(N²). Production stacks use compact global descriptors. Scan Context (Kim & Kim 2018) summarises a LiDAR scan as a 20-by-60 polar-grid image; two scans are similar if their column-shifted L1 distance is below a threshold, which is rotation-invariant for the robot's yaw. Other approaches use deep features (PointNetVLAD), bag-of-words on geometric descriptors, or aggressive geometric verification with global ICP on candidates retrieved by a coarse descriptor.

Once a candidate loop closure is found, a separate geometric verification step runs full ICP between the two scans. If the residual is below threshold and the transformation is consistent, the edge is added to the pose graph with its measured covariance and the back-end re-solves. The trajectory snaps; the map deduplicates.

Worked example: drift on a 1 km loop

Suppose a ground robot makes a 1 km loop through a warehouse, returning to its start. The LiDAR runs at 10 Hz; each scan-to-scan registration adds a translation error of σ_t = 5 mm and a rotation error of σ_θ = 0.05° (typical for a Velodyne VLP-16 with point-to-plane ICP).

Loop length         = 1000 m
Speed               = 1 m/s
Scans per loop      = 10000
Pose-to-pose error  ≈ (σ_t · √N) for random-walk translation
                    = 5 mm · √10000
                    = 0.5 m

Rotation drift      ≈ σ_θ · √N
                    = 0.05° · √10000
                    = 5°

At the end-of-loop, a 5° heading error compounds with the translation:
End-of-loop position error  ≈ 1000 · sin(5°)  ≈ 87 m  (dominant term)

An 87 m miss at the end of a 1 km loop is a useless map. Loop closure on the start point provides the constraint "x_10000 ≈ x_0" with covariance Σ = diag((0.05 m)², (0.05 m)², (0.5°)²) — the noise of a careful geometric verification. The back-end redistributes the 87 m residual across 10000 poses, leaving a residual end-to-end error of order σ_t · √N ≈ 0.5 m. Two orders of magnitude.

This is why loop closure is the most consequential single design decision in a SLAM stack. A good front-end without loop closure delivers a useless global map; a mediocre front-end with reliable loop closure delivers a usable one.

Open-source stacks side-by-side

System	Year	Front-end	Back-end	IMU	Strengths
LOAM	2014	Edge + planar features	Scan-to-map refinement	Optional, loose	Real-time, KITTI-winning
LeGO-LOAM	2018	LOAM + ground separation	Pose graph + loop closure	Loose	Lightweight, ground-vehicle
LIO-SAM	2020	Feature-based ICP	iSAM2 factor graph	Tight (pre-integration)	Industrial standard for outdoor
FAST-LIO / FAST-LIO2	2021–2022	Iterated EKF on full cloud	EKF on points + voxel map	Tight (EKF)	Lowest latency, drones
Cartographer	2016	Local-grid scan matching	Sparse pose graph + branch-and-bound LC	Optional	2D & 3D, very robust LC
HDL Graph SLAM	2018	NDT / GICP	g2o pose graph	Optional	Modular, easy to extend
SuMa / SuMa++	2018–2019	Surfel projection + ICP	Pose graph + semantic LC	Optional	Dense surface maps, semantics
KISS-ICP	2023	Adaptive voxel ICP, no features	Odometry only (no LC)	None	Minimalist, parameter-free

The LOAM family dominates outdoor ground-vehicle deployments. Cartographer is Google's contribution and is the default for many indoor 2D applications. KISS-ICP, a recent surprise, shows that careful adaptive voxel ICP without any feature classification can match the LOAM family on KITTI — suggesting that much of the feature-engineering complexity may have been over-fit to a narrow benchmark set.

The sensor side

Sensor	Year	Channels	Range	Rate	Where it ships
SICK LMS-291 (2D)	~2000	1 (planar)	80 m	75 Hz	DARPA Grand Challenge, AGVs
Velodyne HDL-64E	2007	64	120 m	10 Hz	Stanley, Tartan Racing, Waymo gen 1
Velodyne VLP-16 (Puck)	2014	16	100 m	10 Hz	Boston Dynamics Spot, low-cost AGV
Ouster OS1-64 / OS1-128	2018 / 2020	64 / 128	120 / 240 m	10–20 Hz	Robotaxi, Spot upgrade kits
Hesai Pandar / AT128	2019–2022	40–128	200 m	10 Hz	Chinese robotaxi fleets, Cruise
Livox Mid-70 / Mid-360	2020 / 2022	Non-repetitive scanning	90 / 70 m	10 Hz	Inspection drones, mobile robots
Waymo / Aurora in-house	2020+	Proprietary FMCW	300+ m	20 Hz	Production robotaxis

The 2007 Urban Challenge was the inflection point. Stanford's Stanley (2005, Grand Challenge winner) and CMU's Boss (2007, Urban Challenge winner, Tartan Racing) both relied on Velodyne's HDL-64; it became the de facto sensor for autonomous vehicle research for the next decade. Cost has fallen 70× since then, primarily because solid-state designs (Livox, MEMS scanning, FMCW silicon photonics) avoid the precision mechanical rotation of the HDL-64.

Where LiDAR SLAM ships today

Autonomous vehicles. Waymo, Cruise, Zoox, Pony.ai, Wayve, Aurora — every robotaxi platform uses a LiDAR-SLAM-derived localizer (typically against a pre-built HD map) for safety-critical lane-level positioning. Tesla famously refuses LiDAR; everyone else uses it.
Boston Dynamics Spot. Spot's autonomous mission mode builds a graph of recorded routes using its Velodyne or Ouster head; on playback it localises in real time against that graph. Atlas uses tighter sensor fusion for short-range manipulation.
Warehouse AGVs. Locus Robotics, 6 River Systems, Fetch Robotics — fleets of thousands of LiDAR-equipped picking robots, all running 2D LiDAR SLAM (often built on Cartographer or proprietary descendants) against a warehouse map updated nightly.
Mining and construction. Komatsu, Caterpillar autonomous haul trucks. Built-Robotics, Sarcos. Open-pit mining has been autonomous longer than highway driving has, partly because the environment is private and well-mapped.
Inspection drones. Skydio, DJI Matrice with Livox payloads. Indoor inspection of bridges, tunnels, refineries — environments without GPS. The Mid-360 has become the default sensor for the under-2 kg autonomous drone segment.
Survey-grade mobile mapping. NavVis, Leica Pegasus, GeoSLAM — backpack and trolley-mounted scanners that produce sub-centimetre 3D maps of buildings, tunnels, and underground spaces in a single walk-through. The same SLAM math, different latency target.

LiDAR SLAM versus visual SLAM

Property	LiDAR SLAM	Visual SLAM
Depth measurement	Direct (time-of-flight)	Indirect (stereo or motion triangulation)
Range precision	≤ 1 cm @ 100 m	Quadratic degradation with distance
Texture dependence	None	Fails on blank walls, fog
Lighting dependence	None (active sensor)	Fails in darkness without IR
Sensor cost	$500 – $75,000	$10 – $500 (camera) + compute
Power	10 – 50 W	0.5 – 5 W (CPU/GPU dependent)
Weight	0.1 – 13 kg	Grams
Loop-closure descriptors	Scan Context, geometric	Bag-of-words on visual features
Dominant in	Outdoor AV, mining, AGV	AR/VR, micro-drones, indoor consumer

The choice is rarely religious — modern systems fuse both. Waymo's stack uses LiDAR for safety-critical localisation, vision for traffic-light recognition and pedestrian intent prediction, and radar for adverse weather. The reason LiDAR SLAM warrants its own article is that the localisation backbone in the fused stack — the layer that says "where am I, to within ten centimetres" — almost always comes from LiDAR, because no other sensor reaches that precision robustly at automotive ranges.

Pitfalls that bite practitioners

Motion distortion within a scan. A spinning LiDAR sweeps for 100 ms; at 30 m/s the platform moves 3 m during a single scan. Ignoring this, every scan is a smear. The fix is to deskew the cloud using an IMU-derived motion estimate (LIO-SAM does this; LOAM has a built-in deskewing pass). Failing to deskew is the most common bug in newcomer LiDAR-SLAM implementations.
Degenerate environments. Long featureless tunnels, open fields, parking decks. Scan registration has rank-deficient information in the direction of motion. The IMU has to carry the pose estimate through. Stacks that detect this regime (via Hessian-rank analysis) and dial the front-end's weight down avoid catastrophic drift; stacks that do not, fail abruptly.
Dynamic objects. Pedestrians, vehicles, fluttering leaves. ICP correspondences on dynamic points pull the pose toward the dynamic motion. Filtering by velocity estimate (DynaSLAM-style) or by semantic segmentation (SuMa++ uses RangeNet++) materially improves accuracy in dense urban traffic.
Loop-closure false positives. Two visually similar but geographically distant locations (two identical aisles in a warehouse) can produce wrong loop closures that catastrophically corrupt the map. Robust back-ends use switchable constraints (Sünderhauf 2012) or DCS (dynamic covariance scaling) to down-weight inconsistent edges automatically.
Map versioning and updates. A SLAM-built HD map is stale the moment construction starts on the road. Production AV stacks have a separate map-update pipeline that detects discrepancies between live observations and the stored map and re-runs SLAM offline on flagged tiles.
Calibration drift. The extrinsic transform between the LiDAR and the IMU has to be sub-millimetre accurate. A 1 mm calibration error at 1 m baseline equals 0.06° of angular bias — enough to corrupt long-trajectory odometry. Modern stacks (LIO-SAM, Kalibr-style routines) jointly estimate extrinsics online or run an offline auto-calibration before deployment.

Where the field is heading

Three threads converge in mid-2020s LiDAR-SLAM research. Learned components are replacing parts of the classical pipeline — PointNetVLAD for descriptors, learned ICP weighting, deep loop-closure verification — without yet replacing the geometric core (the back-end remains a hand-derived sparse least-squares solver, because the structure is well-understood and trustable). Continuous-time representations (Anderson-Barfoot's GP-based trajectories, CT-ICP) replace discrete pose nodes with smooth splines, better matching the physical reality of a moving sensor. Tight multi-modal fusion with cameras (V-LOAM, R3LIVE) and 4D imaging radar (RadarSLAM) is becoming standard, with LiDAR providing the geometric backbone and other modalities contributing where LiDAR is degenerate or saturated.

The underlying problem — joint inference of trajectory and map from noisy sensor returns — is unlikely to ever be "solved" in the way path-planning has been. The combinations of environment, sensor, and motion are too numerous. But the toolbox — ICP variants, factor graphs, robust loop closure, semantic priors — has consolidated to the point that ten years of new work tends to refine, rather than overturn, the LOAM-era foundations.

Frequently asked questions

What does SLAM actually solve, and why is it hard?

Given a stream of sensor returns from an unknown environment, recover both the trajectory of the sensor and a map of the environment, simultaneously. The two estimates are coupled: to build a map you need to know where you are, and to localize you need a map. SLAM is the joint inference problem. It is hard because the trajectory accumulates error monotonically (every odometry tick adds a little drift), because the data association problem (which past return does this new return correspond to?) is combinatorial, and because the map can grow to gigabytes — so the back-end has to scale to thousands of poses and millions of features in real time.

What is ICP and how does it match two LiDAR scans?

Iterative Closest Point, published by Besl and McKay in 1992, finds the rigid transformation (R, t) that best aligns one point cloud to another. Each iteration has two phases. (1) For every point in the source cloud, find the nearest point in the target cloud — typically via a k-d tree. (2) Compute the (R, t) that minimises the sum of squared distances between these correspondences — a closed-form SVD problem (Horn 1987). Apply the transform and repeat until the residual stops decreasing. ICP is the workhorse of LiDAR odometry. Variants — point-to-plane, generalized ICP, NDT — improve robustness, particularly on planar environments like roads and corridors where point-to-point gets trapped sliding tangentially.

Why does drift accumulate, and how does loop closure fix it?

Every scan-to-scan registration has a small error — a few millimetres of translation, a fraction of a degree of rotation. These errors are not random with zero mean; they are correlated, and they integrate. After a few hundred metres of dead-reckoning a LiDAR-only system has typically drifted 0.5 to 2 percent of distance travelled. Loop closure is the cure. When the robot returns to a place it has visited before, the recognition (via global descriptors like Scan Context, or by matching keyframes) provides a long-range constraint: this current pose should equal that previous pose, up to measurement noise. Adding that constraint to the pose graph and re-optimising redistributes the accumulated drift across all the intermediate poses, snapping the entire trajectory back to global consistency in a single solve.

What is a pose graph and how is it optimized?

A pose graph is a factor graph in which the nodes are robot poses (typically SE(3) — 6 DOF) and the edges are relative transformations measured between pairs of poses, each annotated with a covariance encoding the measurement's uncertainty. Most edges connect consecutive poses (the odometry chain); a few are long-range loop-closure edges. Graph optimisation finds the pose assignment that minimises the total weighted residual ∑ (e_ij^T Ω_ij e_ij), where e_ij is the difference between the measured edge and the edge implied by the current pose estimates and Ω_ij is the information matrix. Gauss-Newton or Levenberg-Marquardt iteration on this nonlinear least-squares problem converges quickly because the Jacobian is sparse — each edge touches only two poses. Open-source solvers g2o (Kümmerle 2011), GTSAM (Dellaert), and Ceres (Google) all do this; production stacks pick based on threading model and incremental support.

How is LiDAR SLAM different from visual SLAM?

LiDAR measures range directly, with sub-centimetre precision out to a hundred metres, independent of texture and lighting. A camera measures angular position only; depth has to be triangulated from baseline or motion, and the precision degrades quadratically with distance. LiDAR works in pitch darkness and snowstorms (until snow obscures the optics); cameras fail at night without infrared illumination and struggle in featureless environments. The trade-off is cost, weight, power, and resolution: a single Velodyne HDL-64 was sixty thousand dollars at launch and weighs thirteen kilograms; a stereo camera pair costs a few hundred dollars and weighs grams. Modern outdoor autonomy almost always fuses both. Indoor robotics increasingly uses visual SLAM (cameras are cheap and rooms have texture); long-range autonomous vehicles still depend on LiDAR for safety-critical localization.

What is LOAM, and why is it still the reference design?

LOAM — LiDAR Odometry and Mapping — was published by Ji Zhang and Sanjiv Singh at CMU in 2014 and topped the KITTI odometry benchmark for years. The key idea is the two-rate architecture: a high-frequency (10 Hz) scan-to-scan registration produces a coarse but real-time pose, and a low-frequency (1 Hz) scan-to-map registration refines the pose against an accumulated voxel map. Features are split into edge points (high local curvature) and planar points (low curvature) and matched separately. The split decouples low-latency odometry from accurate mapping, and the feature classification makes registration faster than full-cloud ICP. LeGO-LOAM specialised it for ground vehicles by exploiting the ground plane; LIO-SAM tightly coupled IMU pre-integration into the back-end factor graph. All three are still widely used as baselines for new LiDAR-SLAM research.

What hardware actually goes on autonomous vehicles?

Most production autonomous vehicles run a roof-mounted spinning LiDAR plus a corona of forward-and-side units. The Velodyne HDL-64 — 64 channels, 10 Hz, 100 m range — was the canonical sensor for the DARPA Urban Challenge winners and shipped on Waymo's first generation. Modern stacks have largely moved to denser solid-state or hybrid units: the Ouster OS1-128 (128 channels, digital readout, 240 m range) and Hesai AT128 dominate the spinning class; Livox Mid-70 and Mid-360 represent the cheaper non-repetitive solid-state class used widely on quadruped robots. Boston Dynamics Spot ships with a small spinning Velodyne Puck Lite or Ouster; Waymo's fifth-generation Pacifica uses a proprietary in-house design. Per-unit costs have fallen from $75,000 (HDL-64, 2010) to under $1,000 (Livox Mid-70, 2024); the volume scale of autonomous-vehicle and AGV deployment is the reason.

How do modern SLAM stacks handle IMU and wheel-odometry fusion?

Tight inertial coupling is now standard. The IMU runs at 100–1000 Hz and provides high-bandwidth pose deltas between LiDAR scans (which arrive at 10 Hz). LIO-SAM and FAST-LIO pre-integrate IMU measurements between LiDAR keyframes (Forster 2017's IMU pre-integration), inject the resulting relative pose factor into the factor graph alongside the LiDAR-registration factor, and jointly estimate IMU biases. The IMU provides motion compensation during the scan itself — a critical correction since a spinning LiDAR sweeps over 100 ms and the platform has moved during that window — and provides robust odometry through degenerate environments (long tunnels, open fields) where LiDAR registration alone has no constraints in the direction of motion. Wheel odometry adds redundancy on ground vehicles; GPS adds an absolute reference where available.