April 2020
tl;dr: Multi-modal behavior prediction via anchor trajectory with 6-second horizon.
Behavior prediction is inherently stochastic as it is impossible to know what the agent may do next. Most previous method, including Fast and furious, IntentNet and ChauffeurNet only predict MAP trajectory. Rules of the Road predicts multiple future trajectories but through a set of unweighted samples. Sample-based generative methods have drawbacks: non-deterministic, hard to estimate errors, no way to perform probabilistic inference (e.g. to know the probability of collision in a space-time region). Also sample based approaches requires repeated inference to obtain multi-modal prediction.
Anchor trajectories are obtained by grouping logged trajectories (modes) in collected data, and provide templates for coarse granularity features for an agent. This idea brilliantly solved the exchangeability issue in multiple future prediction, as detailed in Rules of the Road.
MultiPath also used the semantic map representation used in previous methods such as IntentNet and ChauffeurNet and Rules of the Road.
IntentNet also predicts intention. But they mainly focus on an MAP trajectory. IntentNet only predict one set of trajectories and make it unsuitable for multiple future path prediction. This can be changed to predict multiple path, each per intent, and then during inference we can sample K most likely trajectory each associated with the top intent. The discrete intent prediction roughly corresponds to the discrete anchors in MultiPath, but anchor design is more data driven and flexible.
The paper is extended to Multipath++ and achieves SOTA in Waymo open motion dataset (WOMD) in late 2021.