May 2020
tl;dr: Egocentric/first person vehicle prediction.
The paper introduced HEVI (Honda egocentric view intersection) dataset.
First-person video or egocentric data are easier to collect, and also captures rich information about the objects performance.
However the front camera has a narrow FOV and tracklets are usually short. The paper selects tracklets that are 2 seconds long. Use 1 sec history and predict 1 second future.
The inclusion of dense optical flow improves results hugely. Incorporation of future ego motion is also important in reducing prediction error. Note that the future ego motion is fed as GT. During inference the system assumes future motion are from motion planning.