August 2019
tl;dr: Detect 2D oriented bbox with BEV maps by adding angle regression to YOLO.
Overall impression
The paper is clearly written and the innovation is limited. However the performance is really nice – this is exactly the type of paper industry likes.
It is twice slower than Point Pillars achieves 115 fps.
Key ideas
- Add angle regression to YOLO.
- IoU calculation is updated to accommodate oriented bbox.
- The input encoding is based on MV3D.
- Each grid has only five anchor bboxes with different headings. The anchors do not cover a full grid but rather a finite combination of the parameters.
- Angle loss only effective when the oriented bbox IOU is larger than a threshold.
- Almost 10 times faster than VoxelNet, at 50 fps. In comparison Point Pillars achieves 115 fps.
Technical details
- FOV is 40 m x 80 m (same with radar). The image format is 512 x 1024.
- RGB map encoded by height, intensity and density.
- The camera FOV is only about 90 (similar to radar). The heatmap of GT is very helpful. Output outside FOV is filtered before evaluation.
Notes