September 2019
tl;dr: Largest dataset for object detection in aerial images.
Overall impression
This dataset addresses a specific field called Earth Vision/Earth Observation/Remote Sensing.
In aerial images, there is rarely occlusion so every object can be annotated (vs. crowd class in COCO dataset).
Other than aerial image, text region detection also involves oriented bbox detection.
Key ideas
- The annotation is 8 dof quadrilateral. But essentially most of them are (or converted to) Oriented bounding box (OBB).
- For horizontal bounding boxes, sometimes the overlap is too big for object detection algorithms to tell them apart (due to NMS).
- Cars: Big car (trucks, etc) and small car two categories.
Technical details
- Dataset stats are analyzed to filter anomaly annotations.
Notes
- According to their implementation of the Faster RCNN (OBB), they used original anchor proposals, and reparameterized the anchor box to four corners (8 points), and then changed prediction from 4 numbers to 8 numbers. No oriented anchors were used. –> compare with RoiTransformer.