Localization is an essential part of object detection, which is usually accomplished by bounding box regression guided by en-norm-based or IoU-based loss functions, where IoU is known for its scale-invariant characteristics. However, introducing the scale-invariance into regression loss in traditional IoU-based methods may result in a bias in favor of smaller boxes and cause redundancy and unstable oscillations. To make up for these shortages of IoU-based losses, we propose a Scale-Balanced Factor (SF) that stabilizes the regression process via a simple adaptive factor. Furthermore, to compensate for the imbalance of different types of losses caused by SF and other IoU-based loss functions, regression losses are always multiplied by a hyperparameter, which is purely empirical and is hard to find an optimum. To address this issue, a Multi-Task Reinforced Equilibrium (MRE) is proposed to dynamically tweak the learning rate of each task based on reinforcement learning. The MRE can guarantee more balanced parameters and maximize the benefit of SF or other improvement methods for IoU. By incorporating the proposed SF and MRE into the classic detectors (RetinaNet, YOLO, and Faster R-CNN, etc.), we have achieved significant performance gains on MS COCO (0.8 AP similar to 1.9 AP) and PASCAL VOC (0.6 AP similar to 2.2 AP).
展开▼