Due to high-level semantic cues (what) and spatial properties (where) of person attribute, some recent works try to introduce it into person re-identification. However, jointly learning attributes and identity by directly combining their loss function does not work, because of the significant difference between these two tasks. To address this problem, we propose an Attribute-identity Feature Fusion Network (AFFNet) for person re-ID, which fuses attribute and identity recognition tasks not only on loss level, but also on feature level. Specifically, to learn different features for attribute and identity, we split them into two branches to avoid the interference effects between each other. These two types of features are then concatenated to form the final representation. In the attribute branch, we propose to combine hierarchical features and use a Feature Attention Block (FAB), to mining high-level semantic and spatial information, respectively. The experimental results on two public datasets show that the proposed method performs favorably against state-of-the-art methods.
展开▼