TY - JOUR
T1 - JÂA-Net
T2 - joint facial action unit detection and face alignment via adaptive attention
AU - Shao, Zhiwen
AU - Liu, Zhilei
AU - Cai, Jianfei
AU - Ma, Lizhuang
PY - 2021
Y1 - 2021
N2 - Facial action unit (AU) detection and face alignment are two highly correlated tasks, since facial landmarks can provide precise AU locations to facilitate the extraction of meaningful local features for AU detection. However, most existing AU detection works handle the two tasks independently by treating face alignment as a preprocessing, and often use landmarks to predefine a fixed region or attention for each AU. In this paper, we propose a novel end-to-end deep learning framework for joint AU detection and face alignment, which has not been explored before. In particular, multi-scale shared feature is learned firstly, and high-level feature of face alignment is fed into AU detection. Moreover, to extract precise local features, we propose an adaptive attention learning module to refine the attention map of each AU adaptively. Finally, the assembled local features are integrated with face alignment feature and global feature for AU detection. Extensive experiments demonstrate that our framework (i) significantly outperforms the state-of-the-art AU detection methods on the challenging BP4D, DISFA, GFT and BP4D+ benchmarks, (ii) can adaptively capture the irregular region of each AU, (iii) achieves competitive performance for face alignment, and (iv) also works well under partial occlusions and non-frontal poses. The code for our method is available at https://github.com/ZhiwenShao/PyTorch-JAANet.
AB - Facial action unit (AU) detection and face alignment are two highly correlated tasks, since facial landmarks can provide precise AU locations to facilitate the extraction of meaningful local features for AU detection. However, most existing AU detection works handle the two tasks independently by treating face alignment as a preprocessing, and often use landmarks to predefine a fixed region or attention for each AU. In this paper, we propose a novel end-to-end deep learning framework for joint AU detection and face alignment, which has not been explored before. In particular, multi-scale shared feature is learned firstly, and high-level feature of face alignment is fed into AU detection. Moreover, to extract precise local features, we propose an adaptive attention learning module to refine the attention map of each AU adaptively. Finally, the assembled local features are integrated with face alignment feature and global feature for AU detection. Extensive experiments demonstrate that our framework (i) significantly outperforms the state-of-the-art AU detection methods on the challenging BP4D, DISFA, GFT and BP4D+ benchmarks, (ii) can adaptively capture the irregular region of each AU, (iii) achieves competitive performance for face alignment, and (iv) also works well under partial occlusions and non-frontal poses. The code for our method is available at https://github.com/ZhiwenShao/PyTorch-JAANet.
KW - Adaptive attention learning
KW - Face alignment
KW - Facial AU detection
KW - Joint learning
UR - http://www.scopus.com/inward/record.url?scp=85090757188&partnerID=8YFLogxK
U2 - 10.1007/s11263-020-01378-z
DO - 10.1007/s11263-020-01378-z
M3 - Article
AN - SCOPUS:85090757188
SN - 0920-5691
VL - 129
SP - 321
EP - 340
JO - International Journal of Computer Vision
JF - International Journal of Computer Vision
ER -