- 无标题文档
查看论文信息

论文中文题名:

 结合 ImArcFace 与知识蒸馏的口罩人脸识别方法    

姓名:

 王蓓    

学号:

 20207040023    

保密级别:

 公开    

论文语种:

 chi    

学科代码:

 0810    

学科名称:

 工学 - 信息与通信工程    

学生类型:

 硕士    

学位级别:

 工学硕士    

学位年度:

 2023    

培养单位:

 西安科技大学    

院系:

 通信与信息工程学院    

专业:

 信息与通信工程    

研究方向:

 数字图像处理    

第一导师姓名:

 朱周华    

第一导师单位:

 西安科技大学    

论文提交日期:

 2023-06-15    

论文答辩日期:

 2023-06-06    

论文外文题名:

 Face Recognition Method Combining ImArcFace and Knowledge Distillation    

论文中文关键词:

 ArcFace ; 注意力机制 ; 知识蒸馏 ; 口罩人脸识别    

论文外文关键词:

 ArcFace ; Attention mechanism ; Knowledge distillation ; Mask face recognition    

论文中文摘要:

人脸识别技术是机器视觉领域和人工智能领域中基础且非常重要的任务。随着深 度学习技术的发展,人脸识别技术已经趋于成熟。然而,在非限制环境下,光照、遮 挡、姿态变化等因素都可能对识别结果产生影响。特别是在新冠疫情、甲流、诺如病 毒等呼吸道疾病高发期,为防止病毒传播,人们普遍戴上口罩,但这也遮盖了大面积 的人脸区域,导致人脸识别算法性能急剧下降。针对上述问题,本文结合注意力机制 和知识蒸馏技术对现有算法进行了改进,主要研究工作如下: (1)针对口罩人脸遮挡问题改进了 ArcFace 算法(ImArcFace),完成了识别任务。 该方法以 ArcFace 为基础框架,选择在准确性和学习收敛性方面上都超过原始网络的 IResNet 作为特征提取网络。为了更加合理的利用未被口罩遮挡的眉眼区域特征,引入 了 CBAM 模块进行自适应的特征修饰,设计了眉眼注意力模块,对特征进行通道聚合, 并对眉眼区域做特殊激活,以提高眉眼区域对模型决策的影响力,减少口罩引入的虚 假特征对模型辨别能力的影响。 (2)针对模型参数量多、运算量大的问题,结合知识蒸馏改进了 ImArcFace 算法 (ImArcFace-KD),完成了识别任务。该方法以一个层数为 100 层的大型深度残差网 络作为教师网络,以层数为 50 的 IResNet 网络作为学生网络。将正常人脸送入教师网 络,而学生网络随机送入口罩人脸和正常人脸,两个网络交互式训练,采用分组蒸馏 策略,在最终蒸馏损失计算中保留了主要组和二元组,而舍弃了包含知识较少,影响 蒸馏效果的次要组,降低蒸馏难度的同时约束学生网络和教师网络产生的特征映射更 加相似。 (3)设计并进行了一系列实验,比较了本文算法与先进的人脸识别算法在口罩人 脸识别上的性能。实验结果表明,本文的算法在两种场景下的 LFW 数据集上精度提升 了 1%到 8%不等,其中在 mask vs mask 场景下 ACC 指标提升了 1.08%,TAR(0.01) 指标提升了 2.36%,TAR(0.001)指标提升了 5.37%;在 mask vs nomask 场景下 ACC 指标提升了 1.9%,TAR(0.01)指标提升了 3.87%,TAR(0.001)指标提升了 8.77%。 在保持精度大致不变的情况下,知识蒸馏技术使得本文模型与原始网络模型相比单张 图片推理速度平均提升了 1.55ms。在可视化实验中,本文算法也取得了最佳的识别决 策依据。最后,基于本文的口罩人脸识别算法设计了一个实时口罩人脸识别可视化界 面,测试结果表明,本文算法在实时识别时也有较好的表现。 综上所述,本文提出的方法在口罩人脸识别的场景下有较为显著的效果,提升了 口罩人脸识别模型的性能。

论文外文摘要:

Face recognition technology is a fundamental and very important task in the field of machine vision and artificial intelligence. With the development of deep learning technology, face recognition technology has become mature. However, in unrestricted environments, factors such as lighting, occlusion, and pose changes may have an impact on recognition results. Especially, during the high incidence of respiratory diseases such as the new crown epidemic, influenza A, and norovirus, people commonly wear masks to prevent the spread of viruses, but this also obscures a large area of the face region, leading to a sharp decline in the performance of face recognition algorithms. To address the above problems, this paper combines attention mechanism and knowledge distillation techniques to improve the existing algorithm, and the main research work is as follows: (1) The ArcFace algorithm is improved (ImArcFace) to accomplish the recognition task for the mask face occlusion problem. The method uses ArcFace as the base framework and selects IResNet, which outperforms the original network in terms of accuracy and learning convergence, as the feature extraction network. In order to more reasonably utilize the features of the eyebrow and eye regions that are not obscured by the mask, a CBAM module is introduced for adaptive feature modification, and an eyebrow and eye attention module is designed for channel aggregation of features and special activation of the eyebrow and eye regions to improve the influence of the eyebrow and eye regions on model decision making and reduce the influence of false features introduced by the mask on the model discrimination ability. (2) The ImArcFace algorithm is improved (ImArcFace-KD) for the recognition task in combination with knowledge distillation for the problem of large number of model parameters and large computation. The method uses a large deep residual network with 100 layers as the teacher network and an IResNet network with 50 layers as the student network. Normal faces are fed into the teacher network, while the student network is randomly fed with masked faces and normal faces. The two networks are trained interactively, and a group distillation strategy is used to retain the major groups and binary groups in the final distillation loss calculation, while discarding the minor groups that contain less knowledge and affect the distillation effect, reducing the distillation difficulty while constraining the student network and the teacher network to produce more similar feature mappings. (3) A series of experiments were designed and conducted to compare the performance of this paper's algorithm with advanced face recognition algorithms for mask face recognition. The experimental results show that the algorithm in this paper improves the accuracy on LFW dataset in two scenarios ranging from 1% to 8%, including 1.08% improvement in ACC metrics, 2.36% improvement in TAR (0.01) metrics and 5.37% improvement in TAR (0.001) metrics in mask vs mask scenario; and 1.9% improvement in ACC metrics improved by 1.9%, TAR (0.01) metrics improved by 3.87%, and TAR (0.001) metrics improved by 8.77%. In the visualization experiments, the algorithm also achieves the best recognition decision basis. Finally, a realtime mask face recognition visualization interface is designed based on the mask face recognition algorithm in this paper, and the test results show that the algorithm in this paper also performs well in real-time recognition. In summary, the method proposed in this paper has more significant effect in the mask face recognition scenario and improves the performance of the mask face recognition model.

参考文献:

[1] Taigman Y, Yang M, Ranzato M A, et al. Deepface: Closing the gap to human-level performance in face verification[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2014: 1701-1708.

[2] Sun Y, Wang X, Tang X. Deep learning face representation from predicting 10,000 classes[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2014: 1891-1898.

[3] Schroff F, Kalenichenko D, Philbin J. Facenet: A unified embedding for face recognition and clustering[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 815-823.

[4] O.M. Parkhi , A. Vedaldi , A. Zisserman , Deep face recognition[J], Proceedings of the British Machine Vision Conference (BMVC) 2015: 41.1–41.12.

[5] Liu J, Deng Y, Bai T, et al. Targeting ultimate accuracy: Face recognition via deep embedding[J]. arXiv preprint arXiv:1506.07310, 2015.

[6] Zhao K, Xu J, Cheng M M. Regularface: Deep face recognition via exclusive regularization[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 1136-1144..

[7] X. Zhang , R. Zhao , J. Yan , M. Gao , Y. Qiao , X. Wang , H. Li , P2sgrad: Refined gradients for optimizing deep face models[C]// IEEE Conference on Computer Vision and Pattern Recognition 2019 :9906–9914 .

[8] Chen J, Guo Z, Hu J. Ring-regularized cosine similarity learning for fine-grained face verification[J]. Pattern Recognition Letters, 2021, 148: 68-74.

[9] Lin J, Morere O, Veillard A, et al. Deephash for image instance retrieval: Getting regularization, depth and fine-tuning right[C]//Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval. 2017: 133-141.

[10] Liu W, Wen Y, Yu Z, et al. Large-margin softmax loss for convolutional neural networks[J]. arXiv preprint arXiv:1612.02295, 2016.

[11] Wen Y, Zhang K, Li Z, et al. A discriminative feature learning approach for deep face recognition[C]//Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part VII 14. Springer International Publishing, 2016: 499-515.

[12] Zhang X, Fang Z, Wen Y, et al. Range loss for deep face recognition with long-tailed training data[C]//Proceedings of the IEEE International Conference on Computer Vision. 2017: 5409-5418. .

[13] Liu Y, Li H, Wang X. Rethinking feature discrimination and polymerization for large-scale recognition[J]. arXiv preprint arXiv:1710.00870, 2017.

[14] Liu W, Wen Y, Yu Z, et al. Sphereface: Deep hypersphere embedding for face recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 212-220.

[15] Yi D, Lei Z, Liao S, et al. Learning face representation from scratch[J]. arXiv preprint arXiv:1411.7923, 2014.

[16] Zheng Y, Pal D K, Savvides M. Ring loss: Convex feature normalization for face recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 5089-5097.

[17] Wang F, Cheng J, Liu W, et al. Additive margin softmax for face verification[J]. IEEE Signal Processing Letters, 2018, 25(7): 926-930..

[18] Wang H, Wang Y, Zhou Z, et al. Cosface: Large margin cosine loss for deep face recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 5265-5274.

[19] Deng J, Guo J, Xue N, et al. Arcface: Additive angular margin loss for deep face recognition[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 4690-4699.

[20] Zhang X, Zhao R, Qiao Y, et al. Adacos: Adaptively scaling cosine logits for effectively learning deep face representations[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 10823-10832.

[21] Duan Y, Lu J, Zhou J. Uniformface: Learning deep equidistributed representation for face recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 3415-3424.

[22] Cheng Y, Wang H. A modified contrastive loss method for face recognition[J]. Pattern Recognition Letters, 2019, 125: 785-790.

[23] Wei X, Wang H, Scotney B, et al. Minimum margin loss for deep face recognition[J]. Pattern Recognition, 2020, 97: 107012.

[24] Wu C Y, Ding J J. Occluded face recognition using low-rank regression with generalized gradient direction[J]. Pattern Recognition, 2018, 80: 256-268.

[25] W Ling H, Wu J, Huang J, et al. Attention-based convolutional neural network for deep face recognition[J]. Multimedia Tools and Applications, 2020, 79: 5595-5616.

[26] Wang J, Yuan Y, Yu G. Face attention network: An effective face detector for the occluded faces[J]. arXiv preprint arXiv:1711.07246, 2017.

[27] Ling H, Wu J, Huang J, et al. Attention-based convolutional neural network for deep face recognition[J]. Multimedia Tools and Applications, 2020, 79: 5595-5616.

[28] Wang K, Peng X, Yang J, et al. Region attention networks for pose and occlusion robust facial expression recognition[J]. IEEE Transactions on Image Processing, 2020, 29: 4057-4069.

[29] Yuan X, Park I K. Face de-occlusion using 3d morphable model and generative adversarial network[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019: 10062-10071.

[30] Gawali A S, Deshmukh R R. 3d face recognition using geodesic facial curves to handle expression, occlusion and pose variations[J]. International Journal of Computer Science and Information Technologies, 2014, 5(3): 4284-4287 .

[31] Duan Q, Zhang L. Look more into occlusion: Realistic face frontalization and recognition with boostgan[J]. IEEE transactions on neural networks and learning systems, 2020, 32(1): 214-228.

[32] Ding F, Peng P, Huang Y, et al. Masked face recognition with latent part detection[C]//Proceedings of the 28th ACM international Conference on multimedia. 2020: 2281-2289.

[33] Geng M, Peng P, Huang Y, et al. Masked face recognition with generative data augmentation and domain constrained ranking[C]//Proceedings of the 28th ACM International Conference on Multimedia. 2020: 2246-2254.

[34] Li C, Ge S, Zhang D, et al. Look through masks: Towards masked face recognition with de-occlusion distillation[C]//Proceedings of the 28th ACM International Conference on Multimedia. 2020: 3016-3024.

[35] Li Y, Guo K, Lu Y, et al. Cropping and attention based approach for masked face recognition[J]. Applied Intelligence, 2021, 51: 3012-3025.

[36] Duta I C, Liu L, Zhu F, et al. Improved residual networks for image and video recognition[C]//2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 2021: 9415-9422.

[37] Woo S, Park J, Lee J Y, et al. Cbam: Convolutional block attention module[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 3-19.

[38] 安占福.无约束人脸识别中若干关键问题研究[D].北京:北京邮电大学,2020.

[39] Deng J, Guo J, Ververas E, et al. Retinaface: Single-shot multi-level face localisation in the wild[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 5203-5212.

[40] Li H, Lin Z, Shen X, et al. A convolutional neural network cascade for face detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 5325-5334.

[41] Chen D, Hua G, Wen F, et al. Supervised transformer network for efficient face detection[C]//Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part V 14. Springer International Publishing, 2016: 122-138.

[42] Jiang H, Learned-Miller E. Face detection with the faster R-CNN[C]//2017 12th IEEE international conference on automatic face & gesture recognition (FG 2017). IEEE, 2017: 650-657.

[43] Sun X, Wu P, Hoi S C H. Face detection using deep learning: An improved faster RCNN approach[J]. Neurocomputing, 2018, 299: 42-50.

[44] Liu W, Anguelov D, Erhan D, et al. Ssd: Single shot multibox detector[C]//Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Springer International Publishing, 2016: 21-37.

[45] Deng J, Guo J, Ververas E, et al. Retinaface: Single-shot multi-level face localisation in the wild[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 5203-5212.

[46] Zhang S, Zhu X, Lei Z, et al. S3fd: Single shot scale-invariant face detector[C]//Procee-dings of the IEEE international conference on computer vision. 2017: 192-201.

[47] Sun Y, Liang D, Wang X, et al. Deepid3: Face recognition with very deep neural networks[J]. arXiv preprint arXiv:1502.00873, 2015.

[48] Liu B, Deng W, Zhong Y, et al. Fair loss: Margin-aware reinforcement learning for deep face recognition[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019: 10052-10061.

[49] Liu H, Zhu X, Lei Z, et al. Adaptiveface: Adaptive margin and sampling for face recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 11947-11956.

[50] Deng J, Guo J, Liu T, et al. Sub-center arcface: Boosting face recognition by large-scale noisy web faces[C]//Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XI 16. Springer International Publishing, 2020: 741-757.

[51] Komodakis N, Zagoruyko S. Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer[C]//ICLR. 2017.

[52] Boutros F, Siebke P, Klemt M, et al. Pocketnet: Extreme lightweight face recognition network using neural architecture search and multistep knowledge distillation[J]. IEEE Access, 2022, 10: 46823-46833.

[53] Ge S, Zhao S, Li C, et al. Low-resolution face recognition in the wild via selective knowledge distillation[J]. IEEE Transactions on Image Processing, 2018, 28(4): 2051-2062.

[54] Sengupta S, Chen J C, Castillo C, et al. Frontal to profile face verification in the wild[C]//2016 IEEE winter conference on applications of computer vision (WACV). IEEE, 2016: 1-9.

[55] Moschoglou S, Papaioannou A, Sagonas C, et al. Agedb: the first manually collected, in-the-wild age database[C]//proceedings of the IEEE conference on computer vision and pattern recognition workshops. 2017: 51-59.

[56] Zheng T, Deng W, Hu J. Cross-age lfw: A database for studying cross-age face recognition in unconstrained environments[J]. arXiv preprint arXiv:1708.08197, 2017.

[57] Zheng T, Deng W. Cross-pose lfw: A database for studying cross-pose face recognition in unconstrained environments[J]. Beijing University of Posts and Telecommunications, Tech. Rep, 2018, 5(7).

[58] Jeevan G, Zacharias G C, Nair M S, et al. An empirical study of the impact of masks on face recognition[J]. Pattern Recognition, 2022, 122: 108308.

[59] Cao Q, Shen L, Xie W, et al. Vggface2: A dataset for recognising faces across pose and age[C]//2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018). IEEE, 2018: 67-74.

[60] Chen S, Liu Y, Gao X, et al. Mobilefacenets: Efficient cnns for accurate real-time face verification on mobile devices[C]//Biometric Recognition: 13th Chinese Conference, CCBR 2018, Urumqi, China, August 11-12, 2018, Proceedings 13. Springer International Publishing, 2018: 428-438.

[61] Anwar A, Raychowdhury A. Masked face recognition for secure authentication[J]. arXiv preprint arXiv:2008.11104, 2020.

中图分类号:

 TP391    

开放日期:

 2023-06-15    

无标题文档

   建议浏览器: 谷歌 火狐 360请用极速模式,双核浏览器请用极速模式