论文中文题名: |
基于深度学习的行人重识别
|
姓名: |
李蕊心
|
学号: |
18208051006
|
保密级别: |
公开
|
论文语种: |
chi
|
学科代码: |
081202
|
学科名称: |
工学 - 计算机科学与技术(可授工学、理学学位) - 计算机软件与理论
|
学生类型: |
硕士
|
学位级别: |
工学硕士
|
学位年度: |
2021
|
培养单位: |
西安科技大学
|
院系: |
计算机科学与技术学院
|
专业: |
计算机软件与理论
|
研究方向: |
人工智能与信息处理
|
第一导师姓名: |
厍向阳
|
第一导师单位: |
西安科技大学
|
论文提交日期: |
2021-06-21
|
论文答辩日期: |
2021-06-03
|
论文外文题名: |
Pedestrian re-identification based on deep learning
|
论文中文关键词: |
行人重识别 ; 随机擦除 ; 残差网络 ; 注意力机制 ; 深度学习
|
论文外文关键词: |
pedestrain re-identification ; random erasing ; residual network ; attention mechanism ; deep learning
|
论文中文摘要: |
︿
近年来,在计算机视觉问题研究中,行人重识别作为一个热点被学者广泛关注。行人重识别仍然存在诸如摄像头视角变化、光线变化、行人姿势变化、行人部分图像被遮挡等问题。如何提取更具有判别力的行人信息进行行人重识别仍是目前行人重识别所面临的问题。为了解决上述问题,本文基于深度学习方法在模型与算法方面对行人重识别进行研究,提出以下模型。
针对传统的行人重识别方法依赖人工构造视觉特征,容易受到其他外界因素的影响,识别精度低。深度学习模型能自主的提取特征,但随着网络层数的加深会出现梯度消失情况,残差网络能缓解梯度消失问题,但提取出的特征信息未被合理使用。行人部分图像被遮挡是影响行人重识别准确性的另一个重要因素。针对上述问题本文提出了融合随机擦除和残差注意力网络的行人重识别算法。该算法:①在残差网络的基础上,引入注意力机制模块,通过强化有用的特征和抑制作用不大的特征来提升网络的判别能力。②引入随机擦除的数据增强方法,以便降低过拟合现象,同时提高网络泛化能力,解决行人重识别中遮挡问题。③使用triplet loss对融合网络进行监督训练,实现样本在特征空间中达到更好的聚类效果,提升行人重识别的准确率。通过实验验证该算法在Market-1501数据集和DukeMTMC-reID数据集上的识别精度。
针对传统的行人重识别方法提取出的行人特征信息判别能力较弱,使得模型难以取得更好的识别效果。在度量特征空间中,由于同类样本之间距离过小导致识别精度低。针对上述问题本文提出了基于注意力机制的多损失行人重识别算法。该算法:①使用孪生注意力机制网络提取特征,通过强调有用的通道特征抑制作用较小的特征来提高网络的判别能力;②引入中心损失,使用多损失训练方法,实现同类样本在特征空间更好聚集,使得异类样本更加远离。通过实验验证该算法在Market-1501数据集和DukeMTMC-reID数据集上的识别精度。
﹀
|
论文外文摘要: |
︿
In recent years, in the research of computer vision problems, pedestrian re-identification has been widely concerned by scholars as a hot topic. Pedestrian re-recognition still has problems such as changes in camera angle of view, changes in light, changes in pedestrian posture, and partial occlusion of pedestrian images. How to extract more discriminative pedestrian information for pedestrian re-identification is still a problem facing pedestrian re-identification. In order to solve the above problems, this paper studies pedestrian re-recognition in terms of models and algorithms based on deep learning methods, and proposes the following model.
The traditional pedestrian re-identification method relies on artificially constructed visual features, which is easily affected by other external factors and has low recognition accuracy. The deep learning model can extract features autonomously, but as the number of network layers deepens, the gradient disappears. The residual network can alleviate the gradient disappearance problem, but the extracted feature information is not used rationally. Partial occlusion of pedestrian images is another important factor that affects the accuracy of pedestrian re-identification. Aiming at the above problems, this paper proposes a pedestrian re-recognition algorithm combining random erasure and residual attention network. The algorithm: Firstly, On the basis of the residual network, the attention mechanism module is introduced, and the discriminative ability of the network is improved by strengthening the useful features and the features with little inhibition.Secondly ,Introduce random erasure data enhancement method in order to reduce over-fitting phenomenon, at the same time improve network generalization ability, and solve the occlusion problem in pedestrian re-identification. Thirdly,Using triplet loss to supervise and train the fusion network to achieve better clustering effect of samples in the feature space and improve the accuracy of pedestrian re-recognition. The recognition accuracy of the algorithm on the Market-1501 data set and DukeMTMC-reID data set is verified through experiments.
According to the traditional pedestrian re-identification method, the distinguishing ability of pedestrian characteristic information extracted is weak, which makes it difficult for the model to achieve better recognition results. In the metric feature space, the recognition accuracy is low due to the small distance between similar samples. Initial matching failure is another important factor that leads to low model recognition accuracy. To solve the above problems, this paper proposes a multi-loss pedestrian re-recognition algorithm based on the attention mechanism. The algorithm: Firstly,Use the twin attention mechanism network to extract features, and improve the discriminative ability of the network by emphasizing useful channel features that have less inhibitory effect; Secondly ,Introduce center loss and use a multi-loss training method to achieve similar samples in feature space Better gathering, making heterogeneous samples farther away.The recognition accuracy of the algorithm on the Market-1501 data set and DukeMTMC-reID data set is verified through experiments.
﹀
|
参考文献: |
︿
[1] CAI Q,AGGARWAL JK.Tracking human motion using multiple cameras[C]// International Conference on Pattern Recognition .Vienna,Austria,1996,68-72. [2] GHEISSARI N,SEBASTIAN T B,HARTLEY R.Person re-identification using spatiotemporal appearance[C]//IEEE Conference on Computer Vision and Pattern Recognition. New York,USA,2006,1528-1535. [3] GRAY D,BRENNAN S,TAO H.Evaluating appearance models for recognition reacquisitionand tracking[J].International journal of computer vision,2007,89(2): 56-68. [4] GONG S, CRISTANI M, YAN S, et al. Person re-identification[M].London, UK: Springer, 2014. [5] 宋婉茹,赵晴晴,陈昌红,干宗良,刘峰.行人重识别研究综述[J].智能系统学报,2017,12(06):770-780. [6] J Huang, S Kumar, M Mitra, W J Zhu, and R Zabih, Image indexing using color correlogram[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.1997,762-768. [7] Gonzalez, R.C, Woods, R.E, Eddins,S.L.数字图像处理.北京电子工业出版社[M],2006. [8] D Moctezuma, C Conde, I M de Diego, and E Cabello.Person detection in surveillance environment with HoGG:Gabor filters and histogram of oriented gradient[C]//IEEE International Conference on Computer Vision Workshops.2011, 1793–1800. [9] Liao S,HuY,ZhuX,etal.Person re-identification by local maximal occurrence representation and metric learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2015,2197–2206. [10] Yang Y,Yang J,Yan J,et al.Salient color names for person re-identification[C] // Proceedings of the European Conference on Computer Vision(ECCV).Springer International Publishing,2014,536-551. [11] Farenzena M,Bazzani L,Perina A,et al.Person re-identification by symmetry-driven accumulation local features[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2010,2360-2367. [12] MATSUKAWA T,OKABE T,SUZUKI E,et al.Hierarchical gaussian descriptor for person re-identification[C]//IEEE Conference on Computer Vision and Pattern Recognition.2016,1363-1372. [13] Xing E P,Jordan M I,Russell S J,et al.Distance metric learning with application to clustering with side-information[C]//Advances in neural information processing systems,2003,521-528. [14] Weinberger K Q,Saul L K.Distance metric learning for large margin nearest neighbor classification[J].Journal of Machine Learning Research,2009,10(1):207- 244. [15] Koestinger M,Hirzer M,Wohlhart P,et al.Large scale metric learning from equivalence constraints[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2012,2288-2295. [16] Zhang L,Xiang T,Gong S.Learning a discriminative null space for person re-identification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2016,1239-1248. [17] W Zhong, L Jiang, T Zhang, J Ji and H Xiong, Combining multi level feature extraction and multi-loss learning for person re-identification[J],Neurocomputing, 2019,334:68-78. [18] H Ling, Z Wang, P Li, Y Shi, J Chen and F Zou.Improving person re- identification by multi-task learning[J].Neurocomputing,2019,347:109-118. [19] Rahul Rama Varior, Bing Shuai, Jiwen Lu, Dong Xu, Gang Wang. A siamese long short-term memory architecture for human re-identification[C]//European Conference on Computer Vision. Springer, 2016,135–153. [20] Liang Zheng, Yujia Huang, Huchuan Lu, Yi Yang. Pose-invariant embedding for deep person reidentification[J].IEEE Transactions on Image Processing,2019, 28(9):4500- 4509. [21] Y Tian, Q Li, D Wang and B Wan.Robust joint learning network: improved deep representation learning for person re-identification[J].Multimedia Tools and Applications,2019,78:24187-24203. [22] Z. Zhang and M. Huang.Person Re-Identification Based on Heterogeneous Part-Based Deep Network in Camera Networks[J].IEEE Transactions on Emerging Topics in Computational Intelligence, 2020,4(1):51-60. [23] S Li, H Yu and R Hu.Attributes-aided part detection and refinement for person re-identification[J].Pattern Recognition, 2020,97,101016. [24] Rahul Rama Varior, Mrinal Haloi, Gang Wang. Gated siamese convolutional neural network architecture for human re-identification[C]//European Conference on Computer Vision. Springer, 2016,791-808. [25] S Zhou, J Wang, D Meng, Y Liang, Y Gong and N Zheng.Discriminative Feature Learning With Foreground Attention for Person Re-Identification[J].IEEE Transactions on Image Processing ,2019,28(9):4671-4684. [26] T. Si, Z. Zhang and S. Liu.Compact Triplet Loss for person re-identification in camera sensor networks[J].Ad Hoc Networks, 2019,95,101984. [27] De Cheng, Yihong Gong, Sanping Zhou, Jinjun Wang, Nanning Zheng. Person re- identification by multichannel parts-based cnn with improved triplet loss function[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016, 1335-1344. [28] Weihua Chen, Xiaotang Chen, Jianguo Zhang, Kaiqi Huang. Beyond triplet loss: a deep quadruplet network for person re-identification[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).2017,403-412. [29] Alexander Hermans, Lucas Beyer, Bastian Leibe. In defense of the triplet loss for person reidentification[EB/OL].http://arxiv.org/abs/1703.07737,2017. [30] Xiao Q, Luo H, Zhang C. Margin Sample Mining Loss: A Deep Learning Based Method for PersonRe-identification[EB/OL].http://arxiv.org/abs/1703. 07737,2017. [31] Zhedong Zheng, Xiaodong Yang, Zhiding Yu, Liang Zheng, Yi Yang, Jan Kautz. Joint Discriminative and Generative Learning for Person Re-identification[C]//Computer Vision and Pattern Recognition.2019,2138-2147. [32] Zhun Zhong,Liang Zheng,Zhedong Zheng,Shaozi Li,Yi Yang.Camera Style Adaptation for Person Re-identification[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition.2018,5157-5166. [33] Longhui Wei,Shiliang Zhang,Wen Gao,Qi Tian.Person Transfer GAN to Bridge Domain Gap for Person Re-Identification[C].IEEE/CVF Conference on Computer Vision and Pattern Recognition.2018,79-88. arXiv:1711.10295 [34] Yifan Sun,Beyond Part Models:Person Retrieval with Refined Part Pooling[EB/OL]. http://arxiv.org/pdf/1711.09349.pdf. [35] Haoran Wang,Yue Fan,Zexin Wang,Licheng Jiao.Parameter-Free Spatial Attention Network for Person Re-identification[EB/OL].http://arxiv.org/pdf/ 1811.12150.pdf. [36] Wei Li,Xiatian Zhu,Shaogang Gong.Harmonious Attention Network for Person Re- Identification[C]//2018 IEEE Conference on Computer Visionand Pattern Recognition (CVPR),2018,2285-2294. [37] Jianlou Si,Honggang Zhang,Chun-Guang Li,Jason Kuen,et al .Dual Attention Matching Network for Context-Aware Feature Sequence based Person Re-Identification[C]//2018 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),2018,5363-5372. [38] R. Zhang, J. Li, H. Sun, Y. Ge, P. Luo, X. Wang and L. Lin,SCAN:Self-and- Collaborative Attention Network for Video Person Re-Identification[J],IEEE Transactions on Image Processing,2019,28(10):4870-4882. [39] L Wu, Y Wang, H Yin, M Wang and L Shao.Few-Shot Deep Adversarial Learning for Video-Based Person Re-Identification[J].IEEE Transactions on Image Processing, 2020,29:1233-1245. [40] Y Wu, OEF Bourahla, X Li, F Wu, Q Tian and X Zhou.Adaptive Graph Representation Learning for Video Person Re-Identification[J].IEEE Transactions on Image Processing,2020, 29:8821-8830. [41] LeCun Y, Boser B E, Denker J S, et al. Handwritten digit recognition with a back-propagation network[C]//Advances in neural information processing systems.1990,396 – 404. [42] Rumelhart D E, Hinton G E, Williams R J. Learning representations by back- propagating errors[J]. nature, 1986, 323(6088):533. [43] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C] //2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2016,770-778. [44] Chopra S,Hadsell R,Lecun Y.Learning a similarity metric discriminatively with application to face verification[C]//2005 IEEE Conference on Computer Vision and Pattern Recognition,2005,539-546. [45] 蔡建军, 孔令富, 李海涛. 基于欧式距离变换的人体2D关节点标定[J].计算机仿真, 2012, 29(7):243-246. [46] E. Ahmed, M. Jones, and T. K. Marks. An improved deep learning architecture for person Re-ID[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR).2015,3908-3916. [47] Xing E P, Jordan M I, Russell S J, et al. Distance metric learning with application to clustering with side-information[C]//Advances in neural information processing systems. 2003,521–528. [48] Li W, Zhao R, Wang X. Human reidentification with transferred metric learning[C]// Asian Conference on Computer Vision.2012,31–44. [49] Li W, Zhao R, Xiao T, et al. Deepreid: Deep filter pairing neural networkfor person re-identification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014,152–159. [50] Zheng L, Shen L, Tian L, et al. Scalable person re-identification: A benchmark[C]// Proceedings of the IEEE International Conference on Computer Vision.2015,1116–1124. [51] Ristani E,Solera F,Zou R,Cucchiara R,Tomasi C.Performance Measures and a Data set for Multi-target,Multi-camera Tracking[C]//European Conference on Computer Vision.Springer,Cham,2016,17-35. [52] Zheng L,Shen L,Tian L,et al.Scalable Person Re-identification:A Benchmark //2015 IEEE International Conference on Computer Vision(ICCV),2015,1116- 1124. [53] Varior R R,Shuai B,Lu J,et al.A Siamese long short-term memory architecyure for human re-identification[C]//2016 European Conference on Computer Vision,2016,135-153. [54] Varior R R,Haloi M,Wang G.Gated Siamese convolutional neural network architecture for hunman re-identification[C]//2016 European Conference on Computer Vision,2016, 791-808. [55] Aderberg M,Simonyan K,Zisserman A.Spatial transformer networks[C]// Advances in neural information processing systems,2015,2017-2025. [56] Hu J,Shen L,Albanie S,et al.Squeeze-and-excitation[C]//In Conference on Computer Vision and Pattern Recognition, 2018,7132-7141. [57] Zheng Z,Zheng L,Yang Y.A Discriminatively Learned CNN Embedding for Person Re-identification[J].Acm Transactions on Multtimedia Computing Communications & Applications,2017,14(1). [58] Woo S,Park J,Lee J Y,et al.Cbam:Convolutional block attention module[C] //Proceedings of the European Conference on Computer Vision(ECCV).2018,3–19. [59] L. Zheng, Y. Yang, and A. G. Hauptmann. Person reidentification: Past, present and future. arXiv preprint arXiv:1610.02984, 2016 [60] Jing Xu, Rui Zhao, Feng Zhu, Huaming Wang,Wanli Ouyang. Attention-Aware Compositional Network for Person Re-identification[C]//In Conference on Computer Vision and Pattern Recognition, 2018,2119-2128. [61] Zhedong Zheng, Liang Zheng, Yi Yang. Pedestrian Alignment Network for Large-scale Person[J]. IEEE Transactions on Circuits and Systems for Video Technology,2019,29(10): 3037-3045. [62] M.S Sarfraz,A Schumann,A Eberle, et al.A Pose-Sensitive Embedding for Person Re-Identification with Expanded Cross Neighborhood Re-Ranking[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).2018,420-429. [63] X Chang,TM Hospedales,T Xiang.Multi-Level Factorisation Net for Person Re-Identification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2018,2019-2118. [64] Cai J,Zha Z J,Wang M,et al.An attribute-assisted re-ranking model for Web image search[J].IEEE Transactions on Image Processing,2015,24(1):261-272. [65] Yu J,Rui Y,Tao D.Click prediction for web image re-ranking using multimodal sparse coding[J].IEEE Transactions on image Processing,2014,23(5): 2019-2032. [66] Leng Q,Hu R,Liang C,et al.Person re-identification with content and context re-rank[J].Multimedia Tools and Applications,2015,74(17):6989-7014. [67] Zhong Z,Zheng L,Gao D,et al.Re-ranking Person Re-identification with k-Reciprocal Encoding[C].Computer Vision and Pattern Recognition,2017,3652- 3661. [68] Zheng Z,Zheng L,Yang Y.Unlabeled samples generated by gan improve the person re-identification baseline in vitro[C]//2017 IEEE International Conference on Computer Vision(ICCV).2017,3774-3782. [69] Sun Y, Zheng L, Deng W, et al. SVDNet for Pedestrian Retrieval[EB/OL]. https://arxiv. org/abs/1707.00408,2017.
﹀
|
中图分类号: |
TP391.41
|
开放日期: |
2021-06-21
|