查看论文信息

免费浏览

查看论文信息

论文中文题名：	基于孪生网络的目标跟踪算法研究
姓名：	李娇
学号：	19207040028
保密级别：	公开
论文语种：	chi
学科代码：	0810
学科名称：	工学 - 信息与通信工程
学生类型：	硕士
学位级别：	工学硕士
学位年度：	2022
培养单位：	西安科技大学
院系：	通信与信息工程学院
专业：	信息与通信工程
研究方向：	图像处理
第一导师姓名：	侯颖
第一导师单位：	西安科技大学
论文提交日期：	2022-06-23
论文答辩日期：	2022-06-10
论文外文题名：	Research on Object Tracking Algorithm Based on Siamese Network
论文中文关键词：	计算机视觉 ; 目标跟踪 ; 孪生网络 ; 特征匹配
论文外文关键词：	Computer Vision ; Object Tracking ; Siamese Network ; Feature Matching
论文中文摘要：	︿目标跟踪作为计算机视觉的关键技术之一，在视频监控、人机交互、自动驾驶等领域有重要的实际应用价值。目标跟踪的主要任务是在视频第一帧中给定目标的位置和大小，根据视频序列的上下文信息对后续帧中目标的运动状态进行预测，得到目标完整的运动轨迹。近年来，基于孪生网络的目标跟踪算法具有跟踪速度快、精确度高、端到端离线训练模型等优点被广泛关注，是当前主流的目标跟踪算法之一。由于跟踪环境的复杂性和目标运动状态的随机性，孪生网络目标跟踪算法在目标被遮挡、出视野等场景下容易发生跟踪漂移，针对此问题，本文提出了基于特征匹配的孪生网络目标跟踪算法（SiamFM）。算法使用最大峰值响应和平均峰值相关能量判别跟踪置信度，当跟踪置信度较高时，当前帧目标跟踪准确，输出跟踪结果；反之，对跟踪置信度较低的视频帧采用目标特征匹配跟踪策略得到粗定位的匹配质心点，再利用SiamRPN跟踪器进行重检测，得到目标的精确位置。改进算法SiamFM在OTB100数据集上精确度和成功率分别达到了0.889和0.673，在VOT2018数据集上精确度和平均重叠率分别达到了0.556和0.297，结果表明SiamFM有效提高了算法的跟踪精确度，改善了复杂场景下算法跟踪漂移的问题。为了进一步提高SiamFM算法的跟踪速度，本文提出了基于运动矢量的快速特征匹配孪生网络跟踪算法（FSiamFM_MV）。设计固定分组和运动矢量参数策略对不同视频帧进行分类判别，利用SiamFM进行目标跟踪，在保证算法跟踪精确度的情况下，提高跟踪速度。将改进算法FSiamFM_MV与SiamFM在OTB100数据集上进行对比实验，结果表明FSiamFM_MV跟踪速度比SiamFM提高了25.2%，证明了改进策略的有效性。﹀
论文外文摘要：	︿ As one of the key technologies of computer vision, object tracking has important practical application value in video surveillance, human-computer interaction, automatic driving and other fields. The main task of object tracking is to give the position and size of the object in the first frame of the video, and predict the motion state of the object in the subsequent frames according to the context information of the video sequence, so as to obtain the complete motion trajectory of the object. In recent years, object tracking algorithm based on siamese network has been widely concerned for its advantages of fast tracking speed, high accuracy and end-to-end off-line training model. It is one of the mainstream object tracking algorithms at present. Due to the complexity of tracking environment and the randomness of object motion state, the tracking drift of object tracking algorithm in siamese network is easy to occur when the object is shielded and out of view. To solve this problem, this paper proposes a siamese network object tracking algorithm based on feature matching (SiamFM). The algorithm uses maximum peak response and average peak correlation energy to discriminate tracking confidence. When the tracking confidence is high, the current frame is tracked accurately and the tracking result is output. On the other hand, for the video frame with low tracking confidence, the object feature matching tracking strategy is adopted to get the coarse positioning matching centroid point, and then the SiamRPN tracker is used for re-detection to get the precise position of the object. The accuracy and success rate of SiamFM are 0.889 and 0.673 on OTB100 dataset, and 0.556 and 0.297 on VOT2018 dataset, respectively. The results show that SiamFM effectively improves the tracking accuracy of the algorithm. The tracking drift problem of algorithm in complex scene is improved. In order to improve the tracking speed of SiamFM algorithm, a fast feature matching siamese network tracking algorithm based on motion vector (FSiamFM_MV) is proposed in this paper. The fixed grouping and motion vector parameter strategy were designed to classify and discriminate different video frames. SiamFM was used to track the object, and the tracking speed was improved while the tracking accuracy was guaranteed. The results show that the tracking speed of FSiamFM_MV is 25.2% higher than that of SiamFM, which proves the effectiveness of the improved strategy. ﹀
参考文献：	︿ [1] 李玺,查宇飞,张天柱,等.深度学习的目标跟踪算法综述[J].中国图象图形学报,2019, 24(12):2057-2080. [2] 卢湖川,李佩霞,王栋.目标跟踪算法综述[J].模式识别与人工智能,2018,31(01):61-76. [3] RiosCabrera R, Tuytelaars T, Van Gool L. Efficient Multi-Camera Vehicle Detection, Tracking, and Identification in a Tunnel Surveillance Application[J]. Computer Vision and Image Understanding, 2012, 116(6): 742-753. [4] Qi J, Jiang G, Li G, et al. Intelligent Human-Computer Interaction Based on Surface EMG Gesture Recognition[J]. IEEE Access, 2019, 7: 61378-61387. [5] Soleimanitaleb Z, Keyvanrad M A, Jafari A. Object Tracking Methods: A Review[C]//2019 9th International Conference on Computer and Knowledge Engineering (ICCKE). IEEE, 2019: 282-288. [6] Gao M, Jin L, Jiang Y, et al. Manifold Siamese Network: A Novel Visual Tracking ConvNet for Autonomous Vehicles[J]. IEEE Transactions on Intelligent Transportation Systems, 2019, 21(4): 1612-1623. [7] 朱均安.基于深度学习的视觉目标跟踪算法研究[D].中国科学院大学(中国科学院长春光学精密机械与物理研究所),2020. [8] 葛宝义,左宪章,胡永江.视觉目标跟踪方法研究综述[J].中国图象图形学报,2018,23(08): 1091-1107. [9] 李成美,白宏阳,郭宏伟,等.一种改进光流法的运动目标检测及跟踪算法[J].仪器仪表学报,2018,39(05):249-256. [10] 孟琭,李诚新.近年目标跟踪算法短评——相关滤波与深度学习[J].中国图象图形学报,2019,24(07): 1011-1016. [11] Bolme D S, Beveridge J R, Draper B A, et al. Visual object tracking using adaptive correlation filters[C]//IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2010, San Francisco, CA, USA, 2010: 2544-2550. [12] Henriques J F, Caseiro R, Martins P, et al. Exploiting the Circulant Structure of Tracking-by-Detection with Kernels[C]//European Conference on Computer Vision. Springer, Berlin, Heidelberg, 2012: 702-715. [13] Henriques J F, Caseiro R, Martins P, et al. High-Speed Tracking with Kernelized Correlation Filters[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2014, 37(3): 583-596. [14] Li Y, Zhu J. A Scale Adaptive Kernel Correlation Filter Tracker with Feature Integration[C] //European Conference on Computer Vision. Springer, Cham, 2014: 254-265. [15] Danelljan M, Häger G, Khan F, et al. Accurate Scale Estimation for Robust Visual Tracking[C]//British Machine Vision Conference. 2014: 65.1-65.11. [16] Danelljan M, Hager G, Shahbaz Khan F, et al. Learning Spatially Regularized Correlation Filters for Visual Tracking[C]//Proceedings of the IEEE International Conference on Computer Vision. Santiago, Chile, 2015: 4310-4318. [17] Kiani Galoogahi H, Fagg A, Lucey S. Learning Background-Aware Correlation Filters for Visual Tracking[C]//IEEE International Conference on Computer Vision. 2017: 1135-1143. [18] Danelljan M, Hager G, Shahbaz Khan F, et al. Convolutional Features for Correlation Filter Based Visual Tracking[C]//Proceedings of the IEEE International Conference on Computer Vision Workshops. 2015: 58-66. [19] Danelljan M, Robinson A, Shahbaz Khan F, et al. Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking[C]//European Conference on Computer Vision. Springer, Cham, 2016: 472-488. [20] Danelljan M, Bhat G, Shahbaz Khan F, et al. ECO: Efficient Convolution Operators for Tracking[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 6638-6646. [21] Tao R, Gavves E, Smeulders A W. Siamese Instance Search for Tracking[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2016: 1420-1429. [22] Bertinetto L, Valmadre J, Henriques J F, et al. Fully-Convolutional Siamese Networks for Object Tracking[C]//European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 850-865. [23] Valmadre J, Bertinetto L, Henriques J, et al. End-to-end Representation Learning for Correlation Filter Based Tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA, 2017: 2805-2813. [24] Li B, Yan J, Wu W, et al. High Performance Visual Tracking with Siamese Region Proposal Network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA, 2018: 8971-8980. [25] Zhu Z, Wang Q, Li B, et al. Distractor-aware Siamese Networks for Visual Object Tracking[C]//European Conference on Computer Vision. Springer, Cham, 2018: 101-117. [26] Li B, Wu W, Wang Q, et al. SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019: 4282-4291. [27] Wang Q, Zhang L, Bertinetto L, et al. Fast Online Object Tracking and Segmentation: A Unifying Approach[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Long Beach, CA, USA, 2019: 1328-1338. [28] Voigtlaender P, Luiten J, Torr P H S, et al. Siam R-CNN: Visual Tracking by Re-Detection [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Seattle, WA, USA, 2020: 6578-6588. [29] Chen Z, Zhong B, Li G, et al. Siamese Box Adaptive Network for Visual Tracking[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 6668-6677. [30] Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[C]//Proceedings of the International Conference on Neural Information Processing Systems. New York: Curran Associates Inc. 2017: 6000-6010. [31] Chen X, Yan B, Zhu J, et al. Transformer Tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 8126-8135. [32] Yan B, Peng H, Fu J, et al. Learning Spatio-Temporal Transformer for Visual Tracking[C]// Proceedings of the IEEE International Conference on Computer Vision. Washington D. C, USA: IEEE Press, 2021:10448-10457. [33] Wu Y, Lim J, Yang M H. Online Object Tracking: A Benchmark[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, 2013:2411-2418. [34] Wang N, Shi J, Yeung D Y, et al. Understanding and Diagnosing Visual Tracking Systems[C]//Proceedings of the IEEE International Conference on Computer Vision. 2015:3101-3109. [35] Taigman Y, Yang M, Ranzato M A, et al. Deepface: Closing the Gap to Human-Level Performance in Face Verification[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2014: 1701-1708. [36] El-Sawy A, El-Bakry H, Loey M. CNN for Handwritten Arabic Digits Recognition Based on LeNet-5[C]//International conference on advanced intelligent systems and informatics. Springer, Cham, 2016: 566-575. [37] 张超溢.基于孪生网络的视觉单目标跟踪算法研究[D].江南大学,2021. [38] Krizhevsky A, Sutskever I, Hinton G E. ImageNet Classification with Deep Convolutional Neural Networks[J]. Advances in Neural Information Processing Systems, 2012, 25: 1097-1105. [39] He K, Zhang X, Ren S, et al. Deep Residual Learning for Image Recognition[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Nevada, 2016: 770-778. [40] Wu Y, Lim J, Yang M H. Object Tracking Benchmark[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2015, 37(9): 1834-1848. [41] Kristan M, Leonardis A, Matas J, et al. The Sixth Visual Object Tracking VOT2018 Challenge Results[C]//Proceedings of the European Conference on Computer Vision, Munich, 2018: 3-53. [42] 刘宗达,董立泉,赵跃进,等.视频中快速运动目标的自适应模型跟踪算法[J].光学学报,2021,41(18):164-173. [43] Lowe D G. Distinctive Image Features From Scale-Invariant Keypoints[J]. International Journal of Computer Vision, 2004, 60(2): 91-110. [44] Tareen S A K, Saleem Z. A Comparative Analysis of SIFT, SURF, KAZE, AKAZE, ORB, and BRISK[C]//International Conference on Computing, Mathematics and Engineering Technologies. IEEE, 2018: 1-10. [45] 贾迪,朱宁丹,杨宁华,等.图像匹配方法研究综述[J].中国图象图形学报,2019, 24(05):677-699. [46] 李艳萍,林建辉,杨宁学.一种基于SIFT特征光流的运动目标跟踪算法[J].计算机科学,2015,42(11):305-309. [47] 张丽红,周天,徐超,韩婷婷.基于声呐图像序列SIFT特征的多目标跟踪方法研究[J].声学技术,2019,38(05):514-519. [48] Muja M, Lowe D G. Scalable Nearest Neighbor Algorithms for High Dimensional Data[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(11):2227-2240. [49] Vijayan V, Kp P. FLANN Based Matching with SIFT Descriptors for Drowsy Features Extraction[C]//2019 Fifth International Conference on Image Information Processing (ICIIP). IEEE, 2019: 600-605. [50] 万微祥.基于局部稀疏表示模板匹配跟踪算法的研究[D].浙江理工大学,2017. [51] Danelljan M, Bhat G, Khan F S, et al. ATOM: Accurate Tracking by Overlap Maximization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 4660-4669. [52] Zhang Z, Peng H, Fu J, et al. Ocean: Object-Aware Anchor-Free Tracking[C]//European Conference on Computer Vision. Springer, Cham, 2020: 771-787. [53] Li P, Chen B, Ouyang W, et al. Gradnet: Gradient-Guided Network for Visual Object Tracking[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019: 6162-6171. ﹀
中图分类号：	TP391.4

附件下载