查看论文信息

免费浏览

查看论文信息

论文中文题名：	基于深度学习的长时目标跟踪算法研究
姓名：	李阳
学号：	19207040024
保密级别：	公开
论文语种：	chi
学科代码：	081001
学科名称：	工学 - 信息与通信工程 - 通信与信息系统
学生类型：	硕士
学位级别：	工学硕士
学位年度：	2022
培养单位：	西安科技大学
院系：	通信与信息工程学院
专业：	信息与通信工程
研究方向：	图像处理
第一导师姓名：	侯颖
第一导师单位：	西安科技大学
论文提交日期：	2022-06-22
论文答辩日期：	2022-06-10
论文外文题名：	Research on Long-term Target Tracking Algorithm Based on Deep Learning
论文中文关键词：	深度学习 ; 孪生网络 ; 长时跟踪 ; 目标重检测 ; 模板匹配
论文外文关键词：	Deep learning ; Siamese network ; Long-term tracking ; Object re-detection ; Template matching
论文中文摘要：	︿目标跟踪在智能监控、自动驾驶和军事制导等领域发挥着重要作用。近几年，随着深度学习短时跟踪算法的性能不断提高，人们开始关注接近实际场景的长时跟踪应用。长时跟踪的视频序列长度远大于短时跟踪，并且目标形变、消失与重现等问题尤其突出，直接应用短时跟踪算法无法应对这些困难，跟踪性能急剧下降。因此，本文提出以下两种深度学习长时目标跟踪改进算法。针对长时跟踪过程中目标消失与重现的问题，本文设计了一种基于动态模板匹配的孪生网络长时目标跟踪算法（SiamDTM_LT）。采用置信度分数判断目标跟丢状态，若判断目标丢失，则启动动态模板匹配的全局搜索重检测机制，获得目标的粗略预测定位，随后利用SiamFC++ 跟踪器精确定位目标位置，从而解决目标丢失问题。为提高重检测时粗略预测位置的准确度，本文还提出自适应动态匹配模板更新策略。在VOT2018_LT、VOT2019_LT、UAV20L、TLP和LaSOT五个长时数据集中测试，结果显示SiamDTM_LT算法不仅跟踪性能得到显著提高，在LaSOT数据集上成功率值为0.556，而且跟踪速度达到45.5FPS，满足实时目标跟踪的需求。为进一步提高算法的跟踪性能，本文设计了一种基于动态模板匹配的双模型孪生网络长时目标跟踪算法（DMSiamDTM_LT），主要提出两种改进策略：（1）“局部-全局-局部”跟踪策略，提高算法应对目标离开视野、被部分遮挡等多种挑战时的跟踪稳定性；（2）SiamFC++ 双模型跟踪策略，充分适应目标外观的变化情况，提高算法的抗干扰能力。在VOT2018_LT、VOT2019_LT、UAV20L、TLP和LaSOT五个长时数据集中测试，结果显示DMSiamDTM_LT算法的跟踪性能显著提高，在LaSOT数据集上成功率值为0.574。与其他先进目标跟踪算法相比， DMSiamDTM_LT算法在目标形变、光照变化、部分遮挡等复杂场景中表现优异，跟踪速度达到40.7FPS，满足实时目标跟踪的需求。﹀
论文外文摘要：	︿ Target tracking plays an important role in intelligent surveillance, autopilot and military guidance. In recent years, with the continuous improvement of the performance of the deep learning short-term tracking algorithm, people began to pay attention to the long-term tracking applications close to the actual scene. The video sequence length of long-term tracking is much longer than that of short-term tracking, and the problems of target deformation, disappearance and reappearance are particularly prominent. The direct application of short-term tracking algorithm can not deal with these difficulties, and the tracking performance drops sharply. Therefore, this paper proposes the following two improved algorithms for deep learning long-term target tracking. Aiming at the problem of target disappearance and reappearance in the process of long-term tracking, this paper designs a siamese network long-term target tracking algorithm based on dynamic template matching (SiamDTM_LT). The confidence score is used to judge the target tracking status. If the target is lost, the global search and re-detection mechanism of dynamic template matching is started to obtain the rough positioning of the target. Then the SiamFC++ tracker is used to accurately locate the target position, so as to solve the problem of target disappearance. In order to improve the accuracy of rough positioning during re-detection, an adaptive dynamic matching template updating strategy is also proposed. Tested in five long-term datasets of VOT2018_LT, VOT2019_LT, UAV20L, TLP and LaSOT, the results show that SiamDTM_LT algorithm not only has significantly improved the tracking performance, with a success rate of 0.556 on the lasot dataset, but also has a tracking speed of 45.5fps, which meets the needs of real-time target tracking. In order to further improve the tracking performance of the algorithm, a double model Siamese networks long-term tracking based on dynamic template matching (DMSiamDTM_LT) is designed in this paper. Two improved strategies are mainly proposed: (1) the "local-global-local" tracking strategy improves the tracking stability of the algorithm when the target leaves the field of view and is partially occluded; (2) SiamFC++ double model tracking strategy fully adapts to the changes of target appearance and improves the anti-interference ability of the algorithm. Tested in five long-term datasets of VOT2018_LT, VOT2019_LT, UAV20L, TLP and LaSOT, the results show that the tracking performance of DMSiamDTM_LT algorithm is significantly improved, and the success rate on LaSOT data set is 0.574. Compared with other advanced target tracking algorithms, DMSiamDTM_LT algorithm performs well in complex scenes such as target deformation, illumination change and partial occlusion. The tracking speed reaches 40.7fps, meeting the needs of real-time target tracking. ﹀
参考文献：	︿ [1] 陆峰,刘华海,黄长缨,等.基于深度学习的目标检测技术综述[J].计算机系统应用, 2021, 30(3): 1-13. [2] 卢湖川,李佩霞,王栋.目标跟踪算法综述.模式识别与人工智能[J].2018,31(1): 61-76. [3] 宋春华,高仕博,程咏梅.自主空中加油视觉导航系统中的锥套检测算法[J].红外与激光工程,2013,42(4): 1089-1094. [4] 李志朋.基于机器视觉的运动目标跟踪方法研究[J].电子技术与软件工程,2021(1): 78-79. [5] Bolme D S, Beveridge J R, Draper B A, et al. Visual Object Tracking Using Adaptive Correlation Filters[C]//2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE, 2010: 2544-2550. [6] Henriques J F, Caseiro R, Martins P, et al. Exploiting the Circulant Structure of Tracking-by-Detection with Kernels[C]//Proceedings of the European Conference on Computer Vision (ECCV). Germany: Springer, 2012: 702-715. [7] Henriques J F, Caseiro R, Martins P, et al. High-speed Tracking with Kernelized Correlation Filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583-596. [8] Danelljan M, Häger G, Khan F S, et al. Accurate Scale Estimation for Robust Visual Tracking[C]//British Machine Vision Conference (BMVC). UK: BMVC, 2014: 65.1-65.11. [9] Danelljan M, Robinson A, Shahbaz Khan F, et al. Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking[C]//Proceedings of the European Conference on Computer Vision (ECCV). Australia: Springer, Cham, 2016: 472-488. [10] Danelljan M, Bhat G, Khan F S, et al. Eco: Efficient Convolution Operators for Tracking[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE, 2017: 6638-6646. [11] 刘嘉敏,谢文杰,黄鸿,汤一明.基于空间和通道注意力机制的目标跟踪方法[J].电子与信息学报,2021,43(9): 2569-2576. [12] Nam H, Han B. Learning Multi-domain Convolutional Neural Networks for Visual Tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE, 2016: 4293-4302. [13] Held D, Thrun S, Savarese S. Learning to Track at 100 FPS with Deep Regression Networks[C]//Proceedings of the European Conference on Computer Vision (ECCV). Australia: Springer, Cham, 2016: 749-765. [14] Bertinetto L, Valmadre J, Henriques J F, et al. Fully-Convolutional Siamese Networks for Object Tracking[C]//Proceedings of the European Conference on Computer Vision (ECCV), Germany: Springer, 2016: 850-865. [15] Krizhevsky A, Sutskever I, Hinton G E. Imagenet Classification with Deep Convolutional Neural Networks[J]. Advances in Neural Information Processing Systems, 2012, 25: 1097- 1105. [16] Li B, Yan J, Wu W, et al. High Performance Visual Tracking with Siamese Region Proposal Network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE, 2018: 8971-8980. [17] Ren S, He K, Girshick R, et al. Faster R-CNN: Towards Real-time Object Detection with Region Proposal Networks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence. 2015, 39(6): 1137-1149. [18] Zhu Z, Wang Q, Li B, et al. Distractor-Aware Siamese Networks for Visual Object Tracking [C]//Proceedings of the European Conference on Computer Vision (ECCV). Germany: Springer, 2018: 101-117. [19] Li B, Wu W, Wang Q, et al. SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE, 2019: 4282-4291. [20] He K, Zhang X, Ren S, et al. Deep Residual Learning for Image Recognition[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE, 2016: 770-778. [21] Danelljan M, Bhat G, Khan F S, et al. ATOM: Accurate Tracking by Overlap Maximization [C]//Proceedings of the IEEE Conference of Computer Vision and Pattern Recognition (CVPR). USA: IEEE, 2019: 1-11. [22] Xu Y, Wang Z, Li Z, et al. SiamFC++: Towards Robust and Accurate Visual Tracking with Target Estimation Guidelines[C]//Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). USA: AAAI, 2020, 34(7): 12549-12556. [23] Kalal Z, Mikolajczyk K, Matas J. Tracking-Learning-Detection[J]. IEEE Transactions on Software Engineering, 2011, 34(7): 1409-1422. [24] Ma C, Yang X, Zhang C, et al. Long-Term Correlation Tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE, 2015: 5388-5396. [25] Zhang Y, Wang L, Wang D, et al. Learning Regression and Verification Networks for Robust Long-Term Tracking[J]. International Joursnal of Computer Vision, 2021, 129(9): 2536-2547. [26] Yan B, Zhao H, Wang D, et al. ’Skimming-Perusal’ Tracking: A framework for Real-Time and Robust Long-Term Tracking[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Korea(South): IEEE, 2019: 2385-2393. [27] Huang L, Zhao X, Huang K. Global Track: A Simple and Strong Baseline for Long-Term Tracking[C]//Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). USA: AAAI, 2020, 34(7): 11037-11044. [28] Dai K, Zhang Y, Wang D, et al. High-performance Long-Term Tracking with Meta Updater [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE, 2020: 6298-6307. [29] Voigtlaender P, Luiten J, Torr P H S, et al. Siam R-CNN: Visual Tracking by Re-detection[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE, 2020: 6578-6588. [30] Gavves E, Tao R, Gupta D K, et al. Model Decay in Long-Term Tracking[C]//2020 25th International Conference on Pattern Recognition (ICPR). Italy: IEEE, 2021: 2685-2692. [31] 贾迪,朱宁丹,杨宁华,等.图像匹配方法研究综述[J].中国图象图形学报,2019,24(5): 0677-0699. [32] Korman S, Reichman D, Tsur G, et al. Fast-match: Fast Affine Template Matching[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE, 2013: 2331-2338. [33] Dekel T, Oron S, Rubinstein M, et al. Best-buddies Similarity for Robust Template Matching[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE, 2015: 2021-2029. [34] Oron S, Dekel T, Xue T F, et al. Best-buddies Similarity-Robust Template Matching Using Mutual Nearest Neighbors[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(8): 1799-1813. [35] Jia D, Cao J, Song W D, et al. Colours FAST (CFAST) Match: Fast Affine Template Matching for Colours Image[J]. Electronics Letters, 2016, 52(14): 1220-1221. [36] Talmi I, Mechrez R, Zelnik-Manor L. Template Matching with Deformable Diversity Similarity[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE, 2017: 175-183. [37] Talker L, Moses Y, Shimshoni I. Efficient Sliding Window Computation for NN-based Template Matching[C]//Proceedings of the European Conference on Computer Vision (ECCV). Germany: Springer, 2018: 404-418. [38] Korman S, Milam M, Soatto S. OATM: Occlusion Aware Template Matching by Consensus Set Maximization[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE, 2018: 2675-2683. [39] Kat R, Jevnisek R, Avidan S. Matching Pixels Using Co-occurrence Statistics[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE, 2018: 1751-1759. [40] Han X, Leung T, Jia Y, et al. MatchNet: Unifying Feature and Metric Learning for Patch-based Matching[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE, 2015: 3279-3286. [41] Zagoruyko S, Komodakis N. Learning to Compare Image Patches via Convolutional Neural Networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE, 2015: 4353-4361. [42] Yang T Y, Hsu J H, Lin Y Y, et al. DeepCD: Learning Deep Complementary Descriptors for Patch Representations[C]//Proceedings of the IEEE International Conference on Computer Vision (ICCV). Italy: IEEE, 2017: 3314-3322. [43] Tian Y, Fan B, Wu F. L2-Net: Deep Learning of Discriminative Patch Descriptor in Euclidean Space[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE, 2017: 661-669. [44] 范大昭,董杨,张永生.卫星影像匹配的深度卷积神经网络方法[J].测绘学报,2018,47(6): 844-853. [45] 邓文浩.长时目标跟踪算法的研究与应用[D].成都:电子科技大学,2021. [46] Wang N, Shi J, Yeung D Y, et al. Understanding and Diagnosing Visual Tracking Systems[C] //Proceedings of the IEEE International Conference on Computer Vision (ICCV). Chile: IEEE, 2015: 3101-3109. [47] 赵德明.基于深度学习的长时目标跟踪[D].成都:电子科技大学,2021. [48] Zhang Z, Peng H. Deeper and Wider Siamese Networks for Real-Time Visual Tracking[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE, 2019: 4591-4600. [49] Szegedy C, Liu W, Jia Y, et al. Going Deeper with Convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE, 2015: 1-9. [50] 陈方芳,宋代平.基于动态模板匹配的自适应尺度目标跟踪算法[J/OL].激光与光电子学进展, 2022-02-18: 1-15. [51] 沈玉玲,伍忠东,赵汝进,等.基于模型更新与快速重检测的长时目标跟踪[J].光学学报, 2020, 40(3): 121-130. [52] Mueller M, Smith N, Ghanem B. A Benchmark and Simulator for UAV Tracking[C]// Proceedings of the European Conference on Computer Vision (ECCV). Australia: Springer, Cham, 2016: 445-461. [53] Kristan M, Leonardis A, Matas J, et al. The Sixth Visual Object Tracking VOT2018 Challenge Results[C]//Proceedings of the European Conference on Computer Vision (ECCV). Germany: Springer, 2018, 11129 LNCS: 3-53. [54] Moudgil A, Gandhi V. Long-Term Visual Object Tracking Benchmark[C]//Proceedings of the European Conference on Computer Vision (ECCV). Germany: Springer, Cham, 2018: 629-645. [55] Valmadre J, Bertinetto L, Henriques J F, et al. Long-Term Tracking in the Wild: A Benchmark[C]//Proceedings of the European Conference on Computer Vision (ECCV). Germany: Springer, Cham, 2018: 670-685. [56] Fan H, Lin L, Yang F, et al. LaSOT: A High-quality Benchmark for Large-scale Single Object Tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE, 2019: 5374-5383. [57] Kristan M, Matas J, Leonardis A, et al. The Seventh Visual Object Tracking VOT2019 Challenge Results[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCV). Korea (South): IEEE, 2019: 0-0. [58] Ghazvini G A, Mohsenzadeh M, Nasiri R, et al. MMLT: A Mutual Multilevel Trust Framework Based on Trusted Third Parties in Multicloud Environments[J]. Software: Practice and Experience, 2020, 50(7): 1203-1227. [59] Fan H, Ling H. Parallel tracking and verifying: A Framework for Real-time and High Accuracy Visual Tracking[C]//Proceedings of the IEEE International Conference on Computer Vision (ICCV). Italy: IEEE, 2017: 5486-5494. [60] Lukežič A, Zajc L Č, Vojíř T, et al. FuCoLoT-A Fully-Correlational Long-Term Tracker[C] //Asian Conference on Computer Vision (ACCV). Australia: Springer, Cham, 2018: 595-611. [61] Valmadre J, Bertinetto L, Henriques J, et al. End-to-End Representation Learning for Correlation Filter Based Tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE, 2017: 2805-2813. [62] Guo D, Wang J, Cui Y, et al. SiamCAR: Siamese Fully Convolutional Classification and Regression for Visual Tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE, 2020: 6269-6277. [63] Yang T, Chan A B. Learning Dynamic Memory Networks for Object Tracking[C]// Proceedings of the European Conference on Computer Vision (ECCV). Germany: Springer, Cham, 2018: 152-167. [64] Park E, Berg A C. Meta-Tracker: Fast and Robust Online Adaptation for Visual Object Trackers[C]//Proceedings of the European Conference on Computer Vision (ECCV). Germany: Springer, Cham, 2018: 569-585. [65] Huang J, Zhou W. Re2EMA: Regularized and Reinitialized Exponential Moving Average for Target Model Update in Object Tracking[C]//Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). USA: AAAI, 2019, 33(1): 8457-8464. [66] Zhang L, Gonzalez-Garcia A, Weijer J, et al. Learning the Model Update for Siamese Trackers[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Korea(South): IEEE, 2019: 4010-4019. [67] Li P, Chen B, Ouyang W, et al. Gradnet: Gradient-guided Network for Visual Object Tracking[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Korea(South): IEEE, 2019: 6162-6171. [68] Bhat G, Danelljan M, Van Gool L, et al. Learning Discriminative Model Prediction for Tracking[C]//2019 IEEE/CVF International Conference on Computer Vision (ICCV). Korea (South): IEEE, 2019: 6181-6190. [69] Dai K, Zhang Y, Wang D, et al. High-performance Long-Term Tracking with Meta-updater [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE, 2020: 6298-6307. ﹀
中图分类号：	TP391.4
开放日期：	2022-06-22

附件下载