- 无标题文档
查看论文信息

论文中文题名:

 基于深度学习的车辆跟踪算法的研究    

姓名:

 陈博文    

学号:

 19208208052    

保密级别:

 公开    

论文语种:

 chi    

学科代码:

 085212    

学科名称:

 工学 - 工程 - 软件工程    

学生类型:

 硕士    

学位级别:

 工程硕士    

学位年度:

 2022    

培养单位:

 西安科技大学    

院系:

 计算机科学与技术学院    

专业:

 软件工程    

研究方向:

 机器学习与计算机视觉    

第一导师姓名:

 杨晓强    

第一导师单位:

 西安科技大学    

论文提交日期:

 2022-06-22    

论文答辩日期:

 2022-06-06    

论文外文题名:

 Research on Vehicle Tracking Algorithm Based on Deep Learning    

论文中文关键词:

 目标检测 ; 目标跟踪 ; k-means聚类 ; PANet ; YOLOv4 ; DeepSort    

论文外文关键词:

 object detection ; object tracking ; k-means ; PANet ; YOLOv4 ; DeepSort    

论文中文摘要:

随着我国城市化进程不断加快,私家车持有率不断提高,汽车行驶过程中产生的交通管理、路径规划、车辆运输等大量的行为信息,为交通信息服务、交通管理、交通运输安全等服务系统的高效智能化提供了基础数据。交通的管控决策、导航软件中的实时路况、自动化车辆控制等关键系统都依赖道路节点的车辆运行数据。车辆目标跟踪在智慧交通系统中起着重要的作用。随着深度学习在机器视觉方面的应用,各种基于计算机视觉的算法被大量用于目标检测和跟踪,如何高效、准确地检测和跟踪车辆成为了该领域的研究热点。由于实际路况中易出现距离摄像头较远时存在车辆较多,同时在图片或视频中像素占比较少等情况导致检测和跟踪准确率较低以及车辆跟踪算法如何实现精度和速度的最佳平衡,本文针对上述问题,所做的具体工作如下:

(1)本文提出一种基于改进YOLOv4(You Only Look Once v4)算法的目标检测器,通过二分k-means聚类算法替换K-means算法来获取更具代表性的先验框,从而能够让训练更有针对性,可以加快模型的收敛。同时由于在YOLOv4网络中,目标特征通过自底向上的卷积操作进行传递,但是由于经过反复的卷积操作,目标特征中有关小目标及遮挡目标的特征及位置信息等数据会逐渐丢失,针对上述存在的问题,本文对YOLOv4中PAN进行改进,在PAN中将CSPDarkNet53的上一层ResBlock输出的特征与下一层进行跳跃拼接,增大了特征检测尺度,从而在高层特征空间中保留不同层级空间的特征,通过综合指标mAP值与召回率进行比较,在多云、夜间、晴天和雨天测试场景下,本文算法的mAP值与YOLOv4相比提高了0.23%-2.07%,在混合场景下提高了1.97%,召回率与YOLOv4相比提高了1.88%-4.03%,在混合场景下提高了2.24%。

(2)由于DeepSort的深度外观模型是在人的重识别数据集上训练得到的,本文针对车辆的实际特点,对DeepSort外观特征提取网络进行修改,同时根据车辆的外观特征,对网络模型的输入进行了修改。通过以上两处实验参数的修改使其更适合进行车辆目标跟踪。

(3)DeepSort目标跟踪算法在实时目标跟踪过程中,会将目标车辆的表观特征进行近邻匹配,每帧通过跟踪和级联匹配后会进行特征提取保存和对比,由于每一帧都进行特征提取保存和对比会耗费大量时间,所以会降低目标跟踪的速度。而Sort目标跟踪算法使用简单的卡尔曼滤波器和匈牙利匹配在高帧速率下的性能,但忽略了被检测物体的表面特征,如果物体间发生遮挡,会影响跟踪准确率。本文根据对数据集中车辆遮挡的分析,提出将DeepSort算法与混合一定比例的Sort算法来进行车辆跟踪。通过相应的实验,证实所提出的方法提高了FPS,而准确性几乎没有损失。最后与其他车辆跟踪算法相比,本文提出的改进DeepSort算法同时将改进的YOLOv4作为检测器的车辆跟踪算法无论是跟踪的准确性、跟踪的精确度还是车辆的检测速度上均有提升,从而可以实现精度和速度的最佳平衡。

论文外文摘要:

As urbanization process continues to accelerate, the rate of private car ownership continues to increase, the car driving process produces a large number of traffic management, route planning, vehicle transportation and other behavioral information, for the traffic information services, traffic management, transportation safety and other services such as intelligent and efficient system to provide the basic data. Traffic control and decision-making, real-time road conditions in navigation software, automated vehicle control and other key systems are dependent on the vehicle operation data of road nodes. Vehicle target tracking plays an important role in the intelligent transportation system. With the application of deep learning in machine vision, various computer vision-based algorithms are used in large numbers for target detection and tracking, and how to efficiently and accurately detect and track vehicles has become a hot research topic in this field. As the actual road conditions are prone to the presence of more vehicles when the distance from the camera is far, while the pixel share in the picture or video is relatively small, etc. leading to low detection and tracking accuracy and how the vehicle tracking algorithm achieves the best balance of accuracy and speed, the specific work done in this paper to address the above issues is as follows.

(1)This paper proposes a target detector based on the improved YOLOv4(You Only Look Once v4) algorithm, which replaces the K-means algorithm by the dichotomous k-means clustering algorithm to obtain more representative prior frames, thus enabling more targeted training and can accelerate the convergence of the model.In the YOLOv4 network, the target features are passed through the bottom-up convolution operation, but due to the repeated convolution operation, the data about the small targets and the features and location information of the occluded targets in the target features will be gradually lost. In response to the above existing problems, this paper improves the PAN in YOLOv4, in which the upper layer of CSPDarkNet53 This paper improves the PAN in YOLOv4 by jump splicing the features output from CSPDarkNet53 with the next layer to increase the feature detection scale, thus preserving the features of different layer spaces in the high level feature space. scenarios by 1.97% and the recall rate by 1.88%-4.03% compared to YOLOv4 and 2.24% in mixed scenarios.

(2)Since DeepSort's deep appearance model is trained on the human re-identification dataset, this paper modifies the DeepSort appearance feature extraction network for the actual characteristics of the vehicles, and also modifies the input of the network model according to the appearance features of the vehicles. The above two experimental parameters are modified to make it more suitable for performing vehicle target tracking.

(3)DeepSort target tracking algorithm will match the apparent features of the target vehicle in the nearest neighbor during the real-time target tracking process, and will perform feature extraction and saving and comparison after each frame through tracking and cascade matching, which will reduce the speed of target tracking because it will take a lot of time to perform feature extraction and saving and comparison in each frame. And Sort target tracking algorithm uses simple Kalman filter and Hungarian matching for performance at high frame rate, but ignores the surface features of detected objects, which will affect the tracking accuracy if occlusion occurs between objects. In this paper, based on the analysis of vehicle occlusion in the dataset, we propose to mix DeepSort algorithm with a certain percentage of Sort algorithm for vehicle tracking. Through corresponding experiments, it is confirmed that the proposed method improves the FPS with almost no loss of accuracy. Finally, compared with other vehicle tracking algorithms, the improved DeepSort algorithm proposed in this paper while using the improved YOLOv4 as a detector for vehicle tracking improves both the accuracy of tracking, the precision of tracking and the speed of vehicle detection, so that the best balance of accuracy and speed can be achieved.

参考文献:

[1]岳倩.汽车保有量超3亿辆新能源汽车同比增59.25%[N].中国质量报,2022-01-13(006).DOI:10.28164/n.cnki.nczlb.2022.000141.

[2]明界日. 基于深度学习和核相关滤波的车辆跟踪研究[D].南昌大学,2021.DOI:10.27232/d.cnki.gnchu.2021.002560.

[3]朱凌志. 基于深度学习的复杂场景监控视频车流量统计方法设计[D].南京邮电大学,2021.DOI:10.27251/d.cnki.gnjdc.2021.000319.

[4]赵永强,饶元,董世鹏,张君毅.深度学习目标检测方法综述[J].中国图象图形学报,2020,25(04):629-654.

[5]Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 779-788.

[6]Liu W, Anguelov D, Erhan D, et al. Ssd: Single shot multibox detector[C]//European conference on computer vision. Springer, Cham, 2016: 21-37.

[7]吴国强, 黄坤, 刘新世, 等. 基于YOLOv4的无人机实时人脸检测[J]. 中国航天电子技术研究院科学技术委员会 2020 年学术年会优秀论文集, 2020.

[8]Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 7263-7271.

[9]Redmon J, Farhadi A. Yolov3: An incremental improvement[J]. arXiv preprint arXiv:1804.02767, 2018.

[10]Bochkovskiy A, Wang C Y, Liao H Y M. Yolov4: Optimal speed and accuracy of object detection[J]. arXiv preprint arXiv:2004.10934, 2020.

[11]Chu Q, Ouyang W, Li H, et al. Online multi-object tracking using CNN-based single object tracker with spatial-temporal attention mechanism[C]//Proceedings of the IEEE international conference on computer vision. 2017: 4836-4845.

[12]Son J, Baek M, Cho M, et al. Multi-object tracking with quadruplet convolutional neural networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 5620-5629.

[13]Fang K, Xiang Y, Li X, et al. Recurrent autoregressive networks for online multi-object tracking[C]//2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2018: 466-475.

[14]Kim C, Li F, Rehg J M. Multi-object tracking with neural gating using bilinear lstm[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 200-215.

[15]Xu Y, Zhou X, Chen S, et al. Deep learning for multiple object tracking: a survey[J]. IET Computer Vision, 2019, 13(4): 355-368.

[16]Sun Z, Chen J, Chao L, et al. A survey of multiple pedestrian tracking based on tracking-by-detection framework[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 31(5): 1819-1833.

[17]徐涛,马克,刘才华.基于深度学习的行人多目标跟踪方法[J].吉林大学学报(工学版),2021,51(01):27-38.DOI:10.13229/j.cnki.jdxbgxb20200509.

[18]Li Z, Liu F, Yang W, et al. A survey of convolutional neural networks: analysis, applications, and prospects[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021.

[19]Ren S, He K, Girshick R, et al. Faster r-cnn: Towards real-time object detection with region proposal networks[J]. Advances in neural information processing systems, 2015, 28.

[20]赵诚诚. 基于卷积神经网络的图像分类改进算法的研究[D].南京邮电大学,2020.DOI:10.27251/d.cnki.gnjdc.2020.001191.

[21]Liu W, Anguelov D, Erhan D, et al. Ssd: Single shot multibox detector[C]//European conference on computer vision. Springer, Cham, 2016: 21-37.

[22]Benjdira B, Khursheed T, Koubaa A, et al. Car detection using unmanned aerial vehicles: Comparison between faster r-cnn and yolov3[C]//2019 1st International Conference on Unmanned Vehicle Systems-Oman (UVS). IEEE, 2019: 1-6.

[23]Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv:1409.1556, 2014.

[24]Morera Á, Sánchez Á, Moreno A B, et al. SSD vs. YOLO for detection of outdoor urban advertising panels under multiple variabilities[J]. Sensors, 2020, 20(16): 4587.

[25]Parico A I B, Ahamed T. Real time pear fruit detection and counting using YOLOv4 models and deep SORT[J]. Sensors, 2021, 21(14): 4803.

[26]Lyu S, Chang M C, Du D, et al. UA-DETRAC 2018: Report of AVSS2018 & IWT4S challenge on advanced traffic monitoring[C]//2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). IEEE, 2018: 1-6.

[27]Li B, Jiang W, Gu J. Research on Target Detection algorithm based on Deep Learning Technology[C]//2021 IEEE International Conference on Power Electronics, Computer Applications (ICPECA). IEEE, 2021: 137-142.

[28]Mao Q C, Sun H M, Liu Y B, et al. Mini-YOLOv3: real-time object detector for embedded applications[J]. Ieee Access, 2019, 7: 133529-133538.

[29]Neubeck A, Van Gool L. Efficient non-maximum suppression[C]//18th International Conference on Pattern Recognition (ICPR'06). IEEE, 2006, 3: 850-855.

[30]Wang C Y, Liao H Y M, Wu Y H, et al. CSPNet: A new backbone that can enhance learning capability of CNN[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops. 2020: 390-391.

[31]He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE transactions on pattern analysis and machine intelligence, 2015, 37(9): 1904-1916.

[32]JOCHER G. Yolov5[EB/OL].[2021-10-14]. Code repository https://github.com/ultralytics/yolov5.

[33]董文轩,梁宏涛,刘国柱,胡强,于旭.深度卷积应用于目标检测算法综述[J/OL].计算机科学与探索:1-20[2022-03-23].http://kns.cnki.net/kcms/detail/11.5602.TP.20220129.1108.004.html

[34]李彬,汪诚,丁相玉,巨海娟,郭振平,李卓越.改进YOLOv4的表面缺陷检测算法[J/OL].北京航空航天大学学报:1-10[2022-02-14].DOI:10.13700/j.bh.1001-5965.2021.0301.

[35]Tajbakhsh N, Jeyaseelan L, Li Q, et al. Embracing imperfect datasets: A review of deep learning solutions for medical image segmentation[J]. Medical Image Analysis, 2020, 63: 101693.

[36]Li X, Chen H, Qi X, et al. H-DenseUNet: hybrid densely connected UNet for liver and tumor segmentation from CT volumes[J]. IEEE transactions on medical imaging, 2018, 37(12): 2663-2674.

[37]Zhou Z, Rahman Siddiquee M M, Tajbakhsh N, et al. Unet++: A nested u-net architecture for medical image segmentation[M]//Deep learning in medical image analysis and multimodal learning for clinical decision support. Springer, Cham, 2018: 3-11.

[38]Huang H, Lin L, Tong R, et al. Unet 3+: A full-scale connected unet for medical image segmentation[C]//ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2020: 1055-1059.

[39]Han G, Su J, Zhang C. A method based on multi-convolution layers joint and generative adversarial networks for vehicle detection[J]. KSII Transactions on Internet and Information Systems (TIIS), 2019, 13(4): 1795-1811.

[40]Loshchilov I, Hutter F. Sgdr: Stochastic gradient descent with warm restarts[J]. arXiv preprint arXiv:1608.03983, 2016.

[41]Wojke N, Bewley A, Paulus D. Simple online and realtime tracking with a deep association metric[C]//2017 IEEE international conference on image processing (ICIP). IEEE, 2017: 3645-3649.

[42]Kuhn H W. The Hungarian method for the assignment problem[J]. Naval research logistics quarterly, 1955, 2(1‐2): 83-97.

[43]Bewley A, Ge Z, Ott L, et al. Simple online and realtime tracking[C]//2016 IEEE international conference on image processing (ICIP). IEEE, 2016: 3464-3468.

[44]王嘉琳. 基于YOLOv5和DeepSORT的多目标跟踪算法研究与应用[D].山东大学,2021.DOI:10.27272/d.cnki.gshdu.2021.000983.

[45]杨薇,王洪元,张继,张中宝.一种基于Faster-RCNN的车辆实时检测改进算法[J].南京大学学报(自然科学),2019,55(02):231-237.DOI:10.13232/j.cnki.jnju.2019.02.008.

[46]Lou Y, Bai Y, Liu J, et al. Veri-wild: A large dataset and a new method for vehicle re-identification in the wild[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 3235-3243.

[47]Milan A, Leal-Taixé L, Reid I, et al. MOT16: A benchmark for multi-object tracking[J]. arXiv preprint arXiv:1603.00831, 2016.

[48]缪佳妮,杨金龙,程小雪,葛洪伟.运动信息优化相关滤波的多目标跟踪算法[J].计算机科学与探索,2021,15(07):1310-1321.

[49]Bernardin K, Stiefelhagen R. Evaluating multiple object tracking performance: the clear mot metrics[J]. EURASIP Journal on Image and Video Processing, 2008, 2008: 1-10.

中图分类号:

 TP391.41    

开放日期:

 2022-06-22    

无标题文档

   建议浏览器: 谷歌 火狐 360请用极速模式,双核浏览器请用极速模式