查看论文信息

免费浏览

查看论文信息

论文中文题名：	基于视频监控的矿井人员行为识别算法研究
姓名：	李珺瑜
学号：	21307223004
保密级别：	公开
论文语种：	chi
学科代码：	085400
学科名称：	工学 - 电子信息
学生类型：	硕士
学位级别：	工程硕士
学位年度：	2024
培养单位：	西安科技大学
院系：	通信与信息工程学院
专业：	电子信息
研究方向：	通信工程
第一导师姓名：	王树奇
第一导师单位：	西安科技大学
论文提交日期：	2024-06-14
论文答辩日期：	2024-05-21
论文外文题名：	Research on behavior recognition algorithm for mine personnel based on video surveillance
论文中文关键词：	视频监控 ; 矿井人员行为识别 ; YOLOv7-Pose ; ST-GCN
论文外文关键词：	Video surveillance ; Identification of mine personnel behavior ; YOLOv7-Pose ; ST-GCN
论文中文摘要：	︿在矿山井下作业中，人员的不安全行为会导致安全事故的发生。目前，行为识别研究存在特征信息提取不充分、模型计算量大等问题，不适合矿山的实际应用。本文在分析国内外行为识别现状基础上，开展了基于视频监控的井下人员行为识别算法研究，对于矿山的生产安全保障具有重要意义。主要研究内容如下：（1）采用基于YOLOv7-Pose改进的算法实现井下人员姿态估计。针对煤矿下算力约束限制模型部署的问题，通过GSConv卷积模块优化YOLOv7-Pose头部网络结构，从而压缩模型；采用CARAFE模块优化上采样UPSample模块，在保证计算效率的同时实现图像细节的重构；头部特征层中嵌入ACMix注意力机制，提高模型对目标的感知力，提升检测精度。实验结果证明，与原始YOLOv7-Pose算法相比，改进算法AP提升了1.2%，召回率提升了1.5%，精确度提升了1.5%，模型压缩了26.5%，每秒帧数提升了14ms，具有较好检测效果。（2）采用基于ST-GCN改进的算法实现井下人员行为识别。针对传统ST-GCN模型特征提取不足的问题，通过优化ST-GCN注意力机制，增加了关节点关联信息，提高模型训练效果；增加关节域特征提取，提高特征信息丰富度；设计双特征分支融合网络，实现降低模型参数以及提高识别精度。实验结果证明，与原始ST-GCN算法相比，改进算法精确度提升了1.7%，每秒帧数提升了10ms，提高了行为识别的效果。（3）基于上述算法和模型优化策略，设计开发了矿井人员行为识别系统。该系统涵盖实时视频流查看、报警统计管理、设备管理和用户管理功能，为矿山的生产安全提供强有力的支持和保障。通过测试，系统能够正常运行并有较好的检测效果。﹀
论文外文摘要：	︿ In underground mining operations,unsafe behavior of personnel can lead to the occurrence of safety accidents. At present, research on behavior recognition has problems such as insufficient feature information extraction and large model computation, which are not suitable for practical applications in mines. On the basis of analyzing the current situation of behavior recognition both domestically and internationally, this article conducts research on underground personnel behavior recognition algorithms based on video surveillance, which is of great significance for ensuring production safety in mines. The main research content is as follows: (1) Using an improved algorithm based on YOLOv7 Pose to achieve underground personnel pose estimation. To address the issue of limited model deployment due to computing power constraints in coal mines, the YOLOv7 Pose head network structure is optimized using the GSConv convolution module to compress the model; Using the CARAFE module to optimize the upsampling UPSample module, achieving reconstruction of image details while ensuring computational efficiency; Embedding ACMix attention mechanism in the head feature layer improves the model's perception of targets and enhances detection accuracy. The experimental results show that compared with the traditional YOLOv7 Pose algorithm, the improved algorithm AP has increased by 1.2%, recall has increased by 1.5%, accuracy has increased by 1.5%, model compression has increased by 26.5%, and frame rate per second has increased by 14ms, demonstrating good detection performance. (2) Using an improved algorithm based on ST-GCN to achieve underground personnel behavior recognition. In response to the problem of insufficient feature extraction in traditional ST-GCN models, the ST-GCN attention mechanism is optimized to increase joint correlation information and improve model training effectiveness; Increase joint domain feature extraction to improve feature information richness; Design a dual feature branch fusion network to reduce model parameters and improve recognition accuracy. The experimental results show that compared with the traditional YOLOv7 Pose algorithm, the improved algorithm has increased accuracy by 1.7%, and frame rate per second has increased by 10ms, improved the performance of behavior recognition. (3) Based on the above algorithms and model optimization strategies, a mine personnel behavior recognition system has been designed and developed. This system covers real-time video stream viewing, alarm statistics management, equipment management, and user management functions, providing strong support and guarantee for the production safety of mines.Through the test, the system can operate normally and has a good detection effect. ﹀
参考文献：	︿ [1]孙喜民.煤炭工业高质量发展方略研究与实践[J].煤炭工程,2019,51(01):152-156. [2]赵小虎,黄程龙.基于Kinect的矿井人员违规行为识别算法研究[J].湖南大学学报(自然科学版),2020,47(04):92-98. [3]Garcia-Ceja E, Galván-Tejada C E, Brena R. Multi-view stacking for activity recognition with sound and accelerometer data[J]. Information Fusion,2018,40:45-56. [4]王国法,庞义辉,任怀伟.智慧矿山技术体系研究与发展路径[J].金属矿山,2022,(05):1-9. [5]Tingxin W, Guitong W, et al. Identification of miners' unsafe behaviors based on transfer learning and residual network[J]. China Safety Science Journal,2020,30(3):41. [6]谭章禄,吴琦,肖懿轩,等.智慧矿山信息可视化研究[J].工矿自动化,2020,46(01):26-31. [7]胡青松,钱建生,李世银,等.智能煤矿技术研究与政策制定现状[J].工矿自动化,2021,47(03):1-8. [8]Toshev A, Szegedy C. Deeppose: Human pose estimation via deep neural networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition.2014:1653-1660. [9]Kumar C, Ramesh J, Chakraborty B, et al. Vru pose-ssd: Multiperson pose estimation for automated driving[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2021,35(17):15331-15338. [10]Fang H S, Tai Y W, et al. Rmpe: Regional multi-person pose estimation[C]//Proceedings of the IEEE international conference on computer vision.2017:2334-2343. [11]Chen Y, Wang Z, Peng Y, et al. Cascaded pyramid network for multi-person pose estimation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition.2018:7103-7112. [12]Sun K, Liu D, et al. Deep high-resolution representation learning for human pose estimation[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.2019:5693-5703. [13]Votel R, Li N. Next-Generation Pose Detection with Movenet and Tensorflow. js. 2021[J]. Retrieved February,2021,1:2023. [14]Luo Z, Wang Z, Huang Y, et al. Rethinking the heatmap regression for bottom-up human pose estimation[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.2021:13264-13273. [15]Cheng B, Wang J, et al. Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.2020:5386-5395. [16]Cao Z, Simon T, Wei S E, et al. Realtime multi-person 2d pose estimation using part affinity fields[C]//Proceedings of the IEEE conference on computer vision and pattern recognition.2017:7291-7299. [17]Maji D, Nagori S, Mathew M, et al. Yolo-pose: Enhancing yolo for multi person pose estimation using object keypoint similarity loss[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:2637-2646. [18]Lv Y. Research on Underground Personnel Behavior Detection Algorithm Based on Lightweight OpenPose[C]//2022 5th International Conference on Artificial Intelligence and Big Data (ICAIBD). IEEE,2022:332-336. [19]Zhao D, Su G, Cheng G, et al. Research on real-time perception method of key targets in the comprehensive excavation working face of coal mine[J]. Measurement Science and Technology,2023, 35(1):015410. [20]Wang C Y, Bochkovskiy A, Liao H Y M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.2023:7464-7475. [21]Li G, Liu K, Ding W, et al. Key‐Skeleton‐Pattern Mining on 3D Skeletons Represented by Lie Group for Action Recognition[J]. Mathematical Problems in Engineering,2018,2018(1):7952974. [22]Song S, Lan C, et al. Spatio-temporal attention-based LSTM networks for 3D action recognition and detection[J]. IEEE Transactions on image processing,2018,27(7):3459-3471. [23]Yang Z, Li Y, Yang J, et al. Action recognition with spatio–temporal visual attention on skeleton image sequences[J]. IEEE Transactions on Circuits and Systems for Video Technology,2018,29(8):2405-2415. [24]Liu J, Shahroudy A, Xu D, et al. Skeleton-based action recognition using spatio-temporal LSTM network with trust gates[J]. IEEE transactions on pattern analysis and machine intelligence,2017,40(12):3007-3021. [25]Yan S,Lin D. Spatial temporal graph convolutional networks for skeleton-based action recognition[C]//Proceedings of the AAAI conference on artificial intelligence.2018,32(1). [26]贾红果,曹庆贵,王树立,等.矿井大数据分析及职工不安全行为预控研究[J].山东科技大学学报(自然科学版),2015,34(02):14-18. [27]蔡利梅.基于视频的煤矿井下人员目标检测与跟踪研究[D].中国矿业大学,2010. [28]毕林,谢伟,崔君.基于卷积神经网络的矿工安全帽佩戴识别研究[J].黄金科学技术,2017,25(04):73-80. [29]王雨生.基于深度学习的复杂姿态下的安全帽佩戴检测方法研究[D].常州大学,2020. [30]徐守坤,倪楚涵,吉晨晨,等.基于YOLOv3的施工场景安全帽佩戴的图像描述[J].计算机科学,2020,47(08):233-240. [31]王兵,李文璟,唐欢.改进YOLO v3算法及其在安全帽检测中的应用[J].计算机工程与应用,2020,56(09):33-40. [32]葛青青,张智杰,袁珑,等.融合环境特征与改进YOLOv4的安全帽佩戴检测[J].中国图象图形学报,2021,26(12):2904-2917. [33]任志玲,朱光泽.基于嵌入式和运动检测的井下视频监控系统[J].计算机测量与控制,2014,22(05):1398-1400+1425. [34]李爽,刘海洋,杨勇.基于矿工不安全行为的煤矿安全预测评价模型[J].煤矿安全,2017,48(08):242-245. [35]Shi X, Huang J, Huang B. An underground abnormal behavior recognition method based on an optimized alphapose-st-gcn[J]. Journal of Circuits, Systems and Computers,2022,31(12):2250214. [36]张晓平,纪佳慧,王力,等.基于视频的人体异常行为识别与检测方法综述[J].控制与决策,2022,37(01):14-27. [37]王师节.大数据技术在煤炭安全生产中不安全行为分析的研究与应用[D].中国矿业大学,2021. [38]李英芹.浅析煤矿“三违”原因及预防控制措施[J].煤,2017,26(11):47-49. [39]刘浩,刘海滨,孙宇,等.煤矿井下员工不安全行为智能识别系统[J].煤炭学报,2021,46(S2):1159-1169. [40]Geiger A, Lenz P, Urtasun R. Are we ready for autonomous driving? the kitti vision benchmark suite[C]//2012 IEEE conference on computer vision and pattern recognition. IEEE,2012:3354-3361. [41]Kuehne H, Jhuang H, Garrote E, et al. HMDB: a large video database for human motion recognition[C]//2011 International conference on computer vision. IEEE,2011:2556-2563. [42]Marszalek M, Laptev I, Schmid C. Actions in context[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE,2009:2929-2936. [43]Soomro K, Zamir A R, Shah M. UCF101: A dataset of 101 human actions classes from videos in the wild[J]. arxiv preprint arxiv:1212.0402,2012. [44]Carreira J, Zisserman A. Quo vadis, action recognition? a new model and the kinetics dataset[C]//proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:6299-6308. [45]Yang W, Zhang X, Ma B, et al. An open dataset for intelligent recognition and classification of abnormal condition in longwall mining[J]. Scientific Data,2023,10(1):416. [46]Li H, Li J, Wei H, et al. Slim-neck by GSConv: A better design paradigm of detector architectures for autonomous vehicles[J]. arxiv preprint arxiv:2206.02424, 2022. [47]Wang J, Chen K, Xu R, et al. Carafe: Content-aware reassembly of features[C]//Proceedings of the IEEE/CVF international conference on computer vision.2019:3007-3016. [48]高新波,莫梦竟成,汪海涛,等.小目标检测研究进展[J].数据采集与处理,2021,36(03):391-417. [49]Pan X, Ge C, Lu R, et al. On the integration of self-attention and convolution[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.2022:815-825. [50]Chi H, Ha M H, Chi S, et al. Infogcn: Representation learning for human skeleton-based action recognition[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.2022:20186-20196. [51]XIONG X,MIN M,WANG Q,et al. Human skeleton feature optimizer and adaptive structure enhancement graph convolution network for action recognition[J]. IEEE Transactions on circuits and systems for video technology，2023，33(1)：342-353. [52]Si C, Wang W, et al. Skeleton-based action recognition with hierarchical spatial reasoning and temporal stack learning network[J]. Pattern Recognition,2020,107:107511. [53]Si C, Chen W, Wang W, et al. An attention enhanced graph convolutional lstm network for skeleton-based action recognition[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.2019:1227-1236. [54]Shi L, Zhang Y, Cheng J, et al. Two-stream adaptive graph convolutional networks for skeleton-based action recognition[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.2019:12026-12035. ﹀
中图分类号：	TP391
开放日期：	2024-06-17

附件下载