查看论文信息

查看全文

免费浏览

查看论文信息

论文中文题名：	煤矿井下皮带区域违规行为识别方法研究
姓名：	姜梅
学号：	21208223034
保密级别：	保密（1年后开放）
论文语种：	chi
学科代码：	085400
学科名称：	工学 - 电子信息
学生类型：	硕士
学位级别：	工程硕士
学位年度：	2024
培养单位：	西安科技大学
院系：	计算机科学与技术学院
专业：	软件工程
研究方向：	媒体计算与可视化
第一导师姓名：	马天
第一导师单位：	西安科技大学
论文提交日期：	2024-06-17
论文答辩日期：	2024-05-31
论文外文题名：	Violations in the Belt Area of Underground Coal Mines Research on Identification Methods
论文中文关键词：	行为识别 ; 多特征融合 ; 可分离卷积 ; 加权策略
论文外文关键词：	Behavior recognition ; multi-feature fusion ; separable convolution ; weighting strategies
论文中文摘要：	︿煤矿作为危险事故的高发场所，大多数事故都是由矿工的违规行为所引起的，皮带区域是井下运输系统的主要部分，对该区域的违规行为进行识别，能够保障煤矿的安全生产。然而，目前的方法在识别该区域的典型违规行为时，存在特征提取与计算量之间的不平衡的问题，并且难以识别少见的违规行为，从而导致该区域的违规行为误报率较高。为解决这些问题，本文提出了典型违规行为识别网络和少样本违规行为识别网络，在保证低计算量的情况下，进行多维度特征提取，以实现对该区域违规行为的准确识别。主要研究内容和创新点如下：（1）针对现有的矿井下皮带区域违规行为识别方法对特征提取不充分、难以考虑到行为时间差异的问题，提出了一种基于多特征融合时差网络的皮带区域典型违规行为识别方法。首先，使用短期多特征融合模块，在网络早期对动作的局部多特征进行融合建模，在行为相似时也可以捕捉到不同动作之间的细微差别。然后，使用长期多特征融合模块，在网络后期将不同时间段的特征相关联，以更好的利用上下文信息，以完成行为后期的全局多特征融合建模。最后，在长短期多特征融合模块中，使用可分离卷积来降低模型的计算量。实验结果表明，提出方法在自建矿井下皮带区域典型违规行为数据集上的识别平均准确率为89.62%，参数量为197.2M。在保持较低参数量的同时实现了多特征融合，能够更准确地识别矿井下皮带区域中的典型违规行为。（2）针对皮带区域部分违规行为的出现频率低导致的少样本问题，提出了一种基于加权时空协同分析的皮带区域少样本违规行为识别方法。首先，使用加权时序调整模块来定位动作，将其扭曲到动作持续时间，以解决查询视频和支持视频的时间偏差问题。然后，使用加权时空协同分析模块，通过执行时间重排和空间偏移预测，以确保查询特征与支持特征的动作演变相协调。最后，通过分析视频内外的相关性来学习特定任务的嵌入，并在这两个模块中引入加权策略以减少无关特征的干扰。实验结果表明，当实验设置为5-way-5-shot时，在自建皮带区域少样本违规行为数据集上的top1_acc为92.28%，能够有效地识别到皮带区域的少见违规行为。（3）在上述工作基础上，设计并完成了皮带区域行为识别系统。首先对系统进行了整体设计，包括监控管理、数据管理、行为分析和系统管理等四个模块的构建；随后详细设计和实现了这四个模块。最后，经功能测试和性能测试验证，系统具有良好的稳定性和可靠性，能够有效地监控和识别皮带区域的违规行为。﹀
论文外文摘要：	︿ Coal mines, as a high incidence of hazardous accidents, most of which are caused by miners' violations, the belt area is the main part of the underground transportation system, and the identification of violations in this area can ensure the safe production of coal mines. However, the current method suffers from an imbalance between feature extraction and computation when identifying typical violations in this region, and it is difficult to identify rare violations, which leads to a high false alarm rate of violations in this region. To solve these problems, this paper proposes a typical violation recognition network and a few-sample violation recognition network for multi-dimensional feature extraction to achieve accurate recognition of violations in the region while ensuring a low computational amount. The main research content and innovations are as follows: （1）Aiming at the problem that existing methods for identifying violations in the belt area under the mine are inadequate for feature extraction and difficult to take into account the temporal differences of the actions, a method for identifying typical violations in the belt area based on a multi-feature fusion time-difference network is proposed. First, using a short-term multi-feature fusion module, the local multi-features of actions are fused and modeled at the early stage of the network, and the subtle differences between different actions can be captured even when the behaviors are similar. Then, a long-term multi-feature fusion module is used to correlate features from different time periods in the later stages of the network to better utilize contextual information in order to accomplish global multi-feature fusion modeling in the later stages of the behavior. Finally, separable convolution is used in the long and short-term multi-feature fusion module to reduce the computational effort of the model. The experimental results show that the proposed method has an average recognition accuracy of 89.62% on the self-constructed dataset of typical violations in the underground belt area of a mine with a parameter count of 197.2 M. Multi-feature fusion is achieved while maintaining a low parameter count, which enables more accurate recognition of typical violations in the underground belt area of a mine. （2）Aiming at the low-sample problem caused by the low frequency of some violations in the belt area, a method for identifying low-sample violations in the belt area based on weighted temporal co-analysis is proposed. First, a weighted temporal adjustment module is used to localize actions by warping them to action durations in order to solve the problem of temporal deviation between query videos and support videos. Then, a weighted spatio-temporal coanalysis module is used to ensure that the query features are coordinated with the action evolution of the supporting features by performing temporal rearrangement and spatial offset prediction. Finally, task-specific embeddings are learned by analyzing correlations within and outside the video, and weighting strategies are introduced in both modules to reduce interference from irrelevant features. The experimental results show that the top1_acc on the self-constructed belt-area less-sample violation dataset is 92.28% when the experimental setting is 5-way-5-shot, which is able to effectively identify the less-sample violations in the belt area. （3）Based on the above work, a belt area behavior recognition system was designed and completed. Firstly, the overall design of the system was carried out, including the construction of four modules, including monitoring management, data management, behavior analysis and system management; then the four modules were designed and implemented in detail. Finally, the functional and performance tests verified that the system has good stability and reliability, and is able to effectively monitor and identify the violation behaviors in the belt area. ﹀
参考文献：	︿ [1] Laptev I. On space-time interest points[J]. International journal of computer vision, 2005, 64: 107-123.doi: 10.1007/s11263-005-1838-7. [2] Dollár P, Rabaud V, Cottrell G, et al. Behavior recognition via sparse spatio-temporal features[C]//2005 IEEE international workshop on visual surveillance and performance evaluation of tracking and surveillance. IEEE, 2005: 65-72, doi: 10.1109/VSPETS.2005.1570899. [3] Willems G, Tuytelaars T, Van Gool L. An efficient dense and scale-invariant spatio-temporal interest point detector[C]//Computer Vision–ECCV 2008: 10th European Conference on Computer Vision, Marseille, France, October 12-18, 2008, Proceedings, Part II 10. Springer Berlin Heidelberg, 2008: 650-663. .DOl:10.1007/978-3-540-88688-4_48. [4] Wang H, Kläser A, Schmid C, et al. Dense trajectories and motion boundary descriptors for action recognition[J]. International journal of computer vision, 2013, 103: 60-79. doi:10.1007/s11263-012-0594-8. [5] HERAT S, HARANDI M,PORIKLl F.Going deeper into action reognition:a survey[J].Image and Vision Computing.2017,60:4-21.DOI:10.1016/j.imavis.2017.01.010. [6] Weinland D, Boyer E, Ronfard R. Action recognition from arbitrary views using 3d exemplars[C]//2007 IEEE 11th international conference on computer vision. IEEE, 2007: 1-7. doi: 10.1109/ICCV.2007.4408849. [7] Wang Y, Huang K, Tan T. Human activity recognition based on r transform[C]//2007 IEEE conference on computer vision and pattern recognition. IEEE, 2007: 1-8. doi: 10.1109/CVPR.2007.383505. [8] Wang L, Suter D. Informative shape representations for human action recognition[C]//18th International Conference on Pattern Recognition (ICPR'06). IEEE, 2006, 2: 1266-1269. doi: 10.1109/ICPR.2006.711. [9] Hao F F, Liu J, Chen X D. A Review of Human Behavior Recognition Based on Deep Learning[C]//2020 International Conference on Artificial Intelligence and Education (ICAIE). IEEE, 2020: 19-23. doi: 10.1109/ICAIE50891.2020.00012. [10] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778. DOI: 10.1109/CVPR.2016.90. [11] Yao W ,Wang A ,Nie Y , et al.Study on the Recognition of Coal Miners’ Unsafe Behavior and Status in the Hoist Cage Based on Machine Vision[J].Sensors,2023,23(21). DOI: 10.3390/s23218794 [12] Tong R, Zhang Y, Yang Y, et al. Evaluating targeted intervention on coal miners’ unsafe behavior[J]. International journal of environmental research and public health, 2019, 16(3): 422. [13] 梁晨阳,华钢.基于信道状态信息的井下人员行为识别方法研究[J].煤炭技术,2022,41(11):182-186.DOI:10.13301/j.cnki.ct.2022.11.042. [14] 王诚聪.基于视频分析的煤矿安全生产典型违规行为识别[D].华北理工大学,2021.DOI:10.27108/d.cnki.ghelu.2021.000093. [15] 石永恒.基于计算机视觉的矿业安全人员违章行为检测识别研究[D].安徽理工大学,2021.DOI:10.26918/d.cnki.ghngc.2021.000596. [16] 范椿琲.基于边缘智能的煤矿冲击危险区人员违规行为识别系统设计与研究[D].中国矿业大学,2023.DOI:10.27623/d.cnki.gzkyu.2023.002084. [17] Wang Z, Liu Y, Duan S, et al. An efficient detection of non-standard miner behavior using improved YOLOv8[J]. Computers and Electrical Engineering, 2023, 112: 109021. [18] 杨赛烽.基于Kinect的罐笼内矿工不安全行为识别方法研究[D].中国矿业大学,2019. [19] 刘西想.基于机器视觉的矿井下异常行为识别研究[D].2021.中国矿业大学,MA thesis.doi:10.27623/d.cnki.gzkyu.2021.000564. [20] Schalk Wilhelm Pienaar,Reza Malekian. Human Activity Recognition Using Visual Object Detection.[J]. CoRR,2019,abs/1905.03707. [21] Cao X, Zhang C, Wang P, et al. Unsafe Mining Behavior Identification Method Based on an Improved ST-GCN[J]. Sustainability, 2023, 15(2): 1041. [22] 程昱昊.基于轻量型神经网络的矿工不安全行为识别算法研究[D].中国矿业大学,2022.DOI:10.27623/d.cnki.gzkyu.2022.000289. [23] 李占利,权锦成,靳红梅.基于3D-Attention与多尺度的矿井人员行为识别算法[J].国外电子测量技术,2023,42(07):95-104.DOI:10.19652/j.cnki.femt.2304886. [24] DU Tran, LUBOMIR Bourdev, ROB Fergus,et al.Learning spatio temporal features with3d convolutional networks[C].Proceedings of the IEEE international conference on computer vision.2015:4489-4497. [25] Zhang X, Zhu Y, Deng L, et al. A SlowFast behavior recognition algorithm incorporating motion saliency[C]//International Conference on Computer Graphics, Artificial Intelligence, and Data Processing (ICCAID 2022). SPIE, 2023, 12604: 761-766. [26] Tran D, Bourdev L, Fergus R, et al. Learning spatiotemporal features with 3d convolutional networks[C]//Proceedings of the IEEE international conference on computer vision. 2015: 4489-4497. [27] S. Hochreiter and J. Schmidhuber, "Long Short-Term Memory," in Neural Computation, vol. 9, no. 8, pp. 1735-1780, 15 Nov. 1997, doi: 10.1162/neco.1997.9.8.1735. [28] 王璇,吴佳奇,阳康,等.煤矿井下人体姿态检测方法[J].工矿自动化,2022,48(05):79-84.DOI:10.13272/j.issn.1671-251x.17867. [29] Wang L, Wang Z, et al. Temporal segment networks for action recognition in videos[J]. IEEE transactions on pattern analysis and machine intelligence, 2018, 41(11): 2740-2755. [30] 党伟超,张泽杰,白尚旺,等.基于改进双流法的井下配电室巡检行为识别[J].工矿自动化,2020,46(04):75-80.DOI:10.13272/j.issn.1671-251x.2019080074. [31] Liu Xiaoyang, Liu Jinqiang, Zheng Haolin. Gait recognition method of coal mine personnel based on Two-Stream neural network[J]. Journal of Mining Science and Technology, 2021, 6(2): 218-227. doi: 10.19606/j.cnki.jmst.2021.02.010. [32] SI C Y,JING Y,WANG W,et al .Skeleton-based action recognition with spatial reasoning and temporal stack learning[C]//Proceedings of the 15th European Conference on Computer Vision.Munich:Springer,2018:109-121.DOl:10.1007/978-3-030-01246-5_7. [33] MALIK N R,ABU-BAKAR S A R, SHEIKH U U, tal. Cascading Pose Features with CNN-LSTM for Multiview Human Action Recognition[J]. Signals, 2023,4(1):40-55. [34] 侯艳文,姚有利,贾泽琳,等.基于视频数据的煤矿井下不安全行为识别分析方法[J].煤,2023,32(11):33-36+91. [35] 饶天荣,潘涛,徐会军.基于交叉注意力机制的煤矿井下不安全行为识别[J].工矿自动化,2022,48(10):48-54.DOI:10.13272/j.issn.1671-251x.17949. [36] 刘斌,侯宇辉,王延辉.基于井下轨迹数据的煤矿人员违规行为识别[J].煤炭与化工,2021,44(10):82-85.DOI:10.19286/j.cnki.cci.2021.10.025. [37] 徐达炜.基于注意力和人体关键点的井下矿工不安全行为识别算法研究[D].中国矿业大学,2021.DOI:10.27623/d.cnki.gzkyu.2021.000818. [38] 王宇,于春华,陈晓青,等.基于多模态特征融合的井下人员不安全行为识别[J].工矿自动化,2023,49(11):138-144.DOI:10.13272/j.issn.1671-251x.2023070055. [39] 崔丽珍,张清宇,郭倩倩,等.基于CNN-LSTM的井下人员行为模式识别模型[J].无线电工程,2023,53(06):1375-1381. [40] 张雷,冉凌鎛,代婉婉,等.基于融合网络的井下人员行为识别方法[J].工矿自动化,2023,49(03):45-52.DOI:10.13272/j.issn.1671-251x.2022120015. [41] 李珊.基于视频的矿井下人员检测方法研究[D].武汉理工大学,2018. [42] 仝泽友,丁恩杰.矿井皮带区矿工违规行为识别方法[J].河南科技大学学报(自然科学版),2020,41(02):10.15926/j.cnki.issn1672-6871.2020.02.008. [43] Peng H W, Tseng Y C. Multi-scale Motion-Aware Module for Video Action Recognition[C]//European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2022: 589-606. [44] 李善华,肖涛,李肖利,等.基于DRCA-GCN的矿工动作识别模型[J].工矿自动化,2023,49(04):99-105+112.DOI:10.13272/j.issn.1671-251x.2022120023. [45] 陈天.基于改进双流算法的矿工不安全行为识别方法研究[D].中国矿业大学,2021.10.27623/d.cnki.gzkyu.2021.000693. [46] 姚勇.基于深度学习的矿井人员异常行为识别研究[D].中国矿业大学,2023.10.27623/d.cnki.gzkyu.2023.001396. [47] Zhu L, Yang Y. Compound memory networks for few-shot video classification[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 751-766. [48] Careaga C, Hutchinson B, Hodas N, et al. Metric-based few-shot learning for video action recognition[J]. arXiv preprint arXiv:1909.09602, 2019. [49] Ben-Ari R, Nacson M S, Azulai O, et al. TAEN: temporal aware embedding network for few-shot action recognition[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021: 2786-2794. [50] Tan S, Yang R. Learning similarity: Feature-aligning network for few-shot action recognition[C]//2019 International Joint Conference on Neural Networks (IJCNN). IEEE, 2019: 1-7. [51] Bishay M, Zoumpourlis G, Patras I. Tarn: Temporal attentive relation network for few-shot and zero-shot action recognition[J]. arxiv preprint arxiv:1907.09021, 2019. [52] Cao K, Ji J, Cao Z, et al. Few-shot video classification via temporal alignment[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 10618-10627. [53] Li S, Liu H, Qian R, et al. TA2N: Two-stage action alignment network for few-shot action recognition[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2022, 36(2): 1404-1411. [54] Wu J, Zhang T, Zhang Z, et al. Motion-modulated temporal fragment alignment network for few-shot action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 9151-9160. [55] Lu S, Ye H J, Zhan D C. Few-shot action recognition with compromised metric via optimal transport[J]. arXiv preprint arXiv:2104.03737, 2021. [56] Peyré G, Cuturi M. Computational optimal transport: With applications to data science[J]. Foundations and Trends® in Machine Learning, 2019, 11(5-6): 355-607. [57] CHRISTOPH Feichtenhofer, FAN Haoqi, MALIKJitendra, et al. Slowfast networks for video recognition[C].Proceedings of the IEEE/CVF international conference on computer vision.2019:6202-6211. [58] LIN Ji, GAN Chuang, HAN Song. Tsm：Temporal shift module for efficient video understanding[C].Proceedings of the IEEE/CVF international conference on computer vision.2019:7083-7093. [59] BERTASIUS Gedas, WANG Heng,TORRESANI Lorenzo. Is space-time attention all you need for video understanding?[C].ICML.2021,2(3):4. [60] YANG Ceyuan, XU Yinghao, SHI Jianping, et al. Temporal pyramid networkfor action recognition[C].Proceedings of the IEEE/CVF conference on computer visionand pattern recognition.2020:591-600. [61] Zhang H, Zhang L, Qi X, et al. Few-shot action recognition with permutation-invariant attention[C]//Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part V 16. Springer International Publishing, 2020: 525-542. [62] Zhang S, Zhou J, He X. Learning implicit temporal alignment for few-shot video classification[J]. arXiv preprint arXiv:2105.04823, 2021. ﹀
中图分类号：	TP391
开放日期：	2025-06-19

附件下载