查看论文信息

免费浏览

查看论文信息

论文中文题名：	基于图像与视频数据的煤矸石快速识别方法研究
姓名：	师玉红
学号：	20206043044
保密级别：	保密（1年后开放）
论文语种：	chi
学科代码：	0811
学科名称：	工学 - 控制科学与工程
学生类型：	硕士
学位级别：	工学硕士
学位年度：	2023
培养单位：	西安科技大学
院系：	电气与控制工程学院
专业：	控制科学与工程
研究方向：	图像处理
第一导师姓名：	潘红光
第一导师单位：	西安科技大学
论文提交日期：	2023-06-19
论文答辩日期：	2023-06-02
论文外文题名：	: Research on Rapid Recognition Method of Coal Gangue Based on Image and Video Data
论文中文关键词：	矸石 ; 图像目标识别 ; Tiny YOLOv3 ; 视频目标识别 ; 长短期存储
论文外文关键词：	Coal gangue ; Image object recognition ; Tiny YOLOv3 ; Video object recognition ; Long short-term storage
论文中文摘要：	︿煤炭开采中矸石的大量混入降低了煤炭的生产效率和经济效益。煤矸石识别和分选是提高煤炭资源质量的有效途径，常用的识别方式存在诸多局限已不适用于逐渐信息化、智能化的现代煤矿，探索更优性能的煤矸石识别方法成为研究重点。基于此，本文对基于图像和视频数据的煤矸石识别方法开展研究，具体内容如下： 1. 针对煤矸石图像中特征图尺寸不一、重要通道权重低及卷积层参数量大的问题，首先，本文设计多卷积核组合池化的空间金字塔池化网络来确保输入特征图被处理为固定尺寸，设计压缩激励模块增强煤矸石图像中重要通道的被关注度，再引入空洞卷积层捕获上下文图像信息以增大感受野；其次，提出一种基于改进 tiny YOLOv3 的煤矸石快速识别模型；最后，实验表明本文图像模型识别精度为 99.4%，与 tiny YOLOv3 相比训练耗时降低了 7.41%，损失值提升了 53.01%，具有显著的性能优势。 2. 针对图像数据存在的偶然性、低效率等弊端，首先，本文设计时空关系网络对煤矸石视频帧序列进行多尺度特征聚合，减少了冗余数据带来的计算负担；其次，设计关键帧选取框架和注意力机制来筛选关键帧，构建长、短期视频帧，并调节不同视频帧之间的权重来增强关键特征的被关注度；最后，设计了长短期存储模块对长、短期视频帧特征进行存储，并在关键帧识别时进行融合以增强识别精度。在此基础上，本文设计了基于长短期聚合特征存储的煤矸石视频快速识别模型。 3. 本文采集宁夏某选煤厂煤矸石真实分选场景视频构建煤矸石视频识别数据集。此外，对基于长短期聚合特征存储的煤矸石视频快速识别模型分别在两个数据集上进行模型性能验证，并与 MEGA、FGFA、RDN、DFF 等模型识别效果进行了对比。实验表明本文所提视频识别模型的识别精度最高，在 ILSVRC2015 数据集和自建煤矸石视频数据集上分别为 77.12%、81.97%，验证了该模型的识别可行性与性能优越性。基于图像和视频数据的煤矸石快速识别方法在自建煤矸石图像数据集、自建煤矸石视频数据集和 ILSVRC2015 数据集上分别取得了较好的识别效果，与同领域其他前沿模型相比具有显著的性能优势。上述模型的设计与验证为煤矸石图像和视频识别方法工业现场应用提供了理论基础和实验数据。﹀
论文外文摘要：	︿ The massive mixing of gangue in coal mining reduces the production efficiency and economic benefits of coal. The recognition and separation of coal gangue is an effective way to improve the quality of coal resources, and the current recognition method has many limitations which are not applicable to the modern coal mines with gradual informatization and intelligence. Exploring the better performance of coal gangue recognition technology has become the focus of research. On the basis of this, the research on coal gangue recognition method based on image and video data is carried out in this paper, as follows: 1. To address the problems of varying feature map size, low weight of important channels and large number of convolutional layer parameters in coal gangue images. Firstly, this paper designs a spatial pyramid pooling network with multiple convolutional kernel combination pooling to ensure that the input feature map is processed to a fixed size, designs a squeeze-andexcitation module to enhance the attention of important channels in gangue images, and then introduces a dilated convolutional layer to capture contextual image information to increase the perceptual field; secondly, a fast recognition model based on improved tiny YOLOv3 is proposed; finally, experiments show that the recognition accuracy of this model is 99.4%, the training time is reduced by 7.41% and the loss value is improved by 53.01% compared with tiny YOLOv3, which has significant performance advantages. 2. To address the drawbacks of image data such as occasional and inefficient, firstly, this paper designs a temporal relationship network for multi-scale feature aggregation of coal gangue video frame sequences to reduce the computational burden caused by redundant data; Secondly, designing key frame selection framework and attention mechanism to filter key frames, Constructing long and short-term video frames and adjust the weights between different video frames to improve focus on key features; Finally, a long short-term storage module is designed to store long and short-term video frame features and fuse them in key frame recognition to enhance recognition accuracy. Based on this, this paper introduces a fast recognition model for coal gangue video based on long and short-term aggregation feature storage. 3. In this paper, real coal gangue sorting scenes in Ningxia coal processing plant has been collected to construct a coal gangue video recognition dataset. In addition, the performance of the fast recognition model of gangue video based on long and short-term aggregated feature storage has been verified on two data sets, and the recognition effect is compared with MEGA, FGFA, RDN, DFF. The experiment data shows that the recognition accuracy of the proposed video recognition model on the ILSVRC2015 dataset and self-built gangue video dataset is the highest, that is 77.12% and 81.97% respectively, which verifies the recognition feasibility and superior performance of the model. The fast recognition method of coal gangue based on image and video data is designed in this paper, which achieves better recognition results on self-built coal gangue image dataset and ILSVRC2015 dataset respectively, and has significant performance advantages compared with other state of the art models. The above models are designed and validated to provide theoretical basis and experimental data for the application of coal gangue image and video recognition in industrial fields. ﹀
参考文献：	︿ [1] 谢和平, 吴立新, 郑德志. 2025 年中国能源消费及煤炭需求预测 [J]. 煤炭学报, 2019,44(07): 1949–1960. [2] 黄金凤, 张建喜, 于江涛, 等. 并联式选矸机器人路径规划研究 [J]. 工矿自动化,2022(08): 26–32+42. [3] Ding K, Laskowski J S. Coal reverse flotation. Part I: Separation of a mixture of subbituminous coal and gangue minerals[J]. Minerals Engineering, 2006, 19(1): 72–78. [4] Wang R, Liang Z. Automatic separation system of coal gangue based on DSP and digital image processing[C]//Symposium on Photonics and Optoelectronics (SOPO). Piscataway: IEEE, 2011: 1–3. [5] Zhou J, Guo Y, Wang S, et al. Research on intelligent optimization separation technology of coal and gangue base on LS-fsvm by using a binary artificial sheep algorithm[J]. Fuel: a Journal of Fuel Science, 2022, 319(1): 123837–123837. [6] Shu X. Hierarchical separating and grading method of coal Google Patents. US Patent 11,498,098. [7] 陈立, 杜文华, 曾志强, 等. 基于小波变换的煤矸石自动分选方法 [J]. 工矿自动化, 2018, 44(12): 60–64. [8] Ma X. A revised edge detection algorithm based on wavelet transform for coal gangue image[C]//International Conference on Machine Learning and Cybernetics (ICMLC). Piscataway: IEEE, 2007: 1639–1642. [9] Li L, Wang H, An L. Research on recognition of coal and gangue based on image processing[J]. World Journal of Engineering, 2015, 12(3): 247–254. [10] Li D, Wang G, Zhang Y, et al. Coal gangue detection and recognition algorithm based on deformable convolution YOLO v3[J]. IET Image Processing, 2022, 16(1): 134–144. [11] Zhou M, Lai W. Coal gangue recognition based on spectral imaging combined with XGboost[J]. PloS One, 2023, 18(1): 1–15. [12] Li D, Zhang Z, Xu Z, et al. An Image-based Hierarchical Deep Learning Framework for Coal and Gangue Detection[J]. IEEE Access, 2019, 7: 184686–184699. [13] Wang D, Ni J, Du T. An image recognition method for coal gangue based on ASGS-CWOA and BP neural network[J]. Symmetry, 2022, 14(5): 880–895. [14] Hu F, Hu Y, Cui E, et al. Recognition method of coal and gangue combined with structural similarity index measure and principal component analysis network under multispectral imaging[J]. Microchemical Journal, 2023, 186: 108330–108341. [15] Zhang L, Sui Y, Wang H, et al. Image feature extraction and recognition model construction of coal and gangue based on image processing technology[J]. Scientific Reports, 2022, 12(20983): 1–15. [16] He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904–1916. [17] Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2018: 7132–7141. [18] Zhang J, Lin S, Ding L, et al. Multi-scale context aggregation for semantic segmentation of remote sensing images[J]. Remote Sensing, 2020, 12(701): 1–16. [19] Mgn A, Kr B. Reliable object recognition system for cloud video data based on LDP features - scienceDirect[J]. Computer Communications, 2020, 149: 343–349. [20] Li Z, Jiang D, Wang H, et al. Video image moving target recognition method based on generated countermeasure network[J]. Computational Intelligence and Neuroscience, 2022, 2022(7972845): 1–8. [21] Liu X, Liu S, Ma Z. A framework for short video recognition based on motion estimation and feature curves on SPD manifolds[J]. Applied Sciences, 2022, 12(9): 4669–4691. [22] Hu H, Liu T, Feng H. Fast-slow visual network for action recognition in videos[J]. Multimedia Tools and Applications, 2022, 81(18): 26361–26379. [23] Wang Y, Wang Y, Dang L. Video detection of foreign objects on the surface of belt conveyor underground coal mine based on improved SSD[J]. Journal of Ambient Intelligence and Humanized Computing, 2020: 1–10. [24] Yanzi M, Xiaolin W, Yuanhao Z, et al. FAOF: a feature aggregation method based on optical flow for gangue detection on production environment[J]. Assembly Automation, 2022, 42(4): 535–541. [25] Xu Z, Li J, Zhang M. A surveillance video real-time analysis system based on edgecloud and fl-YOLO vooperation in coal mine[J]. IEEE Access, 2021, 9(3077499): 68482–68497. [26] Yang G, Wang Y, Yi C, et al. A new super-resolution restoration method with generated adversarial network for underground video images in coal mines[J]. Journal of Physics: Conference Series, 2021, 2031(1): 012011–012023. [27] Zhang Y, Liu C, Deng H, et al. Peridynamic simulation of heterogeneous rock based on digital image processing and low-field nuclear magnetic resonance imaging[J]. International Journal of Geomechanics, 2022, 22(6): 4022083.1–4022083.10. [28] Yan Z, Zhang H, Wang X, et al. An image-to-answer algorithm for fully automated digital PCR image processing[J]. Lab on a Chip, 2022, 22(7): 1333–1343. [29] 胡璟皓, 高妍, 张红娟, 等. 基于深度学习的带式输送机非煤异物识别方法 [J]. 工矿自动化, 2021, 47(6): 57–62+90. [30] 张梦超, 周满山, 张媛, 等. 基于深度学习的矿用输送带损伤检测方法 [J]. 工矿自动化, 2021, 47(6): 51–56. [31] 蒋磊, 马六章, 杨克虎, 等. 基于 MFCC 和 FD-CNN 卷积神经网络的综放工作面煤矸智能识别 [J]. 煤炭学报, 2020, 46(S2): 1109–1117. [32] 许志, 李敬兆, 张传江, 等. 轻量化 CNN 及其在煤矿智能视频监控中的应用 [J]. 工矿自动化, 2020, 46(12): 13–19. [33] Shu C, Sun L. Automatic target recognition method for multitemporal remote sensing image[J]. Open Physics, 2020, 18(1): 170–181. [34] Zhai Y, Lei J, Xia W, et al. Research on the enhancement of laser radar range image recognition using a super-resolution algorithm[J]. Sensors, 2020, 20(18): 5185–5196. [35] Channappa G, Kanagavalli R. Detecting and tracking of multiple objects in a single frame with YOLO[C]//International Conference on Recent Trends on Electronics, Information, Communication & Technology (RTEICT). Piscataway: IEEE, 2021: 980–984. [36] Farhadi A, Redmon J. YOLO v3: an incremental improvement[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Berlin: Springer, 2018: 1–6. [37] Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2016: 779–788. [38] Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2017: 6517–6525. [39] Liang H, Yang J, Shao M. FE-RetinaNet: small target detection withrelation distillation networks for video object detection parallel multi-scale feature enhancement[J]. Symmetry, 2021, 13(6): 950–962. [40] Pan H, Zhang H, Lei X, et al. Hybrid dilated faster RCNN for object detection[J]. Journal of Intelligent & Fuzzy Systems, 2022, 43(1): 1229–1239. [41] Girshick R. Fast R-CNN[C]//IEEE international conference on computer vision (ICCV). Piscataway: IEEE, 2015: 1440–1448. [42] Ren S, He K, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 39(6): 1137–1149. [43] Rodríguez-Moreno I, Martínez-Otzeta J, Sierra B, et al. Video activity recognition: stateof-the-art[J]. Sensors, 2019, 19(14): 3160–3185. [44] Srilakshmi U, Veeraiah N, Alotaibi Y, et al. An improved hybrid secure multipath routing protocol for MANET[J]. IEEE Access, 2021, 9: 163043–163053. [45] Zhu X, Xiong Y, Dai J, et al. Deep feature flow for video recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2017: 2349–2358. [46] Zhu X, Dai J, Yuan L, et al. Towards high performance video object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2018: 7210–7218. [47] Zhu X, Wang Y, Dai J, et al. Flow-guided feature aggregation for video object detection[C]//IEEE International Conference on Computer Vision (CVPR). Piscataway: IEEE, 2017: 408–417. [48] Chen Y, Cao Y, Hu H, et al. Memory enhanced global-local aggregation for video object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2020: 10337–10346. [49] Han M, Wang Y, Chang X, et al. Mining inter-video proposal relations for video object detection[C]//European Conference on Computer Vision (ECCV). Berlin: Springer, 2020: 431–446. [50] Sabater A, Montesano L, Murillo A C. Robust and efficient post-processing for video object detection[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Piscataway: IEEE, 2020: 10536–10542. [51] Belhassen H, Zhang H, Fresse V, et al. Improving video object detection by seq-bbox matching[C]//International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP). Berlin: Springer, 2019: 226–233. [52] Wang D, Wang J, Li W, et al. T-CNN: trilinear convolutional neural networks model for visual detection of plant diseases[J]. Computers and Electronics in Agriculture, 2021, 190: 106468–106478. [53] 高新宇. 基于机器视觉的煤矸智能分选系统设计 [D]. 太原: 太原理工大学, 2021. [54] Adarsh P, Rathi P, Kumar M. YOLO v3-tiny: object detection and recognition using one stage improved model[C]//International Conference on Advanced Computing and Communication Systems (ICACCS). Piscataway: IEEE, 2020: 687–694. [55] Huang Z, Wang J, Fu X, et al. DC-SPP-YOLO: dense connection and spatial pyramid pooling based YOLO for object detection[J]. Information Sciences, 2020, 522: 241–258. [56] 饶中钰, 吴景涛, 李明. 煤矸石图像分类方法 [J]. 工矿自动化, 2020, 46(3): 69–73. [57] 王鹏, 曹现刚, 夏晶, 等. 基于机器视觉的多机械臂煤矸石分拣机器人系统研究 [J]. 工矿自动化, 2019, 45(9): 47–53. [58] Pérez-García F, Sparks R, Ourselin S. TorchIO: a python library for efficient loading, preprocessing, augmentation and patch-based sampling of medical images in deep learning[J].Computer Methods and Programs in Biomedicine, 2021, 208(106236): 1–12. [59] 赵杰, 孙伟, 徐中达, 等. 基于形态学预处理的数字图像相关方法研究 [J]. 实验力学, 2022, 37(5): 629–637. [60] 于正永, 唐万春. 一种新颖的高低阻抗带状线低通滤波器分析方法 [J]. 兰州理工大学学报, 2019, 45(3): 90–94. [61] 张雪峰, 闫慧. 基于中值滤波和分数阶滤波的图像去噪与增强算法 [J]. 东北大学学报 (自然科学版), 2020, 48(4): 37–46. [62] 林昌, 周海峰, 陈武. 基于双边滤波的高斯金字塔变换 Retinex 图像增强算法 [J]. 激光与光电子学进展, 2020, 57(16): 209–215. [63] 吴景涛. 机器视觉技术在煤矸石智能识别中的应用研究 [D]. 徐州: 中国矿业大学2019. [64] 孙继平. AQ 6201—2017《煤矿安全监控系统通用技术要求》(报批稿)[J]. 工矿自动化, 2017(7): 1–6. [65] 贾婷婷. 基于深度学习的视频预处理算法研究和优化实现 [D]. 杭州: 浙江工业大学,2020. [66] 江巨浪, 刘国明, 朱柱, 等. 基于快速模糊聚类的动态多直方图均衡化算法 [J]. 电子学报, 2022, 50(1): 167–176. [67] Zhou R, Liu D. Quantum image edge extraction based on improved sobel operator[J].International Journal of Theoretical Physics, 2019, 58(9): 2969–2985. [68] Watkins B, Van Niekerk A. A comparison of object-based image analysis approaches for field boundary delineation using multi-temporal sentinel-2 imagery[J]. Computers and Electronics in Agriculture, 2019, 158: 294–302. [69] Deng J, Pan Y, Yao T, et al. Relation distillation networks for video object detection[C]//IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway:IEEE, 2019: 7023–7032. ﹀
中图分类号：	TP391
开放日期：	2024-06-19

附件下载