查看论文信息

免费浏览

查看论文信息

论文中文题名：	基于深度学习的煤矿井下视频清洗算法研究
姓名：	韩泽
学号：	18208052009
保密级别：	公开
论文语种：	chi
学科代码：	081203
学科名称：	工学 - 计算机科学与技术（可授工学、理学学位） - 计算机应用技术
学生类型：	硕士
学位级别：	工学硕士
学位年度：	2021
培养单位：	西安科技大学
院系：	计算机科学与技术学院
专业：	计算机应用技术
研究方向：	计算机图形图像处理技术
第一导师姓名：	付燕
第一导师单位：	西安科技大学
论文提交日期：	2021-06-21
论文答辩日期：	2021-06-03
论文外文题名：	Research on video cleaning algorithm in coal mine underground based on deep learning
论文中文关键词：	视频数据质量 ; 近重复视频 ; 视频清洗 ; FD-Means ; 多头注意力残差网络
论文外文关键词：	video data quality ; near-duplicate videos ; videos cleaning ; vgg-16 deep network ; feature distance-means clustering ; multi-head attention resnet
论文中文摘要：	︿煤矿安全生产与煤矿重大灾害防治一直是国家强调的重要问题，智能视频监控技术是预防和缓解该问题的有效途径之一。煤矿井下视频监控设备不断增加，导致大量的近重复视频数据产生，视频数据质量问题日益突出，视频数据的维护和管理工作也越来越具有挑战性。本文以煤矿井下视频数据集为研究对象，针对井下近重复视频数据的质量问题进行基础科学研究。具体内容包括： (1) 已有近重复视频检索方法尽管可以有效识别近重复视频，但无法在保证数据完整性的前提下，自动清洗近重复视频数据，从而改善视频数据质量。为此，本文提出一种融合VGG-16 深度网络与FD-Means聚类的近重复视频清洗方法。该方法采用VGG-16深度网络提取视频的深度空间特征，针对近重复视频产生的多源性， K-means算法模型较难完全适用于近重复视频清洗，本文在K-Means基础之上改进出一种FD-Means算法模型。在更新聚类簇时，根据数据点与聚类中心之间的距离，将距离大于预设距离阈值的离散点构造为新的簇；为了消除距离过近的簇，将各聚类中心之间距离小于距离阈值的簇合并为一个簇。最后清除簇中心点之外的近重复视频数据。实验结果表明，该方法能够有效解决近重复视频数据清洗问题，改善视频的数据质量，减少存储资源浪费。 (2) 由于煤矿井下视频质量较难保证，仅利用VGG深度网络模型抽取的视频空间特征，较难准确表征视频的高层语义特征。为此本文提出多头注意力的残差网络，通过结合时空特征及注意力模型，提高视频特征的表示能力。该方法首先选用嵌入CBAM的Resnet34作为基础网络，提高图像显著区域的特征；其次采用具有时间注意力的长短期记忆网络，捕获视频帧序列的时序特征；最后串联两个网络，构造多头注意力的残差网络。实验结果表明，本章提出的多头注意力残差网络，与第三章的预训练VGG16提取特征的方式相比，在CC_WEB_VIDEO和煤矿井下数据集上，F1-score分别提升了8.7%和15.7%，提升了近重复视频清洗的准确性。﹀
论文外文摘要：	︿ Safety production of coal mine and disaster prevention are always been important issues emphasized in our country, intelligent video surveillance and monitoring (VSAM) technology is one of the most effective ways to prevent and relieve this problem. The continuous increase of underground video surveillance equipment in coal mines has led to the production of a large number of near-duplicate video data. The quality of video data has become increasingly prominent, and the maintenance and management of video data has become more and more challenging. In this project, we focus on the issue of video data quality of near-duplicate video in coal mine, they are studied by using the video data set of coal mine. The concrete contents include: (1) Although the existing near-duplicate video retrieval methods can effectively identify near-duplicate videos, it is difficult to automatically clean the near-duplicate video data under the premise of ensuring data integrity in order to improve the quality of the video data. This paper proposes a near-duplicate video cleaning method combining VGG-16 network and FD-Means clustering. This method uses the VGG-16 network model to extract High-level Semantic Feature of the video; in the unsupervised video cleaning task, because the K-Means algorithm need to predetermine the number of clusters K, and the video content is random, the number of clusters cannot be determined. This paper proposes a FD-Means clustering algorithm model. When updating clusters, according to the distance between the data point and the cluster center, the discrete points with a distance greater than the distance threshold are constructed as a new cluster; in order to eliminate clusters that are too close, the distance between each cluster center is smaller Finally, the near-duplicate video data outside the cluster center point is cleared. Experimental results show that this method can effectively solve the problem of near-duplicate video data cleaning, improve the data quality of the video, and reduce the waste of storage resources. (2) Because it is difficult to guarantee the quality of underground videos in coal mine, and it is difficult to accurately represent the high-level semantic features of videos using only the video spatial features extracted by the VGG deep network model. To this end this paper proposes a multi-attention residuals network , which combines spatio-temporal features and attention models to improve the representation ability of video features. This method first selects Resnet34 embedded in CBAM as the basic network to improve the feature extraction of the salient areas of the image; secondly, it uses a long-term short-term memory network with time attention to capture the timing characteristics of the video frame sequence; finally, two networks are connected in series to construct a multi-head attention Resnet. This method has been experimentally verified on CC_WEB_VIDEO and coal mine datasets. The experimental results show that the multi-head attention Resnet proposed in this chapter has increased F1-score by 18% and 7% in comparison with the comparative experiments in the tasks of CC_WEB_VIDEO and the near-duplicate video detection on the coal mine dataset. In the video cleaning tasks on CC_WEB_VIDEO and coal mine datasets, compared with the pre-trained VGG16 feature extraction method in Chapter 3, F1-score increased by 8.7% and 15.7% respectively, greatly improving the accuracy of near-duplicate video cleaning. ﹀
参考文献：	︿ [1] 刘峰,曹文君,张建明,曹光明,郭林峰.我国煤炭工业科技创新进展及“十四五”发展方向[J].煤炭学报,2021,46(01):1-15. [2] 王国法,赵国瑞,任怀伟.智慧煤矿与智能化开采关键核心技术分析[J].煤炭学报,2019,44(01):34-41. [3] 马宏伟,王世斌,毛清华,石增武,张旭辉,杨征,曹现刚,薛旭升,夏晶,王川伟.煤矿巷道智能掘进关键共性技术[J].煤炭学报,2021,46(01):310-320. [4] 王国臣.矿井安全综合视频监控系统的发展[J].煤炭技术,2006(08):65-67. [5] 王虹桥.“互联网+”背景下煤炭工业两化深度融合的思考[J].煤炭经济研究,2015,35(10):6-11. [6] 黄凯奇,陈晓棠,康运锋,谭铁牛.智能视频监控技术综述[J].计算机学报,2015,38(06):1093-1118. [7] 武强.我国矿井水防控与资源化利用的研究进展、问题和展望[J].煤炭学报, 2014,39(5):795-805. [8] Chou C L, Chen H T, Lee S Y. Pattern-Based Near-Duplicate Video Retrieval and Localization on Web-Scale Videos[J]. IEEE Transactions on Multimedia, 2015, 17(3): 382-395. [9] Mali T, Hingoliwala H A. Duplicate Video Detection by using keypoint Descriptor[J]. International Journal, 2015, 3(7):131-136. [10] Morozov A A. Development of a method for intelligent video monitoring of abnormal behavior of people based on parallel object-oriented logic programming[J]. Pattern Recognition and Image Analysis, 2015, 25(3): 481-492. [11] Wu Xiao, NGO.C, HAUPTMANN A, et al. Real-time near-duplicate elimination for Web video search with content and context[J]. IEEE Trans on Multimedia, 2009, 11(2):196-207. [12] Cheung S S, Zakhor A. Efficient video similarity measurement and search[C]. In:Proceedings of 2000 International Conference on IEEE, 2000, 1: 85-88. [13] A Hampapur, K.H.Hyun, R.Bolle. Comparison of sequence matching techniques for video copy detection[C]. In Proc. SPIE, Storage and Retrieval for Media Database, 2002, 4676:194-201. [14] Tang X, Gao X, Liu J, et al. A spatial-temporal approach for video caption detection and recognition[J]. IEEE Transactions on Neural Networks, 2002, 13(4): 961-971. [15] Shen H T, Zhou X, Huang Z, et al. UQLIPS: a real-time near-duplicate video clip detection system[C]. In: Proceedings of the 33rd international conference on Very large data bases. VLDB Endowment, 2007: 1374-1377. [16] Zhou X, Zhou X, Chen L, et al. An efficient near-duplicate video shot detection method using shot-based interest points[J]. IEEE Transactions on Multimedia, 2009, 11(5): 879-891. [17] Huang Z, Shen H T, Shao J, et al. Practical online near-duplicate subsequence detection for continuous video streams[J]. IEEE Transactions on Multimedia, 2010, 12(5): 386-398. [18] Liu J, Huang Z, Shen H T, et al. Correlation-based retrieval for heavily changed near-duplicate videos [J]. ACM Transactions on Information Systems (TOIS), 2011, 29(4): 21. [19] Zhou X, Chen L. ASVTDECTOR: A practical near duplicate video retrieval system[C]. In: proceedings of Data Engineering (ICDE), 2013 IEEE 29th International Conference on. IEEE, 2013: 1348- 1351. [20] Jun W, Lee Y, Jun B M. Duplicate video detection for large-scale multimedia[J]. Multimedia Tools and Applications, 2015: 1-14. [21] Boukhari A, Serir A. Weber Binarized Statistical Image Features (WBSIF) based video copy detection[J]. Journal of Visual Communication and Image Representation, 2016, 34: 50-64. [22] Giorgos K, Papadopoulos S, Patras I, Kompatsiaris Y. Near-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers(C). In: Proceedings of the 23rd International Conference on MultiMedia Modeling. Springer, 2017:251-263. [23] Bachmann, Jörg P, Hauskeller B . Simple Yet Efficient Content Based Video Copy Detection[J]. arXiv preprint arXiv:1804.07019,2018. [24] Wu X, Hauptmann A G, Ngo C W. Practical elimination of near-duplicates from web video search[C]. In: Proceedings of the 15th international conference on Multimedia. ACM, 2007: 218-227. [25] Wu X, Zhao W L, Ngo C W. Near-duplicate keyframe retrieval with visual keywords and semantic context[C]. In: Proceedings of the 6th ACM international conference on Image and video retrieval. ACM, 2007: 162-169. [26] Wu Xiao, NGO.C, HAUPTMANN A, et al. Real-time near-duplicate elimination for Web video search with content and context[J]. IEEE Trans on Multimedia, 2009, 11(2):196-207. [27] 刘守群, 朱明, 郑烇. 一种基于内容相似性的重复视频片段检测方法[J]. 中国科学技术大学学报, 2010, 40(11): 1130-1135. [28] 曹政, 朱明. 一种快速有效的相似视频检索方法[J]. 中国科学院研究生院学报, 2010,27(3): 376-380. [29] 刘红. 一种基于图的近重复视频子序列匹配算法[J]. 计算机应用研究 ,2013, 30(12): 3857-3862. [30] Wang L, Bao Y, Li H., Fan X, Luo Z. Compact CNN Based Video Representation for Efficient Video Copy Detection(C). In: Proceedings of the 23rd International Conference on MultiMedia Modeling. Springer, 2017: 576-587. [31] 聂秀山,林培光,杨明哲,尹义龙.基于层次特征融合哈希的近似重复视频检索方法[J].中国科学:信息科学,2018,48(12):1697-1708. [32] Zhang, X., Xie, Y., Luan, X. et al. Video Copy Detection Based on Deep CNN Features and Graph-Based Sequence Matching. Wireless Pers Commun 103, 401–416. [33] Hu Y , Lu X . Learning spatial-temporal features for video copy detection by the combination of CNN and RNN[J]. Journal of visual communication & image representation, 2018, 55(AUG.):21-29. [34] Zhang D , Sun Z , Jia K . Near-Duplicate Video Detection Based on Temporal and Spatial Key Points[M]// Advances in 3D Image and Graphics Representation, Analysis, Computing and Information Technology. 2020. [35] M Hernandez. A Generation of Band Joins and the Merge/Purge Problem[R]. Technical Report CCCS-005-1995，Department of Computer Science，Columbia University，1995. [36] Chaudhuri S, Dayal U. An overview of data warehousing and OLAP technology[J]. SIGMOD RECORD, 1997, 26 (1): 65-74. [37] Wang R T. A Product Perspective on Total Data Quality Management[J]. Communications of ACM, 1998, 41(2): 58-65. [38] Kalashnikov D V, Mehrotra S . Domain-independent data cleaning via analysis of entity-relationship graph[J]. ACM Transactions on Database System, 2006: 31(2), 716-767. [39] Hellerstein J M. Quantitative data cleaning for large databases[J]. United Nations Economic Commission for Europe (UNECE), 2008:1-42. [40] Wenzel, Michael J, Boettcher, Andrew J, Drees, Kirk H, Kummer, James P. Systems and Methods for Data Quality Control and Cleansing[J]. United States Patent Application, 2013, 4:692-707. [41] Abedjan Z, Akcora C G, Ouzzani M, et al. Temporal rules discovery for web data cleaning[J]. Proceedings of the VLDB Endowment, 2015, 9(4): 336-347. [42] 郭志懋，周傲英.数据质量和数据清洗研究综述[J] .软件学报，2002，13(11)：2076-2082. [43] 刘嘉，张璟，李军怀. 一种基于Token匹配的中文数据清洗方法[J]. 计算机应用与软件. 2009，26(11):43-45. [44] 王铭军, 潘巧明, 刘真, 等. 可视数据清洗综述[J]. 中国图象图形学报, 2015, 20(4):468-482. [45] 郝爽,李国良,冯建华,王宁.结构化数据清洗技术综述[J].清华大学学报(自然科学版),2018,58(12):1037-1050. [46] JIE L, HENK V, YUAN S D, YUN Z. Urban travel time data cleaning and analysis for Automatic Number Plate Recognition[J]. Transportation Research Procedia, 2020, 47:712-719. [47] WANG H Z,DING X O, et al. Clean Cloud: cleaning big Data on Cloud[C]// Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, Singapore: ACM, 2017: 2543–2546. [48] 杨东华,李宁宁等.基于任务合并的并行大数据清洗过程优化[J].计算机学报,2016,39(01):97-108. [49] Ye O, LI Z L, ZHANG Y. Near-Duplicate video cleaning method based on locality sensitive hashing and the sorted neighborhood method(C). In Proc. ROSENET, Japan, Springer Nature Switzerland AG, 2020:129-139. [50] Hafsa Lattar, Aicha Ben Salem, Henda Hajjami Ben Ghezala.Does data cleaning improve heart disease prediction?.[J]Procedia Computer Science, 2020, 176: 1131-1140 [51] 叶鸥, 李占利. 视频数据质量与视频数据检测技术[J]. 西安科技大学学报, 2017,37(06):919-926. [52] 李坚, 郑宁.对基于MPN数据清洗算法的改进[J].计算机应用与软件, 2008, 25(2): 245−246. [53] Monge A E, Elkan C P. An efficient domain-independent algorithm for detecting approximately duplicate database records[C]//Proceeding of the ACM-SIGMOD Workshop on Data Mining and Knowledge Discovery. Tucson: ACM, 1997: 23−29. [54] Basharat A , Zhai Y , Shah M . Content based video matching using spatiotemporal volumes[J]. Computer Vision and Image Understanding, 2008, 110(3):360-377. [55] ZEILER M D， FEＲGUS Ｒ． Visualizing and understanding convolutional networks（C). In Proc.ECCV，Berlin: Springer，2014: 818－833 [56] K. Fukushima. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position, 1980. [57] LECUN Y，BOTTOU L，BENGIO Y，Haffner P．Gradient-based learning applied to document recognition (C)．In Proc. IEEE，1998，86( 11) : 2278－2324． [58] KＲIZHEVSKY A， SUTSKEVEＲ I， HINTON G E． ImageNet classification with deep convolutional neural networks(C). In Proc. NIPS，Cambridge，MA: MIT Press，2012: 1106 -1114． [59] SZEGEDY C， LIU W， JIA Y， et al． Going deeper with convolutions(C). In Proc. CVPR， Washington，DC: IEEE Computer Society，2015: 1-8． [60] SIMONYAN K, ZISSERMAN A . Very deep convolutional networks for large-scale image recognition(C). In Proc. ICLR，, San Diego: IEEE, 2015:1-14. [61] K. He, X. Zhang, S. Ren and J. Sun, Deep Residual Learning for Image Recognition(C). In Proc. CVPR, Las Vegas, NV, USA, 2016:770-778, doi: 10.1109/CVPR.2016.90. [62] He K , Zhang X , Ren S , et al. Identity Mappings in Deep Residual Networks(C). In Proc. ECCV, Springer, Cham, 2016: 630-645. [63] Jia Jun Liu,Shen H T, Huang Z , et al. Near-duplicate Video Retrieval: Current Research and Future Trends[J]. ACM Computing Surveys, 2013, 45(4):1-23. [64] 杨俊闯,赵超.K-Means聚类算法研究综述[J].计算机工程与应用,2019,55(23):7-14. [65] 白树仁,陈龙.自适应K值的粒子群聚类算法[J].计算机工程与应用,2017,53(16):116-120. [66] 钟志峰,李明辉,张艳.机器学习中自适应k值的k均值算法改进[J].计算机工程与设计,2021,42(01):136-141. [67] 卢光明,杨文,廖庆敏.基于局部纹理分析的虹膜识别算法[J].计算机应用,2007(06):1490-1492. [68] Mnih V, Heess N, Graves A.Recurrent models of visual attention(C). In Proc.NIPS, 2014:2204-2212. [69] Zhang Y, Li K, Li K, Wang L, Zhong B, Fu Y. Image Super-Resolution Using Very Deep Residual Channel Attention Networks(C). In Proc. ECCV. Springer, Cham. 2018: 294-310. [70] X. Zhu, D. Cheng, Z. Zhang, S. Lin and J. Dai, An Empirical Study of Spatial Attention Mechanisms in Deep Networks(C). In Proc. ICCV, Seoul, Korea (South), 2019:6687-6696, doi: 10.1109/ICCV.2019.00679. [71] Zhou P, Shi W, Tian J, Qi Z Y, Li B C, Hao H W, et al.Attention-based bidirectional long short-term memory networks for relation classification(C). In Proc. of the 54th Annual Meeting of the Association for Computational Linguistics.Berlin, Germany:ACL, 2016.207-212 [72] Jaderberg M, Simonyan K, Zisserman A. Spatial transformer networks(C). In Proc.NIPS, 2015: 2017-2025. [73] Jie Hu, Li Shen and Gang Sun, Squeeze-and-excitation networks(C). In Proc.CVPR, Salt Lake City, UT, USA, 2018: 7132-7141. [74] WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[C].The 15th European Conference on Computer Vision, Munich, Germany, 2018: 3–19. [75] W. Cao, Z. Feng, D. Zhang, Y. Huang. Facial Expression Recognition via a CBAM Embedded Network[J], Procedia Computer Science,2020,174:463-477. [76] Herbert Jaeger, A tutorial on training recurrent neural networks, covering BPPT, RTRL,EKF and the echo state network approach[R], Tech. Rep, Fraunhofer Institute for Autonomous Intelligent Systems (AIS) since 2003: International University Bremen, 2005. [77] Sepp Hochreiter; Jürgen Schmidhuber. Long short-term memory[J]. Neural Computation. 1997，9 (8): 1735–1780. doi:10.1162/neco.1997.9.8.1735. [78] X. Zheng, W. Chen, An Attention-based Bi-LSTM Method for Visual Object Classification via EEG[J],Biomedical Signal Processing and Control,2021, 63(102174):1-9. [79] YAGER R R, FILEV D P.Approximate clustering via the mountainmethod[J].IEEE Trans on Systems, Man, and Cybernetics, 1994, 24 (8) :1279-1284. ﹀
中图分类号：	TP391.413
开放日期：	2021-06-22

附件下载