- 无标题文档
查看论文信息

论文中文题名:

 复杂环境下的货物图像识别方法研究    

姓名:

 高瑞芳    

学号:

 19308207004    

保密级别:

 公开    

论文语种:

 chi    

学科代码:

 085400    

学科名称:

 工学 - 电子信息    

学生类型:

 硕士    

学位级别:

 工程硕士    

学位年度:

 2022    

培养单位:

 西安科技大学    

院系:

 计算机科学与技术学院    

专业:

 计算机技术    

研究方向:

 计算机图形图像处理技术    

第一导师姓名:

 李爱国    

第一导师单位:

 西安科技大学    

论文提交日期:

 2022-06-23    

论文答辩日期:

 2022-06-07    

论文外文题名:

 Research on Cargo Image Recognition Method in Complex Environment    

论文中文关键词:

 图像识别 ; 深度学习 ; ESRGAN ; CGAN ; 自动巡检系统    

论文外文关键词:

 Image recognition ; Deep learning ; ESRGAN ; CGAN ; Automatic patrol system    

论文中文摘要:

       自动巡检系统作为某一体化安防系统中的重要组成之一,其作用是通过图像识别技术避免重要物资遭受外部威胁。这类重要物资存放环境复杂,在识别时会受到拍摄角度、光照和遮挡等因素的影响,导致拍摄到的图像存在目标小、背景干扰大、分辨率低以及识别准确率不够高等问题。针对上述问题,研究复杂环境下的货物图像识别方法。主要包括以下三点研究内容:

      (1)研究融合目标检测与ESRGAN的图像识别方法。针对专有图像数据集Cargo-images存在目标小、背景干扰大和分辨率低的问题,提出了一种图像识别方法TEResNet(Target-decetion and ESRGAN before ResNet)。该方法主要有三个步骤:首先利用目标检测方法,得到目标图像;然后用ESRGAN(Enhanced Super-Resolution Generative Adversarial Networks)模型提高目标检测后图像的分辨率;最后用改进的ResNet模型进行图像识别。在三种公开数据集和一种专有数据集上进行了对比实验,实验结果表明TEResNet方法比ResNet、AlexNet、GoogleNet和MobileNet四种卷积神经网络模型识别准确率更高。

      (2)针对TEResNet方法识别失败的图像,提出了一种新的图像后识别方法CISCGAN(Compute Image Similarity and Conditional Generative Adversarial Network)。该方法主要有三个步骤:首先根据均方误差MSE、峰值信噪比PSNR和结构相似性SSIM三个指标从训练样本图像库里面识别正确的样本中选出与识别失败图像相似度最高的一个样本图像;然后把得到的这个样本图像输入CGAN模型生成新的图像;最后用TEResNet方法对CGAN模型生成的图像进行识别。在专有数据集上进行了对比实验,实验结果表明CISCGAN方法可以进一步提高图像识别的准确率。

      (3)在上述两个研究内容的基础上开发了具有图像识别功能的自动巡检系统。自动巡检系统的开发平台是Microsoft Visual Studio 2012和Microsoft SQL Server 2012数据库。图像识别功能借助深度学习框架PyTorch实现。系统测试结果表明,图像识别准确率达到了设计要求。

      通过以上研究内容,构建了复杂环境下自动巡检系统中模拟货物图像识别模型,实现了全自动实时巡检,为重要物资的安全提供了更加准确、可靠的判断方法。

论文外文摘要:

      As one of important components of an integrated security system, automatic inspection plays a role in avoiding external threats to important materials through image recognition technology. The storage environment of protected materials is complex, and when identifying, it is affected by factors such as shooting angle, illumination and occlusion, resulting in the problems of small target, large background interference, low resolution and insufficient recognition accuracy in the captured image. Aiming at the above problems, the method of cargo image recognition in complex environment is studied in thesis. It mainly includes the following three research contents:

      (1) A fusion target detection and ESRGAN image recognition method was studied. To overcome the problems of small targets, high background interference and low resolution of the proprietary image dataset Cargo images, an image recognition method TEResNet (target-decetion and ESRGAN before ResNet) was proposed. The method mainly had three steps: firstly, the target image was obtained by using the target detection method; then ESRGAN (Enhanced super-resolution Generative Adversarial Networks) model was used to improve the image Resolution after target detection; finally, the improved ResNet model was used for image recognition. Comparative experiments were conducted on three public datasets and one proprietary dataset. The experimental results showed that the TEResNet method had higher recognition accuracy than ResNet, AlexNet, GoogleNet and MobileNet.

      (2) A new post-image recognition method, named CISCGAN (Compute Image Similarity and Conditional Generative Adversarial Network), was proposed for the image recognition failure of TEResNet method. CISCGAN had three steps: firstly, according to the mean square error, peak signal-to-noise ratio and structural similarity, a sample image with the highest similarity to the failed image was selected from recognition correct samples in the training sample image set; then inputted the selected image into the CGAN model to generate a new image; finally, the image generated by CGAN model was recognized by TEResNet method. Comparative experiments were conducted on a proprietary dataset. Experimental results showed that CISCGAN method can improve the accuracy of image recognition.

      (3) Based on the above two research results, an automatic inspection system with image recognition function was developed. The system was developed by employing Microsoft Visual Studio 2012 and Microsoft SQL Server 2012 Databases. Image recognition was implemented with the help of the deep learning framework PyTorch. The system test results showed that the image recognition accuracy met the requirement of design.

      Through the above research results, a simulated cargo image recognition model of the automatic inspection system for complex environment was constructed. The system real-time inspected important materials automatically with accurately and reliably.

参考文献:

[1]Gerlini M, Chetaine A, Majeed T. Physical Protection Systems (PPS) for Nuclear Facilities[J]. NATO Science for Peace and Security Series - E: Human and Societal Dynamics, 2016, 126: 970-106.

[2]贾旭, 孙福明, 李豪杰, 等. 具有普适性的改进非负矩阵分解图像特征提取方法[J]. 计算机应用, 2018, 38(01): 233-237+254.

[3]苗开超, 罗希昌, 张淑静, 等. 基于色域分析的大雾图像特征提取与等级识别方法[J]. 科学技术与工程, 2019, 19(35): 228-233.

[4]任燕红, 郭幸丽, 马丽. 基于增强算子的污染土雷达图像特征提取仿真[J]. 计算机仿真, 2020, 37(04): 5-8+61.

[5]李泽宇, 何萍, 朱立峰. 一种基于PCA的医学图像特征提取与配准算法研究[J]. 中国数字医学, 2020, 15(07): 98-101.

[6]郑志强, 胡鑫, 翁智, 等. 基于改进DenseNet的牛眼图像特征提取方法[J]. 计算机应用, 2021, 41(09): 2780-2784.

[7]Kumar G A, William J H. Development of Visual-Only Speech Recognition System for Mute People[J]. Circuits, Systems, and Signal Processing, 2021: 1-21.

[8]Ravi K K, Krothapalli S R. Phoneme Segmentation-Based Unsupervised Pattern Discovery and Clustering of Speech Signals[J]. Circuits, Systems, and Signal Processing, 2022, 41(4): 2088-2117.

[9]Xie Q, Zhang H, Gai S, et al. New Single Image Rain Removal Algorithm Based on Dual Parallel Branch Residual Overlay Network[J]. Circuits, System-s, and Signal Processing, 2022: 1-17.

[10]Meng C, Song Y, Ji J, et al. Automatic classification of rural building characteristics using deep learning methods on oblique photography[C]//Building Simulation. Tsinghua University Press, 2022, 15(6): 1161-1174.

[11]何敬, 刘仁义, 张丰, 等. 基于特征点群相似度计算模型的图像表示方法[J]. 浙江大学学报(理学版), 2017, 44(05): 599-605.

[12]Jiang L, Peng G, Xu B, et al. Foreign object recognition technology for port transportation channel based on automatic image recognition[J]. EURASIP Journal on Image and Video Processing, 2018, 2018(1): 1-9.

[13]Wen S. Translation analysis of English address image recognition based on image recognition[J]. EURASIP Journal on Image and Video Processing, 2019, 2019(1): 1-9.

[14]Liu D, Shen J, Yang H, et al. Recognition and localization of actinidia arguta based on image recognition[J]. EURASIP Journal on Image and Video Processing, 2019, 2019(1): 1-8.

[15]Tuncer T, Dogan S, Abdar M, et al. A novel facial image recognition method based on perceptual hash using quintet triple binary pattern[J]. Multimedia Tools and Applications, 2020, 79(39): 29573-29593.

[16]Zhang Z, Jiang S. Design of incomplete 3D information image recognition system based on SIFT algorithm and wireless network[J]. EURASIP Journal on Wireless Communications and Networking, 2020, 2020(1): 1-20.

[17]Han Z, Ma M. Hip-hop action image recognition based on symmetric algorithm and iterative weighting of dense sampling[J]. Journal of Ambient Intelligence and Humanized Computing, 2020: 1-11.

[18]董天天, 曹海啸, 阚希, 等. 复杂天气下交通场景多目标识别方法研究[J]. 信息通信, 2020(11): 72-74.

[19]谭章禄, 陈孝慈. RetinaNet图像识别技术在煤矿目标监测领域的应用研究[J]. 矿业安全与环保, 2020, 47(05): 65-70+76.

[20]王家臣, 潘卫东, 张国英, 等. 图像识别智能放煤技术原理与应用[J]. 煤炭学报, 2022, 47(01): 87-101.

[21]叶中华, 赵明霞, 贾璐. 复杂背景农作物病害图像识别研究[J]. 农业机械学报, 2021, 52(S1): 118-124+147.

[22]Wu Z, Wang F. Surface irrigation based on image object detection and fuzzy pid control[J]. Arabian Journal of Geosciences, 2021, 14(17): 1-16.

[23]Mehta T, Mehendale N. Classification of X-ray images into COVID-19, pneumonia, and TB using cGAN and fine-tuned deep transfer learning models[J]. Research on Biomedical Engineering, 2021, 37(4): 803-813.

[24]程祥鸣, 邓春华. 基于无标签知识蒸馏的人脸识别模型的压缩算法[J/OL]. 计算机科学, 2022: 1-14.

[25]张杨, 郝江波. 基于注意力机制和残差网络的恶意代码检测方法[J/OL]. 计算机应用, 2022: 1-10.

[26]董明宇, 严迪群. 基于ResNet的音频场景声替换造假的取证算法[J/OL]. 计算机应用, 2022: 1-6.

[27]全磊. 复杂环境下目标识别方法的研究[D]. 兰州:西北师范大学, 2018.

[28]Afsharirad H, Seyedin S A. Salient object detection using the phase information and object model[J]. Multimedia Tools and Applications, 2019, 78(14): 19061-19080.

[29]Sun P, Lü L, Qin J. Moving object extraction based on saliency detection and adaptive background model[J]. Optoelectronics Letters, 2020, 16(1): 59-64.

[30]Algarni A D. Efficient object detection and classification of heat emitting objects from infrared images based on deep learning[J]. Multimedia Tools and Applications, 2020, 79(19): 13403-13426.

[31]Li H, Wang J, Xu L, et al. Efficient and accurate object detection for 3D point clouds in intelligent visual internet of things[J]. Multimedia Tools and Applications, 2021, 80(20): 31297-31334.

[32]Rong W, Han J, Liu G. Instance-level Object relation module for one-stage Object Detection[J]. Multimedia Tools and Applications, 2022, 81(6): 8617-8632.

[33]Li Z, Sun Y, Tian G, et al. A compression pipeline for one-stage object detection model[J]. Journal of Real-Time Image Processing, 2021, 18(6): 1949-1962.

[34]Jiang L, Nie W, Zhu J, et al. Lightweight object detection network model suitable for indoor mobile robots[J]. Journal of Mechanical Science and Technology, 2022, 36(2): 907-920.

[35]Wang X, Yu K, Wu S, et al. ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks[C]//Proceedings of the European conference on computer vision (ECCV) workshops. 2018: 63-79.

[36]Wu Z, Ma P. ESRGAN-BASED DEM SUPER-RESOLUTION FOR ENHAN-CED SLOPE DEFORMATION MONITORING IN LANTAU ISLAND OF H-ONG KONG[J]. International Archives of the Photogrammetry, Remote Sensing & Spatial Information Sciences, 2020, 43.

[37]Rabbi J, Ray N, Schubert M, et al. Small-object detection in remote sensing images with end-to-end edge-enhanced GAN and object detector network[J]. Remote Sensing, 2020, 12(9): 1432.

[38]Gao L. ERDBNet: Enhanced Residual Dense Block Net-A New Net to Rich ESRGAN Image Details[C]//Journal of Physics: Conference Series. IOP Publishing, 2021, 2083(4): 042026.

[39]Wang Y, Sun G, Guo S. Target Detection Method for Low-Resolution Remote Sensing Image Based on ESRGAN and ReDet[C]//Photonics. Multidisciplinary Digital Publishing Institute, 2021, 8(10): 431.

[40]Yang Z, Wang Y. Image Enhancement and Improvement Algorithm Based on Esrgan Singal Frame Remote Sensing Image[C]//Journal of Physics: Conference Series. IOP Publishing, 2021, 1952(2): 022012.

[41]李新利, 邹昌铭, 杨国田, 等. 基于生成式对抗网络的发票图像超分辨率研究[J]. 系统仿真学报, 2021, 33(06): 1307-1314.

[42]辛元雪, 朱凤婷, 史朋飞, 等. 基于改进增强型超分辨率生成对抗网络的图像超分辨率重建算法[J]. 激光与光电子学进展, 2022, 59(04): 381-391.

[43]张建, 贾媛媛, 贺向前, 等. 面向各向异性3D-MRI图像超分辨率重建的ESRGAN网络[J/OL]. 重庆大学学报, 2022: 1-14.

[44]He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016:770-778.

[45]Krizhevsky A., Sutskever I., Hinton G. E., et al. ImageNet Classification with Deep Convolutional Neural Networks[C]. Neural Information Processing Systems, 2012, 25(2): 1097-1105.

[46]Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 1-9.

[47]Howard A G, Zhu M, Chen B, et al. Mobilenets:Efficient convolutional neural networks for mobile vision applications[J].arXiv preprint arXiv:1704.04861.2017.

[48]Jia L, Kitchen L. Object-based image similarity computation using inductive learning of contour-segment relations[J]. IEEE Transactions on Image Processing, 2000, 9(1): 80-87.

[49]Stejić Z, Takama Y, Hirota K. Mathematical aggregation operators in image retrieval: effect on retrieval performance and role in relevance feedback[J]. Signal processing, 2005, 85(2): 297-324.

[50]Kwon S, Lee C H, Lee J H, et al. Efficient Hardware Architecture for Fast Image Similarity Calculation[J]. Journal of the Institute of Electronics Engineers of Korea SD, 2011, 48(4): 6-13.

[51]Zhang Y, Wu J, Cai J, et al. Flexible image similarity computation using hyper-spatial matching[J]. IEEE Transactions on Image Processing, 2014, 23(9): 4112-4125.

[52]陈新荃, 陈晓东, 蒋林华. 基于Spark平台的人脸图像检索系统[J]. 计算机工程, 2018,44(02): 251-256.

[53]丁维龙, 辛卫涛, 徐志福, 等. 基于图像特征的植物形态相似度算法[J]. 中国图象图形学报, 2019, 24(12): 2255-2266.

[54]郭渝洛, 边浩东, 董润婷, 等. 基于SIMD的并行傅里叶空间图像相似度计算[J]. 计算机工程, 2021, 47(11): 247-253.

[55]徐文进, 解钦, 黄海广. 基于轨迹图像特征匹配的渔船轨迹相似度计算和轨迹分类[J]. 计算机系统应用, 2021, 30(08): 232-236.

[56]Mirza M, Osindero S. Conditional generative adversarial nets[J]. arXiv preprint arXiv:1411.1784, 2014.

[57]梁培俊, 刘怡俊. 基于条件生成对抗网络的漫画手绘图上色方法[J]. 计算机应用研究, 2019, 36(01): 308-311.

[58]Ma Y, Zhong G, Liu W, et al. ML-CGAN: conditional generative adversarial network with a meta-learner structure for high-quality image generation with few training data[J]. Cognitive Computation, 2021, 13(2): 418-430.

[59]Son M, Jung S, Jung S, et al. BCGAN: A CGAN-based over-sampling model using the boundary class for data balancing[J]. The Journal of Supercomputing, 2021, 77(9): 10463-10487.

[60]Huang Y F, Liu W D. Choreography cGAN: generating dances with music beats using conditional generative adversarial networks[J]. Neural Computing and Applications, 2021, 33(16): 9817-9833.

[61]Zhang L, Bian Z, Ye H, et al. Restoration of Single pixel imaging in atmospheric turbulence by Fourier filter and CGAN[J]. Applied Physics B, 2021, 127(3): 1-16.

[62]刘建伟, 谢浩杰, 罗雄麟. 生成对抗网络在各领域应用研究进展[J]. 自动化学报, 2020, 46(12): 2500-2536.

中图分类号:

 TP391.4    

开放日期:

 2022-06-24    

无标题文档

   建议浏览器: 谷歌 火狐 360请用极速模式,双核浏览器请用极速模式