查看论文信息

免费浏览

查看论文信息

论文中文题名：	基于改进YOLO的无人零售商品识别研究
姓名：	李佗
学号：	17207205069
保密级别：	公开
论文语种：	chi
学科代码：	085208
学科名称：	工学 - 工程 - 电子与通信工程
学生类型：	硕士
学位级别：	工程硕士
学位年度：	2022
培养单位：	西安科技大学
院系：	通信与信息工程学院
专业：	电子与通信工程
研究方向：	计算机视觉
第一导师姓名：	王晓路
第一导师单位：	西安科技大学
论文提交日期：	2022-06-20
论文答辩日期：	2022-06-06
论文外文题名：	Research on Unmanned Retail Commodity Recognition Based on Improved YOLO
论文中文关键词：	商品识别 ; YOLOv5s ; DeepSORT ; 智能货柜
论文外文关键词：	Commodity Recognition ; YOLOv5s ; DeepSORT ; Smart Container
论文中文摘要：	︿近年来，随着人工智能技术的迅速发展，无人零售逐渐成为业界热点。无人零售模式的优势在于不仅可以节省人力成本，提升消费体验，还可以通过数字化运营，进一步降低商品的销售成本。智能无人售货柜作为最主要的无人零售方式之一，其难点在于如何快速准确识别商品种类与数量。本文以智能无人售货柜为研究对象，对基于动态视觉方案的商品检测以及跟踪与计数展开研究。针对动态视觉下的商品检测问题，构建基于改进YOLOv5s的轻量化商品检测模型。采用Ghost卷积模块优化YOLOv5s骨干网络，以降低模型的计算量和参数量；引入坐标注意力机制，提高网络对重要区域的关注度，从而提升模型检测准确度；在原始损失函数基础上，引入CIoU边界回归损失函数，提高边界框回归速率。实验结果表明，改进后的YOLOv5s-G-CA轻量化商品检测模型相较于原模型，其计算量减少了50%，模型参数量减少了42%，且识别准确率提高了2.6%。针对动态视觉下的商品跟踪与计数问题，设计基于DeepSORT的商品跟踪与计数算法。采用YOLOv5s-G-CA检测模型获取视频帧中商品的定位框和类别信息，生成目标跟踪器；利用卡尔曼滤波器估计跟踪器中目标的运动状态，并生成目标预测框；通过匈牙利算法对目标检测框与目标预测框进行匹配，实现对商品的跟踪与识别；在商品跟踪过程中引入虚拟计数线，通过识别商品运动轨迹与虚拟计数线位置关系，判断商品是否被拿出，从而完成商品的跟踪与计数。实验结果表明，该算法商品识别准确率为 93.5%，有效解决动态商品跟踪与计数的问题。针对视觉遮挡导致商品识别准确率下降的问题，提出基于视觉传感器与称重传感器融合的识别算法。通过实时采集货柜层的商品重量数据，并根据消费者取出商品重量变化值与不同商品重量分布的匹配程度，实现基于称重的商品识别；利用信息熵对视觉传感器和称重传感器识别结果进行加权融合。实验结果表明，该融合算法商品识别准确率为98.1%，相较于单一视觉识别，其识别准确率提升了4.6%，弥补了视觉识别的不足。基于系统稳定可靠原则，设计智能无人售货柜系统，实现了扫码开门、自主选购、关门自动结算的快速购物功能。经测试，系统运行稳定，购物流程简单，满足实际应用需求。本文的研究成果为无人零售商品识别的研究提供了一定的参考。﹀
论文外文摘要：	︿ In recent years, with the rapid development of artificial intelligence technology, unmanned retail has gradually become a hot spot in the industry. The advantage of the unmanned retail model is that it can not only save labor costs and improve the consumer experience, but also further reduce the cost of goods sold through digital operations. As one of the most important unmanned retail methods, the core difficulty of intelligent unmanned vending cabinets is how to quickly and accurately identify the type and quantity of goods. This thesis takes the intelligent unmanned vending cabinet as the research object, and studies the commodity recognition and tracking counting based on the dynamic vision scheme. Aiming at the problem of commodity recognition under dynamic vision, a lightweight commodity recognition model based on improved YOLOv5s is designed. The Ghost convolutional network is used to optimize YOLOv5s, as its backbone network to extract product feature information, reducing the amount of calculation and parameters of the model; the coordinate attention mechanism is introduced to improve the feature extraction ability of the network, thereby improving the accuracy of model detection; in the original loss function on this basis, the CIoU boundary regression loss function is introduced to improve the bounding box regression rate and localization accuracy. The experimental results show that compared with the original model, the improved YOLOv5s-G-CA lightweight commodity recognition model reduces the amount of computation by 50%, the number of model parameters by 42%, and the recognition accuracy rate is increased by 2.6%. Aiming at the problem of commodity tracking and counting under dynamic vision, a commodity tracking and counting algorithm based on DeepSORT is designed. The YOLOv5s-G-CA model is used to obtain the positioning frame and category information of the products in the video frame, and the target tracker is generated; the Kalman filter is used to estimate the motion state of the target in the tracker, and the target prediction frame is generated; the target is detected by the Hungarian algorithm. The frame is matched with the target prediction frame to realize the tracking and identification of the product; the virtual counting line is introduced in the product tracking process, and by identifying the positional relationship between the movement track of the product and the virtual counting line, it is judged whether the product is taken out, so as to complete the tracking of the product count. The experimental results show that the accuracy of commodity identification and tracking counting of the algorithm is 93.5%, which effectively solves the problem of dynamic commodity tracking and counting. Aiming at the problem that the accuracy of product recognition is reduced due to visual occlusion, a recognition algorithm based on the fusion of visual sensors and load cells is proposed. By collecting the commodity weight data of the container layer in real time, and according to the matching degree between the weight change value of the commodity taken out by the consumer and the weight distribution of different commodities, the weighing-based commodity identification is realized; the information entropy is used to weight the results of the visual sensor and the weighing sensor. The experimental results show that the product recognition accuracy of the fusion algorithm is 98.1%. Compared with single visual recognition, the recognition accuracy is increased by 4.6%, which makes up for the lack of visual recognition. Based on the principle of system stability and reliability, an intelligent unmanned vending cabinet system is designed, which realizes the fast shopping functions of scanning code to open the door, independent purchase, and automatic settlement after closing the door. After testing, the system runs stably and the shopping process is simple, which meets the needs of practical applications. The research results of this thesis provide a certain reference for the research of unmanned retail commodity recognition. ﹀
参考文献：	︿ [1] 陈丽娟,刘蕾.消费4.0升级驱动下零售业模式创新及转型路径[J].企业经济,2021,40(04):80-87. [2] 胡祥培,王明征,王子卓,等.线上线下融合的新零售模式运营管理研究现状与展望[J].系统工程理论与实践,2020,40(08):2023-2036. [3] 李然,孙涛,曹冬艳.O2O业态融合视角下的数字化新零售发展趋势研究[J].当代经济管理,2021,43(04):13-21. [4] Umar S, Twaibu S, Ocen G G, et al. RFID-Based Automated Supermarket Self-Billing System[J]. East African Journal of Information Technology,2021,4(1):7-15. [5] 齐俊鹏,田梦凡,马锐.面向物联网的无限射频识别技术的应用及发展[J].科学技术与工程,2019,19(29):1-10. [6] 邓清文,班超,雍鑫,等.射频识别智能货柜在医院医用耗材管理中的应用[J].中国数字医学,2020,15(10):137-140. [7] Zou X, Zhou L, Li K, et al. Multi-task cascade deep convolutional neural networks for large-scale commodity recognition[J]. Neural Computing and Applications, 2020, 32(10):5633-5647. [8] 侯维岩, 靳东安, 王高杰,等. 基于嵌入式系统的智能售货柜目标检测算法[J].电子测量与仪器学报, 2021,35(10):8. [9] Xia K, Fan H, Huang J, et al. An intelligent self-service vending system for smart retail[J]. Sensors, 2021, 21(10): 3560. [10] Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR). 2014: 580-587. [11] He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9):1904-1916. [12] Girshick R. Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision (ICCV),2015: 1440-1448. [13] Ren S, He K, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. Advances in neural information processing systems, 2015, 28. [14] Dai J, Li Y, He K, et al. R-fcn: Object detection via region-based fully convolutional networks[J]. Advances in neural information processing systems, 2016, 29. [15] Lin T Y, Dollár P, Girshick R, et al. Feature Pyramid Networks for Object Detection[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017: 936-944. [16] Cai Z, Vasconcelos N. Cascade R-CNN: delving into high quality object detection[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018:6154-6162. [17] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016: 779-788. [18] Liu W, Anguelov D, Erhan D, et al. SSD: Single Shot MultiBox Detector[C]//Proceedings of the European Conference on Computer Vision.(ECCV) 2016:21-37. [19] Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017: 7263-7271. [20] Zhang Y, Shen Y L S, Zhang J. An improved tiny-yolov3 pedestrian detection algorithm[J]. Optik, 2019, 183:17-23. [21] Zhang X, Zhou X, Lin M, et al. Shufflenet: An extremely efficient convolutional neural network for mobile de- vic-es[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2018: 6848-6856. [22] Ma N, Zhang X, Zheng H T, et al. Shufflenet v2: Practical guidelines for efficient cnn architecture design[C].//Proceedings of the European Conference on Computer Vision(ECCV). 2018: 116-131. [23] Li Y, Huang H, Xie Q, et al. Research on a surface defect detection algorithm based on MobileNet-SSD[J]. Applied Sciences, 2018, 8(9): 1678. [24] Sandler M, Howard A, Zhu M, et al. Mobilenetv2: Inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2018: 4510-4520. [25] Howard A, Sandler M, Chu G, et al. Searching for mobilenetv3[C]//Proceedings of the IEEE International Conference on Computer Vision(ICCV). 2019: 1314-1324. [26] Wu D, Lv S, Jiang M, et al. Using channel pruning-based YOLO v4 deep learning algorithm for the real-time and accurate detection of apple flowers in natural environments[J]. Computers and Electronics in Agriculture, 2020, 178: 105742. [27] Wang C Y, Liao H Y M, Wu Y H, et al. CSPNet: A new backbone that can enhance learning capability of CNN[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2020: 390-391. [28] 伍济钢,成远,邵俊,等.基于改进YOLOv4算法的PCB缺陷检测研究[J].仪器仪表学报,2021,42(10):171-178. [29] 蔡舒平,孙仲鸣,刘慧,等.基于改进型YOLOv4的果园障碍物实时检测方法[J].农业工程学报,2021,37(02):36-43. [30] 田港,张鹏,邹金霖,等.基于改进YOLOv4的无人机目标检测方法[J].空军工程大学学报(自然科学版),2021,22(04):9-14. [31] 郭磊,王邱龙,薛伟,等.基于改进YOLOv5的小目标检测算法[J].电子科技大学学报,2022,51(02):251-258. [32] 何国立,齐冬莲,闫云凤.一种基于关键点检测和注意力机制的违规着装识别算法及其应用[J].中国电机工程学报,2022,42(05):1826-1837. [33] A. Bewley, Z. Ge, L. Ott, et al. Simple online and realtime tracking[C]// Proceedings of the IEEE International Conference on Image Processing (ICIP), 2016: 3464-3468. [34] Wojke N, Bewley A, Paulus D. Simple online and realtime tracking with a deep association metric[C]// Proceedings of the IEEE International Conference on Image Processing (ICIP), 2017:3645-3649. [35] Chen L, Ai H, Zhuang Z, et al. Real-time multiple people tracking with deeply learned candidate selection and person re-identification[C]// Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), 2018: 1-6. [36] Wang Z, Zheng L, Liu Y, et al. Towards real-time multi-object tracking[C]// Proceedings of the European Conference on Computer Vision(ECCV).2020: 107-122. [37] Bergmann P,Meinhardt T,Leal-Taixe L.Tracking without bells and whistles[C]//Proceedings of the IEEE International Conference on Computer Vision(ICCV). 2019: 941-951. [38] 周苏,支雪磊,林飞滨,等.基于车载视频图像的车辆检测与跟踪算法[J].同济大学学报(自然科学版),2019,47(S1):191-198. [39] 李俊彦,宋焕生,张朝阳,等.基于视频的多目标车辆跟踪及轨迹优化[J].计算机工程与应用,2020,56(05):194-199. [40] 侯建华,麻建,王超,等.基于空间注意力机制的视觉多目标跟踪[J].中南民族大学学报(自然科学版),2020,39(04):413-419. [41] 金立生,华强,郭柏苍,等.基于优化DeepSort的前方车辆多目标跟踪[J].浙江大学学报(工学版),2021,55(06):1056-1064. [42] 高新闻,沈卓,许国耀,等.基于多目标跟踪的交通异常事件检测[J].计算机应用研究,2021,38(06):1879-1883. [43] 薛俊韬, 马若寒, 胡超芳. 基于MobileNet的多目标跟踪深度学习算法[J]. 控制与决策, 2021, 36(8):6. [44] Han K,Wang Y,Tian Q,et al.Ghostnet:More features from cheap operations[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR). 2020: 1580-1589. [45] Woo S, Park J, Lee J Y, et al. Cbam: Convolutional block attention module[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 3-19. [46] Jie H, Li S, Gang S, et al. Squeeze-and-Excitation Networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, PP(99). [47] Hou Q, Zhou D, Feng J. Coordinate attention for efficient mobile network design[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR). 2021: 13713-13722. [48] Rezatofighi H, Tsoi N, Gwak J Y, et al. Generalized intersection over union: A metric and a loss for bounding box regression[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2019: 658-666. [49] Zheng Z, Wang P, Liu W, et al. Distance-IoU loss: Faster and better learning for bounding box regression[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2020, 34(07): 12993-13000. [50] 王晓丹,李睿,薛爱军,等.基于熵的自适应加权投票HRRP融合识别方法[J].系统工程与电子技术, 2017, 039(004):707-718. ﹀
中图分类号：	TP391.4
开放日期：	2022-06-27

附件下载