论文中文题名: | 基于深度学习的西餐食材识别方法研究与实现 |
姓名: | |
学号: | 22208223044 |
保密级别: | 公开 |
论文语种: | chi |
学科代码: | 085400 |
学科名称: | 工学 - 电子信息 |
学生类型: | 硕士 |
学位级别: | 工程硕士 |
学位年度: | 2025 |
培养单位: | 西安科技大学 |
院系: | |
专业: | |
研究方向: | 计算机视觉 |
第一导师姓名: | |
第一导师单位: | |
论文提交日期: | 2025-06-16 |
论文答辩日期: | 2025-05-27 |
论文外文题名: | Research and Implementation of Western Food Ingredient Recognition Method Based on Deep Learning |
论文中文关键词: | |
论文外文关键词: | Object Detection ; Semantic Segmentation ; Image Recognition ; Food Computing ; Food Recognition |
论文中文摘要: |
社交网络和物联网等多源数据的涌现催生了海量西餐图像数据,在深度学习迅速发展的背景下,这一趋势推动了新兴交叉领域食品计算的形成。作为食品计算的核心任务之一,西餐图像识别已成为智能餐饮服务、智慧健康和营养分析系统的重要基础。然而,当前的西餐图像识别中的西餐菜品识别和西餐食材识别在实际应用中面临两个主要问题:西餐菜品识别方法存在模型参数量较大,推理速度较慢的问题;西餐食材识别算法难以解决多类别食材识别的边缘重叠问题。因此,本文在西餐菜品识别和西餐食材识别两个连续的工作流程展开研究,具体研究内容如下: (1)为了解决现有西餐菜品识别模型参数量较大,推理速度较慢问题,提出了一种基于DBFMViT-ASDH-YOLO11(Dual-Branch Fusion MobileViT Alterable Shared Detection Head-YOLO11)的轻量级西餐菜品识别算法。首先,在YOLO11的骨干网络部分设计了双分支融合移动视觉转换器DBFMViT,DBFMViT在轻量化网络MobileViT v3上通过双分支跳跃连接结构将浅层特征和深层特征进行跨阶段融合,有效捕获菜品特征并减少模型参数量。其次,设计了可变共享检测头ASDH,通过引入可变核卷积和共享卷积实现进一步模型轻量化,大大减少模型参数量。实验结果表明,在Food Detection数据集上,该模型在识别精度方面表现较好,mAP提升了1.11%。同时,该模型在计算成本和推理速度上表现较好,Params减少了约37.9%。推理速度提升了31.25%。 (2)为了解决多类别食材识别中所存在的边缘重叠问题,提出了一种基于MEFPN-SC-Mask2Former(Multi-scale Enhanced Feature Pyramid Network Semantic Concern-Mask2Former)的西餐食材识别方法。首先,设计了一种多尺度增强特征金字塔网络MEFPN,对多尺度特征图中不同级别特征图分别处理以获取全局特征和局部特征,用于提取食材对象的不规则形状特征。其次,设计了一种语义关注SC模块,实现对语义信息的重点关注,从而理解食材对象之间的重叠关系。实验结果表明,在FoodSeg103数据集上对比西餐食材识别代表性较优的方法,该模型分别在指标mIoU、mAcc和aAcc提高了2.16%、2.37%和1.18%。 (3)基于本文提出的西餐菜品识别模型和西餐食材识别模型,设计并实现了一个西餐食材识别系统。该系统基于B/S架构,实现了用户注册登录、西餐菜品识别、西餐食材识别、模型优化、AI营养分析等功能。 |
论文外文摘要: |
The proliferation of multi-source data from social networks and the Internet of Things has led to the generation of massive Western food image datasets. Against the backdrop of rapid advancements in deep learning, this trend has fostered the emergence of the interdisciplinary field of food computing. As one of its core tasks, Western food image recognition has become a fundamental technology for intelligent dining services, smart health, and nutrition analysis systems. However, current approaches to Western food image recognition—specifically Western dish recognition and Western ingredient recognition—face two major challenges in practical applications: (1) Western dish recognition methods typically suffer from large model sizes and slow inference speeds; (2) Western ingredient recognition algorithms struggle to resolve edge-overlapping issues when recognizing multiple ingredient categories. Therefore, this study focuses on the two consecutive workflows of Western dish recognition and ingredient recognition, with the following contributions: To address the issues of large model parameters and slow inference in existing Western dish recognition models, we propose a lightweight algorithm named DBFMViT-ASDH-YOLO11 (Dual-Branch Fusion MobileViT Alterable Shared Detection Head-YOLO11) for Western dish recognition. Specifically, we design a dual-branch fusion MobileViT (DBFMViT) backbone for YOLO11, which integrates shallow and deep features via dual-branch skip connections based on the lightweight MobileViT v3 network, thereby effectively capturing dish features while reducing the number of parameters. In addition, an Alterable Shared Detection Head (ASDH) is introduced, which leverages deformable convolutions and shared convolutions to further lighten the model and significantly decrease parameter count. Experimental results on the Food Detection dataset demonstrate that the proposed model achieves superior recognition accuracy, with a 1.11% improvement in mAP. Moreover, the model exhibits substantial reductions in computational cost and inference latency, decreasing the number of parameters by approximately 37.9% and increasing inference speed by 31.25%.
To tackle the edge-overlapping issue in multi-category ingredient recognition, we propose a novel Western ingredient recognition method based on MEFPN-SC-Mask2Former (Multi-scale Enhanced Feature Pyramid Network Semantic Concern-Mask2Former). First, we design a Multi-scale Enhanced Feature Pyramid Network (MEFPN) that separately processes multi-level feature maps to extract both global and local features, which are crucial for capturing irregular ingredient shapes. Additionally, a Semantic Concern (SC) module is developed to focus on semantic information and better understand overlapping relationships among ingredient objects. Experimental results on the FoodSeg103 dataset show that, compared with representative state-of-the-art methods, our approach achieves improvements of 2.16%, 2.37%, and 1.18% in mIoU, mAcc, and aAcc, respectively. Based on the proposed Western dish and ingredient recognition models, we have designed and implemented a Western ingredient recognition system. The system, built on a B/S (Browser/Server) architecture, supports functionalities including user registration and login, Western dish recognition, ingredient recognition, model optimization, and AI-driven nutrition analysis. |
参考文献: |
[5] 闵巍庆, 刘林虎, 刘宇昕, 等. 食品图像识别方法综述[J]. 计算机学报, 2022, 45(3): 542-566. [32] 杨青华,钟世昊,杨观赐.基于改进YOLOv7的轻量级多菜品识别方法[J].贵州大学学报,2024,41(06):85-90. [33] 徐涛.基于神经网络特征提取的食堂结算系统中菜品识别方法研究[J/OL].机械工程师,1-5[2025-02-25]. |
中图分类号: | TP391 |
开放日期: | 2025-06-24 |