- 无标题文档
查看论文信息

论文中文题名:

 基于改进YOLOv8模型的双源输入行人摔倒检测算法研究    

姓名:

 贾仟国    

学号:

 21206043036    

保密级别:

 保密(1年后开放)    

论文语种:

 chi    

学科代码:

 081104    

学科名称:

 工学 - 控制科学与工程 - 模式识别与智能系统    

学生类型:

 硕士    

学位级别:

 工学硕士    

学位年度:

 2024    

培养单位:

 西安科技大学    

院系:

 电气与控制工程学院    

专业:

 控制科学与工程    

研究方向:

 目标检测    

第一导师姓名:

 杨勇    

第一导师单位:

 西安科技大学    

论文提交日期:

 2024-06-17    

论文答辩日期:

 2024-06-06    

论文外文题名:

 Research on Human Fall Detection Algorithm with Dual-Input Based on Improved YOLOV8 Model    

论文中文关键词:

 YOLOv8 ; SE注意力机制 ; 3D空洞卷积 ; 可变形卷积 ; 双路径分类网络    

论文外文关键词:

 YOLOv8 ; SE Attention Mechanism ; 3D Dilated Convolutioin ; Deformable Convolution ; Dual-Path Classification Network    

论文中文摘要:

摔倒会对人体带来极大的伤害,尤其容易致使老年人死亡,及时有效的检测到行人摔倒事故的发生,对后期救助治疗具有重要意义。目前对摔倒检测的研究多是二阶段,且普遍丢失了时间维度信息和不同角度视频之间的对比信息。基于此,本文以双源摔倒视频作为研究对象,以YOLOv8算法为基础,展开对摔倒检测方法的研究,具体工作内容如下:

1. 鉴于摔倒行为具有显著的时空特性,进行基于YOLOv8检测算法的改进研究,并将改进后的算法命名为S3DD-YOLOv8。将骨干(Backbone)网络第一层2D网络改为3D空洞卷积,提取时间维度特征;引入注意力机制,并与C2f相结合,重点提取行人运动信息;考虑到行人摔倒体态特征,将骨干网络最后一个2D卷积改为可变形卷积。原数据增强模型具有旋转不变性,与行人摔倒体态特性不符合,因此对其进行针对性优化。最后通过实验证明该网络的$mAP50$达到了0.989,相比较YOLOv8提升了9.889\%,精确率和召回率分别提升了10.91\%,8.705\%,达到了0.986和0.974。

2. 为充分利用公共平台多源数据信息,进行行人摔倒多角度对比研究,提出双路径分类算法。该网络由两条骨干网络组成,因回归框无法统一,将原有网络改为分类模型,取消原YOLOv8目标框回归功能,同时去除原有颈部网络,重新设计分类模块,组成新的头部网络。利用双路径进行检测,存在视频信息不匹配的问题,因角度视觉角度不同会导致目标仅出现在一个镜头内。因此,对损失函数进行改进,利用类别标签进行判断。当输入数据类别一致时,直接输出分类结果;当输入类别不一致时,标签判为摔倒。将该新算法命名为YOLOv8-DualPath,通过实验得出,该算法与YOLOv8的分类算法相比,top-1准确率和F1分数分别提高了7.36\%和6.97\%,达到了0.948和0.951。当使用S3DD-YOLOv8的主干网络时,双路径分类算法的检测精度最高,top-1的准确率达到了0.987,F1分数也达到了0.986。

综上所述,本文针对摔倒具有时间连续性以及不同角度视频之间的对比信息等特点对YOLOv8网络的结构、数据预处理以及相关损失函数进行了改进,经过UR Fall Detection Dataset实验验证,表明了本文所提算法对摔倒检测具有优良的性能。

论文外文摘要:

Falls can inflict significant harm on the human body, especially among the elderly, often leading to fatalities. Timely and effective detection of falls is crucial for subsequent rescue and treatment. Current research on fall detection mainly focuses on two-stage approaches, which tend to lose temporal dimension information and comparative information from different video angles. This study examines dual-source fall videos and builds upon the YOLOv8 algorithm to investigate fall detection methods. The specific work contents are outlined below:

1. Given the distinct spatiotemporal nature of falling behaviors, this work presents an improved detection algorithm, termed S3DD-YOLOv8, derived from YOLOv8. Enhancements involve modifying the backbone's initial 2D layer to a 3D dilated convolution for temporal feature extraction, attention mechanisms was introduced and fused with C2f to emphasize human motion cues, and adapting the final 2D convolution to deformable convolution to better capture falling postures. The data augmentation strategy was refined to accommodate the anisotropic attributes of falls, overcoming its inherent rotational invariance. Empirical validation attests to S3DD-YOLOv8's superior performance, achieving an $mAP50$ of 0.989, reflecting a 9.889\% enhancement over YOLOv8, with precision and recall improved by 10.91\% and 8.705\%, reaching 0.986 and 0.974, respectively.

2. To harness diverse data from public platforms, a dual-path detection model for comparative analysis of falls from multiple angles was devised. Dubbed YOLOv8-DualPath, this framework consists of twin backbones, reconfigured into classification models due to inconsistent bounding boxes across perspectives, necessitating the removal of the bounding box regression and neck components, supplanted by a novel classification head. To tackle mismatches in video streams from different angles, the loss function was amended to utilize class labels: consistent inputs prompt direct classification outputs, while conflicting inputs denote a fall. Evaluation outcomes show YOLOv8-DualPath outperforms YOLOv8's classification variant, boosting top-1 accuracy and F1 score by 7.36\% and 6.97\% to 0.948 and 0.951, respectively. Moreover, adopting the S3DD-YOLOv8 backbone elevates the dual-path classification system's performance, yielding a top-1 accuracy of 0.987 and an F1 score of 0.986, highlighting its prowess in fall detection from varied viewpoints.

In summary, this study improves the structure, data preprocessing, and relevant loss functions of the YOLOv8 network based on the temporal continuity of falls and the comparative information between videos from different angles. Experiments on the UR Fall Detection Dataset demonstrate the excellent performance of the proposed algorithms for fall detection.

中图分类号:

 TP391    

开放日期:

 2025-06-18    

无标题文档

   建议浏览器: 谷歌 火狐 360请用极速模式,双核浏览器请用极速模式