论文中文题名: | 基于骨架信息与图卷积的人体跌倒检测算法研究 |
姓名: | |
学号: | 21206223053 |
保密级别: | 公开 |
论文语种: | chi |
学科代码: | 085400 |
学科名称: | 工学 - 电子信息 |
学生类型: | 硕士 |
学位级别: | 工学硕士 |
学位年度: | 2024 |
培养单位: | 西安科技大学 |
院系: | |
专业: | |
研究方向: | 图像处理 |
第一导师姓名: | |
第一导师单位: | |
第二导师姓名: | |
论文提交日期: | 2024-06-17 |
论文答辩日期: | 2024-06-06 |
论文外文题名: | Research On Human Fall Detection Algorithm Based On Skeleton Information And Graph Convolution Network |
论文中文关键词: | |
论文外文关键词: | Fall Detection ; Pose Estimation ; Skeleton Sequence ; Graph Convolution Networks ; Model Lightweight |
论文中文摘要: |
现如今,生活工作中跌倒事件频频发生,造成的伤害和安全问题众多。尤其对于老年人跌倒会造成严重后果,轻则淤青、骨折,重则大脑损伤、昏迷等。快速准确地检测并识别到跌倒行为,对减少跌倒造成的伤害,提高人们生活工作质量有着重大的意义。因此,本文针对现有的跌倒检测算法精度较差、易受环境的影响造成误检的问题,提出了一种基于骨架信息与图卷积的人体跌倒检测算法。论文主要内容如下: (1) 针对环境复杂导致人体关节点检测不准确、置信度低,以及骨架序列提取器网络模型参数冗杂的问题,本文提出一种基于YOLO v8和GCB-HRNet的骨架序列数据提取方法。该方法使用YOLO v8n算法检测视频中人体目标的位置,再使用设计的GCB-HRNet对检测的人体目标区域进行骨架提取。对于GCB-HRNet网络,设计了一种基于注意力的轻量化卷积GCA模块替换HRNet中的卷积模块,降低模型参数量和计算量的同时使模型更加关注关节点的位置信息;改进HRNet第四阶段结构,引入BiFPN避免特征融合导致的特征冗余,降低模型的参数量和计算量,提高模型的泛化性。本文提出的关节点提取方法在COCO数据集上mAP达到了72.9%,计算量和参数量仅有1.47G和3.89M。实验证明本文提出的关节点提取方法的可以有效的获取关节点数据。 (2) 针对目前跌倒检测算法对时空特征挖掘不充分,导致跌倒行为识别率低以及相似行为误检的问题,本文提出了一种基于SMA-GCN的人体骨架跌倒检测算法。该方法设计了混合移位空间图卷积模块MShift-GCN,使模型充分挖掘非物理连接关节点的潜在特征,提高模型检测精度;设计了多尺度时间卷积模块MS-TGC,解决模型对时间特征不敏感的问题,通过连续两次使用膨胀卷积,避免时间信息丢失问题;使用时空关节点注意力模块STA,增大关键帧和关节点的权重,提高模型的精度。最终本文提出的跌倒检测算法在NTU 60 RGB+D数据集X-sub和X-xiew上准确率为91.1%和97.1%,其中跌倒行为准确率为99.64%和100%。在LFD数据集中准确率和召回率分别是98.6%和98.86%。实验证明本文提出的跌倒检测算法具有更高的识别率和鲁棒性。 |
论文外文摘要: |
Nowadays, falls occur frequently in life and work, causing many injuries and safety problems. Especially for the elderly fall will cause serious consequences, ranging from bruises, fractures, severe brain damage, coma, etc. Rapid and accurate detection and identification of fall behavior is of great significance to reduce the harm caused by falls and improve the quality of people's lives and work. Therefore, this thesis proposes a human fall detection algorithm based on skeleton information and graph convolution to solve the problem that the existing fall detection algorithm has poor accuracy and is easily affected by the environment. The main contents of this thesis are as follows: (1) Aiming at the problems of inaccurate detection of human joint points, low confidence, and miscellaneous parameters of the skeleton sequence extractor network model caused by the complex environment, this thesis proposes a skeleton sequence data extraction method based on YOLO v8 and GCB-HRNet. This method uses the YOLO v8n algorithm to detect the position of the human target in the video, and then uses the designed GCB-HRNet to extract the skeleton of the detected human target area. For the GCB-HRNet network, an attention-based lightweight convolution GCA module is designed to replace the convolution module in HRNet, which reduces the number of model parameters and calculations while making the model pay more attention to the location information of joint points. The fourth stage structure of HRNet is improved, and BiFPN is introduced to avoid feature redundancy caused by feature fusion, reduce the parameter quantity and calculation amount of the model, and improve the generalization of the model. The joint point extraction method proposed in this thesis has a mAP of 72.9% on the COCO dataset, and the amount of calculation and parameters is only 1.47G and 3.89M. Experiments show that the joint point extraction method proposed in this thesis can effectively obtain joint point data. (2) Aiming at the problem that the current fall detection algorithm does not fully mine the spatio-temporal features, resulting in a low recognition rate of fall behavior and false detection of similar behavior, this thesis proposes a human skeleton fall detection algorithm based on SMA-GCN. In this method, a mixed shift space graph convolution module MShift-GCN is designed to make the model fully exploit the potential features of non-physical connection joint points and improve the detection accuracy of the model. A multi-scale time convolution module MS-TGC is designed to solve the problem that the model is not sensitive to time features. By using dilated convolution twice in a row, the problem of time information loss is avoided. The spatio-temporal joint attention module STA is used to increase the weight of keyframes and joint points and improve the accuracy of the model. Finally, the fall detection algorithm proposed in this thesis has an accuracy of 91.1% and 97.1% on the NTU 60 RGB +D dataset X-sub and X-view, and the accuracy of fall behavior is 99.64% and 100%. The accuracy and recall rates in the LFD dataset are 98.6% and 98.86%, respectively. Experiments show that the proposed fall detection algorithm has higher recognition rate and robustness. |
参考文献: |
[1] 史伏雨. 基于少样本数据人体行为分析的研究[D]. 沈阳理工大学, 2021. [4] 刘厚莲. 世界和中国人口老龄化发展态势[J]. 老龄科学研究, 2021, 9 (12) :1-16. [5] 何俊. 城乡老年人跌倒发生现状及危险因素分析[D]. 宁夏医科大学, 2015. [7] 高梦奇, 李江娇, 李彬. 人员意外跌倒检测研究方法分析与综述[J]. 齐鲁工业大学学报, 2021, 35 (6): 61-68. [41] Girshick R. Fast R-CNN[J]. Computer Science, 2015: 1440-1448. [45] Redmon J, Farhadi A. YOLOv3: An incremental improvement[J]. arXiv:1804.02767, 2018. [46] Ge Z, Liu S, Wang F, et al. Yolox: Exceeding yolo series in 2021[J]. arXiv:2107.08430, 2021. [60] 刘勇, 李杰, 任立成等. 并联化高分辨网络的人体姿态估计方法[J]. 计算机工程与设计, 2022, 43(1): 237-244. [63] 高坤, 李汪根, 束阳, 等.结合密集连接的轻量级高分辨率人体姿态估计[J]. 中国图象图形学报, 2024, 29 (5): 1408-1420. |
中图分类号: | TP391.4 |
开放日期: | 2024-06-17 |