- 无标题文档
查看论文信息

论文中文题名:

 基于骨架信息与图卷积的人体跌倒检测算法研究    

姓名:

 张尚辉    

学号:

 21206223053    

保密级别:

 公开    

论文语种:

 chi    

学科代码:

 085400    

学科名称:

 工学 - 电子信息    

学生类型:

 硕士    

学位级别:

 工学硕士    

学位年度:

 2024    

培养单位:

 西安科技大学    

院系:

 电气与控制工程学院    

专业:

 模式识别与智能系统    

研究方向:

 图像处理    

第一导师姓名:

 杨学存    

第一导师单位:

 西安科技大学    

第二导师姓名:

 张金玉    

论文提交日期:

 2024-06-17    

论文答辩日期:

 2024-06-06    

论文外文题名:

 Research On Human Fall Detection Algorithm Based On Skeleton Information And Graph Convolution Network    

论文中文关键词:

 跌倒检测 ; 姿态估计 ; 骨架序列 ; 图卷积网络 ; 轻量化模块    

论文外文关键词:

 Fall Detection ; Pose Estimation ; Skeleton Sequence ; Graph Convolution Networks ; Model Lightweight    

论文中文摘要:

现如今,生活工作中跌倒事件频频发生,造成的伤害和安全问题众多。尤其对于老年人跌倒会造成严重后果,轻则淤青、骨折,重则大脑损伤、昏迷等。快速准确地检测并识别到跌倒行为,对减少跌倒造成的伤害,提高人们生活工作质量有着重大的意义。因此,本文针对现有的跌倒检测算法精度较差、易受环境的影响造成误检的问题,提出了一种基于骨架信息与图卷积的人体跌倒检测算法。论文主要内容如下:

(1) 针对环境复杂导致人体关节点检测不准确、置信度低,以及骨架序列提取器网络模型参数冗杂的问题,本文提出一种基于YOLO v8和GCB-HRNet的骨架序列数据提取方法。该方法使用YOLO v8n算法检测视频中人体目标的位置,再使用设计的GCB-HRNet对检测的人体目标区域进行骨架提取。对于GCB-HRNet网络,设计了一种基于注意力的轻量化卷积GCA模块替换HRNet中的卷积模块,降低模型参数量和计算量的同时使模型更加关注关节点的位置信息;改进HRNet第四阶段结构,引入BiFPN避免特征融合导致的特征冗余,降低模型的参数量和计算量,提高模型的泛化性。本文提出的关节点提取方法在COCO数据集上mAP达到了72.9%,计算量和参数量仅有1.47G和3.89M。实验证明本文提出的关节点提取方法的可以有效的获取关节点数据。

(2) 针对目前跌倒检测算法对时空特征挖掘不充分,导致跌倒行为识别率低以及相似行为误检的问题,本文提出了一种基于SMA-GCN的人体骨架跌倒检测算法。该方法设计了混合移位空间图卷积模块MShift-GCN,使模型充分挖掘非物理连接关节点的潜在特征,提高模型检测精度;设计了多尺度时间卷积模块MS-TGC,解决模型对时间特征不敏感的问题,通过连续两次使用膨胀卷积,避免时间信息丢失问题;使用时空关节点注意力模块STA,增大关键帧和关节点的权重,提高模型的精度。最终本文提出的跌倒检测算法在NTU 60 RGB+D数据集X-sub和X-xiew上准确率为91.1%和97.1%,其中跌倒行为准确率为99.64%和100%。在LFD数据集中准确率和召回率分别是98.6%和98.86%。实验证明本文提出的跌倒检测算法具有更高的识别率和鲁棒性。

论文外文摘要:

Nowadays, falls occur frequently in life and work, causing many injuries and safety problems. Especially for the elderly fall will cause serious consequences, ranging from bruises, fractures, severe brain damage, coma, etc. Rapid and accurate detection and identification of fall behavior is of great significance to reduce the harm caused by falls and improve the quality of people's lives and work. Therefore, this thesis proposes a human fall detection algorithm based on skeleton information and graph convolution to solve the problem that the existing fall detection algorithm has poor accuracy and is easily affected by the environment. The main contents of this thesis are as follows:

(1) Aiming at the problems of inaccurate detection of human joint points, low confidence, and miscellaneous parameters of the skeleton sequence extractor network model caused by the complex environment, this thesis proposes a skeleton sequence data extraction method based on YOLO v8 and GCB-HRNet. This method uses the YOLO v8n algorithm to detect the position of the human target in the video, and then uses the designed GCB-HRNet to extract the skeleton of the detected human target area. For the GCB-HRNet network, an attention-based lightweight convolution GCA module is designed to replace the convolution module in HRNet, which reduces the number of model parameters and calculations while making the model pay more attention to the location information of joint points. The fourth stage structure of HRNet is improved, and BiFPN is introduced to avoid feature redundancy caused by feature fusion, reduce the parameter quantity and calculation amount of the model, and improve the generalization of the model. The joint point extraction method proposed in this thesis has a mAP of 72.9% on the COCO dataset, and the amount of calculation and parameters is only 1.47G and 3.89M. Experiments show that the joint point extraction method proposed in this thesis can effectively obtain joint point data.

(2) Aiming at the problem that the current fall detection algorithm does not fully mine the spatio-temporal features, resulting in a low recognition rate of fall behavior and false detection of similar behavior, this thesis proposes a human skeleton fall detection algorithm based on SMA-GCN. In this method, a mixed shift space graph convolution module MShift-GCN is designed to make the model fully exploit the potential features of non-physical connection joint points and improve the detection accuracy of the model. A multi-scale time convolution module MS-TGC is designed to solve the problem that the model is not sensitive to time features. By using dilated convolution twice in a row, the problem of time information loss is avoided. The spatio-temporal joint attention module STA is used to increase the weight of keyframes and joint points and improve the accuracy of the model. Finally, the fall detection algorithm proposed in this thesis has an accuracy of 91.1% and 97.1% on the NTU 60 RGB +D dataset X-sub and X-view, and the accuracy of fall behavior is 99.64% and 100%. The accuracy and recall rates in the LFD dataset are 98.6% and 98.86%, respectively. Experiments show that the proposed fall detection algorithm has higher recognition rate and robustness.

参考文献:

[1] 史伏雨. 基于少样本数据人体行为分析的研究[D]. 沈阳理工大学, 2021.

[2] United Nations Department of economic and social affairs, population division. World population ageing 2020 highlights: living arrangements of older persons (ST/ESA/SER. A/451) [R/OL].(2021-07-24) [2021-11-04]. www.un.org/develop-ment/desa/pd/.

[3] WHO: Falls. available online[R]. http://www.who.int/mediacentre/factsheets/fs344/en/ (accessed on 25 December 2017).

[4] 刘厚莲. 世界和中国人口老龄化发展态势[J]. 老龄科学研究, 2021, 9 (12) :1-16.

[5] 何俊. 城乡老年人跌倒发生现状及危险因素分析[D]. 宁夏医科大学, 2015.

[6] Kim H, Lee S, Kim Y, et al. Weighted joint-based human behavior recognition algorithmn using only depth information for low-cost intelligent video-surveillance sysytem[J]. Expert Systems with Applications, 2016.45:131-141.

[7] 高梦奇, 李江娇, 李彬. 人员意外跌倒检测研究方法分析与综述[J]. 齐鲁工业大学学报, 2021, 35 (6): 61-68.

[8] N. Pannurat, S. Thiemjarus, E. Nantajeewarawat. A hybrid temporal reasoning framework for fall monitoring[J]. IEEE Sensors Journal, 2017, 17(6): 1749-1759.

[9] G. Shi, J. Zhang, C. Dong, et al. Fall detection system based on inertial mems sensors: Analysis design and realization[C]// 2015 IEEE International Conference on Cyber Technology in Automa tion,Control,and Intelligent Systems(CYBER), 2015, 1834-1839.

[10] D. Yacchirema, J. S. de Puga, C. Palau, et al. Fall detection system for elderly people using iot and big data[J]. Procedia Computer Science, 2018, 130: 603-610.

[11] J. Howcroft, J. Kofman, E. D. Lemaire. Prospective fall-risk prediction models for older adults based on wearable sensors[J]. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 2017, 25(10): 1812-1820.

[12] T.de Quadros, A. E. Lazzaretti, F. K. Schneider. A movement decomposition and machine learning-based fall detection system using wrist wearable device[J]. IEEE Sensors Journal, 2018, 18(12): 5082-5089.

[13] Y. Nizam, M. N. H. Mohd, M. M. A. Jamil. Development of a user-adaptable human fall detection based on fall risk levels using depth sensor[J]. Sensors, 2018, 18(7): 2260-2274.

[14] Youngkong P, Panpanyatep W. A novel double pressure sensors-based monitoring and alarming system for fall detection[C]// 2021 Second International Symposium on Instrum-entation, Control, Artificial Intelligence, and Robotics (ICA-SYMP). IEEE, 2021:1-5.

[15] Saod A H M, Mustafa A A, Soh Z H C, et al. Fall detection system using wearable sensors with automated notification[C]// 2021 11th IEEE International Conference on Control System, Computing and Engineering (ICCSCE). IEEE, 2021: 182-187.

[16] Rivera L R, Ulmer E, Zhang Y D, et al. Radar-based fall detection exploiting time-frequency features[C]// 2014 IEEE China Summit & International Conference on Signal and Information Processing (ChinaSIP). IEEE, 2014: 713-717.

[17] G. Feng, J. Mai, Z.Ban, et al. Floor pressure imaging for fall detection with fiber-optic sensors[J]. IEEE Pervasive Computing, 2016, 15(2): 40-47.

[18] Wang Y, Wu K, Ni L M. Wifall: Device-free fall detection by wireless networks[J]. IEEE Transactions on Mobile Computing, 2016, 16(2): 581-594.

[19] D. Droghini, D. Ferretti, E. Principi ,et al. A combined one-class svm and template-matching ap proach for user-aided human fall detection by means of floor acoustic features[J]. Computational Intelligence and Neuroscience, 2017, 2017(1): 1-13.

[20] Miawarni H, Sardjono T A, Setijadi E, et al. Fall detection system for elderly based on 2d lidar: a preliminary study of fall incident and activities of daily living (ADL) detection[C]// 2020 International Conference on Computer Engineering, Network, and Intelligent Multimedia (CENIM). IEEE, 2020: 1-5.

[21] Liu Z, Yang M, Yuan Y, et al. Fall detection and personnel tracking system using infrared array sensors[J]. IEEE Sensors Journal, 2020, 20(16): 9558-9566.

[22] Zhengliang Z, Degui Y, Junchao Z, et al. Dataset of human motion status using IR-UWB through-wall radar[J]. Journal of Systems Engineering and Electronics, 2021, 32(5): 1083-1096.

[23] Liu C L, Lee C H, Lin P M. A fall detection system using k-nearest neighbor classifier[J]. Expert Systems with Applications, 2010, 37(10): 7174-7181.

[24] Mirmahboub B, Samavi S, Karimi N, et al. Automatic monocular system for human fall detection based on variations in silhouette area[J]. IEEE Transactions on Biomedical Engineering, 2012, 60(2): 427-436.

[25] Aslan M, Sengur A, Xiao Y, et al. Shape feature encoding via fisher vector for efficient fall detection in depth-videos[J]. Applied Soft Computing, 2015, 37: 1023-1028.

[26] Aslan M, Sengur A, Xiao Y, et al. Shape feature encoding via fisher vector for efficient fall detection in depth-videos[J]. Applied Soft Computing, 2015, 37: 1023-1028.

[27] Wang X, Jia K. Human fall detection algorithm based on YOLOv3[C]// 2020 IEEE 5th International Conference on Image, Vision and Computing (ICIVC). IEEE, 2020: 50-54.

[28] Feng Q, Gao C, Wang L, et al. Spatio-temporal fall event detection in complex scenes using attention guided LSTM[J]. Pattern Recognition Letters, 2020, 130: 242-249.

[29] Lu N, Wu Y, Feng L, et al. Deep learning for fall detection: three-dimensional CNN combined with LSTM on video kinematic data[J]. IEEE Journal of Biomedical and Health Informatics, 2018, 23(1): 314-323.

[30] Hasan M M, Islam M S, Abdullah S. Robust pose-based human fall detection using recurrent neural network[C]// 2019 IEEE International Conference on Robotics, Autom-ation, Artificial-Intelligence and Internet-of-Things. IEEE, 2019: 48-51.

[31] Inturi A R, Manikandan V M, Garrapally V. A novel vision-based fall detection scheme using keypoints of human skeleton with long short-term memory network[J]. Arabian Journal for Science and Engineering, 2023, 48(2): 1143-1155.

[32] Osokin D. Real-time 2d multi-person pose estimation on cpu: Lightweight openpose[J]. arXiv preprint arXiv:1811.12004, 2018.

[33] Inturi A R, Manikandan V M, Garrapally V. A novel vision-based fall detection scheme using keypoints of human skeleton with long short-term memory network[J]. Arabian Journal for Science and Engineering, 2023, 48(2): 1143-1155.

[34] Yadav S K, Luthra A, Tiwari K, et al. ARFDNet: An efficient activity recognition & fall detection system using latent feature pooling[J]. Knowledge-Based Systems, 2022, 239: 107-133.

[35] Yan S, Xiong Y, Lin D. Spatial temporal graph convolutional networks for skeleton-based action recognition[C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2018, 32(1): 1-9.

[36] Zahan S, Hassan G M, Mian A. Modeling human skeleton joint dynamics for fall detection[C]// 2021 Digital Image Computing: Techniques and Applications (DICTA). IEEE, 2021: 1-7.

[37] Yuan J, Liu C, Liu C, et al. Real-Time human falling recognition via spatial and temporal self-attention augmented graph convolutional network[C]// 2022 IEEE International Conference on Real-Time Computing and Robotics (RCAR). IEEE, 2022: 438-443.

[38] Zeng M, Nguyen L T, Yu B. Convolutional neural networks for human activity recognition using mobile sensors[C]// Proceedings of the 6th International Conference on Mobile Computing, Applications and Services. IEEE, 2014: 197-205.

[39] Ha S, Choi S. Convolutional neural networks for human activity recognition using multiple accelerometer and gyroscope sensors[C]// Proceedings of the 2016 International Joint Conference on Neural Networks (IJCNN). IEEE, 2016: 381-388.

[40] Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Columbus, USA, 2014: 580-587.

[41] Girshick R. Fast R-CNN[J]. Computer Science, 2015: 1440-1448.

[42] Ren S, He K, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2015, 39(6):1137-1149.

[43] Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, 2016: 779-788.

[44] Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, 2017: 7263-7271.

[45] Redmon J, Farhadi A. YOLOv3: An incremental improvement[J]. arXiv:1804.02767, 2018.

[46] Ge Z, Liu S, Wang F, et al. Yolox: Exceeding yolo series in 2021[J]. arXiv:2107.08430, 2021.

[47] Wang C Y, Bochkovskiy A, Liao H Y M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023: 7464-7475.

[48] Liu W, Anguelov D, Erhan D, et al. SSD: Single shot multi box detector[C]// European Conference on Computer Vision, Springer, Amsterdam, Dutch, 2016: 21-37.

[49] Sun K, Xiao B, Liu D, et al. Deep high-resolution representation learning for human pose estimation[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 5693-5703.

[50] Fang H S, Xie S, Tai Y W, et al. Rmpe: Regional multi-person pose estimation[C]// Proceedings of the IEEE International Conference on Computer Vision. 2017: 2334-2343.

[51] Pishchulin L, Insafutdinov E, Tang S, et al. Deepcut: Joint subset partition and labeling for multi person pose estimation[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 4929-4937.

[52] Cao Z, Simon T, Wei S E, et al. Realtime multi-person 2d pose estimation using part affinity fields[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 7291-7299.

[53] He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916.

[54] Liu S, Qi L, Qin H. Path aggregation network for instance segmentation[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 8759-8768.

[55] Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 2117-2125.

[56] Zheng Z, Wang P, Ren D, et al. Enhancing geometric factors in model learning and inference for object detection and instance segmentation[J]. IEEE Transactions on Cybernetics, 2021, 52(8): 8574-8586.

[57] Lin T Y, Maire M, Belongie S, et al. Microsoft coco: Common objects in context[C]// Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, Septemb-er 6-12, 2014, Proceedings, Part V 13. Springer International Publishing, 2014: 740-755.

[58] Hou Q, Zhou D, Feng J. Coordinate attention for efficient mobile network design[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 13713-13722.

[59] Han K, Wang Y, Tian Q, et al. Ghostnet: More features from cheap operations[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 1580-1589.

[60] 刘勇, 李杰, 任立成等. 并联化高分辨网络的人体姿态估计方法[J]. 计算机工程与设计, 2022, 43(1): 237-244.

[61] Yang S, Quan Z, Nie M, et al. Transpose: Keypoint localization via transformer[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 11802-11812.

[62] Maji D, Nagori S, Mathew M, et al. Yolo-pose: Enhancing yolo for multi person pose estimation using object keypoint similarity loss[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 2637-2646.

[63] 高坤, 李汪根, 束阳, 等.结合密集连接的轻量级高分辨率人体姿态估计[J]. 中国图象图形学报, 2024, 29 (5): 1408-1420.

[64] 林远强, 郜辉, 王鹏, 等.引入级联通道注意力的轻量化人体姿态估计[J/OL]. 计算机工程与应用, 1-11[2024-06-03].http://kns.cnki.net/kcms/detail/11.2127.TP.20231218. 1415.022.html.

[65] Shi L, Zhang Y, Cheng J, et al. Two-stream adaptive graph convolutional networks for skeleton-based action recognition[C]// Proceedings of the IEEE/CVF Conference on Com-puter Vision and Pattern Recognition. 2019: 12026-12035.

[66] Cheng K, Zhang Y, He X, et al. Skeleton-based action recognition with shift graph convolutional network[C]// Proceedings of the IEEE/CVF Conference on Computer Visio-n and Pattern Recognition. 2020: 183-192.

[67] Liu Z, Zhang H, Chen Z, et al. Disentangling and unifying graph convolutions for skeleton-based action recognition[C]// Proceedings of the IEEE/CVF Conference on Computer Vis-ion and Pattern Recognition. 2020: 143-152.

[68] Zhang H, Liu X, Yu D, et al. Skeleton-based action recognition with multi-stream, multi-scale dilated spatial-temporal graph convolution network[J]. Applied Intelligence, 2023, 53(14): 17629-17643.

[69] Wang P, Chen P, Yuan Y, et al. Understanding convolution for semantic segmentation[C]// 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2018: 1451-1460.

[70] Song Y F, Zhang Z, Shan C, et al. Constructing stronger and faster baselines for skeleton-based action recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intellige-nce, 2022, 45(2): 1474-1488.

[71] Shahroudy A, Liu J, Ng T T, et al. Ntu rgb+ d: A large scale dataset for 3d human activity analysis[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recog-nition. 2016: 1010-1019.

[72] Sun Y, Huang H, Yun X, et al. Triplet attention multiple spacetime-semantic graph convolutional network for skeleton-based action recognition[J]. Applied Intelligence, 2022, 52(1): 113-126.

[73] Chen Y, Zhang Z, Yuan C, et al. Channel-wise topology refinement graph convolution for skeleton-based action recognition[C]// Proceedings of the IEEE/CVF International Confe-rence on Computer Vision. 2021: 13359-13368.

[74] Zhang J, Ye G, Tu Z, et al. A spatial attentive and temporal dilated (SATD) GCN for skeleton‐based action recognition[J]. CAAI Transactions on Intelligence Technology, 2022, 7(1): 46-55.

[75] Xing Y, Zhu J, Li Y, et al. An improved spatial temporal graph convolutional network for robust skeleton-based action recognition[J]. Applied Intelligence, 2023, 53(4): 4592-4608.

[76] Hua M, Nan Y, Lian S. Falls prediction based on body keypoints and seq2seq architecture[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops. 2019: 1-9.

[77] Poonsri A, Chiracharit W. Improvement of fall detection using consecutive-frame voting[C]// 2018 International Workshop on Advanced Image Technology (IWAIT). IEEE, 2018: 1-4.

[78] Wang B H, Yu J, Wang K, et al. Fall detection based on dual-channel feature integration[J]. IEEE Access, 2020, 8: 103443-103453.

[79] Zheng H, Liu Y. Lightweight fall detection algorithm based on AlphaPose optimization model and ST-GCN[J]. Mathematical Problems in Engineering, 2022, 2022(1): 1-15.

中图分类号:

 TP391.4    

开放日期:

 2024-06-17    

无标题文档

   建议浏览器: 谷歌 火狐 360请用极速模式,双核浏览器请用极速模式