查看论文信息

免费浏览

查看论文信息

论文中文题名：	基于发音想象脑机接口的字符分类与生成技术研究
姓名：	李卓逸
学号：	20206223053
保密级别：	保密（1年后开放）
论文语种：	chi
学科代码：	085400
学科名称：	工学 - 电子信息
学生类型：	硕士
学位级别：	工程硕士
学位年度：	2023
培养单位：	西安科技大学
院系：	电气与控制工程学院
专业：	控制工程
研究方向：	发音想象脑机接口
第一导师姓名：	潘红光
第一导师单位：	西安科技大学
论文提交日期：	2023-06-15
论文答辩日期：	2023-06-02
论文外文题名：	Research on Characters Classification and Generation Technique Based on Speech Imagery Brain–Computer Interface
论文中文关键词：	脑机接口 ; 字符发音想象 ; 数据集 ; 多分类 ; 脑-文本交流论文
论文外文关键词：	Brain–computer interface ; Character speech imagery ; Dataset ; Multiclassification ; Brain-to-text communication
论文中文摘要：	︿发音想象脑机接口（Brain–Computer Interface, BCI）作为一种新型的BCI 方式，通过解码语言功能障碍患者的内心默读，为其提供有效、舒适的言语沟通的潜力。由于英语作为全球通用性语言，已成为国际交流、文化与科技沟通的重要工具。若能够将大脑想象源的发音内容生成为文本形式，则可以提供一种全新的脑-文本的沟通途径。因此，本文以英文小写字母以及常用标点符号作为发音想象内容，对其进行分类与生成技术研究。具体如下：1. 针对目前想象发音内容不够完整而无法实现文本交流的问题。首先，研究大脑生理功能分区，定位采集区域并设计实验范式；其次，构建字符以及句子发音想象脑电（Electroencephalograph, EEG）数据集，字符EEG 数据集中包含26 个英文小写字母a∼z、以及“,”、“.”、“>”。其中，将空格表示为“>”，发音为“/greɪt/”。句子EEG 数据集根据英国国家语料库选择来自不同语境并包含所有字符的不同句子；最后，通过EEGLAB 工具箱对采集到的数据预处理。2. 针对发音想象EEG 信号信噪比低、数据表征能力差的不足。本文提出了一种小波包分解（Wavelet Packet Decomposition, WPD）结合核主成分分析（kernel Principal Component Analysis, KPCA）的算法，解决WPD 提取特征维数较大，状态信息不凝聚的问题；其次，采用t-分布随机近邻嵌入将提取到的字符特征可视化；最后，通过LightGBM 进行多分类研究。结果表明，采用WPD-KPCA 以及LightGBM 分类器的平均分类准确率为90.17%，相较于单独使用WPD 和KPCA 提高了8.37% 和14.11%。同时，将LightGBM 与传统分类器对比，均能证明字符发音想象EEG 信号的可分性。、3. 针对发音方式不同导致的被试者在构建句子时想象每个字符的时间长短不一致，难以实现句子中单个字符打标签以及网络模型训练的问题。首先，采用了时间扭曲模型，实现字符EEG 数据集中29 种字符在重复多次实验的事件上对齐并生成字符神经模板；其次，将字符神经模板用于初始化隐马尔科夫模型，并采用维特比算法对句子EEG 数据集中每个句子的字符进行打标签；最后，采用长短期记忆网络训练字符与标签的映射关系，并将其翻译为文本从而实现脑-文本的交流。结果表明，所有被试者在想象句子中每个字符的平均正确率为77.80%。本文对字符发音想象BCI 的分类与生成技术研究，通过自建字符与句子发音想象EEG 数据集、验证EEG 信号的可分性、建立字符生成文本模型等手段，为发音想象BCI 的研究开辟了一种新的方法，提供了一种全新的脑-文本的沟通途径。﹀
论文外文摘要：	︿ As a new type of brain–computer interface (BCI), the speech imagery BCI provides the potential of effective and comfortable speech communication for patients with speech dysfunction by decoding their inner silent reading. As a universal language, English has become an important tool for international, cultural and scientific communication. If the speech contents of the imaginary source can be generated into the form of text, it can provide a new way of brain-to-text communication. Therefore, this paper takes English lowercase letters and common punctuation marks as the contents of speech imagery, and studies their classification and generation technology. The details are as follows:1. Aiming at the problem that the content of speech imagery is not complete enough to realize text communication. Firstly, the brain physiological functional regions are studied, the acquisition areas are located and the experimental paradigms are designed; Secondly, the electroencephalograph (EEG) datasets of character and sentence are constructed. The EEG dataset of character contains 26 lowercase English letters a∼z, and “, ”, “.”, “>”. Here, the space is represented as “>” and pronounced as “/greɪt/”. The EEG dataset of sentence selects different sentences from many different contexts and contains all characters according to the British National Corpus; Finally, the collected datas are preprocessed by EEGLAB toolbox.2. Aiming at the shortcomings of low signal-to-noise ratio and poor data characterization of EEG signals of speech imagery, this paper proposes a wavelet packet decomposition (WPD) with kernel principal component analysis (KPCA) algorithm, to solve the problem that the feature dimension of WPD extraction is large and the state information does not condense; Secondly, t-distributed stochastic neighbor embedding is used to visualize the extracted character features; Finally, multi-classification is studied by LightGBM. The results show that the average classification accuracy of WPD-KPCA and LightGBM classifier is 90.17%, which is 8.37% and 14.11% higher than that of WPD and KPCA alone. Meanwhile, comparison between LightGBM and traditional classifiers can prove the separability of EEG signals.3. Aiming at the problems caused by different ways of speech, the length of time of each character in the sentence construction is not consistent, and it is difficult to realize the single character in the sentence labeling and network model training. Firstly, a time warping model is used to achieve the warping alignment of 29 characters in character EEG dataset on repeated events and generate character neural templates; Secondly, the character neural templates are used to initialize the hidden Markov model, and the characters of each sentence in the sentence EEG dataset is labeled according to the Viterbi algorithm; Finally, Long Short-Term Memory is used to train the mapping relationship between characters and labels, and translate them into text to realize brain-to-text communication. The results showed that the average accuracy of each character in the imaginary sentences is 77.80%.In this paper, the classification and generation techniques of character speech imagery BCI are studied. By means of self-constructing EEG datasets of characters and sentences speech imagery, verifying the separability of EEG signals, and establishing text model of character generation, a new method for the study of speech imagery BCI is developed, and a new brain-to- text communication approach is provided. ﹀
参考文献：	︿ [1]于淑月, 李想, 于功敬, 等. 脑机接口技术的发展与展望[J]. 计算机测量与控制, 2019, 27(10): 5–12. [2]蒋勤, 张毅, 谢志荣. 脑机接口在康复医疗领域的应用研究综述[J]. 重庆邮电大学学报(自然科学版), 2021, 33(4): 562–570. [3]Xu Y, Zhang H, Cao L, et al. A shared control strategy for reach and grasp of multiple objects using robot vision and noninvasive brain–computer interface[J]. IEEE Transactions on Automation Science and Engineering, 2020, 19(1): 360–372. [4]Arpaia P, Duraccio L, Moccaldi N, et al. Wearable brain–computer interface instrumentation for robot-based rehabilitation by augmented reality[J]. IEEE Transactions on Instrumentation and Measurement, 2020, 69(9): 6362–6371. [5]Bablani A, Edla D R, Tripathi D, et al. Survey on brain–computer interface: an emerging computational intelligence paradigm[J]. ACM Computing Surveys, 2019, 52(1): 1–32. [6]Boran E, Ramantani G, Krayenbühl N, et al. High-density ECoG improves the detection of high frequency oscillations that predict seizure outcome[J]. Clinical Neurophysiology, 2019, 130(10): 1882–1888. [7]姜耿, 赵春临. 基于EEG 的脑机接口发展综述[J]. 计算机测量与控制, 2022, 30(7): 1–8. [8]Cao M, Galvis D, Vogrin S J, et al. Virtual intracranial EEG signals reconstructed from MEG with potential for epilepsy surgery[J]. Nature Communications, 2022, 13(1): 1–12. [9]Schultz T, Wand M, Hueber T, et al. Biosignal-based spoken communication: a survey[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2017, 25(12): 2257–2271. [10]焦学军, 姜劲, 潘津津, 等. 基于fNIRS 技术的脑机接口研究[J]. 天津大学学报, 2017, 50(5): 527–535. [11]Zuo C, Jin J, Yin E, et al. Novel hybrid brain–computer interface system based on motor imagery and P300[J]. Cognitive Neurodynamics, 2020, 14(2): 253–265. [12]Miao Y, Yin E, Allison B Z, et al. An ERP-based BCI with peripheral stimuli: validation with ALS patients[J]. Cognitive Neurodynamics, 2020, 14(1): 21–33. [13]郭士杰. 基于P300 的脑机接口指令识别[D]. 杭州: 杭州电子科技大学, 2018. [14]Wong C, Wang B, Wang Z, et al. Spatial filtering in SSVEP-based BCIs: unified framework and new improvements[J]. IEEE Transactions on Biomedical Engineering, 2020, 67(11): 3057–3072. [15]Zhang D, Hong B, Gao S, et al. Exploring the temporal dynamics of sustained and transient spatial attention using steady-state visual evoked potentials[J]. Experimental Brain Research, 2017, 235(5): 1575–1591. [16]Murphy O W, Hoy K E, Wong D, et al. Transcranial random noise stimulation is more effective than transcranial direct current stimulation for enhancing working memory in healthy individuals: behavioural and electrophysiological evidence[J]. Brain Stimulation, 2020, 13(5): 1370–1380. [17]Sadiq M T, Yu X, Yuan Z, et al. Motor imagery BCI classification based on multivariate variational mode decomposition[J]. IEEE Transactions on Emerging Topics in Computational Intelligence, 2022, 6(5): 1177–1189. [18]DaSalla C S, Kambara H, Sato M, et al. Single-trial classification of vowel speech imagery using common spatial patterns[J]. Neural Networks, 2009, 22(9): 1334–1339. [19]Matsumoto M, Hori J. Classification of silent speech using support vector machine and relevance vector machine[J]. Applied Soft Computing, 2014, 20: 95–102. [20]Min B, Kim J, Park H, et al. Vowel imagery decoding toward silent speech BCI using extreme learning machine with electroencephalogram[J]. BioMed Research International, 2016, 26(1): 1–11. [21]Jahangiri A, Sepulveda F. The contribution of different frequency bands in class separability of covert speech tasks for BCIs[C]//2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). Jeju-do: IEEE, 2017: 2093–2096. [22]Sereshkeh A R, Yousefi R, Wong A T, et al. Online classification of imagined speech using functional near-infrared spectroscopy signals[J]. Journal of Neural Engineering, 2018, 16(1): 016005. [23]Sereshkeh A R, Yousefi R, Wong A T, et al. Development of a ternary hybrid fNIRS-EEG brain–computer interface based on imagined speech[J]. Brain-Computer Interfaces, 2019, 6(4): 128–140. [24]Dash D, Ferrari P, Wang J. Decoding imagined and spoken phrases from non-invasive neural (MEG) signals[J]. Frontiers in Neuroscience, 2020, 14: 290–305. [25]Mini P, Thomas T, Gopikakumari R. EEG based direct speech BCI system using a fusion of SMRT and MFCC/LPCC features with ANN classifier[J]. Biomedical Signal Processing and Control, 2021, 68: 102625. [26]Llorente D, Ballesteros M, Cruz-Ortiz D, et al. Brain computer interface for speech synthesis based on multilayer differential neural networks[J]. Cybernetics and Systems, 2022, 53(1): 126–140. [27]Wang L, Zhang X, Zhong X, et al. Analysis and classification of speech imagery EEG for BCI[J]. Biomedical Signal Processing and Control, 2013, 8(6): 901–908. [28]杨晓芳, 江铭虎. 基于汉语音位发音想象的脑机接口研究[J]. 中文信息学报, 2014,28(5): 13–23. [29]Wang L, Zhang X, Zhong X, et al. Improvement of mental tasks with relevant speech imagery for brain–computer interfaces[J]. Measurement, 2016, 91: 201–209. [30]曾又. 基于DIVA 模型的汉语语音脑机接口系统的研究[D]. 南京: 南京邮电大学,2017. [31]郭苗苗, 齐志光, 王磊, 等. 语言脑机接口康复系统中的参数优化研究[J]. 信号处理,2018, 34(8): 974–983. [32]Zhang X, Li H, Chen F. EEG-based classification of imaginary mandarin tones[C]//202042nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC). Montreal: IEEE, 2020: 3889–3892. [33]Dong E, Yue C, Du S. Classification of subject-independent motor imagery eeg based on relevance vector machine[C]//2020 IEEE International Conference on Mechatronics and Automation (ICMA). Beijing: IEEE, 2020: 67–71. [34]陈霏, 潘昌杰. 基于发音想象的脑机接口的研究综述[J]. 信号处理, 2020, 36(6): 816–830. [35]Lin S, Liu J, Li W, et al. A flexible, robust, and gelfree electroencephalogram electrode for noninvasive braincomputer interfaces[J]. Nano Letters, 2019, 19(10): 6853–6861. [36]Nojima I, Sugata H, Takeuchi H, et al. Brain–computer interface training based on brain activity can induce motor recovery in patients with stroke: a meta-analysis[J]. Neurorehabilitation and Neural Repair, 2022, 36(2): 83–96. [37]Young M, Lin D, Hochberg L. Brain–computer interfaces in neurorecovery and neurorehabilitation[J]. Seminars in Neurology, 2021, 41(2): 206–216. [38]Benda M, Volosyak I. Peak detection with online electroencephalography (EEG) artifact removal for brain–computer interface (BCI) purposes[J]. Brain Sciences, 2019, 9(12): 347–367. [39]Kristensen A B, Subhi Y, Puthusserypady S. Vocal imagery vs intention: viability of vocal based EEG-BCI paradigms[J]. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 2020, 28(8): 1750–1759. [40]Bouchard K E, Mesgarani N, Johnson K, et al. Functional organization of human sensorimotor cortex for speech articulation[J]. Nature, 2013, 495(7441): 327–332. [41]Akbari H, Khalighinejad B, Herrero J L, et al. Towards reconstructing intelligible speech from the human auditory cortex[J]. Scientific Reports, 2019, 9(1): 1–12. [42]Leuthardt E C, Pei X M, Breshears J, et al. Temporal evolution of gamma activity in human cortex during an overt and covert word repetition task[J]. Frontiers in Human Neuroscience, 2012, 6: 99–110. [43]罗志增, 鲁先举, 周莹. 基于脑功能网络和样本熵的脑电信号特征提取[J]. 电子与信息学报, 2021, 43(2): 412–418. [44]Chaudhary P, Agrawal R. A comparative study of linear and non-linear classifiers in sensory motor imagery based brain computer interface[J]. Journal of Computational and Theoretical Nanoscience, 2019, 16(12): 5134–5139. [45]Li Z, Yuan Y, Luo L, et al. Hybrid brain/muscle signals powered wearable walking exoskeleton enhancing motor ability in climbing stairs activity[J]. IEEE Transactions on Medical Robotics and Bionics, 2019, 1(4): 218–227.[46] Paul B T, Bajin M D, Uzelac M, et al. Evidence of visual crossmodal reorganization positively relates to speech outcomes in cochlear implant users[J]. Scientific Reports, 2022, 12(1): 1–12. [47]Wolpaw J, Birbaumer N, McFarland D, et al. Brain–computer interfaces for communication and control[J]. Clinical Neurophysiology, 2002, 113(6): 767–791. [48]Agarwal P, Kumar S. Imagined word pairs recognition from non-invasive brain signals using Hilbert transform[J]. International Journal of System Assurance Engineering and Management, 2022, 13(1): 385–394. [49]Kaur B, Singh D, Roy P P. Age and gender classification using brain–computer interface[ J]. Neural Computing and Applications, 2019, 31(10): 5887–5900. [50]李嘉莹, 赵丽, 边琰, 等. 基于LDA 和KNN 的下肢运动想象脑电信号分类研究[J].国外电子测量技术, 2021, 40(1): 9–14. [51]Zhang L, Zhou Z, Xu Y, et al. Classification of imagined speech EEG signals with DWT and SVM[J]. 仪器仪表学报: 英文版, 2022, 9(2): 56–63. [52]Qureshi M N, Min B, Park H, et al. Multiclass classification of word imagination speech with hybrid connectivity features[J]. IEEE Transactions on Biomedical Engineering, 2017,65(10): 2168–2177. [53]Dos Santos E M, San-Martin R, Fraga F J. Comparison of subject-independent and subjectspecific EEG-based BCI using LDA and SVM classifiers[J]. Medical & Biological Engineering & Computing, 2023, 61(3): 835–845. [54]罗新勇. 基于流形学习的脑电特征提取方法及应用[D]. 北京: 北京工业大学, 2016. [55]Herwig U, Satrapi P, Schonfeldt-Lecuona C. Using the international 10-20 EEG system for positioning of transcranial magnetic stimulation[J]. Brain Topography, 2003, 16(2):95–99. [56]Rabbani Q, Milsap G, Crone N E. The potential for a speech brain–computer interface using chronic electrocorticography[J]. Neurotherapeutics, 2019, 16(1): 144–165. [57]Rayson P, Leech G N, Hodges M. Social differentiation in the use of English vocabulary: some analyses of the conversational component of the british national corpus[J]. International Journal of Corpus Linguistics, 1997, 2(1): 133–152. [58]Cohen M X. Where does EEG come from and what does it mean?[J]. Trends in Neurosciences, 2017, 40(4): 208–218. [59]Yasoda K, Ponmagal R, Bhuvaneshwari K, et al. Automatic detection and classification of EEG artifacts using fuzzy kernel SVM and wavelet ICA (WICA)[J]. Soft Computing, 2020, 24(21): 16011–16019. [60]Habib M A, Ibrahim F, Mohktar M S, et al. Recursive independent component analysis (ICA)-decomposition of ictal EEG to select the best ictal component for EEG sourceimaging[J]. Clinical Neurophysiology, 2020, 131(3): 642–654. [61]刘飞翔, 王爱民. 基于双树复小波变换和GBDT 的运动想象脑电识别[J]. 测控技术,2019, 38(1): 58–62. [62]Singh S A, Meitei T G, Devi N D, et al. A deep neural network approach for P300 detectionbased BCI using single-channel EEG scalogram images[J]. Physical and Engineering Sciences in Medicine, 2021, 4(44): 1221–1230. [63]王美娥, 徐艳华. 基于小波包分解和共空间模式方法的脑电运动想象分类方法[J].生物医学工程研究, 2021, 40(3): 256–261. [64]Yindong D, Fuji R, Chunbin L. EEG emotion recognition based on linear kernel PCA and XGBoost[J]. Opto-Electronic Engineering, 2021, 48(2): 1–9. [65]Ko W, Jeon E, Jeong S, et al. Multi-scale neural network for EEG representation learning in BCI[J]. IEEE Computational Intelligence Magazine, 2021, 16(2): 31–45. [66]Abenna S, Nahid M, Bajit A. Motor imagery based brain–computer interface: improving the EEG classification using delta rhythm and lightGBM algorithm[J]. Biomedical Signal Processing and Control, 2022, 71: 103102. [67]Zeng H, Yang C, Zhang H, et al. A lightGBM-based EEG analysis method for driver mental states classification[J]. Computational Intelligence and Neuroscience, 2019, 2019(4):1–11. [68]Ketu S, Mishra P K. Hybrid classification model for eye state detection using electroencephalogram signals[J]. Cognitive Neurodynamics, 2022, 16(6): 73–90. [69]Williams A H, Poole B, Maheswaranathan N, et al. Discovering precise temporal patterns in large-scale neural recordings through robust and interpretable time warping[J]. Neuron,2020, 105(2): 246–259. [70]Zhang X, Pan J, Shen J, et al. Fusing of electroencephalogram and eye movement with group sparse canonical correlation analysis for anxiety detection[J]. IEEE Transactions on Affective Computing, 2020, 13(2): 958–971. [71]Pattnaik S, Nayak A K, Patnaik S. A semi-supervised learning of HMM to build a POS tagger for a low resourced language[J]. Journal of Information and Communication Convergence Engineering, 2020, 18(4): 207–215. [72]Guo J, Zhang Q, Zhao Y, et al. Rnn-test: Towards adversarial testing for recurrent neural network systems[J]. IEEE Transactions on Software Engineering, 2021, 48(10): 4167–4180. [73]Gao Z, Yuan T, Zhou X, et al. A deep learning method for improving the classification accuracy of SSMVEP-based BCI[J]. IEEE Transactions on Circuits and Systems II: Express Briefs, 2020, 67(12): 3447–3451. ﹀
中图分类号：	TP391
开放日期：	2024-06-19

附件下载