查看论文信息

免费浏览

查看论文信息

论文中文题名：	时变时延水声信道下基于强化学习的自适应调制
姓名：	朱静茹
学号：	19207205075
保密级别：	公开
论文语种：	chi
学科代码：	085208
学科名称：	工学 - 工程 - 电子与通信工程
学生类型：	硕士
学位级别：	工程硕士
学位年度：	2022
培养单位：	西安科技大学
院系：	通信与信息工程学院
专业：	电子与通信工程
研究方向：	水声通信
第一导师姓名：	张育芝
第一导师单位：	西安科技大学
论文提交日期：	2022-06-22
论文答辩日期：	2022-06-06
论文外文题名：	Reinforcement Learning based Adaptive Modulation in Time-varying and Propagation Delay Underwater Acoustic Channels
论文中文关键词：	水声通信 ; 自适应调制 ; 强化学习 ; 深度强化学习
论文外文关键词：	Underwater acoustic communication ; Adaptive modulation ; Reinforcement learning ; Deep reinforcement Learning
论文中文摘要：	︿自适应调制技术是水声通信中的关键技术，然而复杂多变的水声信道严重制约了自适应调制技术性能，特别是时变时延特性带来发送端收到的反馈信道状态信息（Channel State Information, CSI）过时，使传统阈值切换法选择不准确而导致通信有效性低的问题。针对此问题，本文采用对复杂问题有良好解决能力的强化学习算法提出了两种自适应调制方法：（1）提出了一种基于长短期记忆（Long Short-Term Memory, LSTM）网络预测的Q-学习自适应调制方法，并设计了一种LSTM在线训练策略。发送端首先利用LSTM预测模型根据过时CSI序列对当前时刻CSI进行预测，再将预测结果作为Q-学习输入对自适应调制策略进行学习。仿真和实验结果表明，该方法与基于状态转移预测的Q-学习自适应调制方法相比，在满足误比特率的约束下，可获得更高的等效信息速率。（2）提出了一种基于深度Q网络（Deep Q Network, DQN）的自适应调制方法，为了减小由水声信道部分可观测而引起的决策偏差，本文在传统DQN算法中全连接的深度神经网络前增加了一个LSTM网络层来增强发送端对环境的记忆能力（LSTM-DQN）。该方法直接利用过时CSI序列作为LSTM-DQN的输入学习自适应调制策略。仿真和实验结果表明，该方法相比于全连接的DQN自适应调制方法，可以在满足误比特率的约束下，提升通信等效信息速率，且学习速率更快。综上，在时变时延水声信道下，本文所提的两种自适应调制方法均可在满足误比特率的约束下，提升通信等效信息速率。此外，基于LSTM-DQN的自适应调制方法相比于基于LSTM预测的Q-学习自适应调制方法，不需要单独的预测算法和数据训练集，避免了误差积累，获得了更高的等效信息速率及更快的学习速率。﹀
论文外文摘要：	︿ Adaptive modulation techniques are key technologies in underwater acoustic communication. However, due to the complexity of the underwater acoustic channel, it has severely constrained adaptive modulation technology performance, especially when time-varying, feedback channel states information (CSI) outdated received by the transmitter, the traditional fixed threshold switching method selects inaccurate, resulting communication validity is lowered. For this problem, this thesis uses the reinforcement learning algorithm that has a good ability to solve the complex problems, and propose two adaptive modulation methods: (1) A Q-learning adaptive modulation method based on long short-term memory (LSTM) network prediction is proposed, and designed a LSTM online training strategy. The transmitter utilizes the LSTM prediction model to predict the CSI of the current time slot according to the outdated CSI sequence, and then learn the adaptive modulation strategy as a Q-learning input. Simulation and experimental results show that，this method is compared to the Q-learning adaptive modulation method based on state transfer prediction are better, and higher equivalent information rate performance can be obtained under the bit error rate constraint. (2) An adaptive modulation method based on the Deep Q Network (DQN) is proposed, and in order to alleviate the decision bias caused by partial observable of underwater acoustic channel, this paper adds a LSTM network layer in front of the deep neural network in the traditional DQN algorithm to enhance the memory ability of the transmitter to the environment (LSTM-DQN). This method directly uses the outdated CSI sequence as the input learning adaptation modulation strategy of LSTM-DQN. The simulation and experimental results show that compared with the traditional DQN adaptive modulation method, this method has improved the communication equivalent information rate under the bite error rate constraint, and the learning rate is faster. In summary, under the time-varying and propagation delay underwater acoustic channels, the two adaptive modulation methods proposed in this paper can improve the communication equivalent information rate under the constraints of bit error rate. In addition, the adaptive modulation method based on the LSTM-DQN is compared to the Q-learning adaptive modulation method based on LSTM prediction. It does not require separate prediction algorithms and data training sets. It avoids the accumulation of errors. Finally, a higher communication equivalent information rate and faster learning rate were obtained. ﹀
参考文献：	︿ [1]郎舒妍, 曾晓光, 赵羿羽. 2030: 全球海洋技术趋势[J]. 中国船检, 2017, 6: 90–92. [2]习近平. 进一步关心海洋认识海洋经略海洋推动海洋强国建设不断取得新成就[N], 人民日报, 2013年08月01日01版. [3]Wang L, Zhou H, Xu X, et al. Adaptive Modulation and Coding for Underwater Acoustic OFDM[J]. IEEE Journal of Oceanic Engineering, 2015, 40(2): 327-336. [4]Kim H, Kim S, Choi J W, et al. Bidirectional equalization based on error propagation detection in long-range underwater acoustic communication[J]. Japanese Journal of Applied Physics, 2019, 58( SG): SGGF01． [5]He C, Jing L, Xi R, et al. Time-Frequency Domain Turbo Equalization for Single-Carrier Underwater Acoustic Communications[J]. IEEE Access, 2019, 7: 73324-73333． [6]杨晓霞. 水声通信信道均衡理论与关键技术研究[D]. 北京: 中国科学院声学研究所, 2014． [7]Xi J, Yan S, Xu L. Direct-adaptation based bidirectional turbo equalization for underwater acoustic communications: Algorithm and undersea experimental results[J]. The Journal of the Acoustical Society of America, 2018,143(5): 2715-2728． [8]Peng H, Li J. Turbo Equalization in Blind Receiver[C]∥2010 International Conference on Communications and Intelligence Information Security, Xi'an, China, IEEE, 2010:172-175． [9]Zheng Y, Wu J, Xiao C. Turbo equalization for single-carrier underwater acoustic communications[J]. IEEE Communications Magazine, 2015, 53(11): 79-87． [10]Tu K, Fertonani D, Duman T, et al. Mitigation of Intercarrier Interference for OFDM Over Time-Varying Underwater Acoustic Channels[J]. IEEE Journal of Oceanic Engineering, 2011, 36(2): 156-171． [11]Berger C, Zhou S, Preisig J, et al. Sparse Channel Estimation for Multicarrier Underwater Acoustic Communication: From Subspace Methods to Compressed Sensing[J]. IEEE Transactions on Signal Processing, 2010, 58(3): 1708-1721． [12]李程程, 李有明, 吕新荣, 等. 水声通信中脉冲干扰和载波频偏联合估计算法的研究[J]. 信号处理, 2015, 31(11): 1473-1478． [13]Lu Q, Hu X, Wang D, et al. Parallel combinatory multicarrier modulation in underwater acoustic communications[J]. IET Communications, 2017, 11(9):1331-1337． [14]Ma L, Zhou S, Qiao G, et al. Superposition coding for downlink underwater acoustic OFDM[J]. IEEE Journal of Oceanic Engineering, 2016, 42(1) : 175-187． [15]Amar A, Avrashi G, Stojanovic M. Low complexity residual Doppler shift estimation for underwater acoustic multi-carrier communication[J]. IEEE Transactions on Signal Processing, 2016, 65(8): 2063-2076． [16]Li B, Zhou S, Stojanovic M, et al. Non-Uniform Doppler Compensation for Zero-Padded OFDM over Fast-Varying Underwater Acoustic Channels[C]// OCEANS 2007–Europe, Aberdeen, UK IEEE, 2007:1-6． [17]Wen M, Cheng X, Yang L, et al. Index modulated OFDM for underwater acoustic communications[J]. IEEE Communications Magazine, 2016, 54(5): 132-137． [18]Feng, X, Esmaiel H, Wang J, et al. Underwater Acoustic Communications Based on OTFS[C]// ICSP 2020, China, IEEE, 2020: 439-444. [19]Jing L, Zhang N, He C, et al. OTFS underwater acoustic communications based on passive time reversal[J]. Applied Acoustics, 2022, 185: 108386. [20]邱逸凡, 李爽, 童峰. 一种浅海信道自适应调制水声通信方案[J].舰船科学技术, 2021, 43(19):158-162. [21]Wan L, Zhou H, Xu X, et al. Adaptive modulation and coding for underwater acoustic OFDM[J]. IEEE Journal of Oceanic Engineering, 2015, 40(2): 327–336. [22]吴雨珊. 自适应调制与功率分配的OFDM水声通信技术[D]. 哈尔滨工程大学, 2019. [23]Zhang Y, Yu L and Wang A. Underwater Acoustic Multi-user OFDM Bit Loading with Markov Chain based Channel State Information Prediction[C]// OCEANS 2018 MTS/IEEE Charleston, Charleston, SC, USA, IEEE, 2018:1-6. [24]Huang J, Diamant R. Adaptive Modulation for Long-range Underwater Acoustic Communication[J]. IEEE Transactions on Wireless Communications, 2020, 19(10): 6844-6857. [25]Radosevic A, Ahmed R, John G, et al. Adaptive OFDM modulation for underwater acoustic communications: Design considerations and experimental results[J]. IEEE Journal of Oceanic Engineering, 2014, 39(2): 357–370. [26]Lei L, Lin C, Lu M, et al. Channel State Information Prediction for Adaptive Underwater Acoustic Downlink OFDMA System: Deep Neural Networks Based Approach[J].IEEE Transactions on Vehicular Technology, 2021, 70( 9): 9063-9076. [27]Yang J, Li L, Zhao M. A Blind CSI Prediction Method Based on Deep Learning for V2I Millimeter-Wave Channel[C]// 2020 IEEE 28th International Conference on Network Protocols (ICNP), Madrid, Spain , IEEE, 2020: 1884-2022. [28]Wang C, Wang Z, Sun W, et al. Reinforcement learning-based adaptive transmission in time-varying underwater acoustic channels[J]. IEEE access, 2018, 6: 2541-2558． [29]Fu Q, Song A. Adaptive modulation for underwater acoustic communications based on reinforcement learning[C]// OCEANS 2018 MTS/IEEE Charleston, Charleston, SC, USA, IEEE, 2018: 1-8． [30]王安义, 李萍, 张育芝. 基于SARSA算法的水声通信自适应调制[J].科学技术与工程, 2020, 20(16):6505-6509. [31]Su W, Lin J, Chen K, et al. Reinforcement learning-based adaptive modulation and coding for efficient underwater communications[J]. IEEE Access, 2019, 7: 67539-67550. [32]Su W, Tao J, Pei Y, et al. Reinforcement Learning Based Efficient Underwater Image Communication[J]. IEEE Communications Letters, 2020, 25(3): 883-886. [33]Fan C, Wang Z. Adaptive Switching for Multimodal Underwater Acoustic Communications Based on Reinforcement Learning[C]// The 15th International Conference on Underwater Networks & Systems, Shenzhen, China, ACM, 2021: 22-24. [34]Mashhadi S, Ghiasi N, Farahma S, et al. Deep Reinforcement Learning Based Adaptive Modulation With Outdated CSI[J]. EEE Communications Letters, 2021,25(10): 3291-3295. [35]殷敬伟, 吴雨珊, 韩笑, 等. 北极冰水混合水域的水声信道预测技术[J]. 信号处理, 2019, 35(09): 1496-1504. [36]Cui H, Liu C, Si B, et al. Iterative receiver for the triple differential PSK modulation in the time-varying underwater acoustic communications[J]. IET Communications, 2020,14(6): 2813-2819. [37]Qiao G, Song Q, Mu L, et al. Channel prediction based temporal multiple sparse bayesian learning for channel estimation in fast time-varying underwater acoustic OFDM communications[J]. Signal Processing, 2020, 175: 107668. [38]Zhang Y, Venkatesan R, Dobre O, et al. Estimation and Prediction for Sparse Time-Varying Underwater Acoustic Channels[J]. EEE Journal of Oceanic Engineering, 45(3): 1112-1125. [39]Gou Y, Zhang T, Liu J, et al. Deep Ocean: A General Deep Learning Framework for Spatio-Temporal Ocean Sensing Data Prediction[J]. IEEE Access,2020, 8: 79192-79202. [40]Bhuyan M, Sarma K, Mastorakis N, Nonlinear Mobile Link Adaptation using Modified Flnn and Channel Sounder Arrangement[J]. IEEE Access, 2017, 5:10390–10402. [41]尹艳玲, 乔钢, 刘凇佐. 浅水时变多途信道特性分析与模型实验研究[J]. 声学学报, 2019, 44(01): 96-105. [42]张育芝, 张效民, 王安义, 等. 水声通信网络信道建模与仿真研究进展[J]. 科学技术与工程, 2021, 21(04): 1249-1261. [43]杨净翔. 基于机器学习的快变信道自适应调制编码技术研究[D]. 浙江大学, 2021. [44]秦智慧, 李宁, 刘晓彤. 无模型强化学习研究综述[J].计算机科学, 2021, 48(03): 180-187. [45]Jang B, Kim M, Harerimana G, et al. Q Learning Algorithms: A Comprehensive Classification and Applications[J]. IEEE Access, 2019,7: 133653-133667. [46]Wang Z, Wang C, Sun W. Adaptive transmission scheduling in time-varying underwater acoustic channels [C]// OCEANS 2015-MTS/IEEE Washington, Washington, DC, USA, IEEE, 2015:1:6. [47]Minh V, Kavukcuoglu K, Silver D, et al. Playing Atari with Deep Reinforcement Learning[J]. arXiv Preprint arXiv, 2013: 1312.5602. [48]杨思明, 单征, 丁煜, 等. 深度强化学习研究综述[J]. 计算机工程, 2021, 47(12): 19-29. [49]Yu Y, Wang T, Liew S. Deep-reinforcement learning multiple access for heterogeneous wireless networks[J]. IEEE Journal on Selected Areas in Communications, 2019, 37(6): 1277-1290. [50]Yu Y, Liew S, Wang T. Non-uniform time-step deep q-network for carrier-sense multiple access in heterogeneous wireless networks[J]. IEEE Transactions on Mobile Computing, 2020, 20(9): 2848-2861. [51]Naparstek O, Cohen K. Deep multi-user reinforcement learning for distributed dynamic spectrum access[J]. IEEE Transactions on Mobile Computing, 2018, 18(1):310-323. [52]Ye X, Fu L. Deep Reinforcement Learning Based MAC Protocol for Underwater Acoustic Networks[J]. IEEE Transactions on Mobile Computing, 2020, DOI: 10.1109/TMC.2020.3029844. ﹀
中图分类号：	TN929.3
开放日期：	2022-06-22

附件下载