论文中文题名: | 面向通信基带信号处理的可重构阵列处理器研究与设计 |
姓名: | |
学号: | 19207205059 |
保密级别: | 公开 |
论文语种: | chi |
学科代码: | 085208 |
学科名称: | 工学 - 工程 - 电子与通信工程 |
学生类型: | 硕士 |
学位级别: | 工程硕士 |
学位年度: | 2022 |
培养单位: | 西安科技大学 |
院系: | |
专业: | |
研究方向: | 集成电路设计 |
第一导师姓名: | |
第一导师单位: | |
论文提交日期: | 2022-06-29 |
论文答辩日期: | 2022-06-08 |
论文外文题名: | Research and Design of Reconfigurable Array Processor for Communication Baseband Signal Processing |
论文中文关键词: | |
论文外文关键词: | Reconfigurable architecture ; Array processor ; Communication baseband algorithm ; Calculation granularity ; Parallelization |
论文中文摘要: |
可重构结构具有灵活的信息配置能力,在处理计算密集型和访存密集型应用时拥有巨大潜力。移动通信技术中新兴应用的出现对通信基带信号处理的硬件性能提出了更高的要求,在并行计算领域占有优势的可重构架构成为实现基带信号处理算法的理想硬件平台。然而,在可重构阵列处理器上实现基带信号处理算法时存在适应性差和计算效率低的问题,因此论文研究并设计了面向基带信号处理的可重构阵列处理器。 首先,提取通信基带信号处理典型算法的算子,并评估算法的定点精度,以指导可重构阵列处理器的设计。一方面,通过Profile性能分析工具获取快速傅里叶变换(Fast Fourier Transform, FFT)、有限冲激响应(Finite Impulse Response, FIR)和大规模多输入多输出(Multiple-Input Multiple-Output, MIMO)检测算法的特性,提取了抽象的粗粒度算子。另一方面,通过对算法进行定点仿真的实验结果说明,当硬件结构具有15位以上的数据位宽时,定点精度曲线能够收敛。 其次,针对基带信号处理算法在可重构阵列处理器上适应性差的问题,设计面向通信应用的可重构处理单元。该处理单元(Process Element, PE)将16位的数据位宽扩展为32位,以适配复数操作。同时,在PE中增加了基带信号处理专用指令。通过可重构处理单元执行复数矩阵乘法的实验结果表明,专用指令的实现方法比通用指令缩短了74%的代码行数,减少了61%的存储访问次数,且平均相对误差降低了85%。 然后,针对不同粒度数据与底层硬件结构不协调导致计算效率低的问题,提出一种计算粒度动态配置结构。该结构将计算粒度分为8位、16位和32位,设计了数据组合、数据拆分、并行加法和并行乘法四种功能,使阵列结构的并行度和灵活性得到提高。实验结果表明,计算粒度动态配置电路的最大工作频率为133.5MHz,能够实现计算中不同粒度数据的动态配置。 最后,开发面向通信基带信号处理的可重构阵列原型系统,设计了FFT算法、FIR算法和大规模MIMO检测算法的可重构实现方案,并完成现场可编程门阵列(Field Programmable Gate Array, FPGA)验证。可重构实现结果表明,蝶形运算模块并行化方案为8点FFT算法提供了2.90倍的加速比,滤波计算的流水线并行方案为8阶FIR滤波算法提供了7.28倍的加速比,Gram矩阵计算并行化方案为大规模MIMO检测算法最大提供了5.57倍的加速比。基于ZC706开发板的硬件实验结果表明,可重构阵列处理器在112MHz的工作频率下资源占用率低于60%,实现了不同算法在阵列结构上的灵活配置和并行加速。 |
论文外文摘要: |
Reconfigurable architecture has great potential in computing intensive and memory intensive applications due to its flexible information configuration. The emergence of new applications in mobile communication technology has put forward higher requirements on the hardware performance of communication baseband signal processing. The reconfigurable architecture, which has advantages in the field of parallel computing, has become an ideal hardware platform to implement baseband signal processing algorithm. However, the implementation of baseband signal processing algorithm on reconfigurable array processor has problems of poor adaptability and low computational efficiency. Therefore, this thesis studies and designs a reconfigurable array processor for baseband signal processing. Firstly, the operators of typical algorithms for communication baseband signal processing are extracted, and the fixed-point accuracy of the algorithms is evaluated to guide the design of reconfigurable array processor. On the one hand, the characteristics of Fast Fourier Transform (FFT) , Finite Impulse Response (FIR) and massive Multiple-Input Multiple-Output (MIMO) detection algorithms are obtained by Profile performance analysis tool, and abstract coarse-grained operators are extracted. On the other hand, the experimental results of fixed-point simulation show that the fixed-point precision curve can converge when the hardware structure has more than 15 bits data width. Secondly, aiming at the poor adaptability of baseband signal processing algorithm on reconfigurable array processor, a reconfigurable process element for communication applications is designed. The Process Element (PE) expands the 16-bit data width to 32-bit to accommodate complex operations. At the same time, special instructions for baseband signal processing are added to the PE. The experimental results of complex matrix multiplication performed by the reconfigurable process element show that the implementation method of the special instruction shortens 74% of the code lines, reduces the number of memory access by 61%, and reduces the average relative error by 85% compared with the general instruction. Thirdly, aiming at the problem of low computing efficiency caused by disharmony between the data of different granularity and the underlying hardware structure, a structure of computational granularity dynamic configuration is proposed. The structure divides the computing granularity into 8-bit, 16-bit and 32-bit, and designs four functions of data combination, data splitting, parallel addition and parallel multiplication, which improves the parallelism and flexibility of the array structure. The experimental results show that the maximum working frequency of the dynamic configuration circuit is 133.5MHz, which can realize the dynamic configuration of different granularity data in the calculation. Finally, a reconfigurable array prototype system for communication baseband signal processing is developed, and a reconfigurable implementation scheme for FFT algorithm, FIR algorithm and massive MIMO detection algorithm is designed, and Field Programmable Gate Array (FPGA) verification is completed. The reconfigurable implementation results show that the parallelization scheme of butterfly operation module provides 2.90 times speedup for 8-point FFT algorithm, the pipeline-parallel scheme of filtering calculation provides 7.28 times speedup for 8-order FIR filter algorithm, and Gram matrix computation parallelization scheme provides a maximum speedup of 5.57 times for massive MIMO detection algorithm. The hardware experiment results based on ZC706 development board show that the resource utilization rate of the reconfigurable array processor is less than 60% at the frequency of 112MHz, which achieves flexible configuration and parallel acceleration of different algorithms on the array structure. |
参考文献: |
[3]魏少军,李兆石,朱建峰,刘雷波.可重构计算:软件可定义的计算引擎[J].中国科学:信息科学,2020,50(09):1407-1426. [7]赵亚军, 郁光辉, 徐汉青. 6G 移动通信网络: 愿景、挑战与关键技术[J]. 中国科学: 信息科学, 2019, 49: 963–987. [21]戴庆达,叶茂.基于FPGA的高精度时间数字转换电路设计[J].北京邮电大学学报,2020,43(04):88-94. [35]马丽萍, 张骁煜, 白雨鑫, 陈鑫, 张颖. 基于近似计算的精度动态可调FFT处理器[J].上海交通大学学报, 2022, 56(02):223-230. [37]唐川. 大规模MIMO系统信号检测技术算法研究及硬件加速[D].长沙:国防科学技术大学,2017. [43]谭颖然.大规模MIMO检测可重构计算芯片架构关键技术研究[D].北京:清华大学,2018. [48]蒋林,贺飞龙,山蕊,王帅,吴皓月,武鑫.可重构视频阵列处理器测试平台设计与实现[J].系统仿真学报,2020,32(05):792-800. |
中图分类号: | TN492 |
开放日期: | 2022-06-29 |