帮助 本站公告
您现在所在的位置:网站首页 > 知识中心 > 文献详情
文献详细Journal detailed

基于MPSoC并行调度的矩阵乘法加速算法研究
Research on Acceleration of Matrix Multiplication Based on Parallel Scheduling on MPSoC

作  者: (杨飞); (马昱春); (侯金); (徐宁);

机构地区: 中南民族大学智能无线通信湖北省重点实验室,武汉430074 清华大学计算机科学与技术系,北京100084

出  处: 《计算机科学》 2017年第8期36-41,共6页

摘  要: 矩阵乘法是数值分析以及图形图像处理算法的基础,通用的矩阵乘法加速器设计一直是嵌入式系统设计的研究热点。但矩阵乘法由于计算复杂度高,处理效率低,常常成为嵌入式系统运算速度的瓶颈。为了在嵌入式领域更好地使用矩阵乘法,提出了基于MPSoC(MultiProcessor System-on-Chip)的软硬件协同加速的架构。在MPSoC的架构下,一方面,设计了面向硬件约束的矩阵分块方法,从而实现了通用的矩阵乘法加速器系统;另一方面,通过利用MPSoC下的多核架构,提出了相应的任务划分和负载平衡调度算法,提高了并行效率和整体系统加速比。实验结果表明,所提架构及算法实现了通用的矩阵乘法计算,并且通过软硬件协同设计实现的多核并行调度算法与传统单核设计相比在计算效率方面得到了显著的提高。 Matrix multiplication is the basic algorithm of the numerical analysis,graphics and image processing.General matrix multiplication accelerator has always been a research focus in the embedded system design.However,due to the high complexity and the low processing efficiency,matrix multiplication becomes the bottleneck of computation speed of embedded systems.In order to use matrix multiplication in the embedded field,a synergy acceleration architecture of software and hardware based on MPSoC was proposed in this paper.With MPSoC architecture,the partitioning of the matrix considering hardware constraints is implemented in our HW/SW system to enable the computation of general matrix multiplications.The parallel computation with multiple cores and hardware function unit is realized with the load balance algorithms.Parallel efficiency and speed-up ratio are improved.The experimental results show that the proposed general matrix multiplication approach can achieve significant speed-up over the traditional approaches with single core.

关 键 词: 矩阵乘法 并行计算 负载平衡

相关作者

相关机构对象

相关领域作者

作者 庞菊香
作者 康秋实
作者 康超
作者 廖伟导
作者 廖刚