Design of Synchronization Accelerator in HPC Computing Node
doi: 10.3969/j.issn.1001-0548.2012.01.018
- Received Date: 2011-07-15
- Rev Recd Date: 2011-11-15
- Publish Date: 2012-02-15
-
Key words:
- collective operation /
- communication system /
- computing node /
- fine-grain synchronization /
- high performance computer /
- hybrid programming /
- message passing
Abstract: With the widely use of acceleration devices, hardware parallelism of single hybrid programming computer (HPC) node has increased many. As a result, both on-chip communication and inter-node communication become more and more frequently. Apparently, communication is becoming the bottleneck of system performance. This paper proposes a design of hardware module called synchronization accelerator to accelerate synchronization communication patterns. These patterns include fine-grain synchronization, barrier, and all-reduce. At the scale of 16 processes, synchronization accelerator can achieve about 4 times speedup than software-based collective operations. Also, the performance of benchmark LU can achieve 20% improvement with the use of synchronization accelerator.
Citation: | CHEN Fei, CAO Zheng, WANG Kai, HU Nong-da, AN Xue-jun. Design of Synchronization Accelerator in HPC Computing Node[J]. Journal of University of Electronic Science and Technology of China, 2012, 41(1): 92-97. doi: 10.3969/j.issn.1001-0548.2012.01.018 |