ISBN-13: 9783639060775 / Angielski / Miękka / 2008 / 80 str.
The rapid revolution in microprocessor chip architecture due to multi-core and many-core technology is presenting an unprecedented challenge to the application developers as well as system software designers: how to exploit the performance potential due to such architectures effectively and efficiently? The author conducts an in-depth study on optimizing the Fast Fourier Transform (FFT) on a many-core architecture, the IBM Cyclops-64 (C64) chip architecture. The study demonstrates how many-core architectures could be used to achieve a scalable high-performance implementation of FFT both in 1D and 2D cases. The author also analyzes the optimization challenges and opportunities, including problem decomposition, load balancing, work distribution, and data-reuse, together with the exploiting of the underlying architecture features such as the massive parallelism, explicit multi-level memory hierarchy and large register files. The study provides quantitative evidence on the importance of optimization techniques applied and valuable experience toward establishing an effective programming methodology for C64-like many-core architectures.