ISBN-13: 9783639172836 / Angielski / Miękka / 2009 / 108 str.
ISBN-13: 9783639172836 / Angielski / Miękka / 2009 / 108 str.
Scaling the performance of applications with little thread-level parallelism is one of the biggest impediments to the success of multi-core architectures. At the same time, the long latency of memory accesses represents one of the largest performance bottlenecks for individual program threads. Various prefetching techniques have been previously developed to fetch critical data before they are requested by the processor and eliminate processor stalls. This dissertation proposes two new techniques that utilize extra cores of a chip multiprocessor (CMP) as prefetching engines to increase the performance of single program threads. These approaches leverage the execution capabilities of chip multiprocessors to compute data addresses that are likely to miss in the cache and prefetch them ahead of program thread load requests. We demonstrate significant performance improvements over a baseline that already includes an aggressive hardware stream prefetcher. We also show that the proposed techniques provide competitive performance, incur less energy overhead, and require considerably simpler hardware support than previously proposed prefetching mechanisms.