Computational clusters have long provided a mechanism for the acceleration of high performance computing (HPC) applications. This book addresses the issue of fault-tolerance through checkpointing. It presents a general overview of checkpointing and how it'
Computational clusters have long provided a mechanism for the acceleration of high performance computing (HPC) applications. This book addresses the i...