ISBN-13: 9783659147326 / Angielski / Miękka / 2012 / 128 str.
The objective of this project report is to explore the architecture of E1350 IBM eServer Cluster and parallel programming in OpenMP, MPI and MPI+OpenMP using Intel C/C++ Compilers. By implementing some applications using these programming models, we can analyze the effects of each of the model on speedup. I have used four applications to analyze the effects that are Jacobi Iterative Method, Alternating Direction Implicit (ADI), Matrix Multiplication and Bucket Sorting. The general observations are as follows: Using MPI and MPI+OpenMP will not give significant speedup if used directly with the same program structure. We need to have modified the structure of the programs to reduce communication-to-computation ratio as much as possible. Running threads on different physical processors can cause significant barrier. When using 8-threads, synchronization must occur between all threads and this need communications between two physical processors. Communications between two physical processors is expensive compare to communications between cores. Thus, more cost is paid in term of more waiting time at barrier."
The objective of this project report is to explore the architecture of E1350 IBM eServer Cluster and parallel programming in OpenMP, MPI and MPI+OpenMP using Intel C/C++ Compilers. By implementing some applications using these programming models, we can analyze the effects of each of the model on speedup. I have used four applications to analyze the effects that are Jacobi Iterative Method, Alternating Direction Implicit (ADI), Matrix Multiplication and Bucket Sorting. The general observations are as follows: • Using MPI and MPI+OpenMP will not give significant speedup if used directly with the same program structure. We need to have modified the structure of the programs to reduce communication-to-computation ratio as much as possible. • Running threads on different physical processors can cause significant barrier. • When using 8-threads, synchronization must occur between all threads and this need communications between two physical processors. • Communications between two physical processors is expensive compare to communications between cores. Thus, more cost is paid in term of more waiting time at barrier.