[Please remove <h1>]
Objectives
This course studies efficient algorithms to exploit the full potential of the
parallel computer technology. Emphasis is on different techniques for obtaining
maximum parallelism in various numerical algorithms, especially those occurring
when solving matrix problems and partial differential equations, and the subsequent
mapping onto the computer. Example applications include: image processing, computational
fluid dynamics, structural analysis. Assignments will involve programming on
existing parallel machines as available.
References
Introduction to Parallel Computing, by V. Kumar, A. Grama, A. Gupta, G. Karypis,
(1994), Benjamin/Cummings.
Schedule
Three hours of lecture per week.
Outline
Parallel Architecture and Performance Models
Vector processors, distributed memory, SMP, NUMA, Beowulf clusters. Interconnection topologies. Bandwidth, latency, speedup, Amdahl's law.
Message Passing/Shared Memory Programming
High level description with examples and implementation. MPI, PVM, OpenMP, automatic parallelizing compilers, threads.
Matrix Computations
Dense and sparse matrix multiplication, Gaussian elimination, tridiagonal solvers. Data mapping onto parallel computers. Jacobi, Gauss-Seidel, SOR, Krylov subspace methods.
Fast Fourier Transform
Butterfly algorithm, fast multipole methods.
Graph Partitioning
Bisection, spectral, Metis, ParMetis.
Domain Decomposition Methods
Algorithm decomposition: additive/multiplicative Schwarz, overlapping/nonover-lapping methods.