University of California, Berkeley
Abstract: Algorithms have two costs: arithmetic and communication — i.e. moving data between levels of a memory hierarchy or processors over a network. Communication costs (measured in time or energy per operation) already greatly exceed arithmetic costs, and the gap is growing over time following technological trends. Thus our goal is to design algorithms that minimize communication.
We present algorithms that attain provable lower bounds on communication, and show large speedups compared to their conventional counterparts. These algorithms include direct and iterative linear algebra, for dense and sparse matrices, direct n-body simulations, and some machine learning algorithms. Several of these algorithms exhibit perfect strong scaling, in both time and energy: run time (resp. energy) for a fixed problem size drops proportionally to the number of processors p (resp. is independent of p).
Finally, using recent extensions of the Holder-Brascamp-Lieb inequalities, we show how to generalize our approach to algorithms involving arbitrary loop nests and array accesses, assuming only that array subscripts are affine functions of the loop indices.
Bio: James Demmel received his BS in Mathematics from Caltech in 1975 and his PhD in Computer Science from UC Berkeley in 1983. After spending six years on the faculty of the Courant Institute, New York University, he joined the Computer Science Division and Mathematics Department at Berkeley in 1990, where he holds a joint appointment.