Rich's slides: PDF (32 MiB)
Your public critiques (and praise) are welcome: @hpcgarage
Additional reference materials:
- Choi, Dukhan, Liu, and Vuduc, “Algorithmic time, energy, and power of candidate HPC building blocks.” IPDPS'14 preprint
- Source code for microbenchmarks used in the above paper: Google Code
- Choi, Bedard, Fowler, and Vuduc, “A roofline model of energy.” IPDPS'13 preprint | “Extended remix” (tech report).
- Czechowski and Vuduc, “A theoretical framework for algorithm-architecture co-design.” IPDPS'13 preprint.
- Czechowski et al., “On the communication complexity of 3D FFTs and its implications for exascale.” ICS'12 preprint
Highly relevant and notable work by others:
- S. Williams, A. Waterman, D. Patterson, “Roofline: An Insightful Visual Performance Model for Floating-Point Programs and Multicore Architectures.” CACM (2009).
- G. Hager, J. Treibig, J. Habich, and G. Wellein, “Exploring performance and power properties of modern multicore chips via simple machine models.” CC:P&E (2013). Preprint: arXiv:1208.2908