Tag performance analysis


Rich is spending the week at ETH Zürich attending the Platform for Advanced Scientific Computing (PASC) Conference. Many thanks to two of the co-organizers, Markus Püschel and Olaf Shank, for being great hosts!

If you are interested in getting a copy of Rich’s talk slides + related material, see: hpcgarage.org/pasc14

View form Villa Hatt

The view from Villa Hatt at ETH Zürich is not too shabby!


Piyush, Jee, and Rich are in Portland representing the lab at the SIAM Parallel Processing ’14 meeting. If you are also there, please drop by our various events! List below; resources (papers, slides) are here: [link].


Rich just returned from the Workshop on Visualization and Analysis of Performance of Large-scale Software (VAPLS), co-located with IEEE Vis. The workshop featured a number of very compelling use of visualization to aid performance analysis, which in the long run will help make performance engineering more productive, accessible, and best of all, fun!

Rich’s talk materials appear here: [www]

An example of abstractly visualizing communication volume on an AMR tree.

An example of abstractly visualizing communication volume on an AMR tree, taken from one of the other presenters (see VAPLS’13 website).

Dagstuhl 13401

This week Rich has the great fortune of attending Dagstuhl Seminar 13401: Automatic Application Autotuning for HPC Architectures. He’ll be summarizing Jee’s work (with critical assists by Marat) on the energy archline model. For his slides and pointers to relevant materials, look here: [www]

"Classical" wing of Schloss Dagstuhl

“Classical” wing of Schloss Dagstuhl


We’re in Seattle this week attending the US DOE Workshop on Modeling and Simulation of Exascale Systems and Applications — “ModSim’13” [www]. A copy of Rich’s talk and pointers to supplemental materials is available here: hpcgarage.org/modsim13

SIAM CSE’13 talks

Aparna and Jee are at SIAM CSE 2013 in Boston this week, giving talks the fast multipole method and the energy roofline, respectively. (UPDATE: Click on the title for Jee’s talk to access the PDF slides.)

Tech report: Energy rooflines

Rooflines in time vs. "archlines" in energy

Rooflines in time vs. “archlines” in energy

Check out our new technical report, which presents a thought-experiment on the question of whether engineering an algorithm to optimize time differs from doing so with respect to energy (e.g., Joules).

  • Jee Whan Choi, Richard Vuduc. “A roofline model of energy.” Technical report no. GT-CSE-12-01, Georgia Institute of Technology, School of Computational Science and Engineering, Atlanta, GA, USA, December 2012. [PDF (452 KiB)]


We describe an energy-based analogue of the time-based roofline model of Williams, Waterman, and Patterson (Comm. ACM, 2009). Our goal is to explain—in simple, analytic terms accessible to algorithm designers and performance tuners—how the time, energy, and power to execute an algorithm relate. The model considers an algorithm in terms of operations, concurrency, and memory traffic; and a machine in terms of the time and energy costs per operation or per word of communica- tion. We confirm the basic form of the model experimentally. From this model, we suggest under what conditions we ought to expect an algorithmic time-energy trade-off, and show how algorithm properties may help inform power management.

Subject area: High-Performance Computing
Keywords: performance analysis; power and energy modeling; computational intensity; machine balance; roofline model

Note (January 14, 2013): The above link to the PDF now points to the official version on the Georgia Tech Library’s report repository. If you downloaded an earlier version (dated December 24, 2012), you may wish to re-download this newer version. (The only difference is the addition of Appendix B, made on December 31, 2012.)

Synthesis Lecture on GPUs

Our Morgan & Claypool Synthesis Lecture on GPGPU performance analysis is now available! (Most libraries should have proxy access; the book is $30 otherwise, DRM-free.)

Talk @ PPAM’11

I (Rich) am giving a talk today [PDF slides] at the International Conference on Parallel Processing and Applied Mathematics (PPAM), which is being held in the beautiful and historic town of Toru?, Poland. This is a great and well-organized meeting, and I look forward to the next meeting!