Category Publications


Jee and Rich are in Phoenix this week for IPDPS’14, where Jee will present his latest updates to his “energy roofline” model [Tue May 20, Session 11] :

Rich also has the pleasure of serving in a panel discussion at the High-Performance, Power-Aware Computing workshop. His position is that power-aware scheduling is really an optimal control problem. For his 1-slide summary, see: [link]

Outside our group, Georgia Tech has a very strong representation this year. Looking at the program, we count 9 papers, including two of the four “Best Papers.” Go, Buzz!

Jee chowing down on a pre-conference pulpo at Mariscos Sinaloa.

Jee chowing down on a pre-conference pulpo at Mariscos Sinaloa.

ISCA’14 paper to appear

Kent, in collaboration with Victor Lee among others at Intel, has an exciting new paper that will appear at ISCA’14. It teases apart the improvements in energy-efficiency due to process technology improvements and microarchitectural changes. Congrats to Kent, Victor, and the rest of the Intel team. We’ll post a preprint when it’s ready; stay tuned!


Jee is back from GPGPU-7, where he is presenting some new work on a hybrid CPU+GPU implementation of the kernel-independent fast multipole method, or FMM.

  • Jee Choi, Aparna Chandramowlishwaran, Kamesh Madduri, Richard Vuduc. “A CPU-GPU hybrid implementation and model-driven scheduling of the fast multipole method.” In Proc. 7th Wkshp. General-Purpose Processing using GPUs (GPGPU-7), Salt Lake City, UT, March 2014. [Paper|Slides]
Jee & Naila @ GPGPU-7

Representing Georgia Tech at GPGPU-7, Jee and Naila (from a different research lab)

Check us out at SC’13!

We have a number of activities happening at SC’13 — if you are there at the Denver Convention Center this year, please check us out!

In addition, Georgia Tech has its usual large presence at SC13. See for details.


(L-to-R) Piyush, Marat, and Jee at SC'13

(L-to-R) Piyush, Marat, and Jee at our lab pre-conference celebratory dinner. Have a great meeting, gang!

Sangmin @ ISSTA’13

Sangmin is in Lugano, Switzerland this week to present his new paper on Griffin at ISSTA, the 2013 International Symposium on Software Testing and Analysis. Griffin is a tool that aims to help programmers find bugs in their concurrent software. Congrats, Sangmin!

  • Sangmin Park, Mary Jean Harrold, and Richard Vuduc. “Griffin: Grouping suspicious memory-access patterns to improve understanding of concurrency bugs.” In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA), Lugano, Switzerland, July 15-20, 2013.

    Griffin's "Bug Graph" concept

    A bug graph, one of Griffin’s aids for programmers who need to debug their concurrent software.


IPDPS’13 preprints

A notional space of processors.

A notional space of processors.

Kent and Jee will each present a paper at IPDPS’13, to be held in Boston in May. Kent’s paper is about a unique take on high-level algorithm-architecture co-design; Jee’s paper is about his “roofline” energy model. If you plan to attend, be sure to see their talks! We’ve also posted preprints under Papers.

  • Jee Choi and Richard Vuduc. A roofline model of energy. In Proc. IEEE Int’l. Parallel and Distributed Processing Symp. (IPDPS), Boston, MA, USA, May 2013. [PDF | BibTeX]
  • Kenneth Czechowski and Richard Vuduc. A theoretical framework for algorithm-architecture co-design. In Proc. IEEE Int’l. Parallel and Distributed Processing Symp. (IPDPS), Boston, MA, USA, May 2013. [PDF | BibTeX]

Tech report: Energy rooflines

Rooflines in time vs. "archlines" in energy

Rooflines in time vs. “archlines” in energy

Check out our new technical report, which presents a thought-experiment on the question of whether engineering an algorithm to optimize time differs from doing so with respect to energy (e.g., Joules).

  • Jee Whan Choi, Richard Vuduc. “A roofline model of energy.” Technical report no. GT-CSE-12-01, Georgia Institute of Technology, School of Computational Science and Engineering, Atlanta, GA, USA, December 2012. [PDF (452 KiB)]


We describe an energy-based analogue of the time-based roofline model of Williams, Waterman, and Patterson (Comm. ACM, 2009). Our goal is to explain—in simple, analytic terms accessible to algorithm designers and performance tuners—how the time, energy, and power to execute an algorithm relate. The model considers an algorithm in terms of operations, concurrency, and memory traffic; and a machine in terms of the time and energy costs per operation or per word of communica- tion. We confirm the basic form of the model experimentally. From this model, we suggest under what conditions we ought to expect an algorithmic time-energy trade-off, and show how algorithm properties may help inform power management.

Subject area: High-Performance Computing
Keywords: performance analysis; power and energy modeling; computational intensity; machine balance; roofline model

Note (January 14, 2013): The above link to the PDF now points to the official version on the Georgia Tech Library’s report repository. If you downloaded an earlier version (dated December 24, 2012), you may wish to re-download this newer version. (The only difference is the addition of Appendix B, made on December 31, 2012.)

Synthesis Lecture on GPUs

Our Morgan & Claypool Synthesis Lecture on GPGPU performance analysis is now available! (Most libraries should have proxy access; the book is $30 otherwise, DRM-free.)

RC’12: gnitupmoC

Cong is in Copenhagen presenting our work on synthesizing program inverses for the case of programs with loops, at RC’12. UPDATE: Slides posted (see link below).

  • Cong Hou, Daniel Quinlan, David Jefferson, Richard Fujimoto, Richard Vuduc. “Loop synthesis for program inversion.” In Proc. 4th Wkshp. Reversible Computation (RC), Copenhagen, Denmark, July 2012. [PDF Preprint | PDF slides]

    Scene from Copenhagen - RC'12, by Cong Hou

    Photo by Cong Hou –

Kent on exascale @ ICS’12

Kent is in Venice, Italy this week presenting our 3D FFT exascale projection paper at ICS’12 (Session 6).

  • K. Czechowski, C. McClanahan, C. Battaglino, K. Iyer, P.-K. Yeung, R. Vuduc. “On the communication complexity of the 3D FFT and its implications for exascale.” In Proc. ACM Int’l. Conf. Supercomputing (ICS), San Servolo Island, Venice, Italy, June 2012. doi:10.1145/2304576.2304604 [PDF Preprint | PDF slides]

UPDATE: Kent is joined by fellow Georgia Tech HPC students, Rob McColl and Oded Green, who are presenting their paper on a new parallel algorithm for merging on GPUs [doi:10.1145/2304576.2304621].

UPDATE 2: PDF slides are available (see reference above).

Georgia Tech @ ICS'12 in Venice, Italy

Georgia Tech @ ICS’12: (left-to-right) Kent (HPC Garage) and fellow Georgia Tech’ers, Rob and Oded (of David Bader’s HPC Lab)