Tag performance modeling


Jee is back from GPGPU-7, where he is presenting some new work on a hybrid CPU+GPU implementation of the kernel-independent fast multipole method, or FMM.

  • Jee Choi, Aparna Chandramowlishwaran, Kamesh Madduri, Richard Vuduc. “A CPU-GPU hybrid implementation and model-driven scheduling of the fast multipole method.” In Proc. 7th Wkshp. General-Purpose Processing using GPUs (GPGPU-7), Salt Lake City, UT, March 2014. [Paper|Slides]
Jee & Naila @ GPGPU-7

Representing Georgia Tech at GPGPU-7, Jee and Naila (from a different research lab)


Piyush, Jee, and Rich are in Portland representing the lab at the SIAM Parallel Processing ’14 meeting. If you are also there, please drop by our various events! List below; resources (papers, slides) are here: [link].


Rich just returned from the Workshop on Visualization and Analysis of Performance of Large-scale Software (VAPLS), co-located with IEEE Vis. The workshop featured a number of very compelling use of visualization to aid performance analysis, which in the long run will help make performance engineering more productive, accessible, and best of all, fun!

Rich’s talk materials appear here: [www]

An example of abstractly visualizing communication volume on an AMR tree.

An example of abstractly visualizing communication volume on an AMR tree, taken from one of the other presenters (see VAPLS’13 website).

Dagstuhl 13401

This week Rich has the great fortune of attending Dagstuhl Seminar 13401: Automatic Application Autotuning for HPC Architectures. He’ll be summarizing Jee’s work (with critical assists by Marat) on the energy archline model. For his slides and pointers to relevant materials, look here: [www]

"Classical" wing of Schloss Dagstuhl

“Classical” wing of Schloss Dagstuhl


We’re in Seattle this week attending the US DOE Workshop on Modeling and Simulation of Exascale Systems and Applications — “ModSim’13” [www]. A copy of Rich’s talk and pointers to supplemental materials is available here: hpcgarage.org/modsim13

Kent on exascale @ ICS’12

Kent is in Venice, Italy this week presenting our 3D FFT exascale projection paper at ICS’12 (Session 6).

  • K. Czechowski, C. McClanahan, C. Battaglino, K. Iyer, P.-K. Yeung, R. Vuduc. “On the communication complexity of the 3D FFT and its implications for exascale.” In Proc. ACM Int’l. Conf. Supercomputing (ICS), San Servolo Island, Venice, Italy, June 2012. doi:10.1145/2304576.2304604 [PDF Preprint | PDF slides]

UPDATE: Kent is joined by fellow Georgia Tech HPC students, Rob McColl and Oded Green, who are presenting their paper on a new parallel algorithm for merging on GPUs [doi:10.1145/2304576.2304621].

UPDATE 2: PDF slides are available (see reference above).

Georgia Tech @ ICS'12 in Venice, Italy

Georgia Tech @ ICS’12: (left-to-right) Kent (HPC Garage) and fellow Georgia Tech’ers, Rob and Oded (of David Bader’s HPC Lab)

Talk @ PPAM’11

I (Rich) am giving a talk today [PDF slides] at the International Conference on Parallel Processing and Applied Mathematics (PPAM), which is being held in the beautiful and historic town of Toru?, Poland. This is a great and well-organized meeting, and I look forward to the next meeting!

PPoPP’10 paper on GPU SpMV

Jee’s and Amik’s paper on autotuning sparse matrix-vector multiply for GPUs will appear at PPoPP 2010 in Bangalore, India. Great work, gang!

  • J. W. Choi, A. Singh, R. Vuduc. “Model-driven autotuning of sparse matrix-vector multiply on GPUs.” In Proc. ACM SIGPLAN Symp. Principles and Practice of Parallel Programming (PPoPP), Bangalore, India, January 2010. (accepted)

ICS’09 talk & paper.

Sundar is speaking today at the ACM International Conference on Supercomputing (ICS) on his CPU/GPU stencil kernel work.

Citation & slides: Sundaresan Venkatasubramanian and Richard Vuduc. “Tuned and wildly asynchronous stencil kernels for hybrid CPU/GPU platforms.” In Proc. ACM Int’l. Conf. Supercomputing (ICS), Yorktown Heights, NY, USA, June 2009. [Paper | Slides]