Jee is back from GPGPU-7, where he is presenting some new work on a hybrid CPU+GPU implementation of the kernel-independent fast multipole method, or FMM.

  • Jee Choi, Aparna Chandramowlishwaran, Kamesh Madduri, Richard Vuduc. “A CPU-GPU hybrid implementation and model-driven scheduling of the fast multipole method.” In Proc. 7th Wkshp. General-Purpose Processing using GPUs (GPGPU-7), Salt Lake City, UT, March 2014. [Paper|Slides]
Jee & Naila @ GPGPU-7

Representing Georgia Tech at GPGPU-7, Jee and Naila (from a different research lab)


Jee is headed out to the NVIDIA GPU Technology Conference (GTC) [link]. If you are attending, please drop by his poster, “P0159: A Roofline Model of Energy” [PDF].gtc-2013

Synthesis Lecture on GPUs

Our Morgan & Claypool Synthesis Lecture on GPGPU performance analysis is now available! (Most libraries should have proxy access; the book is $30 otherwise, DRM-free.)

Visiting Tsubame 2.0

Rich is in Japan this week, shown here feeling small against the backdrop of Tsubame 2.0, the petascale GPU-based supercomputer at the Tokyo Institute of Technology. The cost-performance ratio—and the associated engineering design needed to get there—is impressive. Interestingly, every undergrad at Tokyo Tech gets an account on the system, which is a big step toward democratizing HPC. Many thanks to Prof. Satoshi Matsuokafor taking the time out of his hectic schedule to offer a tour. (On a Japanese holiday, no less!)

Rich @ Tsubame 2.0

Rich @ Tsubame 2.0 at the Tokyo Institute of Technology. Photo courtesy of Prof. Satoshi Matusoka, Tsubame’s lead architect.

Talk @ PPAM’11

I (Rich) am giving a talk today [PDF slides] at the International Conference on Parallel Processing and Applied Mathematics (PPAM), which is being held in the beautiful and historic town of Toru?, Poland. This is a great and well-organized meeting, and I look forward to the next meeting!


The HPC Garage is back from Seattle / Bellevue, where we participated in the AMD Fusion Developer Summit. Rich gave a panel talk, which previewed some of the lab’s on-going work on algorithm-architecture co-design (PDF slides). Also, the HPC Garage band, “Casey and the Bloodhounds,” made its debut. YouTube video forthcoming, so stay (performance) tuned!

DiGPUFFT (“dig-puffed”): Distributed GPU FFTs

Chris has released  DiGPUFFT  (pronounced, “dig-puffed”), a set of code patches that add GPU capabilities to P3DFFT, Dmitry Pekurovsky’s distributed memory 3D FFT library. A technical report describing these patches and our performance results for the Keeneland system at ORNL is in the works; it should appear on the DiGPUFFT website later this month or early next.

MICRO’10 and HPCA’11 GPU tutorials.

Rich and collaborator in computer architecture, Hyesoon Kim, gave two tutorials on performance analysis and tuning for GPGPU platforms at two of the big computer architecture conferences, MICRO 2010 (December, in ATL) and HPCA 2011 (February, in San Antonio). Slides from MICRO 2010 tutorial are available, but stay tuned for the HPCA slides, which update that material for clarity and brevity.

And here’s a snap from the beautiful Riverwalk area of San Antonio.

Gordon Bell & HPC PhD, snagged!

Congrats to the Rahimian/Biros-led team, who won this year’s Gordon Bell Prize at SC! In addition, Aparna also won the George Michael Memorial HPC Fellowship, two of which are given out each year to rising-star PhD students.

Overall, a great week for Georgia Tech. Go, Buzz!

(L-to-R: Rich, Shravan Veerapanen, George Biros, Abtin Rahimian, Logan Moon, Aparna Chandramowlishwaran)

Talk at CCGSC’10.

CCGSC'10 attendees (Flat Rock, NC)

Rich just returned from beautiful Flat Rock, North Carolina, where he had a chance to participate in the biennial CCGSC meeting. Evidently, the acronym expands differently each year; this year, it was Clouds, Clusters, and Grids for Scientific Computing. The full program and Rich’s talk slides appear here: