Tag performance tuning

GPGPU-7

Jee is back from GPGPU-7, where he is presenting some new work on a hybrid CPU+GPU implementation of the kernel-independent fast multipole method, or FMM.

  • Jee Choi, Aparna Chandramowlishwaran, Kamesh Madduri, Richard Vuduc. “A CPU-GPU hybrid implementation and model-driven scheduling of the fast multipole method.” In Proc. 7th Wkshp. General-Purpose Processing using GPUs (GPGPU-7), Salt Lake City, UT, March 2014. [Paper|Slides]
Jee & Naila @ GPGPU-7

Representing Georgia Tech at GPGPU-7, Jee and Naila (from a different research lab)

Two papers @ IPDPS’10

The HPC Garage will have two papers at the upcoming IPDPS conference, to be held here in Atlanta, April 19–23. Congratulations to Aparna, who led the two papers, as well as our colleagues at Intel (Kath Knobe) and Lawrence Berkeley National Laboratory (Sam Williams and Lenny Oliker) for their significant contributions!

Here’s a brief description of the papers …

The first paper is a detailed performance evaluation of the Concurrent Collections (CnC) parallel programming model on some of the latest multicore systems. This paper demonstrates the extraordinary potential of the CnC model, and raises a number of questions about how the model should evolve for more complex programs. The second paper is the first extensive multicore optimization and tuning experiment for the kernel-independent fast multipole method (KIFMM). Surprisingly, this paper shows competitive performance from an Intel Nehalem-based multicore system relative to the GPU implementation we showcased in our best paper-nominated SC’09 submission (led by Lashuk and Biros).

  • A. Chandramowlishwaran, K. Knobe, and R. Vuduc. Performance evaluation of Concurrent Collections on high-performance multicore computing systems. In Proc. IEEE Int’l. Parallel and Distributed Processing Symp. (IPDPS), Atlanta, GA, USA, April 2010. (accepted).
  • A. Chandramowlishwaran, S. Williams, L. Oliker, I. Lashuk, G. Biros, and R. Vuduc. Optimizing and tuning the fast multipole method for state-of-the-art multicore architectures. In Proc. IEEE Int’l. Parallel and Distributed Processing Symp. (IPDPS), Atlanta, GA, USA, April 2010. (accepted).

Dagstuhl seminar.

Attendees of the Dagstuhl Seminar 10191 on Program Composition and Optimization

Nerds from the software engineering and the autotuning/HPC communities attempt to understand each other’s language and find common research ground at the Dagstuhl Seminar on Program Composition and Optimization (10191), at which Rich was an attendee.

When in Rome…

Seunghwa Kang, a friend of the Garage, is off to Rome, Italy, IEEE International Parallel and Distributed Processing Symposium (IPDPS) to give a talk on his paper, “Understanding the design trade-offs among current multicore platforms for numerical computations.”

Citation & PDF: Seunghwa Kang, David A. Bader, and Richard Vuduc. “Understanding the design trade-offs among current multicore platforms for numerical computations.” In Proc. IEEE Int’l. Parallel and Distributed Processing Symp. (IPDPS), Rome, Italy, May 2009. [PDF]