Tag performance optimization

Two papers @ IPDPS’10

The HPC Garage will have two papers at the upcoming IPDPS conference, to be held here in Atlanta, April 19–23. Congratulations to Aparna, who led the two papers, as well as our colleagues at Intel (Kath Knobe) and Lawrence Berkeley National Laboratory (Sam Williams and Lenny Oliker) for their significant contributions!

Here’s a brief description of the papers …

The first paper is a detailed performance evaluation of the Concurrent Collections (CnC) parallel programming model on some of the latest multicore systems. This paper demonstrates the extraordinary potential of the CnC model, and raises a number of questions about how the model should evolve for more complex programs. The second paper is the first extensive multicore optimization and tuning experiment for the kernel-independent fast multipole method (KIFMM). Surprisingly, this paper shows competitive performance from an Intel Nehalem-based multicore system relative to the GPU implementation we showcased in our best paper-nominated SC’09 submission (led by Lashuk and Biros).

  • A. Chandramowlishwaran, K. Knobe, and R. Vuduc. Performance evaluation of Concurrent Collections on high-performance multicore computing systems. In Proc. IEEE Int’l. Parallel and Distributed Processing Symp. (IPDPS), Atlanta, GA, USA, April 2010. (accepted).
  • A. Chandramowlishwaran, S. Williams, L. Oliker, I. Lashuk, G. Biros, and R. Vuduc. Optimizing and tuning the fast multipole method for state-of-the-art multicore architectures. In Proc. IEEE Int’l. Parallel and Distributed Processing Symp. (IPDPS), Atlanta, GA, USA, April 2010. (accepted).

Dagstuhl seminar.

Attendees of the Dagstuhl Seminar 10191 on Program Composition and Optimization

Nerds from the software engineering and the autotuning/HPC communities attempt to understand each other’s language and find common research ground at the Dagstuhl Seminar on Program Composition and Optimization (10191), at which Rich was an attendee.

Top ParCo’09 Q1 download

Our co-authored paper on multicore optimizations for sparse matrix-vector multiply was the “hottest” (most downloaded) Journal of Parallel Computing (ParCo) article in Q1 2009.

[ParCo Top 25 List, Q1 2009] S. Williams, L. Oliker, R. Vuduc, J. Shalf, K. Yelick, J. Demmel. “Optimization of sparse matrix-vector multiplication on emerging multicore platforms.” J. Par. Comput. (ParCo)35(3), pp. 178—194, March 2009. doi:10.1016/j.parco.2008.12.006

ICS’09 talk & paper.

Sundar is speaking today at the ACM International Conference on Supercomputing (ICS) on his CPU/GPU stencil kernel work.

Citation & slides: Sundaresan Venkatasubramanian and Richard Vuduc. “Tuned and wildly asynchronous stencil kernels for hybrid CPU/GPU platforms.” In Proc. ACM Int’l. Conf. Supercomputing (ICS), Yorktown Heights, NY, USA, June 2009. [Paper | Slides]

When in Rome…

Seunghwa Kang, a friend of the Garage, is off to Rome, Italy, IEEE International Parallel and Distributed Processing Symposium (IPDPS) to give a talk on his paper, “Understanding the design trade-offs among current multicore platforms for numerical computations.”

Citation & PDF: Seunghwa Kang, David A. Bader, and Richard Vuduc. “Understanding the design trade-offs among current multicore platforms for numerical computations.” In Proc. IEEE Int’l. Parallel and Distributed Processing Symp. (IPDPS), Rome, Italy, May 2009. [PDF]