HPC Garage

  • Increase font size
  • Default font size
  • Decrease font size
Home Publications
Publications

See menu at right to view by category.



Diagnosis, tuning, and redesign for multicore performance: A case study of the fast multipole method

E-mail
Aparna Chandramowlishwaran, Kamesh Madduri, and Richard Vuduc. "Diagnosis, tuning, and redesign for multicore performance: A case study of the fast multipole method." In Proc. ACM/IEEE Conf. Supercomputing (SC), New Orleans, LA, USA, November 2010. (to appear). [PDF]
 

On the limits of GPU acceleration.

E-mail
Richard Vuduc, Aparna Chandramowlishwaran, Jee Whan Choi, Murat Efe Guney, and Aashay Shringarpure. "On the limits of GPU acceleration." In Proc. USENIX Wkshp. Hot Topics in Parallelism (HotPar), Berkeley, CA, USA, June 2010. [PDF]
 

Falcon: Fault Localization for Concurrent Programs

E-mail
User Rating: / 4
PoorBest 
S. Park, R. Vuduc, M. J. Harrold. "Falcon: Fault Localization for Concurrent Programs." In Proc. ACM/IEEE Int'l. Conf. Software Eng. (ICSE), Cape Town, South Africa, May 2010. [PDF]
 

Performance evaluation of Concurrent Collections on high-performance multicore computing systems

E-mail
A. Chandramowlishwaran, K. Knobe, and R. Vuduc. Performance evaluation of Concurrent Collections on high-performance multicore computing systems. In Proc. IEEE Int’l. Parallel and Distributed Processing Symp. (IPDPS), Atlanta, GA, USA, April 2010. (to appear). Winner, Best Paper (software track) [PDF]
 

A massively parallel adaptive fast multipole method on heterogeneous architectures

E-mail
Ilya Lashuk, Aparna Chandramowlishwaran, Harper Langston, Tuan-Anh Nguyen, Rahul Sampath, Aashay Shringarpure, Richard Vuduc, Lexing Ying, Denis Zorin, and George Biros. “A massively parallel adaptive fast multipole method on hetergeneous architectures.” In Proc. ACM/IEEE Conf. Supercomputing (SC), Portland, OR, USA, November 2009. Nominee, best paper. [DOI]
 

Fast sensitivity computations for numerical optimization

E-mail
Nitin Arora, Ryan Russell, and Richard Vuduc. “Fast sensitivity computations for numerical optimizations.” In Proc. AAS/AIAA Astrodynamics Specialist Conference, Pittsburgh, PA, USA, August 2009. (unrefereed).
 

Understanding the design trade-offs among current multicore systems for numerical computations

E-mail
Seunghwa Kang, David Bader, and Richard Vuduc. “Understanding the design trade-offs among current multicore systems for numerical computations.” In Proc. IEEE Int'l. Parallel and Distributed Processing Symp. (IPDPS), Rome, Italy, May 2009. [PDF]
 

Optimizing sparse matrix-vector multiply on emerging multicore platforms

E-mail

Sam Williams, Leonid Oliker, Richard Vuduc, John Shalf, Katherine Yelick, and James Demmel. “Optimizing sparse matrix-vector multiply on emerging multicore platforms.” Journal of Parallel Computing (ParCo), 35(3):178--194, March 2009. [PDF]

Earlier version:

  • Sam Williams, Lenny Oliker, Richard Vuduc, John Shalf, Katherine Yelick, and James Demmel. “Optimization of sparse matrix-vector multiply on emerging multicore platforms.” In Proc. ACM/IEEE Conf. Supercomputing (SC), Reno, NV, USA, November 2007. [PDF]
 

When cache blocking sparse matrix vector multiply works and why

E-mail

Rajesh Nishtala, Richard Vuduc, James W. Demmel, and Katherine A. Yelick. “When cache blocking sparse matrix vector multiply works and why.” Applicable Algebra in Engineering, Communication, and Computing: Special Issue on Computational Linear Algebra and Sparse Matrix Computations, March 2007. [PDF]

Extended technical report:

  • Rajesh Nishtala, Richard Vuduc, James W. Demmel, and Katherine A. Yelick. “Performance modeling and analysis of cache blocking in sparse matrix-vector multiply.” Technical Report UCB/CSD-04-1335, U.C. Berkeley, June 2004. [PDF]
 

Techniques for specifying bug patterns

E-mail
Dan Quinlan, Richard Vuduc, and Ghassan Misherghi. “Techniques for specifying bug patterns.” In Proc. Parallel and Distributed Testing and Debugging (PADTAD) at ISSTA, London, England, July 2007. [PDF]
 

Analyzing and visualizing whole program architectures

E-mail
Thomas Panas, Dan Quinlan, and Richard Vuduc. “Analyzing and visualizing whole program architectures.” In Proc. 3rd Workshop on Aerospace Software Engineering (AeroSE) at the Int'l Conf. on Software Engineering (ICSE), Minneapolis, MN, USA, May 2007.
 

Effective source-to-source outlining to support whole program empirical optimization

E-mail
Chunhua Liao, Daniel J. Quinlan, Richard Vuduc, and Thomas Panas. “Effective source-to-source outlining to support whole program empirical optimization.” In Proc. Int'l. Wkshp. Languages and Compilers for Parallel Computing (LCPC), Newark, DE, USA, October 2009.
 

Petascale direct numerical simulation of blood flow on 200k cores and heterogeneous architectures.

E-mail
Abtin Rahimian, Ilya Lashuk, Aparna Chandramowlishwaran, Dhairya Malhotra, Logan Moon, Rahul Sampath, Aashay Shringarpure, Shravan Veerapaneni, Jeffrey Vetter, Richard Vuduc, Denis Zorin, and George Biros. "Petascale direct numerical simulation of blood flow on 200k cores and heterogeneous architectures." In Proc. ACM/IEEE Conf. Supercomputing (SC), New Orleans, LA, USA, November 2010. (to appear). Finalist, Gordon Bell Prize.
 

Toward interactive statistical modeling.

E-mail
Sooraj Bhat, Ashish Agarwal, Alexander Gray, and Richard Vuduc. "Toward interactive statistical modeling." Procedia Computer Science, 1(1):1829-1838, May-June 2010. Proc. Int'l. Conf. Computational Science (ICCS), Wkshp. Automated Program Generation for Computational Science (APGCS). [PDF]
 

Optimizing and tuning the fast multipole method for state-of-the-art multicore architectures

E-mail

A. Chandramowlishwaran, S. Williams, L. Oliker, I. Lashuk, G. Biros, and R. Vuduc. Optimizing and tuning the fast multipole method for state-of-the-art multicore architectures. In Proc. IEEE Int’l. Parallel and Distributed Processing Symp. (IPDPS), Atlanta, GA, USA, April 2010. [PDF

 

Model-driven autotuning of sparse matrix-vector multiply on GPUs

E-mail
User Rating: / 4
PoorBest 

J. Choi, A. Singh, R.Vuduc. “Model-driven autotuning of sparse matrix-vector multiply on GPUs.” In Proc. Symp. Principles and Practice of Parallel Programming (PPoPP), Bangalore, India, Jan. 2010. [DOI | PDF]

 

Direct n-body kernels for multicore platforms

E-mail
Nitin Arora, Aashay Shringarpure, and Richard Vuduc. “Direct n-body kernels for multicore platforms.” In Proc. Int'l. Conf. Parallel Processing (ICPP), Vienna, Austria, September 2009.
 

Tuned and wildly asynchronous stencil kernels for hybrid CPU/GPU platforms

E-mail
Sundaresan Venkatasubramanian and Richard Vuduc. “Tuned and wildly asynchronous stencil kernels for hybrid CPU/GPU platforms.” In Proc. ACM Int'l. Conf. Supercomputing (ICS), New York, NY, USA, June 2009. [PDF | Slides]
 

Optimization and autotuning of 3D FFTs on the Cray XT4

E-mail
Manisha Gajbe, Andrew Canning, John Shalf, Lin-Wang Wang, Harvey Wasserman, and Richard Vuduc. “Optimization and auto-tuning of 3D FFTs on the Cray XT4.” In Proc. Cray User's Group (CUG) Meeting, Atlanta, Georgia, USA, May 2009. (unrefereed)
 

Numerical algorithms with tunable parallelism

E-mail
Aparna Chandramowlishwaran, Abhinav Karhu, Ketan Umare, and Richard Vuduc. “Numerical algorithms with tunable parallelism.” In Proc. Workshop on Software Tools for Multicore Systems (STMCS) at ACM/IEEE CGO, Boston, MA, USA, April 2008.
 

POET: Parameterized Optimizations for Empirical Tuning

E-mail
Qing Yi, Keith Seymour, Haihang You, Richard Vuduc, and Dan Quinlan. “POET: Parameterized Optimizations for Empirical Tuning.” In Proc. Workshop on Performance Optimization of High-Level Languages and Libraries (POHLL) at IPDPS, Long Beach, CA, USA, March 2007. [PDF]
 

Communicating software architecture using a unified single-view visualization

E-mail
Thomas Panas, Tom Epperly, Dan Quinlan, Andreas Sæbjørnsen, and Richard Vuduc. “Communicating software architecture using a unified single-view visualization.” In Proc. 12th Int'l Conf. on Engineering of Complex Computer Systems (ICECCS), Auckland, New Zealand, July 2007.
 

Tool support for inspecting the code quality of HPC applications

E-mail
Thomas Panas, Dan Quinlan, and Richard Vuduc. “Tool support for inspecting the code quality of HPC applications.” In Proc. 3rd Workshop on Software Engineering for High-Performance Computing Applications (SE-HPC) at the Int'l Conf. on Software Engineering (ICSE), Minneapolis, MN, USA, May 2007. [PDF]