|
|
publications nav tabs
2008
- A. Bhatele, L.V. Kale. Application-specific Topology-aware Mapping for Three Dimensional Topologies, Workshop on Large-Scale Parallel Processing. Miami, FL, April 18, 2008.
- L.V. Kale, K. Pattabiraman, C.W. Lee. Basic Charm++ and Virtualization Tutorial, 6th Annual Workshop on Charm++ and its Applications. Champaign-Urbana, IL, May 1-3, 2008.
- S. Stone, et al. Accelerating Advanced MRI Reconstructions on GPUs, ACM Computing Frontier. Italy, May 5-7, 2008.
- C. Rodrigues, et al. GPU Acceleration of Cutoff Pair Potentials for Molecular Modeling Applications, ACM Computing Frontier. Italy, May 5-7, 2008.
- L.V. Kale. Some Essential Techniques for Developing Essential Petascale Applications, SciDAC'2008. Seattle, WA, July 13-17, 2008.
- F. Gioachin, L.V. Kale. Memory Tagging in Charm++, Workshop on Parallel and Distributed Systems: Testing, Analysis, and Debugging (PADTAD 2008). Seattle, July 20-21, 2008.
- S. Patel, W.W. Hwu. Accelerator Architectures, Guest Editors' Introduction, IEEE Micro. July/August 2008, pp. 4-12.
- W.W. Hwu, K. Keutzer, T. Mattson. The Concurrency Challenge, IEEE Design and Test of Computers. July/August 2008, pp. 312-320.
- J.A. Stratton, S.S. Stone, W.W. Hwu. MCUDA: An Efficient Implementation of CUDA Kernels for Multi-Core CPUs, 21st International Workshop on Languages and Compilers for Parallel Computing. Edmonton, Canada, July 30 - August 2, 2008.
- S. Ueng, M. Lathara, S. Baghsorkhi, W. Hwu. CUDA-lite, Reducing GPU Programming Complexity, 21st International Workshop on Languages and Compilers for Parallel Computing. Edmonton, Canada, July 30 - August 2, 2008.
- L. Owens, S. Anand, L. Kelly-Wilson. GLCPC Virtual School of Computational Science & Engineering: Summer School on Accelerators and GPUs: Final Analytic Report, Survey Research Laboratory, University of Illinois at Chicago. October 2008.
- E. Wah, E. Johnson, L. Auvil, U. Thakkar, W. Hwu, D. Kirk, T.H. Dunning, S.C. Glotzer. Visualization and analysis of GPU summer school applicants and participants, 4th IEEE International Conference on e-Science. Indianapolis, IN, December 10, 2008.
- E. Wah, E. Johnson. Data Visualization and Analysis of CIC Graduate Student TeraGrid Resource Usage, 4th IEEE International Conference on e-Science. Indianapolis, IN, December 10, 2008.
2009
- L.V. Kale. BigSim: Simulating PetaFLOPS Supercomputers, NCSA's CI-Tutor.
- L.V. Kale. Introduction to Performance Tools, NCSA's CI-Tutor.
- J. Phillips, J. Stone. Probing Biomolecular Machines with Graphics Processors, Queue. vol. 7, no. 9, 2009.
- T. Gamblin. Scalable Performance Measurement and Analysis, PhD dissertation. 2009.
- A. Bhatele, L.V. Kale. Quantifying Network Contention on Large Parallel Machines, Parallel Processing Letters. 19:4 (2009).
- R. Fowler, L. Adhianto, B. de Supinski, M. Fagan, T. Gamblin, M. Krentel, J. Mellor-Crummey, M. Schulz, N. Tallent. Frontiers of performance analysis on leadership-class systems, Journal of Physics: Conference Series. 180, 2009.
- A. Pant, H. Jafri, V. Kindratenko. Phoenix: A Runtime Environment for High Performance Computing on Chip Multiprocessors, Proc. 17th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP'09). Weimar, Germany, February 18-20, 2009. pp. 119-126.
- W. Kramer. Update on the Blue Waters Project, HPC Asia 2009. Koahsiung, Taiwan, March 3, 2009.
- J.E. Stone, J. Saam, D.J. Hardy, K.L. Vandivort, W.W. Hwu, K. Schulten. High Performance Computation and Interactive Display of Molecular Orbitals on GPUs and Multi-Core CPUs, Proc. 2nd Workshop on General-Purpose Computation on Graphics Processing Units (GPGPU-2). Washington, D.C., March 8, 2009.
- D. Roeh, V. Kindratenko, R. Brunner. Accelerating Cosmological Data Analysis with Graphics Processors, Proc. 2nd Workshop on General-Purpose Computation on Graphics Processing Units (GPGPU-2). Washington, D.C., March 8, 2009.
- M. Showerman, W.W. Hwu, J. Enos, A. Pant, V. Kindratenko, C. Steffen, R. Pennington. QP: A Heterogeneous Multi-Accelerator Cluster, 10th LCI International Conference on High-Performance Cluster Computing. Boulder, CO, March 10-12, 2009.
- W. Kramer. Fault Tolerance and Large Systems – Some Experiences and Insights, Fault Tolerance for Extreme-Scale Computing. Albuquerque, NM, March 19-20, 2009.
- B. Gropp, M. Snir. A Proposal for a Capability Centers Consortium, International Exascale Software Project. Santa Fe, NM, April 7-8, 2009.
- W. Kramer. Introduction to the Blue Waters Project, IBM SP-XXL and SciComp 09. Barcelona, Spain, May 20, 2009.
- J. Overbey, S. Negara, R. Johnson. Refactoring and the Evolution of Fortran, Proceedings of the 2009 ICSE Workshop on Software Engineering for Computational Science and Engineering (SECSE '09). Vancouver, Canada, May 23, 2009.
- F. Gioachin, L.V. Kale. Dynamic High-Level Scripting in Parallel Applications, 24th IEEE International Parallel and Distributed Processing Symposium (IPDPS '09). Rome, Italy, May 25-29, 2009.
- A. Bhatele, L.V. Kale. An Evaluative Study on the Effect of Contention on Message Latencies in Large Supercomputers, Large-Scale Parallel Processing Workshop (LSPP). Rome, Italy, May 29, 2009.
- A. Bhatele, L.V. Kale, S. Kumar. Dynamic Topology Aware Load Balancing Algorithms for MD Applications, International Conference on Supercomputing. Yorktown Heights, NY, June 9, 2009.
- R. Fowler. Frontiers of Performance Analysis on Leadership Class Systems, SciDAC 2009. San Diego, CA, June 14-18, 2009.
- E. Schnetter. Introduction to the Cactus Framework, TeraGrid '09 Conference. Arlington, VA, June 22-25, 2009.
- F. Gioachin, L.V. Kale. Scalable Interaction with Parallel Applications, TeraGrid '09 Conference. Arlington, VA, June 22-25, 2009.
- F. Cappello, A. Geist, B. Gropp, S. Kale, B. Kramer, M. Snir. Towards Exascale Resilience, International Exascale Software Project. Paris, France, June 28-29, 2009.
- W. Kramer, D. Skinner. An Exascale Approach to Software and Hardware Design, International Exascale Software Project. Paris, France, June 28-29, 2009.
- W. Kramer, D. Skinner. Consistent Application Performance at Exascale, International Exascale Software Project. Paris, France, June 28-29, 2009.
- G. Shi, J. Enos, M. Showerman, V. Kindratenko. On Testing GPU Memory for Hard and Soft Errors, Proc. Symposium on Application Accelerators in HPC (SAAHPC'09). Urbana, IL, July 29, 2009.
- V. Kindratenko, J. Enos, G. Shi, M. Showerman, G. Arnold, J. Stone, J. Phillips, W. Hwu. GPU Clusters for High-Performance Computing, Proc. Workshop on Parallel Programming on Accelerator Clusters (PPAC'09). New Orleans, LA, August 31, 2009.
- J. Overbey, R. Johnson. Refactoring and Programming Language Evolution, Onward! Conference. Orlando, FL, October 25-29, 2009.
- R. Fiedler, R. Wilhelmson, W. Kramer, B. Bode. Blue Waters: Application-Driven System Design For Sustained Petascale Performance, IEEE Computer, 42:11 (2009), p. 29.
- A. Maccabe, H. Falter, W. Kramer. Resource Management, International Journal of High Performance Computing Applications. November 2009, vol. 23, pp. 347-349.
- F. Cappello, A. Geist, W. Gropp, L. Kale, B. Kramer, M. Snir. Toward Exascale Resilience, International Journal of High Performance Computing Applications. November 2009, vol. 23, pp. 374-388.
- W. Kramer, D. Skinner. An Exascale Approach to Software and Hardware Design, International Journal of High Performance Computing Applications. November 2009, vol. 23, pp. 389-391.
- W. Kramer, D. Skinner. Consistent Application Performance at the Exascale, International Journal of High Performance Computing Applications. November 2009, vol. 23, pp. 392-394.
- D. Kunzman, L. Kale. Towards a Framework for Abstracting Accelerators in Parallel Applications: Experience with Cell, Supercomputing 2009 Conference (SC09). November 14-20, 2009.
- I. Dooley, C. Lee, L. Kale. Continuous Performance Monitoring for Large-Scale Parallel Applications, 16th annual IEEE International Conference on High Performance Computing (HiPC 2009). Cochin, India, December 16-19, 2009.
- W. Kramer. How Blue Waters is Addressing the Challenges and Opportunities of Peta-scale Computing, Keynote at the 25th SARA Superdag. Amsterdam, The Netherlands, December 3, 2009.
2010
- J. Overbey, M. Fotzler, A. Kasza, R. Johnson. A Collection of Refactoring Specifications for Fortran 95, ACM Fortran Forum. 29:3 (2010), pp. 11-25.
- F. Gioachin. Debugging Large Scale Applications with Virtualization, PhD Thesis. 2010.
- T. Hoefler. Software and Hardware Techniques for Power-Efficient HPC Networking, Computing in Science and Engineering (CiSE). 12:6 (2010), pp. 30-37.
- A.B. Shiflet, G.W. Shiflet. Testing the Waters with Undergraduates (If you lead students to HPC, they will drink), Journal of Computational Science Education. 1:1 (2010), pp. 33-37.
- D. Guo, W. Gropp. Optimizing Sparse Data Structures for Matrix-Vector Multiply, International Journal of High Performance Computing Applications. 25:115 (2011).
- I. Gelado, J. Stone, J. Cabezas, S. Patel, N. Navarro, W. Hwu. An Asymmetric Distributed Shared Memory Model for Heterogeneous Parallel Systems, 15th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2010). Pittsburgh, PA, March 13-17, 2010.
- W. Kramer. The Crisis in Massive Storage, Keynote at the 26th IEEE Symposium on Massive Storage Systems and Technologies (MSST2010). Incline Village, NV, May 4, 2010.
- E. Meneses, C. Mendes, L. Kale. Team-based Message Logging: Preliminary Results, 3rd Workshop on Resiliency in High Performance Computing in Clusters, Clouds, and Grids (Resilience 2010). Melbourne, Australia, May 17, 2010.
- T. Gamblin, B. de Supinski, M. Schulz, R. Fowler, D.A. Reed. Clustering Performance Data Efficiently at Massive Scales, 24th International Conference on Supercomputing. Tsukuba, Japan, June 1-4, 2010.
- T. Hoefler, T. Schneider, A. Lumsdaine. LogGOPSim - Simulating Large-Scale Applications in the LogGOPS Model, Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing. Chicago, Illinois, June 20-25, 2010.
- F. Gioachin, G. Zheng, L.V. Kale. Robust Non-Intrusive Record-Replay with Processor Extraction, Workshop on Parallel and Distributed Systems: Testing, Analysis, and Debugging (PADTAD - VIII). Trento, Italy, July 13, 2010.
- E. Seidel, G. Allen, S. Brandt, F. Löffler, E. Schnetter. Simplifying Complex Software Assembly: The Component Retrieval Language and Implementation, TeraGrid '10 Conference. Pittsburgh, PA, August 2-5, 2010.
- C. Mei, G. Zheng, F. Gioachin, L.V. Kale. Optimizing a Parallel Runtime System for Multicore Clusters: A Case Study, TeraGrid '10 Conference. Pittsburgh, PA, August 2-5, 2010.
- B. Arimilli, R. Arimilli, V. Chung, S. Clark, W. Denzel, B. Drerup, T. Hoefler, J. Joyner, J. Lewis, J. Li, N. Ni, R. Rajamony. The PERCS High-Performance Interconnect, Proceedings of 18th Symposium on High-Performance Interconnects (HOTI 2010). Mountain View, CA, August 18-20, 2010.
- S, Negara, G. Zheng, K-C. Pan, N. Negara, R. Johnson, L. Kale, P. Ricker. Automatic MPI to AMPI Program Transformation using Photran, 3rd Workshop on Productivity and Performance (PROPER 2010). Naples, Italy, August 30, 2010.
- T. Hoefler. Bridging Performance Analysis Tools and Analytic Performance Modeling for HPC, 3rd Workshop on Productivity and Performance (PROPER 2010). Naples, Italy, August 30, 2010.
- T. Hoefler, W. Gropp, R. Thakur, J.L. Traeff. Toward Performance Models of MPI Implementations for Understanding Application Scaling Issues, 17th EuroMPI Conference. Stuttgart, Germany, September 12-15, 2010.
- T. Hoefler, S. Gottlieb. Parallel Zero-Copy Algorithms for Fast Fourier Transform and Conjugate Gradient using MPI Datatypes, 17th EuroMPI Conference. Stuttgart, Germany, September 12-15, 2010.
- G. Zheng, E. Meneses, A. Bhatele, L.V. Kale. Hierarchical Load Balancing for Large Scale Supercomputers, 3rd International Workshop on Parallel Programming Models and Systems Software for High-end Computing (P2S2 2010). San Diego, CA, September 13, 2010.
- F. Gioachin, G. Zheng, L. Kale. Debugging Large Scale Applications in a Virtualized Environment, 23rd International Workshop on Languages and Compilers for Parallel Computing (LCPC2010). Houston, TX, October 7-9, 2010.
- F. Kjolstad, D. Dig, M. Snir. Bringing the HPC Programmer's IDE into the 21st Century through Refactoring, SPLASH 2010 Workshop on Concurrency for the Application Programmer (CAP'10). Reno, NV, October 18, 2010.
- E.R. Rodrigues, P.O.A. Navaux, J. Panetta, A. Fazenda, C.L. Mendes, L.V. Kale. A Comparative Analysis of Load Balancing Algorithms Applied to a Weather Forecast Model, 22nd Symposium on Computer Architecture and High Performance Computing (SBAC-PAD). Itaipava, Brazil, October 27-30, 2010.
- P. Jetley, L. Wesolowski, F. Gioachin, L.V. Kale, T.R. Quinn. Scaling Hierarchical N-Body Simulations on GPU Clusters, Supercomputing 2010 Conference (SC10). New Orleans, LA, November 13-19, 2010.
- M.J. Garzaran, D. Padua, W.D. Gropp, S. Maleki. Program Optimization through Loop Vectorization, Supercomputing 2010 Conference (SC10). New Orleans, LA, November 13-19, 2010.
- T. Hoefler, T. Schneider, A. Lumsdaine. Characterizing the Influence of System Noise on Large-Scale Applications by Simulation, Supercomputing 2010 Conference (SC10) [SC10 Best Paper Award]. New Orleans, LA, November 13-19, 2010.
- T.G. Armstrong, Z. Zhang, D.S. Katz, M. Wilde, I.T. Foster. Scheduling Many-Task Workloads on Supercomputers: Dealing with Trailing Tasks, 3rd IEEE Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS2010). New Orleans, LA, November 15, 2010.
- G. Zheng, E. Bohm, L.V. Kale, et al. Simulating Large Scale Parallel Applications using Statistical Models for Sequential Execution Blocks, 16th International Conference on Parallel and Distributed Systems (ICPADS 2010). Shanghai, China, December 8-10, 2010.
- E. Rodrigues, P. Navaux, J. Panetta, C. Mendes, L. Kale. Optimizing an MPI Weather Forecasting Model via Processor Virtualization, 17th Annual International Conference on High Performance Computing (HiPC 2010). Goa, India, December 19-22, 2010.
- A. Bhatele, G. Gupta, L. Kale, I-H. Chung. Automated Mapping of Regular Communication Graphs on Mesh and Torus Interconnects, 17th Annual International Conference on High Performance Computing (HiPC 2010). Goa, India, December 19-22, 2010.
- I. Dooley, C. Mei, J. Lifflander, L. Kale. A Study of Memory-Aware Scheduling in Message Driven Parallel Programs, 17th Annual International Conference on High Performance Computing (HiPC 2010). Goa, India, December 19-22, 2010.
- N. Edmonds, T. Hoefler, A. Lumsdaine. A Space-Efficient Parallel Algorithm for Computing Betweenness Centrality in Distributed Memory, 17th Annual International Conference on High Performance Computing (HiPC 2010). Goa, India, December 19-22, 2010.
- N. Edmonds, J. Willock, T. Hoefler, A. Lumsdaine. Design of a Large-Scale Hybrid-Parallel Graph Library, 17th Annual International Conference on High Performance Computing (HiPC 2010). Goa, India, December 19-22, 2010.
2011
- B. Holt, D. Ernst. Accelerating Geophysics Simulation using CUDA, Journal of Computational Science Education. 2:1 (2011), pp. 21-27.
- C. Savoie, D. Mobley. Understanding the Structural and Functional Effects of Mutations in HIV-1 Protease Mutants Using 100ns Molecular Dynamics Simulations, Journal of Computational Science Education. 2:1 (2011), pp. 28-34.
- J. Dongarra, et al. The International Exascale Software Project Roadmap, International Journal for High Performance Computing Applications. 25:1 (2011), pp. 3-60.
- G. Zheng, A. Bhatele, E. Meneses, L. Kale. Periodic Hierarchical Load Balancing for Large Supercomputers, International Journal for High Performance Computing Applications. 25:4 (2011), pp. 371-385.
- A. Bhatele, L. Wesolowski, L.V. Kale. Architectural constraints to attain 1 Exaflop/s on three scientific application classes, 25th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2011). Anchorage, AK, May 16-20, 2011.
- T. Hoefler, M. Snir. Generic Topology Mapping Strategies for Large-scale Parallel Architectures, 25th International Conference on Supercomputing (ICS'11). Tucson, AZ, May 31 - June 4, 2011.
- T. Hoefler, M. Snir. Performance Engineering: A Must for Petaflops and Beyond, Third Workshop on Large-scale System and Application Performance (LSAP2011). San Jose, CA, June 8, 2011.
- A. Gainaru, F. Cappello, B. Kramer. Event log mining tool for large scale HPC systems, Euro-Par 2011 Conference. Bordeaux, France, August 29 - September 2, 2011.
- W. Gropp, T. Hoefler, R. Thakur, J.L. Traeff. Performance Expectations and Guidelines for MPI Derived Datatypes, 18th EuroMPI Conference. Santorini, Greece, September 18-21, 2011.
- T. Hoefler, M. Snir. Writing Parallel Libraries with MPI - Common Practice, Issues, and Extension, 18th EuroMPI Conference. Santorini, Greece, September 18-21, 2011.
- V. Venkatesan, M. Chaarawi, E. Gabriel, T. Hoefler. Design and Evaluation of Nonblocking Collective I/O Operations, 18th EuroMPI Conference. Santorini, Greece, September 18-21, 2011.
- T. Hoefler. Writing Parallel Libraries with MPI - The Good, the Bad, and the Ugly, 18th EuroMPI Conference. Santorini, Greece, September 18-21, 2011.
- W. Kramer. Challenges and Opportunities for Exscale Resource Management and How Today's Petascale Systems are Guiding the Way, Keynote at the SLURM User Group Meeting 2011. Phoenix, AZ, September 23, 2011.
- D.J. Kerbyson, K.J. Barker. Analyzing the Performance Bottlenecks of the Power7-IH Network, IEEE Cluster 2011. Austin, TX, September 26-30, 2011.
- P. Miller, C. Mei. Asynchronous Collective Output With Non-Dedicated Cores, IASDS11: Workshop on Interfaces and Architectures for Scientific Data Storage. Austin, TX, September 30, 2011.
- S. Maleki, D. Padua, M.J. Garzaran, Y. Gao, T. Wong. An Evaluation of Vectorizing Compilers, Parallel Architectures and Compilation Techniques (PACT). Galveston, TX, October 10-14, 2011.
- A. Gainaru, F. Cappello, J. Fullop, S. Trausan-Matu, W. Kramer. Adaptive Event Prediction Strategy with Dynamic Time Window for Large-Scale HPC Systems, 23rd ACM Symposium on Operating Systems Principles. Cascais, Portugal, October 23-26, 2011.
- L.V. Kale, B. Gropp, et al. Avoiding hot-spots on two-level direct networks, Supercomputing 2011 Conference (SC11). Seattle, WA, November 12-18, 2011.
- C. Mei, G. Zheng, E. Bohm, L.V. Kale, et al. Enabling and Scaling Biomolecular Simulations of 100~Million Atoms on Petascale Machines with a Multicore-optimized Message-driven Runtime, Supercomputing 2011 Conference (SC11). Seattle, WA, November 12-18, 2011.
- K.J. Barker, A. Hoisie, D.J. Kerbyson. An Early Performance Analysis of Power7-IH HPC Systems, Supercomputing 2011 Conference (SC11). Seattle, WA, November 12-18, 2011.
- O. Sarood, L. Kale. A 'Cool' Load Balancer for Parallel Applications, Supercomputing 2011 Conference (SC11). Seattle, WA, November 12-18, 2011.
- A. Langer, R. Venkataraman, G. Gupta, L. Kale, U. Palekar, S. Baker, M. Surina. Enabling Massive Parallelism for Stochastic Optimization, Supercomputing 2011 Conference (SC11). Seattle, WA, November 12-18, 2011.
- W. Kramer. How to Measure Useful, Sustained Performance, Supercomputing 2011 Conference (SC11). Seattle, WA, November 12-18, 2011.
- E.M. Heien, D. Kondo, A. Gainaru, D. Lapine, B. Kramer, F. Cappello. Modeling and Tolerating Heterogeneous Failures in Large Parallel Systems, Supercomputing 2011 Conference (SC11). Seattle, WA, November 12-18, 2011.
- T. Hoefler, W. Gropp, M. Snir, W. Kramer. Performance Modeling for Systematic Performance Tuning, Supercomputing 2011 Conference (SC11). Seattle, WA, November 12-18, 2011.
- E. Totoni, L. Kale. Optimizing All-to-All Algorithm for Blue Waters Using Simulation, Supercomputing 2011 Conference (SC11). Seattle, WA, November 12-18, 2011.
- G. Zheng, C.L. Mendes, L.V. Kale. Automatic Handling of Global Variables for Multi-threaded MPI Programs, IEEE 17th International Conference on Parallel and Distributed Systems (ICPADS'11). Tainan, Taiwan, December 7-9, 2011.
- E. Totoni, A. Bhatele, E. Bohm, N. Jain, C. Mendes, R. Mokos, G. Zheng, L. Kale. Simulation-based Performance Analysis and Tuning for the Planned Blue Waters System, IEEE 17th International Conference on Parallel and Distributed Systems (ICPADS'11). Tainan, Taiwan, December 7-9, 2011.
- P. Jetley, L. Kale. Optimizations for Message Driven Applications on Multicore Architectures, 18th Annual International Conference on High Performance Computing (HiPC 2011). Bangalore, India, December 18-21, 2011.
2012
- P. Miller, A. Becker, L. Kale. Using Shared Arrays in Message-Driven Parallel Programs, Parallel Computing. 38:1-2 (2012), pp. 66-74.
- K. Kharbas, D. Kim, T. Hoefler, F. Mueller. Assessing HPC Failure Detectors for MPI Jobs, 20th Euromicro International Conference on Parallel, Distributed and Network-Based Computing. Garching, Germany, February 15-17, 2012.
- T. Hoefler, T. Schneider. Communication-Centric Optimizations by Dynamically Detecting Collective Operations, 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'12). New Orleans, LA, February 25-29, 2012.
- F. Kjolstad, T. Hoefler, M. Snir. Automatic Datatype Generation and Optimization, 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'12). New Orleans, LA, February 25-29, 2012.
- A. Mittal, N. Jain, T. George, Y. Sabharwal, S. Kumar. Collective Algorithms for Sub-communicators, 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'12). New Orleans, LA, February 25-29, 2012.
- J. Fullop, A. Gainaru, J. Plutchak. "Real Time Analysis and Event Prediction Engine," Cray User Group Meeting (CUG 2012). Stuttgart, Germany, April 29 - May 3, 2012.
- C.L. Mendes, L.V. Kale, et al. "Adaptive and Dynamic Load Balancing for Weather Forecasting Models," Cray User Group Meeting (CUG 2012). Stuttgart, Germany, April 29 - May 3, 2012.
- J. Alameda, J. Overbey. "The Eclipse Parallel Tools Platform: Toward an Integrated Development Environment for Improved Software Engineering on Crays," Cray User Group Meeting (CUG 2012). Stuttgart, Germany, April 29 - May 3, 2012.
- J. Muggli, B. Bode, T. Hoefler, W. Kramer, C. Mendes. "Blue Waters Testing Environment," Cray User Group Meeting (CUG 2012). Stuttgart, Germany, April 29 - May 3, 2012.
- G. Bauer, T. Hoefler, W. Kramer, R. Fiedler. "Analyses and Modeling of Applications Used to Demonstrate Sustained Petascale Performance on Blue Waters Testing Environment," Cray User Group Meeting (CUG 2012). Stuttgart, Germany, April 29 - May 3, 2012.
- K. Chadalavada, M. Gajbe. "Understanding the Effects of Process Placement on Application Performance on an AMD Interlagos Processor," Cray User Group Meeting (CUG 2012). Stuttgart, Germany, April 29 - May 3, 2012.
- G. Shi, S. Gottlieb, M. Showerman. "Tuning And Understanding MILC Performance In Cray XK6 GPU Clusters," Cray User Group Meeting (CUG 2012). Stuttgart, Germany, April 29 - May 3, 2012.
- Y. Sun. uGNI-based Charm++ Runtime for Cray Gemini Network, 10th Annual Workshop on Charm++ and its Applications. Champaign-Urbana, IL, May 7-9, 2012.
- G. Zheng. A Scalable Double In-memory Checkpoint and Restart Scheme Towards Exascale, 10th Annual Workshop on Charm++ and its Applications. Champaign-Urbana, IL, May 7-9, 2012.
- P. Miller. Advances in Charm++ from the 2011 HPC Challenge Competition, 10th Annual Workshop on Charm++ and its Applications. Champaign-Urbana, IL, May 7-9, 2012.
- G. Bauer, S. Gottlieb, T. Hoefler. Performance Modeling and Comparative Analysis of the MILC Lattice QCD Application su3_rmd, 2012 12th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID). Ottawa, Canada, May 13-16, 2012.
- A. Gainaru, F. Cappello, W. Kramer. Taming of the Shrew: Modeling the Normal and Faulty Behavior of Large-scale HPC Systems, 26th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2012). Shanghai, China, May 21-25, 2012.
- J. Lifflander, P. Miller, R. Venkataraman, A. Arya, T. Jones, L. Kale. Dense LU Factorization on Multicore Supercomputer Nodes, 26th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2012). Shanghai, China, May 21-25, 2012.
- Y. Sun, G. Zheng, R. Olson, T. Jones, L. Kale. A uGNI-Based Asynchronous Message-driven Runtime System for Cray Supercomputers with Gemini Interconnect, 26th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2012). Shanghai, China, May 21-25, 2012.
- O. Sarood, P. Miller, E. Totoni, L.V. Kale, 'Cool' Load Balancing for High Performance Computing Data Centers, IEEE Transactions on Computers. June 12, 2012.
- D. Guo, W. Gropp. Adaptive Threads Distributions for SpMV on GPU, XSEDE12 Extreme Scaling Workshop. Chicago, IL, June 15-16, 2012.
- L. Oser. Alleviating the scaling problem of cosmological hydrodynamic simulations with HECA, XSEDE12 Extreme Scaling Workshop. Chicago, IL, June 15-16, 2012.
- M. el Mehdi Diouri, O. Guck, L. Lefevre, F. Cappello. Energy Considerations in Checkpointing and Fault Tolerance Protocol, 2nd Workshop on Fault-Tolerance for HPC at Extreme Scale (FTXS 2012). Boston, MA, June 25-28, 2012.
- L. Oser, J. P. Ostriker, K. Nagamine, G. Bryan, R. Cen, T. Naab, M. Gajbe. Alleviating the scaling problem of cosmological hydrodynamic simulations with HECA, XSEDE12 Extreme Scaling Workshop. Chicago, IL, July 15-16, 2012.
- A. Canning, J. Shalf, N. J. Wright S. Anderson, M. Gajbe. A Hybrid MPI/OpenMP 3d FFT for Plane Wave First-principles Materials Science Codes, 9th International Conference on Scientific Computing (CSC'12). Las Vegas, NV, July 16-19, 2012.
- K. Nagamine, L. Oser, J. P. Ostriker, M. Gajbe, G. Bryan, R. Cen, T. Naab. "Weak Scaling Results from the GADGET-3 Cosmological Smoothed Particle Hydrodynamics Simulations," XSEDE12, Chicago, IL, July 16-20, 2012.
- W. Kramer. "What Blue Waters is Already Teaching Us about Petascale Computing," Keynote at the Second Annual Front Range High Performance Computing Symposium. Fort Collins, CO, August 12-13, 2012.
- W. Kramer. Top500 Versus Sustained Performance – Or the Top Problems with the TOP500 List – And What to Do About Them, 21st International Conference On Parallel Architectures And Compilation Techniques (PACT12). Minneapolis, MN, September 19-23, 2012.
- A. Langer, R. Venkataraman. L. Kale. Scalable Algorithms for Constructing Balanced Spanning Trees on System-ranked Process Groups, 19th EuroMPI Conference. Vienna, Austria, September 23-26, 2012.
- Y. Sun, G. Zheng, C. Mei, E.J. Bohm, L.V. Kale, J.C. Phillips, T.R. Jones. Optimizing Fine-Grained Communication in a Biomolecular Simulation Application on Cray XK6, Supercomputing 2012 Conference (SC12). Salt Lake City, UT, November 10-16, 2012.
- A. Gainaru, F. Cappello, W. Kramer, M. Snir. Fault Prediction Under the Microscope - A Closer Look Into HPC Systems, Supercomputing 2012 Conference (SC12). Salt Lake City, UT, November 10-16, 2012.
- T. Hoefler, T. Schneider. Optimization Principles for Collective Neighborhood Communications, Supercomputing 2012 Conference (SC12). Salt Lake City, UT, November 10-16, 2012.

|