


default search action
SC 2013: Denver, CO, USA
- William Gropp, Satoshi Matsuoka:
International Conference for High Performance Computing, Networking, Storage and Analysis, SC'13, Denver, CO, USA - November 17 - 21, 2013. ACM 2013, ISBN 978-1-4503-2378-9
ACM Gordon Bell finalists
- Peter W. J. Staar, Thomas A. Maier
, Michael S. Summers, Gilles Fourestey, Raffaele Solcà
, Thomas C. Schulthess:
Taking a quantum leap in time to solution for simulations of high-Tc superconductors. 1:1-1:11 - Massimo Bernaschi, Mauro Bisson, Massimiliano Fatica, Simone Melchionna:
20 petaflops simulation of proteins suspensions in crowding conditions. 2:1-2:11 - Diego Rossinelli, Babak Hejazialhosseini, Panagiotis E. Hadjidoukas
, Costas Bekas, Alessandro Curioni, Adam Bertsch, Scott Futral, Steffen J. Schmidt
, Nikolaus A. Adams
, Petros Koumoutsakos
:
11 PFLOP/s simulations of cloud cavitation collapse. 3:1-3:13 - Peter A. Boyle, Michael I. Buchoff, Norman H. Christ, Taku Izubuchi, Chulwoo Jung, Thomas C. Luu, Robert D. Mawhinney, Chris Schroeder, Ron Soltz
, Pavlos Vranas
, Joseph Wasem:
The origin of mass. 4:1-4:10 - Michael Bussmann
, Heiko Burau, Thomas E. Cowan
, Alexander Debus
, Axel Huebl
, Guido Juckeland
, Thomas Kluge
, Wolfgang E. Nagel, Richard Pausch, Felix Schmitt, Ulrich Schramm
, Joseph Schuchart, René Widera:
Radiative signatures of the relativistic Kelvin-Helmholtz instability. 5:1-5:12 - Salman Habib, Vitali A. Morozov, Nicholas Frontiere, Hal Finkel, Adrian Pope, Katrin Heitmann:
HACC: extreme scaling and performance across diverse architectures. 6:1-6:10
Fault-tolerant computing
- Xiang Ni, Esteban Meneses
, Nikhil Jain, Laxmikant V. Kalé:
ACR: automatic checkpoint/restart for soft and hard error protection. 7:1-7:12 - Thomas Ropars, Tatiana V. Martsinkevich, Amina Guermouche, André Schiper, Franck Cappello:
SPBC: leveraging the characteristics of MPI HPC applications for scalable checkpointing. 8:1-8:12 - Ke Wang, Abhishek Kulkarni
, Michael Lang
, Dorian C. Arnold, Ioan Raicu:
Using simulation to explore distributed key-value stores for extreme-scale system services. 9:1-9:12
GPU programming
- Michael Goldfarb, Youngjoon Jo, Milind Kulkarni:
General transformations for GPU execution of tree traversals. 10:1-10:12 - Alberto Magni, Christophe Dubach, Michael F. P. O'Boyle:
A large-scale cross-architecture evaluation of thread-coarsening. 11:1-11:11 - Nishkam Ravi, Yi Yang, Tao Bao, Srimat T. Chakradhar:
Semi-automatic restructuring of offloadable tasks for many-core accelerators. 12:1-12:12
Load balancing
- Pai-Wei Lai, Kevin Stock, Samyam Rajbhandari, Sriram Krishnamoorthy
, P. Sadayappan:
A framework for load balancing of tensor contraction expressions via dynamic task partitioning. 13:1-13:10 - Md. Kamruzzaman, Steven Swanson
, Dean M. Tullsen
:
Load-balanced pipeline parallelism. 14:1-14:12 - Harshitha Menon, Laxmikant V. Kalé:
A distributed dynamic load balancer for iterative applications. 15:1-15:11
MPI performance and debugging
- Tobias Hilbrich, Bronis R. de Supinski, Wolfgang E. Nagel, Joachim Protze
, Christel Baier
, Matthias S. Müller
:
Distributed wait state tracking for runtime MPI deadlock detection. 16:1-16:12 - Nilesh Mahajan, Uday Pitambare, Arun Chauhan:
Globalizing selectively: shared-memory efficiency with address-space separation. 17:1-17:12 - Andrew Friedley, Greg Bronevetsky, Torsten Hoefler, Andrew Lumsdaine
:
Hybrid MPI: efficient message passing for multi-core systems. 18:1-18:11
Memory hierarchy
- Richard M. Yoo, Christopher J. Hughes
, Konrad Lai, Ravi Rajwar:
Performance evaluation of Intel® transactional synchronization extensions for high-performance computing. 19:1-19:11 - Jongsoo Park, Richard M. Yoo, Daya Shanker Khudia, Christopher J. Hughes
, Daehyun Kim:
Location-aware cache management for many-core processors with deep cache hierarchy. 20:1-20:12 - Doe Hyun Yoon, Jichuan Chang, Robert S. Schreiber, Norman P. Jouppi:
Practical nonvolatile multilevel-cell phase change memory. 21:1-21:12
Memory resilience
- Vilas Sridharan, Jon Stearley, Nathan DeBardeleben, Sean Blanchard, Sudhanva Gurumurthi:
Feng shui of supercomputer memory: positional effects in DRAM and SRAM faults. 22:1-22:11 - Bharan Giridhar, Michael Cieslak, Deepankar Duggal, Ronald G. Dreslinski, Hsing Min Chen, Robert Patti, Betina Hold, Chaitali Chakrabarti, Trevor N. Mudge, David T. Blaauw:
Exploring DRAM organizations for energy-efficient and resilient exascale memories. 23:1-23:12 - Xun Jian
, Henry Duwe, John Sartori, Vilas Sridharan, Rakesh Kumar:
Low-power, low-storage-overhead chipkill correct via multi-line error correction. 24:1-24:12
Optimizing numerical code
- Qian Wang, Xianyi Zhang, Yunquan Zhang, Qing Yi:
AUGEM: automatically generate high performance dense linear algebra kernels on x86 CPUs. 25:1-25:12 - Wai Teng Tang, Wen Jun Tan, Rajarshi Ray, Yi Wen Wong, Weiguang Chen, Shyh-Hao Kuo, Rick Siow Mong Goh, Stephen John Turner
, Weng-Fai Wong
:
Accelerating sparse matrix-vector multiplication on GPUs using bit-representation-optimized schemes. 26:1-26:12 - Cindy Rubio-González, Cuong Nguyen, Hong Diep Nguyen, James Demmel, William Kahan, Koushik Sen, David H. Bailey, Costin Iancu, David Hough:
Precimonious: tuning assistant for floating-point precision. 27:1-27:12
Parallel performance tools
- Xu Liu, John M. Mellor-Crummey
:
A data-centric profiler for parallel programs. 28:1-28:12 - Germán Llort
, Harald Servat
, Juan Gonzalez
, Judit Giménez
, Jesús Labarta
:
On the usefulness of object tracking techniques in performance analysis. 29:1-29:11 - Sanath Jayasena
, Saman P. Amarasinghe
, Asanka Abeyweera, Gayashan Amarasinghe
, Himeshi De Silva, Sunimal Rathnayake
, Xiaoqiao Meng, Yanbin Liu:
Detection of false sharing using machine learning. 30:1-30:9
Parallel programming models and compilation
- Zhao Zhang, Daniel S. Katz
, Timothy G. Armstrong, Justin M. Wozniak, Ian T. Foster:
Parallelizing the execution of sequential scripts. 31:1-31:12 - Hans Vandierendonck, Kallia Chronaki, Dimitrios S. Nikolopoulos
:
Deterministic scale-free pipeline parallelism with hyperqueues. 32:1-32:12 - Uday Bondhugula:
Compiling affine loop nests for distributed-memory parallel architectures. 33:1-33:12
Performance analysis of applications at large scale
- Jongsoo Park, Ganesh Bikshandi, Karthikeyan Vaidyanathan, Ping Tak Peter Tang, Pradeep Dubey, Daehyun Kim:
Tera-scale 1D FFT with low-communication algorithm and Intel® Xeon Phi™ coprocessors. 34:1-34:12 - Christian Godenschwager, Florian Schornbaum, Martin Bauer, Harald Köstler
, Ulrich Rüde
:
A framework for hybrid parallel flow simulations with a trillion cells in complex geometries. 35:1-35:12 - Xin Yuan, Santosh Mahapatra, Wickus Nienaber, Scott Pakin
, Michael Lang
:
A new routing scheme for Jellyfish and its performance with HPC workloads. 36:1-36:11
Performance management of HPC systems
- Alex D. Breslow, Ananta Tiwari, Martin Schulz
, Laura Carrington, Lingjia Tang, Jason Mars:
Enabling fair pricing on HPC systems with node sharing. 37:1-37:12 - Mingliang Liu, Ye Jin, Jidong Zhai, Yan Zhai, Qianqian Shi, Xiaosong Ma, Wenguang Chen:
ACIC: automatic cloud I/O configurator for HPC applications. 38:1-38:12 - Shaolei Ren
, Yuxiong He:
COCA: online distributed resource management for cost minimization and carbon neutrality in data centers. 39:1-39:12
System-wide application performance assessments
- Nikola Rajovic, Paul M. Carpenter
, Isaac Gelado, Nikola Puzovic, Alex Ramírez, Mateo Valero
:
Supercomputing with commodity CPUs: are mobile SoCs ready for HPC? 40:1-40:12 - Abhinav Bhatele, Kathryn M. Mohror
, Steve H. Langer, Katherine E. Isaacs:
There goes the neighborhood: performance degradation due to nearby jobs. 41:1-41:12 - Xiaobing Li, Yandong Wang, Yizheng Jiao, Cong Xu, Weikuan Yu
:
CooMR: cross-task coordination for efficient data management in MapReduce programs. 42:1-42:11
Tools for scalable analysis
- Milind Chabbi, Karthik Murthy, Michael W. Fagan, John M. Mellor-Crummey
:
Effective sampling-driven performance tools for GPU-accelerated supercomputers. 43:1-43:12 - Dong Li, Zizhong Chen
, Panruo Wu
, Jeffrey S. Vetter:
Rethinking algorithm-based fault tolerance with a cooperative software-hardware approach. 44:1-44:12 - Alexandru Calotoiu, Torsten Hoefler, Marius Poke, Felix Wolf:
Using automated performance modeling to find scalability bugs in complex codes. 45:1-45:12
Data management in the cloud
- Kisung Lee, Ling Liu:
Efficient data partitioning model for heterogeneous graphs in the cloud. 46:1-46:12 - Yu Su, Yi Wang, Gagan Agrawal, Rajkumar Kettimuthu:
SDQuery DSI: integrating data management support with a wide area data transfer protocol. 47:1-47:12 - Yufei Ren, Tan Li, Dantong Yu, Shudong Jin, Thomas G. Robertazzi:
Design and performance evaluation of NUMA-aware RDMA-based end-to-end data transfer systems. 48:1-48:10
Graph partitioning and data clustering
- Md. Mostofa Ali Patwary, Diana Palsetia, Ankit Agrawal
, Wei-keng Liao
, Fredrik Manne, Alok N. Choudhary:
Scalable parallel OPTICS data clustering using graph algorithmic techniques. 49:1-49:12 - Erik G. Boman, Karen D. Devine, Sivasankaran Rajamanickam:
Scalable matrix computations on large scale-free graphs using 2D graph partitioning. 50:1-50:12 - Shad Kirmani, Padma Raghavan:
Scalable parallel graph partitioning. 51:1-51:10
Inter-node communication
- George Michelogiannakis
, Nan Jiang, Daniel Becker, William J. Dally:
Channel reservation protocol for over-subscribed channels and destinations. 52:1-52:12 - Robert Gerstenberger, Maciej Besta, Torsten Hoefler:
Enabling highly-scalable remote memory access programming with MPI-3 one sided. 53:1-53:12 - Sreeram Potluri, Devendar Bureddy, Khaled Hamidouche, Akshay Venkatesh, Krishna Chaitanya Kandalla, Hari Subramoni, Dhabaleswar K. Panda:
MVAPICH-PRISM: a proxy-based communication framework using InfiniBand and SCIF for intel MIC clusters. 54:1-54:11
Cloud resource management and scheduling
- Kefeng Deng, Junqiang Song, Kaijun Ren, Alexandru Iosup
:
Exploring portfolio scheduling for long-term execution of scientific workloads in IaaS clouds. 55:1-55:12 - Shuangcheng Niu, Jidong Zhai, Xiaosong Ma, Xiongchao Tang, Wenguang Chen:
Cost-effective cloud HPC resource provisioning by building semi-elastic virtual clusters. 56:1-56:12 - Alok Gautam Kumbhare, Yogesh Simmhan
, Viktor K. Prasanna:
Exploiting application dynamism and cloud elasticity for continuous dataflows. 57:1-57:12
Energy management
- Osman Sarood, Esteban Meneses
, Laxmikant V. Kalé:
A 'cool' way of improving the reliability of HPC machines. 58:1-58:12 - Indrani Paul
, Vignesh T. Ravi, Srilatha Manne, Manish Arora, Sudhakar Yalamanchili:
Coordinated energy management in heterogeneous processors. 59:1-59:12 - Xu Yang, Zhou Zhou, Sean Wallace, Zhiling Lan, Wei Tang, Susan Coghlan, Michael E. Papka
:
Integrating dynamic pricing of electricity into energy aware scheduling for HPC systems. 60:1-60:11
Extreme-scale applications
- Myoungkyu Lee
, Nicholas Malaya, Robert D. Moser
:
Petascale direct numerical simulation of turbulent channel flow on up to 786K cores. 61:1-61:11 - Iván Bermejo-Moreno
, Julien Bodart, Johan Larsson
, Blaise M. Barney, Joseph W. Nichols, Steve Jones:
Solving the compressible navier-stokes equations on up to 1.97 million cores and 4.1 trillion grid points. 62:1-62:10 - Peter Johnsen, Mark Straka, Melvyn Shapiro, Alan Norton, Thomas Galarneau
:
Petascale WRF simulation of hurricane Sandy deployment of NCSA's cray XE6 blue waters. 63:1-63:7
Fault tolerance and migration in the cloud
- Sheng Di, Yves Robert
, Frédéric Vivien
, Derrick Kondo, Cho-Li Wang, Franck Cappello:
Optimization of cloud task processing with checkpoint-restart mechanism. 64:1-64:12 - Kaveh Razavi, Thilo Kielmann:
Scalable virtual machine deployment using VM image caches. 65:1-65:12 - Jihun Kim, Dongju Chae, Jangwoo Kim, Jong Kim:
Guide-copy: fast and silent migration of virtual machine for datacenters. 66:1-66:12
IO tuning
- Sidharth Kumar, Avishek Saha, Venkatram Vishwanath, Philip H. Carns, John A. Schmidt, Giorgio Scorzelli
, Hemanth Kolla, Ray W. Grout, Robert Latham, Robert B. Ross, Michael E. Papka
, Jacqueline Chen, Valerio Pascucci
:
Characterization and modeling of PIDX parallel I/O for performance optimization. 67:1-67:12 - Babak Behzad, Huong Vu Thanh Luu, Joseph Huchette
, Surendra Byna
, Prabhat, Ruth A. Aydt, Quincey Koziol, Marc Snir:
Taming parallel I/O complexity with auto-tuning. 68:1-68:12 - Da Zheng, Randal C. Burns
, Alexander S. Szalay:
Toward millions of file system IOPS on low-cost, commodity hardware. 69:1-69:12
Physical frontiers
- Yifeng Cui, Efecan Poyraz, Kim B. Olsen, Jun Zhou, Kyle Withers, Scott Callaghan, Jeff Larkin, Clark C. Guest, Dong Ju Choi, Amit Chourasia, Zheqiang Shi, Steven M. Day, Philip Maechling
, Thomas H. Jordan:
Physics-based seismic hazard analysis on petascale heterogeneous supercomputers. 70:1-70:12 - Manaschai Kunaseth
, Rajiv K. Kalia, Aiichiro Nakano, Ken-ichi Nomura, Priya Vashishta:
A scalable parallel algorithm for dynamic range-limited n-tuple computation in many-body molecular dynamics simulation. 71:1-71:12 - Michael S. Warren
:
2HOT: an improved parallel hashed oct-tree n-body algorithm for cosmological simulation. 72:1-72:12
Optimizing data movement
- Joe B. Buck, Noah Watkins, Greg Levin, Adam Crume, Kleoni Ioannidou, Scott A. Brandt, Carlos Maltzahn, Neoklis Polyzotis, Aaron Torres:
SIDR: structure-aware intelligent data routing in Hadoop. 73:1-73:12 - Tong Jin, Fan Zhang, Qian Sun, Hoang Bui, Manish Parashar, Hongfeng Yu, Scott Klasky, Norbert Podhorszki
, Hasan Abbasi:
Using cross-layer adaptations for dynamic data management in large scale coupled scientific workflows. 74:1-74:12 - Myoungsoo Jung, Ellis Herbert Wilson, Wonil Choi, John Shalf
, Hasan Metin Aktulga
, Chao Yang, Erik Saule, Ümit V. Çatalyürek
, Mahmut T. Kandemir:
Exploring the future of out-of-core computing with compute-local non-volatile memory. 75:1-75:11
In-situ data analytics and reduction
- Daniel E. Laney, Steven Langer, Christopher Weber, Peter Lindstrom
, Al Wegener:
Assessing the effects of data compression in simulations using physically motivated metrics. 76:1-76:12 - Marc Gamell, Ivan Rodero
, Manish Parashar, Janine Bennett, Hemanth Kolla, Jacqueline Chen, Peer-Timo Bremer, Aaditya G. Landge, Attila Gyulassy, Patrick S. McCormick
, Scott Pakin
, Valerio Pascucci
, Scott Klasky:
Exploring power behaviors and trade-offs of in-situ data analytics. 77:1-77:12 - Fang Zheng, Hongfeng Yu, Can Hantas, Matthew Wolf, Greg Eisenhauer, Karsten Schwan, Hasan Abbasi, Scott Klasky:
GoldRush: resource efficient in situ scientific data analytics using fine-grained interference aware execution. 78:1-78:12
Preconditioners and unstructured meshes
- James King, Robert M. Kirby:
A scalable, efficient scheme for evaluation of stencil computations over unstructured meshes. 79:1-79:12 - Pierre Jolivet, Frédéric Hecht, Frédéric Nataf, Christophe Prud'homme
:
Scalable domain decomposition preconditioners for heterogeneous elliptic problems. 80:1-80:11 - Long Qu
, Laura Grigori, Frédéric Nataf:
Parallel design and performance of nested filtering factorization preconditioner. 81:1-81:12
Engineering scalable applications
- Bei Wang
, Stéphane Ethier, William M. Tang
, Timothy J. Williams, Khaled Z. Ibrahim, Kamesh Madduri
, Samuel Williams
, Leonid Oliker:
Kinetic turbulence simulations at extreme scale on leadership-class systems. 82:1-82:12 - Florian Wende, Thomas Steinke:
Swendsen-Wang multi-cluster algorithm for the 2D/3D Ising model on Xeon Phi and GPU. 83:1-83:12 - Benjamin Welton, Evan Samanas, Barton P. Miller:
Mr. Scan: extreme scale density-based clustering using a tree-based network of GPGPU nodes. 84:1-84:11
Improving large-scale computation and data resources
- Eli Dart
, Lauren Rotman, Brian Tierney, Mary Hester, Jason Zurawski
:
The Science DMZ: a network design pattern for data-intensive science. 85:1-85:10 - James C. Browne, Robert L. DeLeon, Charng-Da Lu, Matthew D. Jones, Steven M. Gallo
, Amin Ghadersohi, Abani K. Patra, William L. Barth
, John L. Hammond, Thomas R. Furlani
, Robert T. McLay:
Enabling comprehensive data-driven system management for large computational facilities. 86:1-86:11 - Jay F. Lofstead
, Robert Ross:
Insights for exascale IO APIs from building a petascale IO API. 87:1-87:12
Matrix computations
- Yulu Jia, George Bosilca, Piotr Luszczek, Jack J. Dongarra:
Parallel reduction to hessenberg form with algorithm-based fault tolerance. 88:1-88:11 - Oded Green, Yitzhak Birk
:
A computationally efficient algorithm for the 2D covariance method. 89:1-89:12 - Azzam Haidar, Jakub Kurzak, Piotr Luszczek:
An improved parallel singular value algorithm and its implementation for multicore hardware. 90:1-90:12
Sorting and graph algorithms
- Md. Maksudul Alam
, Maleq Khan, Madhav V. Marathe:
Distributed-memory parallel algorithms for generating massive scale-free networks using preferential attachment model. 91:1-91:12 - Sungpack Hong, Nicole C. Rodia, Kunle Olukotun:
On fast parallel detection of strongly connected components (SCC) in small-world graphs. 92:1-92:11 - Hari Sundar, Dhairya Malhotra, Karl W. Schulz:
Algorithms for high-throughput disk-to-disk sorting. 93:1-93:10
Application performance characterization
- Subhash Saini, Haoqiang Jin, Dennis C. Jespersen, Huiyu Feng, M. Jahed Djomehri, William Arasin, Robert Hood, Piyush Mehrotra, Rupak Biswas:
An early performance evaluation of many integrated core architecture based SGI rackable computing system. 94:1-94:12 - Nikhil Jain, Abhinav Bhatele, Michael P. Robson, Todd Gamblin, Laxmikant V. Kalé:
Predicting application performance using supervised learning on communication features. 95:1-95:12 - Qingyu Meng, Alan Humphrey, John A. Schmidt, Martin Berzins:
Investigating applications portability with the Uintah DAG-based runtime system on PetaScale supercomputers. 96:1-96:12

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.