Stephen W. Keckler - Publications

Affiliations: 
Electrical & Computer Engineering University of Texas at Austin, Austin, Texas, U.S.A. 
Area:
Electronics and Electrical Engineering, Computer Science
Website:
https://www.cs.utexas.edu/~skeckler/

68 high-probability publications. We are testing a new system for linking publications to authors. You can help! If you notice any inaccuracies, please sign in and mark papers as correct or incorrect matches. If you identify any major omissions or other inaccuracies in the publication list, please let us know.

Year Citation  Score
2020 Zimmer B, Venkatesan R, Shao YS, Clemons J, Fojtik M, Jiang N, Keller B, Klinefelter A, Pinckney N, Raina P, Tell SG, Zhang Y, Dally WJ, Emer JS, Gray CT, ... Keckler SW, et al. A 0.32–128 TOPS, Scalable Multi-Chip-Module-Based Deep Neural Network Inference Accelerator With Ground-Referenced Signaling in 16 nm Ieee Journal of Solid-State Circuits. 55: 920-932. DOI: 10.1109/Jssc.2019.2960488  0.754
2019 Crago NC, Stephenson M, Keckler SW. Exposing Memory Access Patterns to Improve Instruction and Memory Efficiency in GPUs Acm Transactions On Architecture and Code Optimization. 15: 45. DOI: 10.1145/3280851  0.517
2018 Voitsechov D, Zulfiqar A, Stephenson M, Gebhart M, Keckler SW. Software-Directed Techniques for Improved GPU Register File Utilization Acm Transactions On Architecture and Code Optimization. 15: 38. DOI: 10.1145/3243905  0.448
2016 Agarwal N, Nellans D, Ebrahimi E, Wenisch TF, Danskin J, Keckler SW. Selective GPU caches to eliminate CPU-GPU HW cache coherence Proceedings - International Symposium On High-Performance Computer Architecture. 2016: 494-506. DOI: 10.1109/HPCA.2016.7446089  0.454
2016 Zheng T, Nellans D, Zulfiqar A, Stephenson M, Keckler SW. Towards high performance paged memory for GPUs Proceedings - International Symposium On High-Performance Computer Architecture. 2016: 345-357. DOI: 10.1109/HPCA.2016.7446077  0.446
2015 Jog A, Kayiran O, Kesten T, Pattnaik A, Bolotin E, Chatterjee N, Keckler SW, Kandemir MT, Das CR. Anatomy of GPU memory system for multi-application execution Acm International Conference Proceeding Series. 5: 223-234. DOI: 10.1145/2818950.2818979  0.41
2015 Rogers TG, Johnson DR, O'Connor M, Keckler SW. A variable warp size architecture Proceedings - International Symposium On Computer Architecture. 13: 489-501. DOI: 10.1145/2749469.2750410  0.402
2015 Agarwal N, Nellans D, Stephenson M, O'Connor M, Keckler SW. Page placement strategies for GPUS within heterogeneous memory systems International Conference On Architectural Support For Programming Languages and Operating Systems - Asplos. 2015: 607-618. DOI: 10.1145/2694344.2694381  0.339
2015 Bolotin E, Nellans D, Villa O, O'Connor M, Ramirez A, Keckler SW. Designing Efficient Heterogeneous Memory Architectures Ieee Micro. 35: 60-68. DOI: 10.1109/Mm.2015.72  0.534
2015 Lee Y, Grover V, Krashinsky R, Stephenson M, Keckler SW, Asanovic K. Exploring the design space of SPMD divergence management on data-parallel architectures Proceedings of the Annual International Symposium On Microarchitecture, Micro. 2015: 101-113. DOI: 10.1109/MICRO.2014.48  0.3
2015 Keckler SW. Increasing interconnection network throughput with virtual channels Computer. 48: 10. DOI: 10.1109/Mc.2015.191  0.389
2015 Pekhimenko G, Bolotin E, O'Connor M, Mutlu O, Mowry TC, Keckler SW. Toggle-Aware Compression for GPUs Ieee Computer Architecture Letters. 14: 164-168. DOI: 10.1109/Lca.2015.2430853  0.447
2015 Hestness J, Keckler SW, Wood DA. GPU computing pipeline inefficiencies and optimization opportunities in heterogeneous CPU-GPU processors Proceedings - 2015 Ieee International Symposium On Workload Characterization, Iiswc 2015. 87-97. DOI: 10.1109/IISWC.2015.15  0.483
2014 Keckler SW. Rethinking caches for throughput processors: technical perspective Communications of the Acm. 57: 90-90. DOI: 10.1145/2682585  0.437
2014 Keckler SW. Rethinking caches for throughput processors Communications of the Acm. 57: 90. DOI: 10.1145/2682583  0.447
2014 Huh J, Kim C, Shafi H, Zhang L, Burger D, Keckler SW. Author retrospective for a NUCA substrate for flexible CMP cache sharing Proceedings of the International Conference On Supercomputing. 74-76. DOI: 10.1145/2591635.2591667  0.373
2014 Jog A, Bolotin E, Guz Z, Parker M, Keckler SW, Kandemir MT, Das CR. Application-aware memory system for fair and efficient execution of concurrent GPGPU applications Acm International Conference Proceeding Series. 1-8. DOI: 10.1145/2576779.2576780  0.456
2014 Huh J, Kim C, Shafi H, Zhang L, Burger D, Keckler SW. A NUCA substrate for flexible CMP cache sharing Proceedings of the International Conference On Supercomputing. 380-389. DOI: 10.1109/Tpds.2007.1091  0.508
2014 Govindan MSS, Robatmili B, Li D, Maher BA, Smith A, Keckler SW, Burger D. Scaling power and performance viaprocessor composability Ieee Transactions On Computers. 63: 2025-2038. DOI: 10.1109/Tc.2013.48  0.341
2014 Keckler SW. 2014 International symposium on computer architecture influential paper award Ieee Micro. 34: 95-96. DOI: 10.1109/Mm.2014.91  0.335
2014 Hestness J, Keckler SW, Wood DA. A comparative analysis of microarchitecture effects on CPU and GPU memory system behavior Iiswc 2014 - Ieee International Symposium On Workload Characterization. 150-160. DOI: 10.1109/IISWC.2014.6983054  0.459
2013 Lee Y, Krashinsky R, Grover V, Keckler SW, Asanovic K. Convergence and scalarization for data-parallel architectures Proceedings of the 2013 Ieee/Acm International Symposium On Code Generation and Optimization, Cgo 2013. DOI: 10.1109/CGO.2013.6494995  0.426
2012 Gebhart M, Johnson DR, Tarjan D, Keckler SW, Dally WJ, Lindholm E, Skadron K. A hierarchical thread scheduler and register file for energy-efficient throughput processors Acm Transactions On Computer Systems. 30. DOI: 10.1145/2166879.2166882  0.47
2012 Grot B, Hestness J, Keckler S, Mutlu O. A QoS-enabled on-die interconnect fabric for kilo-node chips Ieee Micro. 32: 17-25. DOI: 10.1109/Mm.2012.18  0.441
2012 Gebhart M, Keckler SW, Khailany B, Krashinsky R, Dally WJ. Unifying primary cache, scratch, and register file memories in a throughput processor Proceedings - 2012 Ieee/Acm 45th International Symposium On Microarchitecture, Micro 2012. 96-106. DOI: 10.1109/MICRO.2012.18  0.482
2012 Grot B, Keckler SW, Mutlu O. Topology-aware quality-of-service support in highly integrated chip multiprocessors Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 6161: 357-375. DOI: 10.1007/978-3-642-24322-6_28  0.313
2011 Gebhart M, Keckler SW, Dally WJ. A compile-time managed multi-level register file hierarchy Proceedings of the Annual International Symposium On Microarchitecture, Micro. 465-476. DOI: 10.1145/2155620.2155675  0.32
2011 Grot B, Hestness J, Keckler SW, Mutlu O. Kilo-NOC: A heterogeneous network-on-chip architecture for scalability and service guarantees Proceedings - International Symposium On Computer Architecture. 401-412. DOI: 10.1145/2000064.2000112  0.332
2011 Keckler SW, Dally WJ, Khailany B, Garland M, Glasco D. GPUs and the future of parallel computing Ieee Micro. 31: 7-17. DOI: 10.1109/Mm.2011.89  0.744
2010 Hestness J, Grot B, Keckler SW. Netrace: Dependency-driven trace-based network-on-chip simulation 3rd International Workshop On Network On Chip Architectures, Nocarc 2010, in Conjunction With the 43rd Annual Ieee/Acm International Symposium On Microarchitecture, Micro-43. 31-36. DOI: 10.1145/1921249.1921258  0.327
2009 Grot B, Keckler SW, Mutlu O. Preemptive virtual clock: A flexible, efficient, and cost-effective QOS scheme for networks-on-chip Proceedings of the Annual International Symposium On Microarchitecture, Micro. 268-279. DOI: 10.1145/1669112.1669149  0.415
2009 Grot B, Hestness J, Keckler SW, Mutlu O. Express cube topologies for on-chip interconnects Proceedings - International Symposium On High-Performance Computer Architecture. 163-174. DOI: 10.1109/HPCA.2009.4798251  0.312
2008 Gulati DP, Kim C, Sethumadhavan S, Keckler SW, Burger D. Multitasking workload scheduling on flexible-core chip multiprocessors Parallel Architectures and Compilation Techniques - Conference Proceedings, Pact. 187-196. DOI: 10.1145/1399972.1399981  0.443
2008 Roesner F, Burger D, Keckler SW. Counting dependence predictors Proceedings - International Symposium On Computer Architecture. 215-226. DOI: 10.1109/ISCA.2008.6  0.305
2008 Diamond J, Robatmili B, Keckler SW, Van De Geijn R, Goto K, Burger D. High performance dense linear algebra on a spatially distributed processor Proceedings of the Acm Sigplan Symposium On Principles and Practice of Parallel Programming, Ppopp. 63-72.  0.347
2007 Mudigonda J, Vin HM, Keckler SW. Reconciling performance and programmability in networking systems Acm Sigcomm 2007: Conference On Computer Communications. 73-84. DOI: 10.1145/1282380.1282390  0.5
2007 Owens JD, Dally WJ, Ho R, Jayashima DN, Keckler SW, Peh LS. Research challenges for on-chip interconnection networks Ieee Micro. 27: 96-108. DOI: 10.1109/Mm.2007.91  0.621
2007 Gratz P, Kim C, Sankaralingam K, Hanson H, Shivakumar P, Keckler SW, Burger D. On-chip interconnection networks of the TRIPS chip Ieee Micro. 27: 41-50. DOI: 10.1109/Mm.2007.90  0.771
2007 Kim C, Sethumadhavan S, Gulati D, Burger D, Govindan MS, Ranganathan N, Keckler SW. Composable lightweight processors Proceedings of the Annual International Symposium On Microarchitecture, Micro. 381-393. DOI: 10.1109/MICRO.2007.41  0.384
2006 Agaram KK, Keckler SW, Lin C, McKinley KS. Decomposing memory performance: Data structures and phases International Symposium On Memory Management, Ismm. 2006: 95-103. DOI: 10.1145/1133956.1133970  0.713
2006 Sankaralingam K, Nagarajan R, McDonald R, Desikan R, Drolia S, Govindan MS, Gratz P, Gulati D, Hanson H, Kim C, Liu H, Ranganathan N, Sethumadhavan S, Sharif S, Shivakumar P, ... Keckler SW, et al. Distributed microarchitectural protocols in the TRIPS prototype processor Proceedings of the Annual International Symposium On Microarchitecture, Micro. 480-491. DOI: 10.1109/MICRO.2006.19  0.749
2006 Smith A, Nagarajan R, Sankaralingam K, McDonald R, Burger D, Keckler SW, McKinley KS. Dataflow predication Proceedings of the Annual International Symposium On Microarchitecture, Micro. 89-100. DOI: 10.1109/MICRO.2006.17  0.66
2006 Gratz P, Kim C, McDonald R, Keckler SW, Burger D. Implementation and evaluation of on-chip network architectures Ieee International Conference On Computer Design, Iccd 2006. 477-484. DOI: 10.1109/ICCD.2006.4380859  0.359
2006 Sethumadhavan S, McDonald R, Desikan R, Burger D, Keckler SW. Design and implementation of the TRIPS primary memory system Ieee International Conference On Computer Design, Iccd 2006. 470-476. DOI: 10.1109/ICCD.2006.4380858  0.423
2006 Agaram KK, Keckler SW, Lin C, McKinley KS. The memory behavior of data structures in C SPEC CPU2000 benchmarks 2006 Spec Benchmark Workshop 0.716
2006 Nagarajan R, Xia C, McDonald RG, Burger D, Keckler SW. Critical path analysis of the TRIPS architecture Ispass 2006: Ieee International Symposium On Performance Analysis of Systems and Software, 2006. 2006: 37-47.  0.345
2004 Sankaralingam K, Nagarajan R, Liu H, Kim C, Huh J, Ranganathan N, Burger D, Keckler SW, McDonald RG, Moore CR. TRIPS: A polymorphous architecture for exploiting ILP, TLP, and DLP Acm Transactions On Architecture and Code Optimization. 1: 62-93. DOI: 10.1145/980152.980156  0.71
2004 Desikan R, Sethumadhavan S, Burger D, Keckler SW. Scalable selective re-execution for EDGE architectures Operating Systems Review (Acm). 38: 120-132. DOI: 10.1145/1037949.1024408  0.387
2004 Nagarajan R, Kushwaha SK, Burger D, McKinley KS, Lin C, Keckler SW. Static Placement, Dynamic Issue (SPDI) scheduling for EDGE architectures Parallel Architectures and Compilation Techniques - Conference Proceedings, Pact. 74-84. DOI: 10.1109/PACT.2004.1342543  0.34
2004 Sethumadhavan S, Desikan R, Burger D, Moore CR, Keckler SW. Scalable hardware memory disambiguation for HIGH-ILP processors Ieee Micro. 24: 118-127. DOI: 10.1109/MM.2004.87  0.38
2004 Burger D, Keckler SW, McKinley KS, Dahlin M, John LK, Lin C, Moore CR, Burrill J, McDonald RG, Yoder W. Scaling to the end of silicon with EDGE architectures Computer. 37: 44-55. DOI: 10.1109/Mc.2004.65  0.352
2004 Desikan R, Sethumadhavan S, Burger D, Keckler SW. Scalable selective re-execution for EDGE architectures 11th International Conference On Architectural Support For Programming Languages and Operating Systems, Asplos Xi. 120-132.  0.387
2003 Hanson H, Hrishikesh MS, Agarwal V, Keckler SW, Burger D. Static energy reduction techniques for microprocessor caches Ieee Transactions On Very Large Scale Integration (Vlsi) Systems. 11: 303-313. DOI: 10.1109/Tvlsi.2003.812370  0.702
2003 Kim C, Burger D, Keckler SW. Nonuniform Cache Architectures for Wire-Delay Dominated On-Chip Caches Ieee Micro. 23. DOI: 10.1109/Mm.2003.1261393  0.5
2003 Sankaralingam K, Nagarajan R, Liu H, Kim C, Huh J, Burger D, Keckler SW, Moore CR. Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture Conference Proceedings - Annual International Symposium On Computer Architecture, Isca. 422-433. DOI: 10.1109/Mm.2003.1261386  0.707
2003 Karthikeyan Sankaralingam, Keckler SW, Mark WR, Burger D. Universal mechanisms for data-parallel architectures Proceedings of the Annual International Symposium On Microarchitecture, Micro. 2003: 303-314. DOI: 10.1109/MICRO.2003.1253204  0.458
2003 Shivakumar P, Keckler SW, Moore CR, Burger D. Exploiting microarchitectural redundancy for defect tolerance Proceedings - Ieee International Conference On Computer Design: Vlsi in Computers and Processors. 481-488. DOI: 10.1109/ICCD.2012.6378613  0.604
2003 Keckler SW, Burger D, Moore CR, Nagarajan R, Sankaralingam K, Agarwal V, Hrishikesh MS, Ranganathan N, Shivakumar P. A wire-delay scalable microprocessor architecture for high performance systems Digest of Technical Papers - Ieee International Solid-State Circuits Conference 0.742
2003 Sankaralingam K, Singh VA, Keckler SW, Burger D. Routed Inter-ALU Networks for ILP scalability and performance Proceedings - Ieee International Conference On Computer Design: Vlsi in Computers and Processors. 170-177.  0.653
2002 Kim C, Burger D, Keckler SW. An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches International Conference On Architectural Support For Programming Languages and Operating Systems - Asplos. 211-222. DOI: 10.1145/635508.605420  0.376
2002 Shivakumar P, Kistler M, Keckler SW, Burger D, Alvisi L. Modeling the effect of technology trends on the soft error rate of combinational logic Proceedings of the 2002 International Conference On Dependable Systems and Networks. 389-398. DOI: 10.1109/DSN.2002.1028924  0.602
2002 Hrishikesh MS, Jouppi NP, Farkas KI, Burger D, Keckler SW, Shivakumar P. The optimal logic depth per pipeline stage is 6 to 8 FO4 inverter delays Conference Proceedings - Annual International Symposium On Computer Architecture, Isca. 14-24.  0.599
2001 Agaram K, Keckler SW, Burger D. A characterization of speech recognition on modern computer systems 2001 Ieee International Workshop On Workload Characterization, Wwc 2001. 45-53. DOI: 10.1109/WWC.2001.990743  0.355
2001 Nagarajan R, Sankaralingam K, Burger D, Keckler SW. A design space evaluation of grid processor architectures Proceedings of the Annual International Symposium On Microarchitecture. 40-51.  0.725
2000 Carter NP, Dally WJ, Lee WS, Keckler SW, Chang A. Processor mechanisms for software shared memory Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 1940: 120-133.  0.412
1999 Keckler SW, Chang A, Lee WS, Chatterjee S, Dally WJ. Concurrent event handling through multithreading Ieee Transactions On Computers. 48: 903-916. DOI: 10.1109/12.795220  0.367
1998 Lee WS, Dally WJ, Keckler SW, Carter NP, Chang A. An efficient, protected message interface Computer. 31: 69-75. DOI: 10.1109/2.730739  0.345
1997 Fillo M, Keckler SW, Dally WJ, Carter NP, Chang A, Gurevich Y, Lee WS. The M-machine multicomputer International Journal of Parallel Programming. 25: 183-212. DOI: 10.1007/Bf02700035  0.434
Show low-probability matches.