Richard S. Sutton - Publications

Affiliations:

University of Alberta, Edmonton, Alberta, Canada

Area:

Reinforcement Learning

Website:

http://www.cs.ualberta.ca/~sutton/index.html

Tree Info Grants Similar researchers PubMed Report error

Year	Citation	Score
2022	Rafiee B, Abbas Z, Ghiassian S, Kumaraswamy R, Sutton RS, Ludvig EA, White A. From eye-blinks to state construction: Diagnostic benchmarks for online representation learning. Adaptive Behavior. 31: 3-19. PMID 36618906 DOI: 10.1177/10597123221085039	0.721
2020	Dalrymple AN, Roszko DA, Sutton RS, Mushahwar VK. Pavlovian control of intraspinal microstimulation to produce over-ground walking. Journal of Neural Engineering. PMID 32348970 DOI: 10.1088/1741-2552/Ab8E8E	0.687
2020	De Asis K, Chan A, Pitis S, Sutton R, Graves D. Fixed-Horizon Temporal Difference Methods for Stable Reinforcement Learning Proceedings of the Aaai Conference On Artificial Intelligence. 34: 3741-3748. DOI: 10.1609/aaai.v34i04.5784	0.413
2018	Travnik JB, Mathewson KW, Sutton RS, Pilarski PM. Reactive Reinforcement Learning in Asynchronous Environments. Frontiers in Robotics and Ai. 5: 79. PMID 33500958 DOI: 10.3389/frobt.2018.00079	0.315
2018	Travnik JB, Mathewson KW, Sutton RS, Pilarski PM. Reactive Reinforcement Learning in Asynchronous Environments Frontiers in Robotics and Ai. 5. DOI: 10.3389/frobt.2018.00079	0.436
2015	Edwards AL, Dawson MR, Hebert JS, Sherstan C, Sutton RS, Chan KM, Pilarski PM. Application of real-time machine learning to myoelectric prosthesis control: A case series in adaptive switching. Prosthetics and Orthotics International. PMID 26423106 DOI: 10.1177/0309364615605373	0.439
2014	Kehoe EJ, Ludvig EA, Sutton RS. Time course of the rabbit's conditioned nictitating membrane movements during acquisition, extinction, and reacquisition. Learning & Memory (Cold Spring Harbor, N.Y.). 21: 585-90. PMID 25320350 DOI: 10.1101/Lm.034504.114	0.651
2014	Modayil J, White A, Sutton RS. Multi-timescale nexting in a reinforcement learning robot Adaptive Behavior. 22: 146-160. DOI: 10.1177/1059712313511648	0.464
2014	Mahmood AR, Van Hasselt H, Sutton RS. Weighted importance sampling for off-policy learning with linear function approximation Advances in Neural Information Processing Systems. 4: 3014-3022.	0.33
2014	Sutton RS, Mahmood AR, Precup D, Van Hasselt H. A new Q(λ) with interim forward view and Monte Carlo equivalence 31st International Conference On Machine Learning, Icml 2014. 3: 1973-1988.	0.447
2013	Pilarski PM, Dick TB, Sutton RS. Real-time prediction learning for the simultaneous actuation of multiple prosthetic joints. Ieee ... International Conference On Rehabilitation Robotics : [Proceedings]. 2013: 6650435. PMID 24187253 DOI: 10.1109/ICORR.2013.6650435	0.39
2013	Kehoe EJ, Ludvig EA, Sutton RS. Timing and cue competition in conditioning of the nictitating membrane response of the rabbit (Oryctolagus cuniculus). Learning & Memory (Cold Spring Harbor, N.Y.). 20: 97-102. PMID 23325726 DOI: 10.1101/Lm.028183.112	0.641
2013	Pilarski PM, Dawson MR, Degris T, Carey J, Chan KM, Hebert JS, Sutton RS. Adaptive artificial limbs: A real-time approach to prediction and anticipation Ieee Robotics and Automation Magazine. 20: 53-64. DOI: 10.1109/MRA.2012.2229948	0.374
2012	Ludvig EA, Sutton RS, Kehoe EJ. Evaluating the TD model of classical conditioning. Learning & Behavior. 40: 305-19. PMID 22927003 DOI: 10.3758/S13420-012-0082-6	0.726
2012	Modayil J, White A, Pilarski PM, Sutton RS. Acquiring a broad range of empirical knowledge in real time by temporal-difference learning Conference Proceedings - Ieee International Conference On Systems, Man and Cybernetics. 1903-1910. DOI: 10.1109/ICSMC.2012.6378016	0.385
2012	Silver D, Sutton RS, Müller M. Temporal-difference search in computer Go Machine Learning. 87: 183-219. DOI: 10.1007/S10994-012-5280-0	0.641
2012	Degris T, Pilarski PM, Sutton RS. Model-Free reinforcement learning with continuous action in practice Proceedings of the American Control Conference. 2177-2182.	0.351
2011	Pilarski PM, Dawson MR, Degris T, Fahimi F, Carey JP, Sutton RS. Online human training of a myoelectric prosthesis controller via actor-critic reinforcement learning. Ieee ... International Conference On Rehabilitation Robotics : [Proceedings]. 2011: 5975338. PMID 22275543 DOI: 10.1109/ICORR.2011.5975338	0.33
2011	Sutton RS, Modayil J, Degris MDT, Pilarski PM, White A, Precup D. Horde: A scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction 10th International Conference On Autonomous Agents and Multiagent Systems 2011, Aamas 2011. 2: 713-720.	0.61
2010	Kehoe EJ, Ludvig EA, Sutton RS. Timing in trace conditioning of the nictitating membrane response of the rabbit (Oryctolagus cuniculus): scalar, nonscalar, and adaptive features. Learning & Memory (Cold Spring Harbor, N.Y.). 17: 600-4. PMID 21075900 DOI: 10.1101/Lm.1942210	0.646
2010	Maei HR, Szepesvari C, Bhatnagar S, Sutton RS. Toward off-policy learning control with function approximation Icml 2010 - Proceedings, 27th International Conference On Machine Learning. 719-726.	0.784
2010	Maei HR, Sutton RS. GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces Artificial General Intelligence - Proceedings of the Third Conference On Artificial General Intelligence, Agi 2010. 91-96.	0.784
2009	Kehoe EJ, Ludvig EA, Sutton RS. Magnitude and timing of conditioned responses in delay and trace classical conditioning of the nictitating membrane response of the rabbit (Oryctolagus cuniculus). Behavioral Neuroscience. 123: 1095-101. PMID 19824776 DOI: 10.1037/A0017112	0.638
2009	Kehoe EJ, Olsen KN, Ludvig EA, Sutton RS. Scalar timing varies with response magnitude in classical conditioning of the nictitating membrane response of the rabbit (Oryctolagus cuniculus). Behavioral Neuroscience. 123: 212-7. PMID 19170446 DOI: 10.1037/A0014122	0.652
2009	Sutton RS, Maei HR, Precup D, Bhatnagar S, Silver D, Szepesvári C, Wiewiora E. Fast gradient-descent methods for temporal-difference learning with linear function approximation Proceedings of the 26th International Conference On Machine Learning, Icml 2009. 993-1000. DOI: 10.1145/1553374.1553501	0.791
2009	Bhatnagar S, Sutton RS, Ghavamzadeh M, Lee M. Natural actor-critic algorithms Automatica. 45: 2471-2482. DOI: 10.1016/j.automatica.2009.07.008	0.515
2009	Maei HR, Szepesvari C, Bhatnagar S, Precup D, Silver D, Sutton RS. Convergent temporal-difference learning with arbitrary smooth function approximation Advances in Neural Information Processing Systems 22 - Proceedings of the 2009 Conference. 1204-1212.	0.806
2009	Ludvig EA, Sutton RS, Verbeek E, Kehoe EJ. A computational model of hippocampal function in trace conditioning Advances in Neural Information Processing Systems 21 - Proceedings of the 2008 Conference. 993-1000.	0.611
2009	Sutton RS, Szepesvári C, Maei HR. A convergent O(n) algorithm for off-policy temporal-difference learning with linear function approximation Advances in Neural Information Processing Systems 21 - Proceedings of the 2008 Conference. 1609-1616.	0.777
2009	Sutton RS, Maei HR, Precup D, Bhatnagar S, Silver D, Szepesvári C, Wiewiora E. Fast gradient-descent methods for temporal-difference learning with linear function approximation Proceedings of the 26th International Conference On Machine Learning, Icml 2009. 993-1000.	0.804
2008	Ludvig EA, Sutton RS, Kehoe EJ. Stimulus representation and the timing of reward-prediction errors in models of the dopamine system. Neural Computation. 20: 3034-54. PMID 18624657 DOI: 10.1162/Neco.2008.11-07-654	0.666
2008	Kehoe EJ, Ludvig EA, Dudeney JE, Neufeld J, Sutton RS. Magnitude and timing of nictitating membrane movements during classical conditioning of the rabbit (Oryctolagus cuniculus). Behavioral Neuroscience. 122: 471-6. PMID 18410186 DOI: 10.1037/0735-7044.122.2.471	0.645
2008	Cutumisu M, Szafron D, Bowling M, Sutton RS. Agent learning using action-dependent learning rates in computer role-playing games Proceedings of the 4th Artificial Intelligence and Interactive Digital Entertainment Conference, Aiide 2008. 22-29.	0.36
2008	Silver D, Sutton RS, Müller M. Sample-based learning and search with permanent and transient memories Proceedings of the 25th International Conference On Machine Learning. 968-975.	0.575
2007	Sutton RS, Koop A, Silver D. On the role of tracking in stationary environments Acm International Conference Proceeding Series. 227: 871-878. DOI: 10.1145/1273496.1273606	0.46
2007	Silver D, Sutton R, Müller M. Reinforcement learning of local shape in the game of go Ijcai International Joint Conference On Artificial Intelligence. 1053-1058.	0.355
2006	Geramifard A, Bowling M, Sutton RS. Incremental least-squares temporal difference learning Proceedings of the National Conference On Artificial Intelligence. 1: 356-361.	0.434
2005	Stone P, Sutton RS, Kuhlmann G. Reinforcement learning for RoboCup soccer keepaway Adaptive Behavior. 13: 165-188. DOI: 10.1177/105971230501300301	0.499
2005	Rafols EJ, Ring MB, Sutton RS, Tanner B. Using predictive representations to improve generalization in reinforcement learning Ijcai International Joint Conference On Artificial Intelligence. 835-840.	0.356
2005	Precup D, Sutton RS, Paduraru C, Koop A, Singh S. Off-policy learning with options and recognizers Advances in Neural Information Processing Systems. 1097-1104.	0.618
2002	Stone P, Sutton RS. Keepaway soccer: A machine learning testbed Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2377: 214-223.	0.32
2001	Stone P, Sutton RS, Singh S. Reinforcement learning for 3 vs. 2 keepaway Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2019: 249-258.	0.393
2000	Sutton RS, McAllester D, Singh S, Mansour Y. Policy gradient methods for reinforcement learning with function approximation Advances in Neural Information Processing Systems. 1057-1063.	0.387
1999	Sutton RS, Precup D, Singh S. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning Artificial Intelligence. 112: 181-211. DOI: 10.1016/S0004-3702(99)00052-1	0.675
1999	Sutton RS. Open theoretical questions in reinforcement learning Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 1572: 11-17.	0.388
1999	Sutton RS, Singh S, Precup D, Ravindran B. Improved switching among temporally abstract actions Advances in Neural Information Processing Systems. 1066-1072.	0.476
1999	Moll R, Barto AG, Perkins TJ, Sutton RS. Learning instance-independent value functions to enhance local search Advances in Neural Information Processing Systems. 1017-1023.	0.745
1999	Sutton RS. Reinforcement learning: Past, present and future? Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 1585: 195-197.	0.392
1998	Sutton R, Barto A. Reinforcement Learning: An Introduction Ieee Transactions On Neural Networks. 9: 1054-1054. DOI: 10.1109/TNN.1998.712192	0.397
1998	Precup D, Sutton RS. Multi-time models for temporally abstract planning Advances in Neural Information Processing Systems. 1050-1056.	0.495
1998	Precupl D, Sutton RS, Satinder S. Theoretical results on reinforcement learning with temporally abstract options Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 1398: 382-393.	0.409
1997	Santamaría JC, Ram A, Sutton RS. Experiments with reinforcement learning in problems with continuous state and action spaces Adaptive Behavior. 6: 163-217. DOI: 10.1177/105971239700600201	0.432
1997	Barto AG, Sutton RS. Chapter 19 Reinforcement learning in artificial intelligence Advances in Psychology. 121: 358-386. DOI: 10.1016/S0166-4115(97)80105-7	0.677
1996	Singh SP, Sutton RS. Reinforcement learning with replacing eligibility traces Machine Learning. 22: 123-158. DOI: 10.1007/Bf00114726	0.466
1992	Sutton RS, Barto AG, Williams RJ. Reinforcement Learning is Direct Adaptive Optimal Control Ieee Control Systems. 12: 19-22. DOI: 10.1109/37.126844	0.676
1992	Sutton RS. Introduction: The challenge of reinforcement learning Machine Learning. 8: 225-227. DOI: 10.1007/BF00992695	0.475
1991	Sutton RS. Dyna, an integrated architecture for learning, planning, and reacting Intelligence\/Sigart Bulletin. 2: 160-163. DOI: 10.1145/122344.122377	0.46
1991	Sutton RS. Planning by Incremental Dynamic Programming Machine Learning. 353-357. DOI: 10.1016/B978-1-55860-200-7.50073-8	0.304
1990	Whitehead SD, Sutton RS, Ballard DH. Advances in reinforcement learning and their implications for intelligent control . 1289-1297.	0.37
1988	Sutton RS. Learning to Predict by the Methods of Temporal Differences Machine Learning. 3: 9-44. DOI: 10.1023/A:1022633531479	0.453
1988	Franklin JA, Sutton RS, Anderson CW. Application of connectionist learning methods to manufacturing process monitoring . 709-712.	0.314
1986	Moore JW, Desmond JE, Berthier NE, Blazis DE, Sutton RS, Barto AG. Simulation of the classically conditioned nictitating membrane response by a neuron-like adaptive element: response topography, neuronal firing, and interstimulus intervals. Behavioural Brain Research. 21: 143-54. PMID 3755947 DOI: 10.1016/0166-4328(86)90092-6	0.587
1983	Barto AG, Sutton RS, Anderson CW. Neuronlike Adaptive Elements That Can Solve Difficult Learning Control Problems Ieee Transactions On Systems, Man and Cybernetics. 834-846. DOI: 10.1109/TSMC.1983.6313077	0.609
1982	Barto AG, Anderson CW, Sutton RS. Synthesis of nonlinear control surfaces by a layered associative search network. Biological Cybernetics. 43: 175-85. PMID 7093360 DOI: 10.1007/BF00319977	0.606
1982	Barto AG, Sutton RS. Simulation of anticipatory responses in classical conditioning by a neuron-like adaptive element. Behavioural Brain Research. 4: 221-35. PMID 6277346 DOI: 10.1016/0166-4328(82)90001-8	0.572
1982	Barto AG, Sutton RS, Anderson CW. SPATIAL LEARNING SIMULATION SYSTEMS . 204-206.	0.315
1981	Barto AG, Sutton RS. Landmark learning: an illustration of associative search. Biological Cybernetics. 42: 1-8. PMID 7326277 DOI: 10.1007/BF00335152	0.622
1981	Sutton RS, Barto AG. Toward a modern theory of adaptive networks: expectation and prediction. Psychological Review. 88: 135-70. PMID 7291377 DOI: 10.1037/0033-295X.88.2.135	0.618
1979	Barto AG, Sutton RS, Brouwer PS. Associative search network: A reinforcement learning associative memory Biological Cybernetics. 40: 201-211. DOI: 10.1007/BF00453370	0.581
Show low-probability matches.