Year |
Citation |
Score |
2020 |
Dalrymple AN, Roszko DA, Sutton RS, Mushahwar VK. Pavlovian control of intraspinal microstimulation to produce over-ground walking. Journal of Neural Engineering. PMID 32348970 DOI: 10.1088/1741-2552/Ab8E8E |
0.709 |
|
2015 |
Edwards AL, Dawson MR, Hebert JS, Sherstan C, Sutton RS, Chan KM, Pilarski PM. Application of real-time machine learning to myoelectric prosthesis control: A case series in adaptive switching. Prosthetics and Orthotics International. PMID 26423106 DOI: 10.1177/0309364615605373 |
0.402 |
|
2014 |
Kehoe EJ, Ludvig EA, Sutton RS. Time course of the rabbit's conditioned nictitating membrane movements during acquisition, extinction, and reacquisition. Learning & Memory (Cold Spring Harbor, N.Y.). 21: 585-90. PMID 25320350 DOI: 10.1101/Lm.034504.114 |
0.659 |
|
2014 |
Modayil J, White A, Sutton RS. Multi-timescale nexting in a reinforcement learning robot Adaptive Behavior. 22: 146-160. DOI: 10.1177/1059712313511648 |
0.429 |
|
2014 |
Sutton RS, Mahmood AR, Precup D, Van Hasselt H. A new Q(λ) with interim forward view and Monte Carlo equivalence 31st International Conference On Machine Learning, Icml 2014. 3: 1973-1988. |
0.459 |
|
2013 |
Kehoe EJ, Ludvig EA, Sutton RS. Timing and cue competition in conditioning of the nictitating membrane response of the rabbit (Oryctolagus cuniculus). Learning & Memory (Cold Spring Harbor, N.Y.). 20: 97-102. PMID 23325726 DOI: 10.1101/Lm.028183.112 |
0.652 |
|
2013 |
Pilarski PM, Dawson MR, Degris T, Carey J, Chan KM, Hebert JS, Sutton RS. Adaptive artificial limbs: A real-time approach to prediction and anticipation Ieee Robotics and Automation Magazine. 20: 53-64. DOI: 10.1109/MRA.2012.2229948 |
0.349 |
|
2012 |
Ludvig EA, Sutton RS, Kehoe EJ. Evaluating the TD model of classical conditioning. Learning & Behavior. 40: 305-19. PMID 22927003 DOI: 10.3758/S13420-012-0082-6 |
0.712 |
|
2012 |
Silver D, Sutton RS, Müller M. Temporal-difference search in computer Go Machine Learning. 87: 183-219. DOI: 10.1007/S10994-012-5280-0 |
0.629 |
|
2011 |
Sutton RS, Modayil J, Degris MDT, Pilarski PM, White A, Precup D. Horde: A scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction 10th International Conference On Autonomous Agents and Multiagent Systems 2011, Aamas 2011. 2: 713-720. |
0.59 |
|
2010 |
Kehoe EJ, Ludvig EA, Sutton RS. Timing in trace conditioning of the nictitating membrane response of the rabbit (Oryctolagus cuniculus): scalar, nonscalar, and adaptive features. Learning & Memory (Cold Spring Harbor, N.Y.). 17: 600-4. PMID 21075900 DOI: 10.1101/Lm.1942210 |
0.656 |
|
2010 |
Maei HR, Szepesvari C, Bhatnagar S, Sutton RS. Toward off-policy learning control with function approximation Icml 2010 - Proceedings, 27th International Conference On Machine Learning. 719-726. |
0.773 |
|
2010 |
Maei HR, Sutton RS. GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces Artificial General Intelligence - Proceedings of the Third Conference On Artificial General Intelligence, Agi 2010. 91-96. |
0.772 |
|
2009 |
Kehoe EJ, Ludvig EA, Sutton RS. Magnitude and timing of conditioned responses in delay and trace classical conditioning of the nictitating membrane response of the rabbit (Oryctolagus cuniculus). Behavioral Neuroscience. 123: 1095-101. PMID 19824776 DOI: 10.1037/A0017112 |
0.65 |
|
2009 |
Kehoe EJ, Olsen KN, Ludvig EA, Sutton RS. Scalar timing varies with response magnitude in classical conditioning of the nictitating membrane response of the rabbit (Oryctolagus cuniculus). Behavioral Neuroscience. 123: 212-7. PMID 19170446 DOI: 10.1037/A0014122 |
0.662 |
|
2009 |
Sutton RS, Maei HR, Precup D, Bhatnagar S, Silver D, Szepesvári C, Wiewiora E. Fast gradient-descent methods for temporal-difference learning with linear function approximation Proceedings of the 26th International Conference On Machine Learning, Icml 2009. 993-1000. DOI: 10.1145/1553374.1553501 |
0.778 |
|
2009 |
Ludvig EA, Sutton RS, Verbeek E, Kehoe EJ. A computational model of hippocampal function in trace conditioning Advances in Neural Information Processing Systems 21 - Proceedings of the 2008 Conference. 993-1000. |
0.624 |
|
2009 |
Sutton RS, Szepesvári C, Maei HR. A convergent O(n) algorithm for off-policy temporal-difference learning with linear function approximation Advances in Neural Information Processing Systems 21 - Proceedings of the 2008 Conference. 1609-1616. |
0.77 |
|
2009 |
Maei HR, Szepesvari C, Bhatnagar S, Precup D, Silver D, Sutton RS. Convergent temporal-difference learning with arbitrary smooth function approximation Advances in Neural Information Processing Systems 22 - Proceedings of the 2009 Conference. 1204-1212. |
0.793 |
|
2009 |
Sutton RS, Maei HR, Precup D, Bhatnagar S, Silver D, Szepesvári C, Wiewiora E. Fast gradient-descent methods for temporal-difference learning with linear function approximation Proceedings of the 26th International Conference On Machine Learning, Icml 2009. 993-1000. |
0.792 |
|
2008 |
Ludvig EA, Sutton RS, Kehoe EJ. Stimulus representation and the timing of reward-prediction errors in models of the dopamine system. Neural Computation. 20: 3034-54. PMID 18624657 DOI: 10.1162/Neco.2008.11-07-654 |
0.666 |
|
2008 |
Kehoe EJ, Ludvig EA, Dudeney JE, Neufeld J, Sutton RS. Magnitude and timing of nictitating membrane movements during classical conditioning of the rabbit (Oryctolagus cuniculus). Behavioral Neuroscience. 122: 471-6. PMID 18410186 DOI: 10.1037/0735-7044.122.2.471 |
0.653 |
|
2008 |
Silver D, Sutton RS, Müller M. Sample-based learning and search with permanent and transient memories Proceedings of the 25th International Conference On Machine Learning. 968-975. |
0.567 |
|
2008 |
Cutumisu M, Szafron D, Bowling M, Sutton RS. Agent learning using action-dependent learning rates in computer role-playing games Proceedings of the 4th Artificial Intelligence and Interactive Digital Entertainment Conference, Aiide 2008. 22-29. |
0.312 |
|
2007 |
Sutton RS, Koop A, Silver D. On the role of tracking in stationary environments Acm International Conference Proceeding Series. 227: 871-878. DOI: 10.1145/1273496.1273606 |
0.481 |
|
2005 |
Stone P, Sutton RS, Kuhlmann G. Reinforcement learning for RoboCup soccer keepaway Adaptive Behavior. 13: 165-188. DOI: 10.1177/105971230501300301 |
0.451 |
|
2005 |
Precup D, Sutton RS, Paduraru C, Koop A, Singh S. Off-policy learning with options and recognizers Advances in Neural Information Processing Systems. 1097-1104. |
0.596 |
|
1999 |
Sutton RS, Precup D, Singh S. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning Artificial Intelligence. 112: 181-211. DOI: 10.1016/S0004-3702(99)00052-1 |
0.645 |
|
1999 |
Sutton RS, Singh S, Precup D, Ravindran B. Improved switching among temporally abstract actions Advances in Neural Information Processing Systems. 1066-1072. |
0.485 |
|
1999 |
Moll R, Barto AG, Perkins TJ, Sutton RS. Learning instance-independent value functions to enhance local search Advances in Neural Information Processing Systems. 1017-1023. |
0.73 |
|
1998 |
Precup D, Sutton RS. Multi-time models for temporally abstract planning Advances in Neural Information Processing Systems. 1050-1056. |
0.504 |
|
1997 |
Barto AG, Sutton RS. Chapter 19 Reinforcement learning in artificial intelligence Advances in Psychology. 121: 358-386. DOI: 10.1016/S0166-4115(97)80105-7 |
0.557 |
|
1992 |
Sutton RS, Barto AG, Williams RJ. Reinforcement Learning is Direct Adaptive Optimal Control Ieee Control Systems. 12: 19-22. DOI: 10.1109/37.126844 |
0.557 |
|
1986 |
Moore JW, Desmond JE, Berthier NE, Blazis DE, Sutton RS, Barto AG. Simulation of the classically conditioned nictitating membrane response by a neuron-like adaptive element: response topography, neuronal firing, and interstimulus intervals. Behavioural Brain Research. 21: 143-54. PMID 3755947 DOI: 10.1016/0166-4328(86)90092-6 |
0.583 |
|
1983 |
Barto AG, Sutton RS, Anderson CW. Neuronlike Adaptive Elements That Can Solve Difficult Learning Control Problems Ieee Transactions On Systems, Man and Cybernetics. 834-846. DOI: 10.1109/TSMC.1983.6313077 |
0.512 |
|
1982 |
Barto AG, Anderson CW, Sutton RS. Synthesis of nonlinear control surfaces by a layered associative search network. Biological Cybernetics. 43: 175-85. PMID 7093360 DOI: 10.1007/BF00319977 |
0.557 |
|
1982 |
Barto AG, Sutton RS. Simulation of anticipatory responses in classical conditioning by a neuron-like adaptive element. Behavioural Brain Research. 4: 221-35. PMID 6277346 DOI: 10.1016/0166-4328(82)90001-8 |
0.557 |
|
1981 |
Barto AG, Sutton RS. Landmark learning: an illustration of associative search. Biological Cybernetics. 42: 1-8. PMID 7326277 DOI: 10.1007/BF00335152 |
0.557 |
|
1981 |
Sutton RS, Barto AG. Toward a modern theory of adaptive networks: expectation and prediction. Psychological Review. 88: 135-70. PMID 7291377 DOI: 10.1037/0033-295X.88.2.135 |
0.557 |
|
1979 |
Barto AG, Sutton RS, Brouwer PS. Associative search network: A reinforcement learning associative memory Biological Cybernetics. 40: 201-211. DOI: 10.1007/BF00453370 |
0.557 |
|
Show low-probability matches. |