Andrew G. Barto - US grants

Affiliations:

University of Massachusetts, Amherst, Amherst, MA

Area:

Reinforcement Learning

Website:

https://people.cs.umass.edu/~barto/

Tree Info Publications Similar researchers PubMed Report error

We are testing a new system for linking grants to scientists.

The funding information displayed below comes from the NIH Research Portfolio Online Reporting Tools and the NSF Award Database.
The grant data on this page is limited to grants awarded in the United States and is thus partial. It can nonetheless be used to understand how funding patterns influence mentorship networks and vice-versa, which has deep implications on how research is done.
You can help! If you notice any innacuracies, please sign in and mark grants as correct or incorrect matches.

Sign in to see low-probability grants and correct any errors in linkage between grants and researchers.

High-probability grants

According to our matching algorithm, Andrew G. Barto is the likely recipient of the following grants.

Filter high-probability grants:

Years	Recipients	Code	Title / Keywords	Matching score
1984 — 1986	Barto, Andrew Moore, John [⬀]	N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information	Adaptive Element Models of Classical Conditioning @ University of Massachusetts Amherst	1
1987 — 1989	Utgoff, Paul [⬀] Barto, Andrew	N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information	Learning Efficient Recognizers For Analytically Derived Concepts (Computer and Information Science) @ University of Massachusetts Amherst This research develops artificial intelligence methods for the automatic formulation and use of concepts by computer systems. In particular, techniques which enable a machine to develop concepts based on its ability to "explain" instances presented to it are combined with methods to rapidly "classify" new instances into one of its learned categories. The goal is to convert the machine's inefficient but correct "explanation" procedures into efficient classification routines. Methods used will include both symbol-processing strategies and newer, "connectionist" approaches. The importance of this research is that exploratory artificial intelligence programs, now capable of some limited learning, must be substantially improved. In particular, artificial intelligence systems must be developed to both learn new categories (concepts) efficiently and apply these categories rapidly when presented with a high rate of data and new experience. Show summary Hide summary	1
1988	Barto, Andrew	N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information	Conference On the Neurone as a Computational Unit, June 28--July 1, 1988, King's College, Cambridge, England @ University of Massachusetts Amherst Computational neuroscience is a subfield which is growing quickly at the present time. Putting what we know about the functioning of the nervous system into quantitative terms is becoming more appealing now that computers are affordable, easy to work with and sufficient data has been gathered about the systems in question. Mathematical models of various parts of the nervous system are currently being pursued by physiologists well trained in engineering methodology. But now engineers have taken an interest in modelling the various systems of the brain using their perspective. This action is to provide partial support for an international conference which will bring together neurophysiologists and neuroengineers from various countries to discuss important issues regarding modelling approaches in neuroscience. Show summary Hide summary	1
1988 — 1990	Barto, Andrew	N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information	Us-United Kingdom Cooperative Science: a Theoretical Study of the Perceptual Prerequisites of Learning @ University of Massachusetts Amherst This award will support collaborative research between Dr. Andrew Barto, University of Massachusetts, and Professor Horace Barlow, Physiological Laboratory, Cambridge University, England. Dr. Barto will spend a six month sabbatical at the King's College Research Center at Cambridge. He will participate in the Research Center's Program on Biological Information Processing and will collaborate with Dr. Barlow on formulating and evaluating perceptual principles that are linked to the facilitation of goal-directed learning. The investigators propose to investigate how encoding schemes for sensory input, depending minimally on the specific behavior to be learned, can facilitate that learning. They will examine a variety of existing and novel coding principles with respect to their implications for goal-directed learning. They will focus particularly on the ideas being developed by Barlow on minimum entropy encoding principles. The investigators plan to compare and contrast these principles theoretically and, where need arises, via computational experiments. They will also relate their theoretical findings to data relevant to the coding principles exhibited by populations of neurons. Dr. Barto is a recognized leader in the field of real-time computational learning problems and Dr. Barlow has been for many years a preeminent researcher in brain perception. The results of this research should make an important contribution to understanding the theoretical link between biological and artificial neural networks. Show summary Hide summary	1
1988 — 1993	Barto, Andrew Moore, John [⬀]	N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information	Neural Stimulus Representations and Computational Learning Models @ University of Massachusetts Amherst BARTO Learning in animals involves complex brain processes that integrate sensory information into a coordinated set of actions: It is something that animals do very well, but man-made thinking machines (computers) do rather poorly by comparison. In order to improve computer performance, many scientists and engineers have turned to the brain for theoretical insights into the processes of learning. These insights, when applied to computer technology, have become increasingly important in applications ranging from industrial robotics to process control. Studies of animal behavior, dating from the early years of this century, have provided a rich scientific literature for evaluating theories of learning. These are normally expressed in terms of mathematical relationships between environmental events (stimuli) and actions (responses). State-of-the-art learning theories are known as "real-time computational learning models" because they can readily be translated into computer programs for application to technology. Drs. Moore and Barto are using a well-characterized associative learning paradigm (that is, learning that one event preceeds another event), classical conditioning of the nictitating membrane response in the intact rabbit. This preparation serves them as a laboratory benchmark for evaluating their recently proposed mathematical learning model, and holds much promise both for clafifying brain processes underlying learning, and for enhancing and advancing computer technology. The ultimate goal of this research is to discover how information is processed so efficiently by the brain. These investigators will incorporate this knowledge into a mathematical model, in a way that best describes the relationship between a specific behavior and the brain mechanisms underlying that behavior. This new information can then be applied directly to information acquisition by the latest generation of computer systems. Show summary Hide summary	1
1989 — 1993	Barto, Andrew Hollot, Christopher (co-PI) [⬀] Ydstie, Erik	N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information	Neural Networks For Adaptive Control @ University of Massachusetts Amherst This project is an interdisciplinary cooperation between a leading neural network researcher, a computer scientist, and a chemical engineer, to develop and test advanced concepts of neural networks to control systems. The methods under study promise, in the long-term, an ability to handle noisy nonlinear systems beyond the reach of more conventional forms of control theory. Practical applications to chemical engineering problems will be attempted, in a realistic context. Show summary Hide summary	1
1992 — 1997	Barto, Andrew Ydstie, Erik	N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information	Reinforcement Learning Algorithms Based On Dynamic Programming @ University of Massachusetts Amherst This project will investigate aspects of a class of reinforcement learning algorithms based on dynamic programming (DP). Although these algorithms have been widely studied and have been experimented with in many applications, their theory is not developed enough to permit a clear understanding of the classes of problems for which they may be the methods of choice, or to guide their application. Research at the University of Massachusetts has made considerable recent progress in relating these methods to the most closely related conventional methods and in understanding the factors that influence their performance, both successful and unsuccessful. These methods may provide the only computationally feasible approaches to very large and analytically intractable sequential decision problems. The objectives of this project are: 1) to continue development of DP-based reinforcement learning methods an their theory, 2) to investigate their computational complexity, and 3) to define the characteristics of problems for which they are best suited. Show summary Hide summary	1
1995 — 1998	Grupen, Roderic [⬀] Barto, Andrew	N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information	A Control Basis For Learning and Skill Acquisition @ University of Massachusetts Amherst This research addresses principles for organizing predictable and flexible behavior in complex sensorimotor systems operating in unstructured environments. The central hypothesis of the work is that large classes of correct behavior can be constructed at run-time through the use of a small set of properly designed control primitives (the control basis) and that the skillful use of these sensorimotor primitives generalizes well to other tasks. A control basis is designed to represent a broad class of tasks, to structure the exploration of control policies to avoid irrelevant or unsafe alternatives, and to facilitate adaptive optimal compensation. The Discrete Event Dynamic Systems (DEDS) framework is used to characterize the control basis and to prune inappropriate control composition policies. Dynamic programming (DP) techniques are used to explore safe composition policies in which sensory and motor resources are bound to elements of the control basis to maximize the expected future payoff. An adaptive compensation policy is designed to extend the control basis and to incrementally approximate optimal control policies. A program of theoretical development and empirical analysis is undertaken to demonstrate the utility of this approach in robotics and machine learning applications. Show summary Hide summary	1
1995 — 1998	Barto, Andrew	N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information	Multiple Time Scale Reinforcement Learning @ University of Massachusetts Amherst This project will investigate a new approach to learning models of dynamical systems for use in advanced reinforcement learning architectures. It will develop a method by which a reinforcement learning system can learn multiple time scale models and use them as the basis for hierarchical learning and planning. the project(s objectives are to develop the mathematical theory of this approach, to examine its relationship to control theory and to behavioral and neural models of learning, and to demonstrate its effectiveness in a number of simulated learning tasks. The significance of the project for engineeirng is that the use of TD models will be major generalization of RL architectures, making them much more widely applicable. It also has the possibility of establshing the utility of TD modesl for system identification in more conventional adaptive control. The project also has implication for our understanding of animal learning able to model indirect and direct associations and their interactions in a mathematically principled way. An Additinoal impact of this research will be t o strengthen links between engineering, artifical intellegence, and biological studies of learning, thereby contributng to all three areas by facilitating a transfer of concepts and methods. Show summary Hide summary	1
1997 — 2001	Clifton, Rachel (co-PI) [⬀] Sutton, Richard Barto, Andrew Berthier, Neil [⬀]	N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information	Learning and Intelligent Systems: Developmental Motor Control in Real and Artificial Systems @ University of Massachusetts Amherst This project is being funded through the Learning and Intelligent Systems (LIS) initiative. A key aim of this initiative is to understand how highly complex intelligent systems could arise from simple initial knowledge through interactions with the environment. The best real-world example of such a system is the human infant who progresses from relatively simple abilities at birth to quite sophisticated abilities by two-years-of-age. This research focuses on the development of reaching by infants because (a) only rudimentary reaching ability is present at birth; (b) older infants use their arms in a sophisticated way to exploit and explore the world; and (c) the problems facing the infant are similar to those an artificial system would face. The project brings together two computer scientists who are experts on learning control algorithms and neural networks, and two psychologists who are experts on the behavioral and neural aspects of infant reaching, to investigate and test various algorithms by which infants might gain control over their arms. The proposed research focuses on the control strategies that infants use in executing reaches, how infants develop appropriate and adaptive modes of reaching, the mechanisms by which infants improve their ability to reach with age, the role of sensory information in controlling the reach, and how such knowledge might be stored in psychologically appropriate and computationally powerful ways. Preliminary results suggest that computational models that are appropriate for modeling the development of human reaching are different in significant ways from traditional computational models. Understanding the mechanisms by which intelligence can develop through learning can have significant impact in many scientific and engineering domains because the ability to build such systems would be simpler and faster than engineering a system with the intelligence specified by the engineer and because systems based on interactive learning could rapidly adapt to changing environmental conditions. Show summary Hide summary	1
1997 — 2004	Cohen, Paul Beal, Carole Clifton, Rachel (co-PI) [⬀] Grupen, Roderic [⬀] Barto, Andrew Berthier, Neil (co-PI) [⬀]	N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information	Cise Research Infrastructure: a Facility For Cross Disciplinary Research On Sensorimotor Development in Humans and Machine @ University of Massachusetts Amherst CDA-9703217 Grupen, Roderic A. University of Massachusetts A Facility for Cross Disciplinary Research on Sensorimotor Development in Humans and Machines This award is for supporting research activities in Computer Science and Psychology at UMass in assembling an infrastructure for experimental work on the development of conceptual structure from sensorimotor activity. An interactionist theory is advanced in which the origin of knowledge is interactive behavior in an environment. By this account, the nature of the environment and the agent's native resources (sensors, effectors, and control) lead directly to appropriate conceptual structures in natural and artificial systems. The central claim of this research is that the first task facing an intelligent, embodied agent is coordinated sensory and motor interaction with its environment and that this task leads to policies and abstractions that influence the subsequent acquisition of higher cognitive abilities. An interdisciplinary team specializing in robotics, cognitive development, and motor development, learning, planning and language lead the effort. The infrastructure incorporates robot hands and arms, binocular vision, binaural audition, haptic and kinesthetic information in a common framework to provide a rich sensory and motor encoding of interaction with the world. In addition to the robotics facilities, the infrastructure includes tools for gathering precise, quantitative observations of postures and rates of movement in human subjects. These facilities are designed to support analogs of nontrivial human processes so that computational models of development may be compared to data from infant subjects. Show summary Hide summary	1
1999 — 2003	Barto, Andrew Moore, John (co-PI) [⬀]	N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information	Kdi: Temporal Abstraction in Reinforcement Learning @ University of Massachusetts Amherst This project investigates a new approach to learning, planning, and representing knowledge at multiple levels of temporal abstraction. It develops methods by which an artificial reinforcement learning system can model and reason about persistent courses of action and perceive its environment in corresponding terms, and it develops and examines the validity of models of animal behavior related to this approach. The project's objectives are to develop the mathematical theory of the approach, to refine, extend, and conduct validation studies of related models of animal behavior, to examine the theory's relationship to control theory and artificial intelligence, and to demonstrate its effectiveness in a number of simulated learning tasks. Most current reinforcement learning (RL) research uses a framework in which an agent has to take a sequence of actions paced by a single, fixed time step: actions take one step to complete, and their immediate consequences become available after one step (modeled as a Markov decision process, or MDP). This makes it difficult to learn and plan at different time scales. Some RL research instead uses a generalization of this framework (semi-Markov decision processes, or SMDPs) in which actions take varying amounts of time to complete, and the existing theory specifies how to model the results of these actions and how to plan with them. However, this approach is limited because temporally extended actions are treated as indivisible and unknown units. For the greatest flexibility and best performance, it is necessary to look inside temporally extended actions to examine or modify how they are comprised of lower-level actions, which is not considered in existing approaches. This project, by contrast, will model extended courses of action as SMDP actions overlaid upon a base MDP. These courses of action, called options, can then be treated as if they were primitive actions; existing RL can be applied almost unchanged. This approach enables options to be analyzed at both the MDP and SMDP levels and introduces new issues at the interface between the levels. This approach is appealing because of its simplicity, similarity to previous approaches using primitive actions, and its solid mathematical foundation in MDP and SMDP theory. This is being developed further into a general approach to hierarchical and multi-time-scale planning and learning. Show summary Hide summary	1
2000 — 2002	Barto, Andrew Engelbrecht, Sascha	N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information	Lyapunov Methods For Reinforcement Learning @ University of Massachusetts Amherst 0070102 Barto Reinforcement learning is a summary term for a collection of methods for approximating solutions to stochastic optimal control problems. RL methods leave been successfully applied to a large array of such problems in a diversity of domains, including finance, logistics, telecommunications, and robot control. Although similar problems have been studied intensively for many years in control engineering and operations research, the methods developed by RL researchers have added new elements to the classical solution methods. RL methods offer novel ways to approximate solutions to problems that are too large or ill-defined for the classical solution methods to be feasible. A significant part of RL research is directed at increasing on-line performance and speed of convergence by providing RL systems with domain knowledge. This project is concerned with knowledge related to the design of stabilizing controllers for complex dynamical systems. It will try to develop a general theory for incorporating this knowledge into RL systems. The basic idea is to mathematically define policy subspaces that have certain known stability and safety properties and to focus exploration on control laws that lie within these policy subspaces. The means by which this is achieved are based on the theory of Lyapunov stability and the associated methods of Lyapunov control design. *** Show summary Hide summary	1
2002 — 2005	Barto, Andrew Mahadevan, Sridhar (co-PI) [⬀]	N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information	Dynamic Abstraction in Reinforcement Learning @ University of Massachusetts Amherst This project investigates reinforcement learning algorithms that use dynamic abstraction to exploit the spatial and temporal structure of complex environments to facilitate learning. The use of abstraction is one of the features of human intelligence that allows us to operate as effectively as we do in complex environments. We systematically ignore details that are not relevant to a task at hand, and we rapidly switch between abstractions when we focus on a succession of subtasks. For example, in planning everyday activities, such as driving to work, we abstract out irrelevant details such as the layout of objects inside the car, but when we actually drive, many of these details become relevant, such as the locations of the steering wheel and the accelerator. Different abstractions are appropriate for different tasks or subtasks, and the agent has to shift abstractions as it shifts to new tasks or to new subtasks. This project combines the theory of options with factored state and action representations to give precise meaning to the concept of dynamic abstractions and to study methods for creating and exploiting them. It will develop formalisms for representing option models in terms of factored state and action representations by extending existing formalisms for single-step dynamic Bayes network models to the multi-time case. It will investigate how the multi-time formulation call facilitate creating and using dynamic abstractions. An algebraic theory of abstraction will be developed by extending relevant concepts from classical automata theory to multi-time factored models. Methods will be developed for learning compact multistep option models by extending an existing mixture model algorithm for learning transition models from single-step to multi-step models. In general the notion of dynamic abstraction will be a valuable tool to apply to many difficult optimization problems in large-scale manufacturing (e.g., factory process control), robotics (navigation), multi-agent coordination, and other state-of-the-art applications of reinforcement learning. Since this research combines ideas from the fields of decision theory, operations research, control theory, cognitive science, and AI, it may provide a useful bridge that has the potential to foster contributions in all of these fields. Show summary Hide summary	1
2004 — 2007	Barto, Andrew	N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information	Collaborative Research: Intrinsically Motivated Learning in Artificial Agents @ University of Massachusetts Amherst Collaborative Research: Intrinsically Motivated Learning in Artificial Agents Project Summary Humans are unendingly curious; we spontaneously explore and manipulate our surroundings to see what we can make them do; we obtain enjoyment from making discoveries and for making things happen. We often engage in these activities for their own sakes rather than as steps toward solving practical problems. Psychologists call these intrinsically motivated behaviors because rewards are intrinsic in these activities instead of being due to the satisfaction of more primary biological needs. But what we learn during intrinsically motivated behavior is essential for our development as competent autonomous entities able to efficiently solve a wide range of practical problems as they arise. This project's objective is to develop a computational model of intrinsically motivated learning that will allow artificial agents to construct and extend hierarchies of reusable skills that are needed for competent autonomy. This project builds on existing research in machine learning, recent advances in the neuroscience of brain reward systems, and classical and contemporary psychological theories of motivation. At the core of the model are recent theoretical and algorithmic advances in computational reinforcement learning, specifically, new concepts related to skills and new learning algorithms for learning with skill hierarchies. The project develops a mathematical framework, implements the model in a series of simulated agents, and demonstrates the advances this will make possible in a series of increasingly complex environments. Intellectual Merit-Machine learning methods have become much more powerful in recent years. Despite these advances and their utility, today's learning algorithms fall far short of the possibilities for machine learning. They are typically applied to single, isolated problems for each of which they have to be hand-tuned and for which training data sets have to be carefully prepared. They do not have the generative capacity required to significantly extend their abilities beyond initially built-in representations. They do not address many of the reasons that learning is so useful in allowing animals to cope with new problems as they arise over extended periods of time. Success in this project will provide a fundamental advance in machine learning and move the field in a new direction. Although computational study related to intrinsic motivation is not entirely new, it is currently underdeveloped and does not take advantage of the highly relevant recent advances in the field of computational reinforcement learning and in the neuroscience of brain reward and motivation systems. Furthermore, computational studies do not take advantage of psychological theories of play, curiosity, surprise, and other factors involved in intrinsically motivated learning. This project addresses these shortcoming by taking an explicitly interdisciplinary approach. Broader Impacts-The new methods promise to improve our ability to control the behavior of complex systems in ways that will benefit society. Machine learning algorithms have been instrumental in a wide variety of applications in such areas as bioinformatics, manufacturing, communications, robotics, and security systems. It is important strategically, economically, and intellectually to increase the power of machine learning technologies as rapidly as possible. This project attempts to address some of these challenges. This project will strengthen interdisciplinary ties between the machine learning community of computer science and various disciplines devoted to the study of human cognitive development and education. The specific methods of concern in the proposed research have not yet been integrated. There has been very little cross-fertilization between the psychological study of intrinsic motivation and machine learning. The proposed research will remedy this situation, thereby helping to create an avenue of communication that can foster future developments in both fields. The project has the potential to contribute to our understanding of general principles underlying human cognitive development, with implications for education, where enhancing intrinsic motivation is a key factor. The educational component of the project focuses on graduate education through its training of graduate students. This includes the offering interdisciplinary graduate-level seminars at both U. of Massachusetts and U. of Michigan, to be taught by the PIs on the topic of intrinsically motivated learning. In its recruitment of graduate students, the project will take advantage of the role that U. Massachusetts plays as the lead institution in the NSF funded Northeast Alliance, which supports and mentors underrepresented minority students interested in academic careers in a science, mathematics, or engineering discipline. At U. of Michigan special effort will be made to recruit and involve undergraduates in student projects leading to summer projects funded by the Marian Sarah Parker Scholars Program, which targets female undergraduates and provides funds for summer research opportunities. Show summary Hide summary	1
2004 — 2008	Barto, Andrew Woolf, Beverly [⬀] Mahadevan, Sridhar (co-PI) [⬀] Arroyo, Ivon (co-PI) [⬀] Fisher, Donald (co-PI) [⬀]	N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information	Learning to Teach: the Next Generation of Intelligent Tutor Systems @ University of Massachusetts Amherst The primary objective of this project is to develop new methods for optimizing an automated pedagogical agent to improve its teaching efficiency through customization to individual students based on information about their responses to individual problems, student individual differences such as level of cognitive development, spatial ability, memory retrieval speed, long-term retention, effectiveness of alternative teaching strategies (such as visual vs. computational solution strategies), and degree of engagement with the tutor. An emphasis will be placed on using machine learning and computational optimization methods to automate the process of developing efficient Intelligent Tutoring Systems (ITS) for new subject domains. The approach is threefold. First, a methodology based on hierarchical graphical models and machine learning will be developed and evaluated for automating the creation of student models with rich representations of student state based on data collected from populations of students over multiple tutoring episodes. Second, methods will be developed and evaluated for deriving pedagogical decision strategies that are effective and efficient not just over the short-term (from one math problem to the next one), but over the long-term where retention over a period of at least one month is the objective. Third, a systematic study will be conducted of the role that known and powerful latent and instructional variables can have on performance through their inclusion in student models. Research in cognitive and educational psychology clearly shows the critical role that latent variables such as short-term memory and engagement play in learning, and that instructional variables such as over-learning and review, and massed and distributed practice have on the rate at which material is learned. The investigators jointly have strengths in the areas of intelligent tutoring, machine learning and optimization, and cognitive, mathematical and educational psychology, strengths that are needed in order to make the synergistic advances that are being proposed. Our preliminary simulations and classroom experiments suggest that we can significantly reduce the time it takes students to learn new material based on improved pedagogical decisions. For intellectual merit, he proposed research should advance fundamental knowledge of the learning and teaching of basic mathematics and more advanced algebra and geometry. It should add to the set of growing statistical and computational techniques that are available to estimate the complex hidden hierarchical structures that govern human behavior. The research should also significantly broaden the capabilities of machine learning systems by addressing learning scenarios that are grounded on the real and challenging problem of mathematics education than the abstract scenarios typically studied at present. For broader impact, this foundational educational research will lead to the broadening of participation of underrepresented groups, especially women, in a variety of science, technology, engineering and mathematics (STEM) disciplines. It will advance discovery and understanding of learning and engagement as predictors of individual differences in learning and will result in intelligent tutors that are more sensitive to individual differences. It will unveil the extent to which students of different genders and cognitive abilities learn more efficiently with different forms of teaching. This research will benefit society as machine learning methods, which provide a core technology for building complex systems, will be applicable to a variety of teaching systems. Show summary Hide summary	1
2007 — 2009	Barto, Andrew	N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information	Sger: Building Blocks For Creative Search @ University of Massachusetts Amherst This project will develop a formal framework based on optimization and reinforcement learning to model important features of creative processes. Large, ill-defined optimization problems that characterize situations where creativity comes into play require selectional, or generate-and-test, procedures that include both a smart generator and a smart tester. The generator responsible for generating structures to be evaluated should be able to generate structures that are novel while at the same time have high probability of being successful. This project investigates new methods for injecting structured, knowledge-based, novelty into the generation process. The tester, the process that evaluates alternatives, should be a good surrogate for the primary objective function, which is often not easily or inexpensively accessible. A smart tester uses a combination of a priori knowledge, knowledge accumulated from past creative activity, and information gained during the current creative activity to assess alternatives. The working hypothesis is that the synergy created by the interaction of a sufficiently smart generator and a sufficiently smart tester can account for important aspects of creative processes. Intellectual Merit. Although there have been past attempts to mathematically and computationally model aspects of creativity, few bring to bear modern developments in machine learning or take advantage of recent advances in computational reinforcement learning and its relation to animal reward and motivational systems. Furthermore, computational studies have not taken advantage of psychological theories of play, curiosity, surprise, and other factors involved in intrinsically motivated behavior and that perform significant roles in creative activities. This project addresses these shortcomings by taking a interdisciplinary approach. The project will meet the challenge of providing a coherent theoretical account of aspects of creativity without losing sight of the fluidity and flexibility of creative processes. Broader Impacts. Representing key elements of creative processes in a mathematically coherent framework can stimulate new directions of research in computer science, engineering, design research, and psychology. Algorithms designed according to this framework can facilitate the design of creative artificial agents as well as form the basis of tools for enhancing human creativity and creative enterprises. Such a framework can also provide a principled means for comparing performances of algorithms purporting to show creativity, thus forming a component of future research methodology directed toward creativity. The project has the potential to contribute to our understanding of general principles underlying human creativity, with implications for design, education, and the arts. Show summary Hide summary	1
2007 — 2011	Barto, Andrew Woolf, Beverly [⬀] Arroyo, Ivon (co-PI) [⬀] Fisher, Donald (co-PI) [⬀]	N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information	Hcc: Collaborative Research: Affective Learning Companions: Modeling and Supporting Emotion During Learning @ University of Massachusetts Amherst Emotion and motivation are fundamental to learning; students with high intrinsic motivation often outperform students with low motivation. Yet affect and emotion are often ignored or marginalized with respect to classroom practice. This project will help redress the emotion versus cognition imbalance. The researchers will develop Affective Learning Companions, real-time computational agents that infer emotions and leverage this knowledge to increase student performance. The goal is to determine the affective state of a student, at any point in time, and to provide appropriate support to improve student learning in the long term. Emotion recognition methods include using hardware sensors and machine learning software to identify a student's state. Five independent affective variables are targeted (frustration, motivation, self-confidence, boredom and fatigue) within a research platform consisting of four sensors (skin conductance glove, pressure mouse, face recognition camera and posture sensing devices). Emotion feedback methods include using a variety of interventions (encouraging comments, graphics of past performance) varied according to type (explanation, hints, worked examples) and timing (immediately following an answer, after some elapsed time). The interventions will be evaluated as to which best increase performance and in which contexts. Machine learning optimization algorithms search for policies that further engage individual students who are involved in different affective and cognitive states. Animated agents are enhanced with appropriate gestures and empathetic feedback in relation to student achievement level and task complexity. Approximately 500 ethnically and economically diverse students in Massachusetts and Arizona will participate. The broader impact of this research is its potential for developing computer-based tutors that better address student diversity, including underrepresented minorities and disabled students. The solution proposed here provides alternative representations of scientific content, alternative paths through material and alternative means of interaction; thus, potentially leading to highly individualized science learning. Further, the project has the potential to advance our understanding of emotion as a predictor of individual differences in learning, unveiling the extent to which emotion, cognitive ability and gender impact different forms of learning. Show summary Hide summary	1
2008 — 2009	Grupen, Roderic [⬀] Barto, Andrew	N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information	Sger: Hierarchical Knowledge Representation in Robotics @ University of Massachusetts Amherst This SGER proposal concerns the accumulation and representation of skills and control knowledge by robots that interact with unstructured environments. There has been comparatively little work on representations that capture re-useable knowledge in robotics---an issue that lies at the heart of many future applications. Thus, this SGER represents a potentially transformative technology and addresses significant gaps in the state-of-the-art for which the payoff, despite the risk, is extremely high. We aim our 1 year study on learning techniques that accumulate knowledge related to grasping and manipulation. We shall extend pilot studies and build prototypes for self-motivated learning techniques and generative models for manipulation and multi-body contact relationships. The approach relies on learning to discover and exploit structure over the course of several staged learning episodes; from sensory and motor knowledge concerning the robot itself, to controllable relationships between the robot and external bodies, to multi-body contacts involved in tasks like stacking and insertion. The project has three principal technological goals: to advance the state-of-the-art of robotic manipulation and knowledge representation; to extend machine learning methods toward intrinsically motivated, cumulative, and hierarchical learning; and to advance computational accounts of the longitudinal processes of sensorimotor and cognitive development in humans and machines. Show summary Hide summary	1
2012 — 2015	Barto, Andrew	N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information	Crcns: Collaborative Research: Neural Correlates of Hierarchical Reinforcement Learning @ University of Massachusetts Amherst Research on human behavior has long emphasized its hierarchical structure: Simple actions group together into subtask sequences, and these in turn cohere to bring about higher-level goals. This hierarchical structure is critical to humans' unique ability to tackle complex, large-scale tasks, since it allows such tasks to be decomposed or broken down into more manageable parts. While some progress has been made toward understanding the origins and mechanisms of hierarchical behavior, key questions remain: How are task-subtask-action hierarchies initially assembled through learning? How does learning operate within such hierarchies, allowing adaptive hierarchical behavior to take shape? How do the relevant learning and action-selection processes play out in neural hardware? To pursue these questions, the present proposal will leverage ideas emerging from the computational framework of Hierarchical Reinforcement Learning (HRL). HRL builds on a highly successful machine-learning paradigm known as reinforcement learning (RL), extending it to include task-subtask-action hierarchies. Recent neuroscience and behavioral research has suggested that standard RL mechanisms may be directly relevant to reward-based learning in humans and animals. The present proposal hypothesizes that the mechanisms introduced in computational HRL may be similarly relevant, providing insight into the cognitive and neural underpinnings of hierarchical behavior. The project brings together two computational cognitive neuroscientists and a computer scientist with expertise in machine learning. The proposed research, which includes both computational modeling and human functional neuroimaging and behavioral studies, pursues a set of hypotheses drawn directly from HRL research. A first set of hypotheses relates to the question of how complex tasks are decomposed into manageable subtasks. Here, fMRI and computational work will leverage the idea, drawn from HRL research, that useful decompositions "carve" tasks at points identifiable through graph-theoretic measures of centrality. A second set of hypotheses relates to the question of how learning occurs within hierarchies. Here, fMRI and modeling work will pursue the idea that hierarchical learning may be driven by reward prediction errors akin to those arising within the HRL framework. The project as a whole aims to construct a biologically constrained neural network model, translating computational HRL into an account of how the brain supports hierarchically structured behavior. Show summary Hide summary	1
2012 — 2016	Barto, Andrew	N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information	Nri-Small: Collaborative Research: Multiple Task Learning From Unstructured Demonstrations @ University of Massachusetts Amherst This project develops techniques for the efficient, incremental learning of complex robotic tasks by breaking unstructured demonstrations into reusable component skills. A Bayesian model segments task demonstrations into simpler components and recognizes instances of repeated skills across demonstrations. Established methods from control engineering and reinforcement learning are leveraged and extended to allow for skill improvement from practice, in addition to learning from demonstration. The project aims to unify existing research on each of these ideas into a principled, integrated approach that addresses all of these problems jointly, with the goal of creating a deployment-ready, open-source system that transforms the way experts and novices alike interact with robots. A simple interface that allows end-users to intuitively program robots is a key step to getting robots out of the laboratory and into human-cooperative settings in the home and workplace. Although it is often possible for an expert to program a robot to perform complex tasks, this programming is often very time-consuming and requires a great deal of knowledge. In response to this, much recent research is focusing on robot learning-from-demonstration, where non-expert users can teach a robot how to perform a task by example. Unfortunately, much of this work is limited to the artificially-structured demonstration of a single task with a well-defined beginning and end. By contrast, human-cooperative robots will be required to efficiently and incrementally learn many different, but often related, tasks from complex, unstructured demonstrations that are easy for non-experts to produce. Show summary Hide summary	1