
cached image
James J. DiCarlo - US grants
Affiliations: | Massachusetts Institute of Technology, Cambridge, MA, United States |
Area:
Visual CortexWebsite:
http://web.mit.edu/dicarlo-lab/index.htmlWe are testing a new system for linking grants to scientists.
The funding information displayed below comes from the NIH Research Portfolio Online Reporting Tools and the NSF Award Database.The grant data on this page is limited to grants awarded in the United States and is thus partial. It can nonetheless be used to understand how funding patterns influence mentorship networks and vice-versa, which has deep implications on how research is done.
You can help! If you notice any innacuracies, please sign in and mark grants as correct or incorrect matches.
High-probability grants
According to our matching algorithm, James J. DiCarlo is the likely recipient of the following grants.Years | Recipients | Code | Title / Keywords | Matching score |
---|---|---|---|---|
2004 — 2009 | Dicarlo, James J | R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Visual Object Processing in the Inferotemporal Cortex @ Massachusetts Institute of Technology DESCRIPTION (provided by applicant): Visual object recognition is central to our behavior, and knowledge of the underlying brain mechanisms is critical to understanding human visual perception and memory. The key problem is creation of selectivity for object identity that tolerates changes in an object's retinal image, such as changes in position and size. The primate brain appears to construct this selectivity in the ventral visual stream because neuronal responses in the highest area of that stream--the anterior inferotemporal cortex (AIT)--show shape selectivity that can tolerate position and size changes. Yet, we do not understand these key neuronal properties--reports of AIT tolerance are limited and inconsistent, and recent studies show that it can be very restricted. Thus, the goals of this proposal are an understanding of key factors likely to determine AIT position and size tolerance, and to determine if AIT tolerance can explain behavioral tolerance. Our first aim is to systematically determine the position and size tolerance of AIT neuronal shape selectivity for a range of object sets and object training histories. We will establish the relationship of selectivity and AIT position and size tolerance, the interaction of AIT position and size tolerance, and the effect of object-specific training on these relationships. These data will establish neuronal tolerance at the highest level of the primate visual system and provide a much-needed foundation for further study. The mechanisms that might underlie position and size tolerance fall into two broad classes: (1) automatic generalization; and (2) tolerance learned by experiencing objects across changes in position and size. Our second aim is to determine if position- or size-specific object experience have substantial effects on the position or size tolerance of AIT shape selectivity. Because this has not been examined, any result would be extremely informative in constraining mechanisms and guiding future studies. Although it is thought that AIT tolerance underlies behavioral tolerance, this has not been systematically examined. Our third aim is to determine if the position and size tolerance of object identification can be explained by the tolerance of AIT neuronal shape selectivity. This is a vital to understanding the link between high-level, ventral stream neuronal responses and visual object identification. |
1 |
2009 — 2012 | Dicarlo, James J | P30Activity Code Description: To support shared resources and facilities for categorical research by a number of investigators from different disciplines who provide a multidisciplinary approach to a joint research effort or from the same discipline who focus on a common research problem. The core grant is integrated with the center's component projects or program projects, though funded independently from them. This support, by providing more accessible resources, is expected to assure a greater productivity than from the separate projects and program projects. |
@ Massachusetts Institute of Technology DESCRIPTION (provided by applicant): Five years of renewed support are requested for three Core modules in the Department of Brain and cognitive Sciences at the Massachusetts Institute of Technology. Ours is a growing department which at are sent has ten NEI-supported investigators who carry out research on vision, the visual system and the oculomotor system. Due to major support from the McGovern family and the Picower Foundation, two major new centers have been formed at MIT (the McGovern Institute for Brain Research and the Picower Center for Learning and Memory) in which numerous new appointments will be made the next five years, with several of them in vision and oculomotor control. Some of these appointments have already been made. Presently we have 19 grants from the National Eye Institute. The research carried out by the ten investigators of the Core Group includes the following areas of research: (1) neurophysiological studies of the visual and oculomotor systems, (2) anatomical studies of the visual and oculomotor systems, (3) developmental studies of the vision and visuomotor function, (4) psychophysical studies of visual functions in non-human primates, (5) psychophysical studies of visual functions in normal human and patient populations, (6) computational analyses of vision and eye movement, and (7) imaging studies to elucidate the roles various visual areas play in image processing Two of the modules supported by this Core Grant, the Instrument Shop and the Electronics Shop, are in extensive use by the Core Group. During the past four years these investigators have been very productive and have published extensively as documented in this application. The Core Modules have made significant contributions to the overall research efforts of this group and have fostered interaction among investigators involved in vision and visuomotor research. Continued support of these two Core Facilities will be of great benefit to our future investigations. Additionally, we are requesting support for a new module to support ongoing and planned imaging studies conducted by eye researchers at MIT. If funded, this module will provide a central support facility for functional brain imaging by NEI-supported investigators in the BCS department. |
1 |
2010 — 2013 | Dicarlo, James Cox, David |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
@ Massachusetts Institute of Technology This project exploits advances in parallel computing hardware and a neuroscience-informed perspective to design next-generation computer vision algorithms that aim to match a human's ability to recognize objects. The human brain has superlative visual object recognition abilities -- humans can effortlessly identify and categorize tens of thousands of objects with high accuracy in a fraction of a second -- and a stronger connection between neuroscience and computer vision has driven new progress on machine algorithms. However, these models have not yet achieved robust, human-level object recognition in part because the number of possible "bio-inspired" model configurations is enormous. Powerful models hidden in this model class have yet to be systematically characterized and the correct biological model is not known. |
0.915 |
2010 — 2014 | Dicarlo, James J | R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Construction of Invariant Shape Selectivity in the Ventral Visual Stream @ Massachusetts Institute of Technology DESCRIPTION (provided by applicant): A fundamental goal of perceptual neuroscience is to understand the neuronal representations that underlie our remarkable ability to perceive, recognize, and remember visual objects. In humans and non-human primates, these representations are produced by processing along the ventral visual stream, and conveyed by patterns of neuronal activity in its highest level -- the monkey inferior temporal cortex (IT). The key computational problem the ventral stream solves is that it produces an IT neuronal representation of visual images that conveys selectivity for object identity and category, with tolerance ("invariance") to changes in object position, size, pose, illumination and clutter. Indeed, although the shape selectivity properties of the ventral stream have received much study, we know very little about the mechanisms that construct that tolerance. The goal of this proposal is a mechanistic understanding of how the ventral visual stream constructs the tolerant ("invariant") visual shape selectivity that underlies our object recognition abilities. In Aim 1 we ask: does naturally-acquired temporally contiguous experience "instruct" the formation of tolerance in the ventral stream? We have recently discovered that the tolerance of IT neuronal shape selectivity can be strongly and rapidly sculpted by altered temporal contiguity of unsupervised visual object experience. In this aim, we will use a series of closely-related visual experience manipulations to systematically test and characterize the role of this plasticity in position, size, and pose tolerance learning. This will illuminate its role in instructing adult visual object representation, and set the stage for longer-term studies of how these powerful representations are assembled during early development. In Aim 2 we will take a comparative approach to ask how object information is transformed across two ventral stream areas (V4 vs. IT). Using the same monkeys, same task, and same visual stimuli, we will use neuronal population methods to ask: How is the tolerance of the IT representation changed from the V4 representation? Is V4 shape selectivity preserved in the IT representation? Does the sparseness of visual representation change from V4 to IT? How does tolerant shape selectivity evolve in real time? Together, these experiments will inform a central question: "How is the tolerant object selectivity in IT built from earlier visual representation?", and the results will provide strong constraints on computational models of the ventral visual stream and guide our understanding of cortical information transformation more generally. PUBLIC HEALTH RELEVANCE: Visual object recognition is fundamental to our well-being and our brain is remarkably good at solving this problem even though the same object can appear very differently to our eyes. The overarching goal of these experiments is a mechanistic understanding of how the visual system constructs the patterns of neuronal activity that solve this problem. This will lead to an understanding of the brain processes that allow us to see and evaluate the visual world (e.g. recognize and remember objects). |
1 |
2013 — 2014 | Dicarlo, James J | R21Activity Code Description: To encourage the development of new research activities in categorical program areas. (Support generally is restricted in level of support and in time.) |
Time Delimited Neural Silencing to Dissect the Basis of Visual Object Perception @ Massachusetts Institute of Technology DESCRIPTION (provided by applicant): Visual object recognition is central to quality of life in health and disease, but it is not understood at a deep, mechanistic level. For example, while primate inferior temporal cortex (IT) is likely a key neuronal processing bottleneck, we still have only a dim understanding of its causal role at a fine spatial and temporal grain. This exploratory proposal (R21) aims to deploy, characterize, and behaviorally validate novel tools to produce spatially precise, temporally delimited silencing of neuronal activity in the IT cortex of the awak, behaving primate. Concretely, we want to choose a mm-scale location on a magnetic resonance image of a non-human primate brain, and then ask: what is the importance of normally-evoked neuronal activity at that location in supporting a given behavioral task? Rather than try to inject neuronal signals, our strategy is to develop methods to briefly (10-300 ms) block the neuronal activity that normally intervenes between visual stimulus onset and the animal's reaction time. To that end, this exploratory proposal has two synergistic aims: First, our preliminary results show that virally delivered optically-gated silencing molecules can indeed produce strong silencing of neuronal activity in IT cortex, but we have little understanding of the reliability, spatial extent and temporal limits of this silencing. Thus, we will (Aim 1) make x-ray targeted viral injections, followed by x-ray targeted optical fiber implantation, and spatially precise (~10 um) maps of neuronal silencing (or enhancement) effects in and around the optical fiber tip at multiple sites in IT cortex. The expected outcome is a spatiotemporal map of light-induced neuronal silencing around the optical fiber tip, and its dependence on light intensity, duration and latency. Second, we do not know if optical silencing of IT sub-regions leads to measurable behavioral effects on object recognition tasks. Thus, we have developed recognition tasks that are likely to be affected by IT silencing, and we have already discovered that pharmacological neuronal silencing at specific IT sub-regions (muscimol) leads to behavioral deficits in at least one recognition task (but not all such tasks). We now aim (Aim2) to test the ability of optical silencing tools to produce behavioral deficits in that same task at those same locations. The expected outcome is a demonstration that optical silencing of IT sub-regions can produce specific behavioral recognition deficits, as well as a comparison with pharmacologically induced deficits. If successful, the proposed work will enable entirely new lines of interventional work to systematically test the causal role of IT cortex in visual object recognition, and will contribute o a still nascent, but promising toolbox of optical techniques for systems-level questions in primates. |
1 |
2013 — 2021 | Dicarlo, James J | P30Activity Code Description: To support shared resources and facilities for categorical research by a number of investigators from different disciplines who provide a multidisciplinary approach to a joint research effort or from the same discipline who focus on a common research problem. The core grant is integrated with the center's component projects or program projects, though funded independently from them. This support, by providing more accessible resources, is expected to assure a greater productivity than from the separate projects and program projects. |
@ Massachusetts Institute of Technology DESCRIPTION (provided by applicant): Five years of renewed support are requested for three Core modules in the Department of Brain and Cognitive Sciences at the Massachusetts Institute of Technology: an Instrument Core, an Electronics Core, and an MR Imaging Core. We presently have twelve NEI-supported investigators who carry out research on vision, the visual system and the oculomotor system, with a total of 21 grants and two fellowships from NEI. Their research includes: (1) neurophysiological studies of the visual and oculomotor systems, (2) anatomical studies of the visual and oculomotor systems, (3) developmental studies of the vision and visuomotor function, (4) psychophysical studies of visual functions in non-human primates, (5) psychophysical studies of visual functions in normal human and patient populations, (6) computational analyses of vision and eye movement, and (7) imaging studies to elucidate the roles of various brain sub-regions in visual processing. Ours is a highly-productive, rapidly growing department with a longstanding focus on vision and eye-related research. The three Core Modules supported exclusively by this grant are the indispensable shared infrastructure of that productivity. Each Core has enabled the construction of novel devices and research methods that are not available off-the-shelf, facilitating interactions among the Core faculty as well as NEI investigators at other institutions. In addition, the Cores allow for rapid repair of equipment (often preventing significant down-time) and for training of students and postdoctoral fellows in design and fabrication constraints. We have a track-record of administering these Cores in ways that maximize their value, updating their resources and the training of their personnel as research demands have changed. NEI's continued support is essential to the operation and evolution of these Cores, which in turn are essential to our research. The collective value of the wide range of studies that depend upon these Cores represent an excellent return on NEI's investment. |
1 |
2014 — 2017 | Dicarlo, James J | T32Activity Code Description: To enable institutions to make National Research Service Awards to individuals selected by them for predoctoral and postdoctoral research training in specified shortage areas. |
@ Massachusetts Institute of Technology DESCRIPTION (provided by applicant): The Brain and Cognitive Sciences Graduate Program at the Massachusetts Institute of Technology requests renewal of its major training grant. The department is organized to promote interdisciplinary training and research in neuroscience and behavior, approached with the experimental power of modern molecular and cellular neuroscience, systems neuroscience, and cognitive science, combined with the theoretical strength of computational neuroscience and artificial intelligence. Trainees begin laboratory work through lab rotations in the first two terms and subsequently join a laboratory, working on problems in learning and memory, neural development, vision, motor control or brain disorders and diseases. Required course work can be completed in two to three years, with a two-term sequence of core courses in the first year, a quantitative methods course, and a flexible array of graduate lecture courses and seminar classes. The qualifying exam consists of written and oral components of an interdisciplinary NIH/NSF style grant proposal. Annual research reports and annual committee meetings are required, and mark the student's progress in research through completion of a thesis. Multiple presentations at professional meetings and journal publications are typically expected of a dissertation. Most students continue in research careers, armed with skills that typically span multiple theoretical and experimental approaches comprising molecular/cellular neuroscience, systems neuroscience, cognitive neuroscience, psychophysics, behavior and computation. Trainees will, in general, have strong backgrounds in the natural sciences (e.g., undergraduate majors in biology, chemistry, physics, mathematics, or electrical engineering). Occasional trainees will already hold a master's degree in another field. Candidates for the graduate program will be chosen by the department Graduate Committee constituted for the purpose of overseeing this program and will be evaluated on the basis of interviews, talent for research as demonstrated by past performance, letters of recommendation, grades, and GRE scores. Funds are requested for five years to support 12 predoctoral trainees per year. |
1 |
2016 | Dicarlo, James J | R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
The Role of Inferior Temporal Cortex in Core Visual Object Recognition @ Massachusetts Institute of Technology Our goal is to understand how the brain accomplishes visual object recognition. Evidence obtained under this grant and from other labs suggests that each visual image is processed along the ventral visual cortical process- ing stream into a new pattern of neural activity at its top level -- the inferior temporal cortex (IT) -- that con- veys explicit information about object identity, even in the face of substantial view uncertainty (?invariance?). That IT population representation is thought to be causally responsible for object recognition. But precisely how does the IT population account for a seemingly in?nite number of object discriminations? What are the behaviorally critical ?features? conveyed by IT? How many? How can they be described? Here we aim to build and test image-to-IT-to-behavior models that are predictively accurate over the entire domain of core visual object recognition behavior. Substantial prior work argues that we should start by test- ing and developing the IT 100.1f model family: all models in that family state that IT conveys ~100, image- computable ?features? in its activity sampled at ~1 mm scale. How can we test and develop such models? First, this model family predicts that we can build and provide a single, low dimensional (<100) Euclidean em- bedding space to predict all basic and subordinate level object discrimination tasks (Aim 1). Second, the model family predicts that we can discover the particular aspects of IT activity (called IT ?features?) as those that, when weighted and summed, exactly predict behavioral object confusion of every image (Aim 2a). Third, the model family predicts that temporary suppression of individual, mm-scale portions of IT cortex will produce reliable, predictable patterns of behavioral disruption across all basic-level and subordinate level object tasks (Aim 3). Fourth, the model family posits that differences in IT neural tuning functions at spatial scales less than ~1 mm are irrelevant for core object discrimination behavior ? a prediction we will test with both record- ing (Aim 2a) and neural perturbation (Aim 3) experiments. Finally, the model family motivates our goal (Aim 2b) of characterizing the complete set of ~100 IT features with image-computable functions and with human shape adjectives. While substantial preliminary data support these predictions and goals, a complete model has not yet been built or tested. If these aims are accomplished, this work would transform our understanding by showing pre- cisely how core object recognition is causally accounted for at the level of IT cortex, and by providing a model that would accurately predict how any image manipulation or direct IT neural intervention would alter any core object recognition behavior. |
1 |
2017 — 2018 | Dicarlo, James J | R21Activity Code Description: To encourage the development of new research activities in categorical program areas. (Support generally is restricted in level of support and in time.) |
Post-Natal Development of High-Level Visual Representation in Primates @ Massachusetts Institute of Technology View-invariant object recognition is a complex cognitive task that is critical to everyday functioning. A key neural correlate of high-level object recognition is inferior temporal (IT) cortex, a brain area present in both humans and non-human primates. Recent advances in visual systems neuroscience have begun to uncover how images are encoded in the adult IT object representation, however the learning rules by which high level visual areas (especially IT) develop remain mysterious, with both the magnitude and qualitative nature of developmental changes remaining almost completely unknown ? in part because, over the last thirty years, there have been practically no studies of spiking neural responses in the higher ventral cortical areas of developing primates. There is thus a significant gap in our understanding of how visual development proceeds. This exploratory proposal aims to characterize how representation in higher primate visual cortex changes during development. We first aim (Aim 1) to implant chronic electrode arrays to record hundreds of IT neuronal sites in response to thousands of image stimuli in awake behaving juvenile macaques. These data will comprise a snapshot of the developing primate visual representation, and will be particularly powerful because we have already extensively measured adult monkey IT using the same stimuli and methods. By comparing juvenile and adult neuronal responses at both single site and population levels, we will obtain a unprecedentedly large-scale and detailed picture of the neural correlates of high-level visual development (Aim 2). Aims 1 and 2 are exploratory, but potentially transformative ? they will result in publicly available neuronal IT development benchmarks against which any proposed model of high level visual development can be rigorously tested, and will spur the development of those models in our lab and others. In that context, we will also seek (Aim 3) to improve known semi- and un-supervised learning rules from the computer vision and computational neuroscience literature, and to compare them to both recent high-performing (but biologically implausible) supervised models as well to the rich developmental measurements obtained in Aims 1 and 2. Establishing experimental and surgical procedures for juvenile array recordings will create the future opportunity to observe changes in high level neural visual representations while experience is manipulated in early development, and will enable experiments in other sensory, motor, or decision making domains. If successful, the proposed work will yield a deeper understanding of the principles underlying visual cortex development, understanding which will in turn be helpful for treating neurodevelopmental disorders that implicate cortical circuits, including amblyopia and autism. |
1 |
2019 — 2021 | Dicarlo, James J | T32Activity Code Description: To enable institutions to make National Research Service Awards to individuals selected by them for predoctoral and postdoctoral research training in specified shortage areas. |
Computationally Enabled Integrative Neuroscience @ Massachusetts Institute of Technology Summary / Abstract ! The Brain and Cognitive Sciences (BCS) Graduate Program at the Massachusetts Institute of Technology (MIT) proposes the ?Computationally-Enabled Integrative Neuroscience? (CEIN) predoctoral training program. Tremendous advances in the field of neuroscience are beginning to enable an integrated understanding of the brain. A convergence of new tools and methods, from optogenetics to CLARITY to CRISPR to deep neural network modeling and machine learning, is providing the means to address problems that once seemed intractable. MIT's BCS department is ideally organized to promote interdisciplinary, integrative training and research in neuroscience and behavior, combining the empirical power of modern molecular, cellular, systems and behavioral methods, with the theoretical and model-building strength of computational neuroscience and artificial intelligence. With world-renowned faculty and access to state-of-the-art equipment, CEIN trainees will be poised to lead the next generation of basic and translational neuroscience. The proposed CEIN training program maintains BCS's longstanding strength of integrative training across levels of empirical analysis. In addition, the proposed program reflects significant evolution in our field: the increased importance of computation in both data analysis and complex model building, and the increased importance of professional skills for leadership. The CEIN training objectives are focused on three training pillars (1) advancing empirical methods and concepts at multiple levels of neuroscience (2) computational approaches to theory development and brain data analysis, and (3) professional skills such as grant-writing, oral presentations, and clinical connection. Supported by MIT's world-class facilities, resources, and faculty, predoctoral students will achieve these goals through comprehensive coursework, new modules for professional skills development, mentorship by experts in the field, and advanced research experience. The research of CEIN trainees will lead to profound new discoveries about brain function in health and its modes of failure in disease. Insights from CEIN laboratories will impact diagnosis and treatment of Alzheimer's Disease, Autism Spectrum Disorders, dyslexia, hearing loss, and many other disorders with increasing impact on heath in the United States. The CEIN program is focused on training students in their first two years of graduate school. Funds are requested for five years to support 11 predoctoral trainees per year. The CEIN training program would be the only foundational neuroscience predoctoral training program at MIT. |
1 |
2019 — 2021 | Dicarlo, James J | P30Activity Code Description: To support shared resources and facilities for categorical research by a number of investigators from different disciplines who provide a multidisciplinary approach to a joint research effort or from the same discipline who focus on a common research problem. The core grant is integrated with the center's component projects or program projects, though funded independently from them. This support, by providing more accessible resources, is expected to assure a greater productivity than from the separate projects and program projects. |
Core-Vision Processes: Administration Core @ Massachusetts Institute of Technology ADMINISTRATIVE CORE: Project Summary The Administrative Core ensures the three service Cores (Machine Core, Electronics Core and Imaging Core) are efficiently run, coordinated, communicated, and evolved as needed to meet the overall goals of the NEI Core grant. Most importantly, the Administrative Core supervises, mentors, and administratively supports the personnel in the service Cores to keep the Core personnel working toward the specific aims of each of the service Cores and provides opportunities for additional training of those personnel as needed for those aims. The Administrative Core ensures the availability and equitable use of the service Cores and develops and deploys software to assist in this effort. The Administrative Core upholds a culture of transparency and collaboration by communicating the ever-evolving Core services to keep all investigators up to date with the technologies and capabilities of each Core, communicating the projects in the service Cores, and facilitating dissemination of designs and know how. Finally, the Administrative Core tracks changes in research needs of the Core NEI Investigators and adapts the Cores to meet those ever-evolving needs. Because these efforts keep the Cores focused on their overall mission, all of the proposed activities were previously performed on an ad hoc basis. However, the formal addition of the Administrative Core in this renewal application will enhance these activities and thus enhance the impact of the overall Core. |
1 |
2019 — 2021 | Dicarlo, James J | P30Activity Code Description: To support shared resources and facilities for categorical research by a number of investigators from different disciplines who provide a multidisciplinary approach to a joint research effort or from the same discipline who focus on a common research problem. The core grant is integrated with the center's component projects or program projects, though funded independently from them. This support, by providing more accessible resources, is expected to assure a greater productivity than from the separate projects and program projects. |
@ Massachusetts Institute of Technology MACHINE CORE: Project Summary The onsite Machine Core is critical to the research productivity of the NEI Core Investigators and the surrounding research community. The Machine Core designs and fabricates novel research apparatuses that cannot be easily obtained by other means. The Core has evolved to have modern capabilities in computer-aided design (CAD), 5-axis fabrication, and 3D printing, as well as significant expertise in (e.g.) using 3D magnetic resonance imaging data to customize established base designs (shared by all labs) into custom designs for each lab's needs and each specific animal subject. The Machine Core acts as a physical and intellectual meeting ground to migrate apparatus design and use knowledge from one lab to another. It is very adept at detecting synergistic research needs across labs and leveraging those for efficiency and collaboration. As our very active research community frequently needs repair or modification of equipment, the Machine Core has proven critical to avoiding disruptions or delays. One cannot overstate the importance of this on-site service -- it is practically impossible for each lab to keep redundant copies of all the equipment in their lab. The Machine Core plays an important role training our principal investigators, graduate students and postdocs to gain significant ?hands-on? interaction that cannot occur remotely. These experiences inculcate researchers in computer aided design (CAD), design trade-offs (e.g. materials choices, tolerances), and the advantages and limitations of various fabrication methods, providing them confidence to try developing innovative methodologies. |
1 |
2021 — 2024 | Kanwisher, Nancy [⬀] Tenenbaum, Joshua (co-PI) [⬀] Dicarlo, James |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
@ Massachusetts Institute of Technology The last ten years have witnessed an astonishing revolution in AI, with deep neural networks suddenly approaching human-level performance on problems like recognizing objects in an image and words in an audio recording. But impressive as these feats are, they fall far short of human-like intelligence. The critical gap between current AI and human intelligence is that, beyond just classifying patterns of input, humans build mental models of the world. This project begins with the problem of physical scene understanding: how one extracts not just the identities and locations of objects in the visual world, but also the physical properties of those objects, their positions and velocities, their relationships to each other, the forces acting upon them, and the effects of forces that could be exerted on them. It is hypothesized that humans represent this information in a structured mental model of the physical world, and use that model to predict what will happen next, much as the physics engine in a video game generates physically plausible future states of virtual worlds. To test this idea, computational models of physical scene understanding will be built and tested for their ability to predict future states of the physical world in a variety of scenarios. Performance of these models will then be compared to humans and to more traditional deep network models, both in terms of their accuracy on each task, and their patterns of errors. Computational models that incorporate structured representations of the physical world will then be tested against standard convolutional neural networks in their ability to explain neural responses of the human brain (using fMRI) and the monkey brain (using direct neural recording). These computational models will provide the first explicit theories of how physical scene understanding might work in the human brain, at the same time advancing the ability of AI systems to solve the same problems. Because the ability to understand and predict the physical world is essential for planning any action, this work is expected to help advance many technologies that require such planning, from robotics to self-driving cars to brain-machine interfaces. Each of the participating labs will also expand their established track records of recruiting, training, and mentoring women and under-represented minorities at the undergraduate, graduate, and postdoctoral levels. Finally, the collaborating laboratories will continue and increase their involvement in the dissemination of science to the general public, via public talks, web sites, and outreach activities. |
0.915 |