2003 |
Oliva, Aude |
R03Activity Code Description: To provide research support specifically limited in time and amount for studies in categorical program areas. Small grants provide flexibility for initiating studies which are generally for preliminary short-term projects and are non-renewable. |
Perception and Categorization or Real World Scenes @ Michigan State University
DESCRIPTION (provided by applicant): The application addresses a fundamental tension in visual cognition: how are human observers able to recognize the semantics of a complex real world scene image at a glance? Since the seminal work of Mary Potter, a number of experimental studies have demonstrated that we identify a surprising amount of information from a single glance at a scene. We can recognize its semantic category (e.g. a street), some objects and regions (e.g. a red car on the left) and other characteristics of the space that the scene subtends in the real world (e.g. perspective). This information refers to the "gist" of the scene and can be identified as quickly and as accurately as a single object. The principal aim of this project is to define the perceptual content of the image information acquired during a glance at scene photographs. In this application, we consider the case of the image being conceptualized in short-term memory. We aim to propose an experimental paradigm that allows the comparison of the quantity of information common to pairs of scene images. The research program introduces an innovative image similarity measure that defines the exact quantity of spatial and spectral components common to images that share the same semantic category. The results of the proposed research program shall provide researchers in visual cognition with the knowledge about the quantity of image information that, on average, adult human observers are seeing and remembering within a brief exposure to a novel picture. The research should demonstrate that the quantity of information varies with the task the observer has to perform. More precisely, the study aims to explore the information that human observers may use when recognizing a scene at different levels of abstraction (its superordinate level, its basic-level and its subordinate level of description).
|
0.909 |
2006 — 2011 |
Oliva, Aude |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Career: Categorization and Identification of Visual Scenes @ Massachusetts Institute of Technology
CAREER: Categorization and Identification of Visual Scenes PI: Aude Oliva
One remarkable aspect of visual recognition is that humans are able to recognize the meaning (or "gist") of complex visual scenes within 1/20 of a second, independently of the quantity of objects in the scene. This rapid understanding phenomenon can be experienced while looking at rapid sequences in television advertisements and quick cuts in modern movie trailers. How is this remarkable feat accomplished? Research over the last decade has made substantial progress toward understanding the mechanisms underlying single object recognition, but less progress has been made toward understanding scene recognition. For example, computer systems fall well short of human performance in tasks that require recognizing the gist of a scene. Dr. Aude Oliva has undertaken a novel approach to this challenging question by studying mechanisms of analysis that are global in nature, focusing on statistically robust features describing the spatial layout of the scene (e.g. its volume, its perspective, its level of clutter) and not merely its components (e.g., the objects in a scene). With National Science Foundation support, Dr. Aude Oliva will conduct a five-year CAREER award study to examine how a global approach to image analysis can explain humans remarkable ability to recognize scenes and objects. Moreover she will use this approach to define operational strategies for machine vision systems. This program of research will combine a number of methodologies, including behavioral experiments (psychophysics, eye tracking), cognitive neuroscience methods (event-related potentials), and computational modeling. Applications of this work might include scene and space recognition systems to assist drivers, automatic systems that could provide semantic descriptions of the contents of large image databases, and computer assisted systems to aid the visually-impaired in navigating through visual space. The educational mission proposed by Dr. Oliva includes laboratory training of graduate and undergraduate students in the cognitive and computational methods of scene understanding, as well as a new course on computational visual cognition, and a winter tutorial together with an annual symposium both on scene understanding.
|
1 |
2010 — 2014 |
Oliva, Aude |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ri: Small: Hierarchical Visual Scene Understanding @ Massachusetts Institute of Technology
Intelligent systems, both artificial and biological, must find effective ways to organize a complex visual world. The cross-disciplinary field of scene understanding is in need of a comprehensive framework in which to integrate cognitive, computational and neural approaches to the organization of knowledge.
This research program aims to create a framework for organizing knowledge of visual environments that human and artificial systems encounter when navigating in the world or browsing visual databases. The aim is to determine which taxonomies are best suited for solving different visual tasks, and use computer vision algorithms to organize visual environments as humans do. For example, semantic relationships between scenes are well captured by a hierarchical tree (e.g. a basilica is a type of church, which is a type of building) but functional similarities between different environments may be best represented as clusters (e.g. restaurants, kitchens and picnic areas clustered as places to eat; offices and internet cafés as places to work).
Because hierarchies and taxonomies provide a way of formalizing many types of contextual information (spatial, temporal, and semantic), they can be used to enhance the performance of computer vision systems at object and scene recognition, and aid in the development of smarter image search algorithms.
Besides serving as a unified benchmark for comparing different models and theories, this enterprise offers new teaching and applied tools for research and courses, which will be made available through websites and symposia.
|
1 |
2010 — 2012 |
Oliva, Aude |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Workshop: Froniers in Computer Vision @ Massachusetts Institute of Technology
Computer vision started with the goal of building machines that can see like humans. Nowadays, computer vision has expended to numerous applications such as image database search in the world wide web, computational photography, reconstruction of three-dimensional scenes, surveillance, assistive systems, vision for graphics and nanotechnology, etc. More domains and applications keep arising as computer vision technology develops.
The goal of the workshop in Frontiers in Computer Vision is to bring together national and international experts, from academia and industry, to identify the future impact of computer vision on the economic, social, educational and security needs of the nation and outline the scientific and technological challenges to address the issues: how can computer vision build on the success and enthusiasm of its growing participants? how can the academic community make connections to industry? how to better foster scholarship and improve communication both within computer vision itself and to related disciplines and application areas? How can computer vision best interact with related fields? how can the importance and promise of computer vision be communicated to the general public?
The deliverables of the event include, among others, videos of the presentations available on the Frontiers in Computer Vision website, together with a roadmap to outline the scientific and technological challenges to address.
|
1 |
2011 — 2012 |
Oliva, Aude |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Nsf-Anr Workshop: Us-French Collaboration in Computational Neuroscience @ Massachusetts Institute of Technology
US-French Collaboration in Computational Neuroscience, Paris, November 29-30, 2011
This award supports a US-French workshop, led by Aude Oliva and Alain Destexhe, on binational collaboration in computational neuroscience. The workshop builds on interests by NSF, Agence Nationale de la Recherche (ANR), and other agencies in collaborative research in this rapidly developing area of research. The workshop will explore the intellectual opportunities, educational and economically relevant impacts, and practical considerations needed for US-French collaboration to be successful. The workshop will be attended by US and French researchers, and representatives from US and French funding organizations. A report from the workshop will be made available at http://www.nsf.gov/crcns.
|
1 |
2011 — 2015 |
Oliva, Aude |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
The Gist of the Space: a Space Centered Approach to Visual Scene Perception @ Massachusetts Institute of Technology
Project Summary Vision is central to our interactions with the world. Aside from recognizing faces and communicating with people, our daily activities are also organized around two fundamental tasks: recognizing our environment and navigating through it. The research program of Dr. Aude Oliva constitutes a new integration of behavioral, computational and cognitive neuroscience research on scene perception. A growing body of evidence from behavioral, imaging and computational investigations has shown that the perception of complex real-world scenes engages distinct cognitive and neural mechanisms from those engaged in object recognition. To date, however, this evidence has not resulted in a comprehensive framework for understanding scene processing. Here, the PI proposes to test the novel hypothesis that real-world scene analysis is performed in a network of distinctive brain regions, with each region specialized in representing a different level of scene information. Since scenes are inherently three-dimensional spaces, she will show that the brain capitalizes on information uniquely derived from the space encompassed by a scene, rather than an exclusively object-based description. In other words, before knowing the gist of a scene, we analyze the gist of the space. Understanding the nature of the brain's representations of visual scenes is an enterprise that will push the development of fast and reliable rehabilitation strategies for individuals with visual and spatial impairments, and push forward the development of aid-based systems that rely on an understanding of the visual space. Real-world scene recognition is an unsolved mystery that will have implications for neuroscience, computational vision, artificial intelligence, robotics and psychology.
|
1 |
2015 — 2018 |
Oliva, Aude Torralba, Antonio (co-PI) [⬀] Pantazis, Dimitrios (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ncs-Fo: Algorithmically Explicit Neural Representation of Visual Memorability @ Massachusetts Institute of Technology
As Lewis Carroll famously wrote in Alice in Wonderland - It's a poor sort of memory that only works backwards-. On this side of the mirror, we cannot remember visual events before they happen; however, our work will help predict what people remember, as they see an image or an event. Our team of investigators in cognitive science, human neuroscience and computer vision bring the synergetic expertise to determine how visual memories are encoded in the human brain at milliseconds and millimeters-resolution. Cognitive-level algorithms of memory would be a game changer for society, ranging from accurate diagnostic tools to human-computer interfaces that will foresee the needs of humans and compensate when cognition fails.
The project capitalizes on the spatiotemporal dynamics of encoding memories while providing a computational framework for determining the representations formed from perception to memory along the scale of the whole human brain. A fundamental function of cognition is the encoding of information, a dynamic and complex process underlying much of our successful interaction with the external environment. Here, we propose to combine three technologies to predict what makes an image memorable or forgettable: neuro-imaging technologies recording where encoding happens in the human brain (spatial scale), when it happens (temporal scale), and what types of computation are performed at the different stages of storage (computational scale). Characterizing the spatiotemporal dynamics of visual memorability, and determining the type of computation and representation a successful memorability system performs is a crucial endeavor for both basic and applied sciences.
|
1 |