1990 — 1992 |
Yuille, Alan Mumford, David (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Mathematical Sciences: Feature Detection and Representation of Faces Using Deformable Templates
This one-year grant will permit Dr. Yuille and one student to extend their preliminary work in representation and visual recognition of human faces. Their current system can locate strong features such as eyes and mouths by using parametric adaptation of deformable templates. They are now applying robust statistical methods to identification of partially occluded or unresolved features, and will also study pyramid algorithms for matching a global face template. The goal is robust face recognition in the presence of noise and distractions.
|
0.957 |
1991 — 1993 |
Yuille, Alan |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Parallel Image Smoothing and Segmentation Algorithms Appropriate For Vlsi Implementation
This research focuses on two main themes: (i) the unification of image segmentation methods in a framework based on scale space an differential equations, and (ii) the texture segmentation using Gabor filters and energy functions with discontinuities. It is intended that the resulting techniques will utimately be suitable for VLSI implementation. The goal of (i) is to provide a mathematical framework within which many current approaches to image segmenttion can be compared. The goal of (ii) is to devslop a model for texture segmentation using the energy function approach with Gabor filters. Interactions between filters will allow the system to deal with slowly varying changes in the orientation of textures, such as ocur in foreshortening.
|
0.957 |
1993 — 1997 |
Yuille, Alan Clark, James Gold, Albert |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Data Fusion and Trajectory Control of Active Vision Systems
This is the first year of a three-year continuing award. The research aims to develop a principled methodology with which to specify motions of an active observer, commonly referred to as the "next look" problem. The work will provide algorithms and techniques that can be used in mobile robot systems to control the robot's motion and to efficiently acquire sensory information about the robot's environment. The work consists of three primary areas of research. The first involves the development, and implementation in a real-time robotic system, of active vision algorithms for extraction of 3D environmental information. This includes structure from controlled camera motion, and structure from controlled illuminant motion. The second aspect of the work involves the development of a Bayesian data integration technique which will allow the 3D information acquired by multiple active vision modules over time to be integrated into a dynamic world or robot environment map. The third aspect is the development of a methodology for determining optimal and sub-optimal trajectories of active observers, based on the criteria of minimizing the time and computation required to reduce a global measure of uncertainty to below a set threshold.
|
0.957 |
1994 — 1996 |
Yuille, Alan |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Deformable Templates For Face Description, Recognition, Interpretation, and Learning
9317670 Yuille This is the first year funding of a three-year continuing award to investigate the problems of robust face recognition under lighting, albedo, geometry and viewpoint variations by formulating new modeling and matching alogrithms. The approach consists of deriving sets of models which, when interpolated, serve to handle wide variations in viewpoint. The model used consists of two parts: first, a geometric model related to spatial variations and individual and expression changes, and second, an imaging model, which handles variations in lighting and albedo. The research involves the following four goals: 1) the design, test, and implementation of algorithms to match the model described against frontally-viewed faces, 2) the extension of deformable template techniques for tracking faces and their features and determine an interpretative mapping in terms of expressions, 3) the development of probabilistic methodologies for first learning from a database the prior distributions used by deformable templates, and later, to develop a learning method to determine the maximum a posteriori estimate of the model, and 4) to generalize the model to detect multiple faces in complex visual scenes, and perform robustly even in the presence of outliers and occluders.
|
0.957 |
1998 — 2001 |
Yuille, Alan |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Automated Detection of Informational Signs and Hazardous Objects: Visual Aids For the Blind @ Smith-Kettlewell Eye Research Foundation
This research will develop a framework for the rapid detection, location, and identification of visual targets in unconstrained real world domains. This framework will lead to algorithms which can be implemented on portable computers with video input with the goal of being used, for example, to enable the blind/visually impaired to navigate in real world scenes. These requirements mean that the algorithms must be extremely efficient at extracting information from the input images. The approach will use statistical analysis of the targets and background, taking into account variations due to illumination and viewpoint variations, to determine probabilistic models for the appearance of the target and background. From these models, sets of tests and groups of tests will be determined. These tests will be designed to be maximally informative, based on statistical measures of errors rates such as Chernoff Information, and to lead to fast implementations on portable PC's. The search strategy is based on the intuition of picking tests which maximize the expected gain in information about the target hypothesis. In practical problems, however, it will not always be possible to compute these expected information gains in real time. Therefore the search strategy will make use of a more general formulation in terms of A algorithms where the search is guided by a heuristics taylored to the application domain. Expected information is one possible heuristic but there are many others which are more easily computable and which can still give provable convergence to the optimal solution.
|
0.909 |
2002 — 2004 |
Yuille, Alan L |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Locating and Reading Informational Signs @ University of California Los Angeles
DESCRIPTION (provided by applicant): Our goal is to construct computer vision systems to enable the blind and severely visually impaired to detect and read informational text in city scenes. The informational text can be street signs, bus numbers hospital signs, supermarket signs, and names of products (eg. Kellogg's cornflakes). We will construct portable prototype computer vision systems implemented by digital cameras attached to personal, or hand held, computers. The camera need only be pointed in the general direction of the text (so the text is only one percent of the image). A speech synthesizer will read the text to the user. Blind and visually impaired users will test the device in the field (under supervision) and give feedback to improve the algorithms. We argue that this work will make a significant contribution to improving human health (rehabilitation). Computer vision is a rapidly maturing technology with immense potential to help the blind and visually impaired. Reports suggest that detecting and reading informational text is one of the main unsatisfied desires of these groups. Written signs and information in the environment are used for navigation, shopping, operating equipment, identifying buses, and many other purposes (to which a blind person does not otherwise have independent access). The blind and severely visually impaired make up a large fraction of the US population (3 million). Moreover, this proportion is expected to increase by a factor of two in the next ten years due to increased life expectancy. Our proposal is design-driven. It uses a new class of computer vision algorithms known as Data Driven Monte Carlo (DDMCMC). The algorithms are used to: (i) search for text, and (ii) to read it. Recent developments in digital cameras and portable/handheld computers make it practical to implement these algorithms in portable prototype systems. The three scientists in this proposal have the necessary expertise to accomplish it. Dr.'s Yuille and Zhu have backgrounds in computer vision and Dr. Brabyn has experience in developing and testing engineering systems to help the blind and visually impaired. Our proposal falls within the scope of the Bioengineering initiative because we are applying techniques from the mathematical/engineering sciences to develop informatic approaches for patient rehabilitation. More specifically, our work will facilitate the development of portable devices to help the blind and visually impaired.
|
0.958 |
2003 — 2004 |
Yuille, Alan Zhu, Song-Chun (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Sger: Stochastic Algorithms For Visual Search and Recognition @ University of California-Los Angeles
SGER Proposal 0240148 DDMCMC Algorithms for Rapid Search and Detection PI: A.L. Yuille University of California, LA
The proposed study is to design and implement a rapid visual search algorithm called DDMCMC, which uses data-driven feedforward visual cues to drive a Monte Carlo algorithm. This approach combines the speed of feedforward algorithms with the high quality performance of top-down model instantiation algorithms. DDMCMC was developed for the image segmentation problem by the Co-PI and was demonstrated to give fast and very accurate segmentation results on large datasets (with groundtruth). DDMCMC is generalized to the search and detection task by using techniques such as AdaBoost learning to determine effective feedforward algorithms to drive the MCMC algorithm.
The broader impact is to develop computer vision algorithms to help the blind and visually impaired (though we will also use this grant to train graduate students). To achieve this, a pilot study of the DDMCMC algorithm will be developed for the specific task of searching for, and then reading, informational signs (the results can be communicated to a visually impaired user by a speech synthesizer). This study will be aided by researchers at the Smith-Kettlewell Eye Research Institute (SKERI) who include two blind engineers. The proposed algorithms will be designed and tested on image datasets taken by blind volunteers using head or body mounted cameras (to ensure the realism of our approach). Researchers at SKERI will also give feedback on the practicality of the approach and how it compares with alternative technologies for this task (none of which use computer vision). Computer vision has enormous potential to help the visually disabled provided fast and effective algorithms are developed.
|
1 |
2005 — 2008 |
Yuille, Alan Zhu, Song-Chun (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Image Parsing: Integrating Generative and Discriminative Methods @ University of California-Los Angeles
Abstract: 0413214. Alan L. Yuille.
The proposal will develop a computational framework for parsing images. Image parsing subsumes many standard computer vision tasks such as object detection and image segmentation. The approach is based on Bayesian inference using the Data Driven Markov Chain Monte Carlo (DDMCMC) algorithm. It involves modeling the diverse visual patterns that occur in natural images by hierarchical generative models. It also requires an algorithm, DDMCMC, that is capable of performing inference on these models. The design principle of DDMCMC is to use discriminative models to guide the search through the parameters of the generative models. Specific applications of image parsing include the development of computer vision systems to help the visually disabled by detecting and reading text, and detecting other salient objects such as faces. Hence we except this work to have broad impact in helping the visually disabled. Other applications include context based image retrieval and automatic security systems. The work will also help train 2-3 graduate students in Computer Science and Statistics.
|
1 |
2006 — 2010 |
Yuille, Alan |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Computational Theory of Motion Perception @ University of California-Los Angeles
This proposal will develop fundamental understanding of aspects of the human visual system. The goal is to understand how humans perceive motion . In other words, to understand what goes on inside a human's brain when he, or she, looks at a group of birds flying or snowflakes falling. The proposal is interdisciplinary and combines computational theory, psychophysics and physiology. The computational theory provides a mathematical model for how humans process motion. The theory is implemented by computer algorithms, which are applied to a motion sequence of images, and which predict the estimated velocities of motion and other properties. These motion sequences of images are pseudo-realistic, in the sense that they appear to be natural images (e.g. of flying birds, or snow) but instead are artificial parameterized models (which enables us to design controlled experiments by altering the parameters). The psychophysical experiments compare the predictions of our theory with the performance of human subjects performing a range of motion estimation tasks. These experiments address issues such as, what types of motion stimuli are people expert at perceiving? The physiological experiments attempt to pin down where motion processing takes place in the brain. In particular, we study the activity of neural cells (neurons) in different parts of the brain during motion perception. It is anticipated that understanding how the human visual system processes motion will enable us to develop more robust and powerful computer vision algorithms which will have many technological applications (e.g. for robotics and automated medical diagnosis). In addition, understanding how neurons perform computations is central to the entire enterprise of neuroscience in its attempt to give a scientific account of the brain mechanisms underlying our mental life.The grant will help encourage underrepresented groups by supporting a female postdoctoral researcher. The grant will include data sharing of the multi-electrode recordings of the physiological experiments and, in addition, we will make available the code for making novel pseudo-random stimuli. The proposal also has educational impact because it will help train a graduate student in interdisciplinary research, encompassing computer science, psychology, neuroscience and statistics.
|
1 |
2007 — 2009 |
Yuille, Alan |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
A Computational Theory of Motion Perception Modeling the Statistics of the Environment @ University of California-Los Angeles
This project develops a novel extension to a computational theory of visual motion perception. The overall goal of the theory is to understand how humans perceive motion in their natural environment; in other words, to understand what goes on inside a person's brain when he or she sees birds flying, snowflakes falling, or other complex patterns of motion that occur in the natural visual world. Building on recent work modeling the appearance of a limited set of motion flow patterns, the present project explores a probabilistic approach, based on Bayesian Ideal Observers, to the representation, learning, and modeling of natural visual, and the use of learned probabilistic models in turn to synthesize pseudo-realistic stimuli. Pseudo-realistic stimuli are a novel class of visual stimuli, which have the appearance of natural visual stimuli but can be quantified and varied in a precisely controlled manner. Stimuli of this type have never been used before and offer the exciting prospect of experimentally understanding the behavior of visual systems when exposed to realistic but controlled stimuli. It is anticipated that understanding how the human visual system processes motion will enable development of more robust and powerful computer vision algorithms which will have many technological applications.
|
1 |
2007 — 2008 |
Green, Mark (co-PI) [⬀] Yuille, Alan |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ipam/Statistics Graduate Workshop @ University of California-Los Angeles
The proposed workshop, 'Probabilistic Models of Cognition: The Mathematics of Mind,' will bring together leaders from cognitive science, computer science, mathematics, and statistics who are interested in developing a common mathematical framework for all aspects of cognition, and review how it explains empirical phenomena such as vision, memory, reasoning, learning, planning, and language. This program is motivated by recent advances which offer the promise of modeling human cognition mathematically. The workshop will entail presentations by leading faculty-level lecturers and an audience of graduate students, postdoctoral researchers, and more senior researchers interested in focusing their efforts on probabilistic models of cognition and their applications. Attendees will represent a number of disciplines, including cognitive science, neuroscience, computer science, mathematics, physics, statistics, engineering, and education. Researchers interested in education should be equipped with a wide range of new computational, mathematical and statistical tools that can be used to improve educational technology, curriculum design and assessment, through the development of qualitatively more powerful models of human learning. Researchers in all of these fields, as well as basic cognitive-science researchers, should benefit immensely from interacting with each other and learning about this new generation of cognitive modeling approaches in an unprecedented interdisciplinary environment, with both basic and applied research themes represented among the lectures and discussions.
|
1 |
2009 — 2013 |
Yuille, Alan |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ri: Small: Recursive Compositional Models For Vision @ University of California-Los Angeles
Detecting and recognizing objects from real world images is a very challenging problem with many practical applications. The past few years have shown growing success for tasks such as detecting faces, text, and for recognizing objects which have limited spatial variability.
Broadly speaking, the difficulty of detection and recognition increases with the variability of the objects ? rigid objects being the easiest and deformable articulated objects being the hardest. There is, for example, no computer vision system which can detect a highly deformable and articulated object such as a cat in realistic conditions or read text in natural images. This project develops and evaluates computer vision technology for detecting and recognizing deformable articulated objects.
The strategy is to represent objects by recursive compositional models (RCMs) which describe objects into compositions of subparts. Preliminary work has shown that these RCMs can be learnt with only limited supervision from natural images. In addition, inference algorithms have been developed which can rapidly detect and describe a limited class of objects. This project starts with single objects with fixed pose and viewpoint and proceeds to multiple objects, poses, and viewpoints. Theoretical analysis of these models gives insight and understanding of the performance and computational complexity of RCMs.
The expected results are a new technology for detecting and recognizing objects for the applications mentioned above. The results are disseminated by peer reviewed publications, webpage downloads, and by university courses.
|
1 |
2013 — 2018 |
Yuille, Alan |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Research: Visual Cortex On Silicon @ University of California-Los Angeles
The human vision system understands and interprets complex scenes for a wide range of visual tasks in real-time while consuming less than 20 Watts of power. This Expeditions-in-Computing project explores holistic design of machine vision systems that have the potential to approach and eventually exceed the capabilities of human vision systems. This will enable the next generation of machine vision systems to not only record images but also understand visual content. Such smart machine vision systems will have a multi-faceted impact on society, including visual aids for visually impaired persons, driver assistance for reducing automotive accidents, and augmented reality for enhanced shopping, travel, and safety. The transformative nature of the research will inspire and train a new generation of students in inter-disciplinary work that spans neuroscience, computing and engineering discipline.
While several machine vision systems today can each successfully perform one or a few human tasks ? such as detecting human faces in point-and-shoot cameras ? they are still limited in their ability to perform a wide range of visual tasks, to operate in complex, cluttered environments, and to provide reasoning for their decisions. In contrast, the mammalian visual cortex excels in a broad variety of goal-oriented cognitive tasks, and is at least three orders of magnitude more energy efficient than customized state-of-the-art machine vision systems. The proposed research envisions a holistic design of a machine vision system that will approach the cognitive abilities of the human cortex, by developing a comprehensive solution consisting of vision algorithms, hardware design, human-machine interfaces, and information storage. The project aims to understand the fundamental mechanisms used in the visual cortex to enable the design of new vision algorithms and hardware fabrics that can improve power, speed, flexibility, and recognition accuracies relative to existing machine vision systems. Towards this goal, the project proposes an ambitious inter-disciplinary research agenda that will (i) understand goal-directed visual attention mechanisms in the brain to design task-driven vision algorithms; (ii) develop vision theory and algorithms that scale in performance with increasing complexity of a scene; (iii) integrate complementary approaches in biological and machine vision techniques; (iv) develop a new-genre of computing architectures inspired by advances in both the understanding of the visual cortex and the emergence of electronic devices; and (v) design human-computer interfaces that will effectively assist end-users while preserving privacy and maximizing utility. These advances will allow us to replace current-day cameras with cognitive visual systems that more intelligently analyze and understand complex scenes, and dynamically interact with users.
Machine vision systems that understand and interact with their environment in ways similar to humans will enable new transformative applications. The project will develop experimental platforms to: (1) assist visually impaired people; (2) enhance driver attention; and (3) augment reality to provide enhanced experience for retail shopping or a vacation visit, and enhanced safety for critical public infrastructure. This project will result in education and research artifacts that will be disseminated widely through a web portal and via online lecture delivery. The resulting artifacts and prototypes will enhance successful ongoing outreach programs to under-represented minorities and the general public, such as museum exhibits, science fairs, and a summer camp aimed at K-12 students. It will also spur similar new outreach efforts at other partner locations. The project will help identify and develop course material and projects directed at instilling interest in computing fields for students in four-year colleges. Partnerships with two Hispanic serving institutes, industry, national labs and international projects are also planned.
|
1 |
2018 — 2021 |
Yuille, Alan |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Research: Compcog: Achieving Analogical Reasoning Via Human and Machine Learning @ Johns Hopkins University
Despite recent advances in artificial intelligence, humans remain unmatched in their ability to think creatively. Intelligent machines can use massive data to learn to identify patterns that are similar to learned examples, but people can use very small amounts of data to discover deep similarities between situations that are superficially very different (e.g., engineers have devised a cooling system for buildings using principles adapted from termite mounds). This type of creative thinking depends on analogy: the ability to find and exploit resemblances based on relations among entities, rather than solely on superficial appearances. The present investigation aims to show how relations can be learned from examples (in the form of either texts or pictures) and then used to reason by analogy. The work integrates recent advances in machine learning with more human-like learning mechanisms. Improved analogy models will increase the power of computer-based information retrieval, allowing both text and pictures to serve as retrieval cues to search large databases for items that are analogous in relational structure. The large analogy datasets generated for the project will be made publically available. More flexible search engines will help to automate creative tasks such as engineering design. Identifying the computational basis for relation learning and analogical reasoning will guide development of artificial intelligence systems by providing more efficient learning mechanisms. The research team is integrating research and education activities by using this project as a training opportunity in interdisciplinary research, encompassing psychology, statistics, computer science and mathematics.
The research will integrate advanced computational approaches with behavioral experiments on human relation learning and analogical reasoning, using both texts and pictures as inputs. The work is guided by cognitive theory on learning and reasoning, and exploits recent advances in the field of machine vision. The project includes the creation and validation of multiple databases of analogy problems. Experiments will be performed to establish human performance levels in a variety of tasks. Computational models will be developed by synergizing big-data learning through deep networks with small-data learning through Bayesian modeling. Models will be evaluated by comparison with human benchmarks. By addressing issues that arise in reasoning from natural inputs such as texts and pictures, the models to be developed will generalize to situations that people encounter in their daily life.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
0.939 |