2009 — 2015 |
Corso, Jason Krovi, Venkat |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Career: Generalized Image Understanding With Probabilistic Ontologies and Dynamic Adaptive Graph Hierarchies
This award is funded under the American Recovery and Reinvestment Act of 2009 (Public Law 111-5).
From representation to learning to inference, effective use of high-level semantic knowledge in computer vision remains a challenge in bridging the signal-symbol gap. This research investigates the role of semantics in visual inference through the generalized image understanding problem: to automatically detect, localize, segment, and recognize the core high-level elements and how they interact in an image, and provide a parsimonious semantic description of the image.
Specifically, this research examines a unified methodology that integrates low- (e.g., pixels and features), mid- (e.g. latent structure), and high-level (e.g., semantics) elements for visual inference. Adaptive graph hierarchies induced directly from the images provide the core mathematical representation. A statistical interpretation of affinities between neighboring pixels and regions in the image drives this induction. Latent elements and structure are captured with multilevel Markov networks. A probabilistic ontology represents the core knowledge and uncertainty of the inferred structure and guides the ultimate semantic interpretation of the image. At each level, rigorous methods from computer science and statistics are connected to and combined with formal semantic methods from philosophy.
A symbiotic education plan involving graduate and undergraduate mentoring and education, professional tutorial courses at the boundary of vision and ontology, and K-12 outreach is incorporated into the research plan. The research and education, disseminated broadly through both the applied science and semantics/philosophy literatures, lays a foundation on which to both utilize and automatically extract rich semantic information from images and other signal data for critical application areas such as internet vision, autonomous navigation, and ambient biometrics.
|
0.946 |
2009 — 2011 |
Corso, Jason Hoffmann, Kenneth Chaudhary, Vipin Furlani, Thomas Krovi, Venkat |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ii-New: Acquisition of Bci - a Biomedical Computing Infrastructure
The goal of this project is to acquire and integrate Biomedical Computing Infrastructure (BCI) capable of processing the increasingly high-resolution, large-volume, and high-frequency digital content generated within biomedical applications. The BCI will comprise: a large NVIDIA processor-based Tesla cluster with double precision Graphics Processing Units (GPUs) along with a multi-node NEC Nehalem-based cluster to drive the Tesla cluster via Infiniband; large shared memory multi-core computer nodes; and a large parallel high-performance solid-state disk farm.
Intellectual Merit: While parallel- and grid-computing is relatively well understood, effective use of a cluster of massively multi-core GPUs with large memory, and fast disk access has as yet been minimally explored. Thus, the BCI seeks to facilitate deployment of this transformational computational paradigm in ongoing biomedical research projects between the University at Buffalo, SUNY and the Roswell Park Cancer Institute. These projects encompass the gamut of biomedical computing from: virtual surgery and intervention; image segmentation and labeling; computer tomography and reconstruction; imaging biomarkers and computer-aided diagnosis; to nuclear molecular imaging.
Broader Impact: This BCI empowers a large group of multidisciplinary researchers to unlock the full potential of the digital content in the biomedical enterprise as well as attain faster and more reliable transfer of science from the lab to the clinic. In addition, a vibrant dissemination and outreach effort has been planned around the BCI, involving classes, tutorials and workshops, to engage students and researchers of all ages. Many of these activities forming the foundation of the team?s outreach efforts, ranging from the high-school summer institutes to conference workshops, have already been initiated and the web-portal documents these efforts (http://www.cse.buffalo.edu/~vipin/nsf/cri2009/).
|
0.946 |
2014 — 2017 |
Corso, Jason |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ci-New: Collaborative Research: Federated Data Set Infrastructure For Recognition Problems in Computer Vision
Broad access to image and video datasets has been responsible for much of the progress in computer vision recognition problems over the last decade. These common benchmarks have played a leading role in transforming recognition research from a black art into an experimental science. Progress, however, has stagnated; although datasets continue to grow, they are developed and annotated in isolation: e.g., a collection of sporting activities, a set of objects in images, etc. These isolated datasets suffer from task and domain-specific bias, and knowledge transfer across them is extremely limited. This project is investigating and establishing a prototype architecture that federates across various recognition problems and modalities, by establishing a common namespace for entities, events and annotations across the datasets. The project is also establishing a web-portal for the prototype federated dataset architecture and linking two existing recognition datasets into the prototype architecture. The resulting federated structure is truly greater than the sum of its parts, and can support new research that was not previously possible for the computer vision community and other related fields.
As a first test scenario for this federated architecture, this project is investigating and constructing a new federated dataset of images and video annotated with various forms of associated text. Image and video content annotations span both the spatial and temporal dimensions while textual annotations reflecting depicted content range from complete free-form natural language descriptions, to more targeted phrases and referring expressions, to individual keyword lists. This dataset is being constructed to promote and enhance collaboration efforts between the vision and language communities by providing a new multi-modal annotated dataset with associated research competitions.
|
0.946 |
2015 — 2018 |
Corso, Jason |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Nri: Collaborative Research: Robotslang: Simultaneous Localization, Mapping, and Language Acquisition @ University of Michigan Ann Arbor
Humans and robots alike have a critical need to navigate through new environments to carry out everyday tasks. A parent and child may be touring a college campus; a robot may be searching for survivors after a building has collapsed. In this collaboration by faculty at two institutions, the PIs envision human and robotic partners sharing common perceptual-linguistic experiences and cooperating in mundane tasks like janitorial work and home care as well as in critical tasks like emergency response or search-and-rescue. But while mapping and navigation are now commonplace for mobile robots, when considering human-robot collaboration for even simple tasks one is confronted by a critical barrier: robots and people do not share a common language. Human language is rich in linguistic elements for describing our spatial environment, the objects and places within it, and navigable paths through it (e.g., "go down the hallway and enter the third door on the right."). Robots, on the other hand, inhabit a metric world of occupied and unoccupied discretized grid cells, wherein most objects are devoid of meaning (semantics). The PIs' goal in this project is to overcome this limitation by conjoining the well understood problem of simultaneous localization and mapping (SLAM) with that of language acquisition, in order to enable robots to learn to communicate with people in English about navigation tasks. The PIs will spur interest in this novel research area within the scientific community by means of an Amazing Race challenge problem modeled after the reality television show of the same name, which will place robots and human-robot teams in unknown environments and charge them with completing a specific task as quickly as possible. Other outreach activities will include visits to K-12 schools with demonstrations.
This work will focus on simultaneous localization, mapping, and language acquisition, a field of inquiry that remains untouched. The crucial principles are that semantics are formulated as a cost function, which in turn specifies a joint distribution over many variables including those capturing sensory input, language, the environment map, and robot motor control. The cost function and joint distribution support standard inference of many forms, such as command following. More importantly, they support multidirectional inference over multiple variable sets jointly, such as simultaneous mapping and language interpretation. Within this innovative multivariate optimization-based framework, the PIs plan a thorough experimental regimen including both synthetic and real-world datasets of challenging environments, grounding the semantics of natural language in spatial maps of the realistic visual world and robot motor control, while navigating along particular paths or to arrive at particular destinations in (possibly novel) environments that are mapped not only in a geometric sense but also with linguistic underpinning to these particular paths and destinations. The language approach is compositional and uses spatially-grounded representations of nouns (objects/places) and prepositions (relations between them). These spatially-grounded representations will be modeled in the context of mapping. Furthermore, the PIs will consider realistic environments and adapt visual models thereof according to the joint model. The PIs are aware of no other work that jointly models mapping, vision, and language acquisition.
|
0.939 |
2016 — 2019 |
Corso, Jason |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ci-New: Collaborative Research: Cove-Computer Vision Exchange For Data, Annotations and Tools @ University of Michigan Ann Arbor
The project provides discoverability, low overhead for use, reproducibility of research, and persistence for computer vision data. The project is hence setting a direction toward which the computer vision community can collectively work in creating a dataset infrastructure that allows for transparency across individual datasets and annotations, experimental benchmarks with community-set corpora and metrics, and a web-based infrastructure to cultivate continued development of computer vision datasets. The availability of such an infrastructure, which is named COVE: Computer Vision Exchange of Data, Annotations and Tools, impacts the computer vision and related communities to develop next generation robust intelligence capabilities that have great potential to positively impact society. The project is integrated with education by supporting graduate and undergraduate students, and reaches middle school students through outreach activities.
The project is establishing COVE, a centralized community-run infrastructure to support the exchange of data and annotations as well as the software tools to manipulate them. The infrastructure is web-based open-source, and provides open access to its contents. Stewardship over the contents are managed by the Investigators initially and subsequently through elected members of the computer vision community. There are two salient components of the infrastructure. First, a curation infrastructure facilitates back-end storage, querying, data annotation and curation tools, to support it. To curate the federated data set, COVE uses widely known open-source tools like Python, Bootstrap and Postgresql. For curation of new annotations to incorporate into the exchange, the project relies heavily on crowd-sourcing. Second, a usage infrastructure, e.g., data structures and software enables widespread and easy use by researchers and practitioners. The project develops APIs to allow for easy programmable access to the federated data sets and tools through common software interfaces like Matlab and OpenCV.
|
0.939 |