2005 — 2008 |
Ram, Sudha |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Investigating Data Provenance in the Context of New Product Design and Development
Information is one of the biggest assets for most enterprises. In today's information age, almost every enterprise decision is based on a detailed analysis of data recorded in diverse sources ranging from structured databases to the World Wide Web. To ensure that data retrieved from different sources is used appropriately and within context, it is imperative that the provenance of the data be recorded and made available to its users. Provenance refers to the knowledge that enables a piece of data be interpreted correctly. It is the essential ingredient that ensures that users of data (for whom the data may or may not have been originally intended) understand the background of the data. This includes elements such as, who (person) or what (process) created the data, where it came from, how it was transformed, the assumptions made in generating it, and the processes used to modify it. This research team will investigate the semantics of data provenance and will develop an ontology to represent the semantics of data provenance, including the development of ways to automate the capture of provenance. Using new product design and development as the real world domain, a partnership will be formed with a large defense contracting company, viz., Raytheon Missile Systems, located in Tucson, Arizona, to investigate these research issues. A testbed will be created to capture and use provenance and evaluate the system's utility using a well defined set of metrics. Raytheon has committed considerable resources in the form of personnel and access to software as needed for this research.
The intellectual merit of this proposal stems from the theoretical framework for understanding and representing the semantics of data provenance. This is considerably different from existing work on provenance which has mainly explored the 'where' and 'why' of provenance. This work will pave the way for understanding the extent to which provenance can be automatically captured.
The project has the potential for broader impacts on society. Most importantly, the development of techniques to represent, capture and deploy provenance has the potential to revolutionize the Department of Defense product development industry and other domains as well. The ultimate goal is to enable the development of autonomic and interoperable enterprise data management systems.
|
1 |
2008 — 2014 |
Jorgensen, Richard Andrews, Gregory (co-PI) [⬀] Chandler, Vicki Ram, Sudha Stein, Lincoln Stanzione, Daniel Goff, Stephen [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Pscic Full Proposal: the Iplant Collaborative: a Cyberinfrastructure-Centered Community For a New Plant Biology
University of Arizona: Richard Jorgensen (PI), Gregory Andrews, Kobus Barnard, Susan Brown, Vicki Chandler, Nirav Merchant, Carolyn Napoli, Sudha Ram, Steven Rounsley Arizona State University: Daniel Stanzione Cold Spring Harbor Laboratory: Lincoln Stein, Doreen Ware Purdue University: Rebecca Doerge, University of North Carolina at Wilmington: Ann Stapleton
The iPlant Collaborative (iPC) is a new type of organization ? a cyberinfrastructure collaborative for the plant sciences - that enables new conceptual advances through integrative, computational thinking (i.e. thinking at multiple levels of abstraction using a systems-level approach to problem-solving). The iPC is fluid and dynamic, utilizing new computer, computational science and cyberinfrastructure solutions to address an evolving array of grand challenges in the plant sciences. It is community-driven, involving plant biologists, computer and information scientists and engineers, as well as experts from other disciplines, all working in integrated teams. The iPC brings together strengths in plant biology, bioinformatics, statistics, computer science and high throughput computing, as well as innovative approaches to education, outreach, and the study of social networks.
Several key principles guided the development of the iPC. Specifically, the iPC: ? is a cyberinfrastructure collaborative rather than purely a cyberinfrastructure, ? will enable multi-disciplinary teams to address grand challenges in plant science, ? will be an entity that is by, for and of the community, ? will train the next generation in computational thinking, and ? is designed to be able to reinvent itself as needs and technologies change.
The driving force behind the iPC is the nature of the grand challenge questions in plant sciences, and all facets of the collaborative are organized around those selected questions. The act of selecting these questions will be community-driven, and to facilitate that, the Collaborative will host a series of workshops, each focused on a specific area of plant biology, but with participants cutting across the spectrum of the computational and biological sciences. The goal of each workshop will be to identify the grand challenge questions in that field, as well as the necessary strategies and approaches that will be needed to solve the question(s). Self-forming Grand Challenge Teams from the community will then work with iPC personnel to develop a ?Discovery Environment? (DE), which will be a cyberinfrastructure for open-access research and education focused on a grand challenge question. Over time, the DEs designed for different grand challenges will overlap and coalesce into a comprehensive cyberinfrastructure for discovery and learning.
The cyberinfrastructure created by the iPC will provide the community with two main capabilities: it will provide access to world-class physical infrastructure ? for example persistent storage, and compute power via local and national resources, and it will provide services that promote interactions, communications and collaborations and that advance the understanding and use of computational thinking in plant biology. Through these capabilities, the iPC will catalyze progress in targeted areas of plant biology, and more broadly advance the whole of plant science through new, creative, synthesis activities, and training the next generation of scientists in computational (and collaborative) thinking.
The broader impacts of the iPC project will not be limited merely to creating the tools for solution of currently intractable grand challenge questions, because at its core the iPC is actually a community building and educational enterprise designed to facilitate education and outreach. Grand Challenge teams and iPC staff will work together to educate students (K-12, undergraduate, and graduate, including members of underrepresented groups) through the use and development of Discovery Environments. Thus, education and outreach efforts will permeate the iPlant Collaborative.
|
1 |
2013 — 2017 |
Debray, Saumya (co-PI) [⬀] Collberg, Christian [⬀] Ram, Sudha |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Twc Ttp: Small: Mitigating Insider Attacks in Provenance Systems
The digital provenance of a digital object gives a history of its life cycle including its creation, update, and access. It thus provides meta-level information about the sequence of events that lead up to the current version of the object, as well as its chain of custody. Such provenance information can be used for a variety of purposes, such as identifying the origins of a document, assessing the quality or reliability of data, and detecting undesirable actions such as forgery or unauthorized alteration of data. However, all of these practical uses of provenance information presuppose that the provenance system is secure, i.e. that provenance data is collected, processed, and stored in a manner that ensures its confidentiality and integrity. Without such guarantees, users can get an incorrect impression of document authenticity, potentially with significant real-world consequences.
This project investigates the design of secure provenance collection systems where the collected meta-data can be relied upon even in light of realistic insider attack models. Security, however, is not sufficient; a practical system must also be efficient even when large amounts of fine-grained provenance data needs to be stored and processed. The project is aimed at addressing both issues through the following three objectives. (1) Techniques for continuously updatable software tamperproofing to ensure the integrity of the system itself. (2) Techniques for robust, continuous marking, collusion-free, text fingerprinting to mitigate document leakage. (3) Techniques for anonymous storage on untrusted storage servers to allow for efficient storage and access of fine-grained provenance data.
|
1 |
2017 — 2020 |
Mills, Barbara [⬀] Ram, Sudha |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ridir: Collaborative Research: Cybersw: a Data Synthesis and Knowledge Discovery System For Long-Term Interdisciplinary Research On Southwest Social Change
Abstract
This project will create cyberSW, an integrated knowledge discovery system that will significantly enhance interdisciplinary research on long-term social change at decadal to centennial scales. The project will result in the data integration of millions of objects from tens of thousands of prehispanic settlements across the U.S. Southwest, making it one of the largest digital archaeological repositories in the world. A major challenge in using archaeological data is that most relevant information is not digitally curated or synthesized beyond individual projects. A number of recent synthesis projects in the U.S. Southwest show the great potential of these data for addressing big questions in the social sciences such as: What promotes the success or failure of some societies? How does migration transform social identities and create new social structures? And, what are the relationships between environmental challenges and social changes? We will build on these prior projects to produce an integrated system that will allow users at different levels of expertise to readily view, analyze, and export data on past societies in the Southwest to address these and many other questions relevant to contemporary society.
Due to preservation and intensity of investigations, the U.S. Southwest has an unparalleled high quality archaeological record of human occupation and social change. Archaeological data are geospatially- and temporally-referenced and include variables such as population scale and movement, social diversity and inequality, technological innovations and diffusion, and climate. Through the creation of cyberSW the proposed project will realize the cumulative research potential of these data by (1) merging several existing synthetic databases into one scalable, networked digital repository; (2) collecting additional data to fill in spatial, temporal, and material culture gaps; (3) analyzing those data and creating user-friendly online tools for data analysis and visualization; and (4) establishing a web portal for data visualization, analysis, and sharing that is available to both professional researchers and the general public. Making these data usable to researchers and to the public ensures that the findings of archaeological research are accessible, interpretable, and replicable. The online analytical tools will allow a wide range of individuals to conduct their own analyses, whether tribal members interested in their history, land managers responsible for public interpretation, students learning data manipulation and display, or social scientists grappling with the long-term questions about the human past. A Citizen Science component for registered volunteers will allow a bigger community to participate in transformative science.
|
1 |
2018 — 2021 |
Miller, Marc (co-PI) [⬀] Ram, Sudha Bethard, Steven (co-PI) [⬀] Lopez Hoffman, Laura Baldwin, Elizabeth |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ridir: Collaborative Research: a Data Science Platform and Mechanisms For Its Sustainability
This research will design and develop a data science platform that provides access to documents. The platform will make available for the first-time analytical tools, using natural language computer processing and data science to enable systematic research and inquiry by practitioners, project proponents, scholars, and the public to answer a host of critical questions. Using natural language processing and data science, this research will design and develop a data science platform. The research team will produce a blueprint for the platform and validate its functionality with a community of scholars and practitioners. The team will establish the platform structure, ingest data, and refine analytical tools to integrate documents across many repositories, link text to metadata even when text comes from one source and the metadata from another, infer more detailed types of metadata that are not present in any of the existing repositories from analysis of the text, and allow researchers, contractors, and policy analysts to pose complex questions and answer them via analysis of documents. The team will also build a user community to catalyze scholarship and application and develop long-term mechanisms to ensure the sustainability and continued growth, management, and use of the platform and its resources.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
1 |