1982 — 1985 |
Eddy, William [⬀] Fienberg, Stephen |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Acquisition of Mathematical Sciences Research Equipment @ Carnegie-Mellon University |
0.915 |
1982 — 1984 |
Fienberg, Stephen |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Research On the Transfer of Methodologies Between Large-Scale Survey and Social Experiments @ Carnegie-Mellon University |
0.915 |
1984 — 1987 |
Fienberg, Stephen |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Research On the Design and Analysis Parallels Between Sample Surveys and Randomized Experiments @ Carnegie-Mellon University |
0.915 |
1987 — 1988 |
Fienberg, Stephen |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Research On Survey Designs and Randomized Experiments @ Carnegie-Mellon University |
0.915 |
1996 — 1997 |
Kaye, David Gastwirth, Joseph Fienberg, Stephen |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
International Conference On Forensic Statistics, June 30 to July 3, 1996 At the University of Edinburgh, Scotland @ Carnegie-Mellon University
: The International Conference on Forensic Statistics, scheduled for the University of Edinburgh June 30-July 3, 1996, is the third in a series of tri-annual conferences intended to provide a forum for the interaction of researchers in the areas of law, forensic science, social sciences, and statistics. Among the goals of this conference are (a) the fostering of interaction among research specialists in different disciplines, (b) advances in topics of current interest to the courts, and (c) the development of a firm scientific basis for the use of statistical evidence in the courts. Presentations scheduled include reports on the ethics of expert testimony, DNA fingerprinting and its forensic uses, inferring causality, statistical evidence of environmental harm, the use of econometric models in anti-trust litigation, and the judicial reception of meta-analysis for combining statistical information across scientific studies, especially those involving the harmful effects of exposure to drugs and environmental hazards. This travel grant supports the participation in the Conference by several leading American statisticians, forensic scientists, and social scientists as well as several promising young scientists who are currently beginning to pursue research in this interdisciplinary field and have sought the opportunity to present their work in an international context. Over the past four decades statistics and statistical methods have played increasingly important roles in the evaluation of forensic evidence and in the presentation of scientific evidence more broadly in the courts. American scientist have played a prominent role in this development and have led initiatives intended to improve the quality of statistics as evidence as well as to improve the interaction between statisticians and other scientists as they prepare materials for use in a legal context. While expert testimony in the American legal system has unique features, important aspects of the science associated with the expert testimony transcend national boundaries. The Third International Conference on Forensic Statistics, to be held on June 3 to July 3 1996, is a unique forum for the interaction of researchers in the areas of law, forensic science, social sciences, and statistics and this travel grant supports the participation of several leading American researchers from these different fields of interest. Among the topics of current federal strategic interest featured in the conference presentations are biotechnology and its role in the manufacture of drugs as well as the evaluation of forensic evidence, and the assessment of harmful effects as a result of environmental exposure.
|
0.915 |
1997 — 2001 |
Fienberg, Stephen Mitchell, Tom [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Learning, Visualization, and the Analysis of Large-Scale Multiple-Media Data @ Carnegie-Mellon University
This project is being funded through the Learning and Intelligent Systems (LIS) Initiative, with funds partially provided by the MPS/OMA office. Scientific and engineering data now come in large amounts and in new forms. Although the problem of analyzing and learning from purely numerical data has been heavily studied, we currently lack principled methods for analyzing the multiple-media data sets that form the basis of many modern empirical studies. These modern data sets contain a mixture of numerical features, symbolic logic descriptions, images, text, sound, and other media. This project offers an interdisciplinary research effort to create the statistical foundations and practical machine learning algorithms needed to take advantage of the growing number of such multiple-media data sets. The research plan is to develop new approaches to this problem by working with several large-scale multiple-media databases of significant scientific and societal importance. This research will provide the theoretical foundations and practical algorithms for analyzing multiple-media data in a broad range of application domains. For example, many medical institutions now collect detailed patient records that can be analyzed to predict treatment outcomes for future patients. These medical records are typically multiple-media records consisting of numerical features (e.g., temperature), symbolic features (e.g., gender), images (e.g., x-rays), other instrument data (e.g., EKG), text (e.g., physicians' notes), and other data. Current data analysis algorithms simply ignore most of these available features, because we lack well-understood methods for analyzing such multiple-media data. The current research seeks to develop new approaches that will be able to utilize the full information collected in such data sets. The goal is to extend the foundations of data interpretation that form the basis for many experimental sciences and engineering disciplines.
|
0.915 |
2000 — 2003 |
Fienberg, Stephen |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
International Conference On the Foundation of Statistical Inference: Applications in the Medical and Social Sciences and in Industry and the Interface With Computer Science @ Carnegie-Mellon University
Abstract: DMS-0086688 PI: Fienberg, Stephen E
A special international conference, "Foundation of Statistical Inference: Applications in the Medical and Social Sciences and in Industry and the Interface with Computer Science", will be held in Israel at the conference center at Kiryat Anavim near Jerusalem, on December 17-21, 2000. This conference brings together leading experts in statistics and its applications to discuss both the issues in the foundations of inference and their relevance to the many and varied uses of statistics in the medical sciences, the social sciences, and in industry. A similar conference was held in December 1985. A special focus at this conference will be on the interface between statistics and computer science. Funding will provide travel support for approximately 20 U.S. primarily new researchers and members of underrepresented groups, and invited speakers and participants. The KCS Program in the CISE directorate and the MMS Program in SBE directorate is providing co-funding along with the Statistics Program in MPS.
|
0.915 |
2000 |
Fienberg, Stephen Elliott |
R03Activity Code Description: To provide research support specifically limited in time and amount for studies in categorical program areas. Small grants provide flexibility for initiating studies which are generally for preliminary short-term projects and are non-renewable. |
Statistical Approaches For the Study of Disability @ Carnegie-Mellon University
This proposal addresses Research Objective 21, Data Collection In Population Aging. The principal investigator is an established statistical Investigator with broad background In the development and application of statistical models to demographic, health, and social science data. The proposal outlines the start of a methodological research program to enhance the study of chronic disability over time. It focuses on disability data arising from the National Long Term Care Survey and statistical methods currently used for their analysis. The long term goal of the PI and his collaborators is to develop new statistical methodology for the analysis of survey-based longitudinal disability data and new statistical tools to preserve the confidentiality of the data sources while making them more broadly available for analysis by others. The principal investigator is a "New Investigator" under NIH definition because he has not worked in this area before. Nonetheless, he is an established researcher with considerable experience in relevant aspects of statistical modeling and disclosure limitation methodology.
|
1 |
2002 — 2003 |
Fienberg, Stephen |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Fifth International Conference On Forensic Statistics @ Carnegie-Mellon University
The Fifth International Conference on Forensic Statistics will be held at Isola di San Servolo, Venice, Italy, on August 30th to September 2nd, 2002. The conference is intended to allow statisticians, forensic scientists, lawyers, and scholars from related disciplines to discuss the many and varied uses of statistics and probability in legislative, administrative and judicial proceedings. Forensic Statistics, the application of statistics and probability to legal matters, is an expanding field. The current debates surrounding the presentation of evidence based on DNA profiling, epidemiological studies on the health effects of various drugs, and racial profiling are just three examples. There is a great need for statisticians to provide guidance to forensic scientists, and lawyers and the courts. By bringing together statisticians, forensic scientists, lawyers and other scholars, this fifth international conference will further the interdisciplinary understanding of Statistics and Probability applied to legal matters. Previous conferences in this series have been held at North Carolina State University, the University of Edinburg, and the Arizona State University. This proposal provides funds to insure that U.S. participants will be able to hear and network with experts on this topic from around the world, and to provide access to the conference for selected new researchers and members of underrepresented groups
|
0.915 |
2003 — 2005 |
Fienberg, Stephen Elliott |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Modeling Longitudinal Disability Survey Data @ Carnegie-Mellon University
DESCRIPTION (provided by applicant): Survey data on disability among the elderly are available from several sources, most prominently the National Long Term Care Survey (NLTCS). The NLTCS began in 1982 and now extends over five waves through 1999, making it a rich source of information on possible changes in disability over time. But these data pose challenges for both statistical modeling and the protection of confidentiality of the information provided by survey respondents, especially when the data for individuals are linked across waves. Most statistical approaches used to analyze NLTCS data are based on disability scales that cannot account for the complexity of disability manifestations. Attempts to deal with such complexity include traditional multivariate methods for both discrete and continuous data, and approaches based on the grade of membership model. These methods typically require either making heroic simplifying assumptions or need to be adapted. This project aims to develop new statistical models and approaches for the analysis of such survey data, including the role of sample weights in the use of these models. It also proposes to take a fresh look at the risk of inadvertent disclosure of information on NLTCS respondents and to develop new approaches to protect against disclosure while preserving access to the maximal amount of information in the data required for their proper analysis using the new models and methods.
|
1 |
2005 — 2006 |
Fienberg, Stephen |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Workshop On Privacy and Confidentiality; July 19-26, 2005,Italy. @ Carnegie-Mellon University
This award supports travel and expenses for selected participants in a Workshop on Privacy and Confidentiality, to be held July 9-15, 2005 at the University of Bologna Residential Center in Bertinoro, Italy. The workshop comes at a time when governments and organizations are struggling to expand access to statistical and other databases while simultaneously protecting medical and other administrative records, and combating breaches of cyberinfrastructure security, especially those involving unauthorized record linkage and individual identification and harm. There has been a long tradition of confidentiality associated with statistical databases, but the ever-expanding cyberinfrastructure raises new and far more challenging questions about the protection of privacy associated with electronic databases involving individuals, families and other groups, and organizations. The goal of this workshop is to bring together leading privacy researchers from the statistics and computer science communities to share expertise and map out feasible research goals. It will (1) help shape an interdisciplinary methodologically-oriented intellectual agenda for the area of privacy and confidentiality by establishing commonly understood terminology, goals, and methodological description, (2) stimulate new interdisciplinary collaborations on the topic, and (3) influence the directions that confidentiality research takes in the separate disciplines. Specific areas of investigation include finding mathematically/statistically rigorous definitions of confidentiality, disclosure, and privacy that transcend specific problem and model details, finding minimal definitions of statistical utility, understanding the tension between privacy and utility, and understanding the role of auxiliary information ("extra" information known to the adversary) in defeating privacy objectives.
|
0.915 |
2006 — 2011 |
Fienberg, Stephen Rinaldo, Alessandro (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Confidentiality and Estimation For Large Sparse Multi-Dimensional Contingency Tables @ Carnegie-Mellon University
This research project deals with two crucial aspects of working with large sparse contingency tables: protecting the confidentiality of responses when data are shared with other researchers, and the implications of sparsity for maximum likelihood estimation in log-linear models. The first problem entails the evaluation of the disclosure risk associated with the partial release of information from a classified database, e.g., in the form of marginal tables involving subsets of variables. The second problem is concerned with developing general-purpose inferential methodologies for model selection and estimation/testing in log-linear model analysis that are appropriate for sparse categorical data. The links between these seemingly separate problems emanate from the common statistical and mathematical formalism of algebraic statistics. This research will produce new computational algorithms and sharable computer code for use by behavioral and social science researchers, as well as foundational methods and theory linking the problems of cell estimation using maximum likelihood and log-linear models and confidentiality protection. The expected outcomes of this activity will include: (1) more effective inferential procedures for the quantitative analysis and interpretation of behavioral and social science data and for the determination of the risk of disclosure; (2) statistical software for the analysis of categorical data targeted at a large audience of practitioners and researchers, which will be developed and freely distributed in the form of both computer source codes and modular, executable files; (3) more efficient numerical procedures for assessing the disclosure risk associated with the release of marginal totals.
Log-linear models analysis forms a well-established and powerful set of statistical tools for the study of categorical data, especially in the form of multi-dimentional cross-classifications or multi-way contingency tables, These models have proved to be essential for the analysis of data emanating from many areas of the social and behavioral sciences, as well as in other scientific areas. For example, in a typical sample survey, data are generated for several thousand individuals on a large number of categorical variables, measuring such information on employment, income, health status, etc. The resulting cross-classification of these variables is large, i.e., involving many thousands of cells, and sparse, i.e., most of the cell entries are either very small or contain zero counts. Similar problems arise in the study of social networks, in public health and medicine, and in the analysis of genetics databases. Recent developments in the mathematical area of algebraic geometry have provided a novel and powerful formalism for the representation of log-linear models relevant for such contingency table data. This project will use this mathematical formalism to focus on two different aspects of large sparse contingency tables: (1) Protecting the privacy of the data providers when data are shared with other users, while at the same time (2) Ensuring that such tables are useful for statistical analysis by developing new methods for log-linear model computation. The results of the project will improve access to data for secondary analysis and enhance the capacity of researchers and analysts to exploit the information in large sparse databases.
|
0.915 |
2007 — 2008 |
Fienberg, Stephen |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Travel Grant Proposal For Workshop On Data Confidentiality @ Carnegie-Mellon University
Abstract Proposal Number: 0741571 PI: Stephen E. Fienberg Institution: Carnegie Mellon University
Title: Travel Grant Proposal for Workshop on Data Confidentiality
Abstract
Carnegie Mellon University will host a workshop September 6-7, 2007 on data confidentiality. Increasingly, organizations are collecting data to, among other things, be made available to researchers. Many kinds of data are needed by researchers, including:
- Census data - Health data - Network data, for example collected from users' access to the Internet or general background data collected from routers or other network services.
Building on his extensive background involving security for statistical databases, the PI has assembled an impressive collection of experts. This workshop will explore ways to apply existing statistical methods for sanitization of data for these applications and, as needed, new methods suitable to other applications.
Before data can be released to researchers it is essential that data confidentiality policies be respected. For example, from a released data set associated with the Census Bureau, it should not be possible to associate a particular individual or household with particular sensitive data items, such as household income. Network data made available should not associate a packet, and its associated destination on the Internet or Web, with an individual or even an IP address.
Various techniques will be considered for the specification of data confidentiality requirements and for sanitization of the data to meet the requirements, including masking fields and data transformation techniques that preserve properties essential for the research use of the data.
The technical topics to be covered in the workshop are as follows:
- Languages to express confidentiality needs - Legal, regulation and policy issues - Societal and economic impacts - Limitations of the technology, particularly in the face of the limitations of cryptographic methods - Applications, including network data, census data, other databases - Education and training - Needed infrastructure support
The PI will produce a report on the workshop's deliberations.
The proposed workshop will assemble Government, industry and academic organizations representing different disciplines with diverse needs for data, and will consider different techniques for data sanitization. Microsoft and IBM will be co-sponsoring the workshop. Microsoft's needs, for example, are to sanitize data gathered from use of their search engines and browsers.
|
0.915 |
2009 — 2010 |
Fienberg, Stephen |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Participant Support For Workshop On Statistical Methods For the Analysis of Network Data in in Dublin, Ireland. @ Carnegie-Mellon University
This award will support travel and expenses for young U.S. participants in a three-day Workshop on Statistical Methods for the Analysis of Network Data in Practice, to be held on June 15-17, 2009 in Dublin, Ireland. Many modern data analysis problems involve large data sets from social, biological and other networks. In these settings, traditional modeling assumptions are inappropriate; the analysis of these data must take into account the structure of relationships between the entities being measured. In keeping with this perspective, the Annals of Applied Statistics (AOAS) has been planning a special section to appear in late 2009 or early 2010 with papers on the modeling of network data. This workshop will pay special attention to model design and computational issues of model fitting and inference. The workshop will take advantage of the high quality papers that have already been submitted to AOAS to bringing together statistical network modeling researchers from different communities, and encourage the submission of papers by young researchers and graduate students in a broad spectrum of disciplines. The workshop goal is to foster collaborations and intellectual exchange resulting in novel modeling approaches, diverse applications, and new research directions for network research and its application.
|
0.915 |
2010 — 2015 |
Fienberg, Stephen |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Cdi-Type Ii: Collaborative Research: Integrating Statistical and Computational Approaches to Privacy @ Carnegie-Mellon University
Data privacy is a fundamental problem of the modern information infrastructure. Increasing volumes of personal and sensitive data are collected and archived by health networks, government agencies, search engines, social networking websites, and other organizations. The social benefits of analyzing these databases are significant. At the same time, the release of information from sensitive data repositories can be devastating to the privacy of individuals and organizations. The challenge is to discover and release important characteristics of these databases without compromising the privacy of those whose data they contain. The main goal of this project is to design scalable computational techniques that are statistically sound, yield broadly useful data, and yet preserve privacy in the face of realistic external information. The project aims to integrate two essentially different approaches to the complex problem of data privacy. The reconciliation of these approaches raises a number of fundamental questions for statistical theory and cryptography, as well as methodological challenges that must be overcome to enable practical applications. This research is centered around three themes: (1) Integrating the computationally-focused, rigorous definitions of privacy emanating from computer science with notions of utility from statistics. (2) Developing cryptographic protocols for distributing privacy-preserving algorithms for valid statistical analysis among a group of servers so as to avoid pooling data in any single location. (3) Understanding the practical potential of the developed techniques by applying them to concrete problems in the behavioral and social sciences and analyzing important data sources from the official statistical community. The research will be carried out in collaboration with social scientists and industry researchers.
The project will increase awareness of data privacy issues and promote research on statistical disclosure limitation, cryptography and privacy-preserving data mining. Moreover, this research will transform the way statistical agencies, social scientists, medical researchers, and those in industry approach privacy?in particular, how they collect, share and publish information. The integration of statistical and cryptographic methods in the form of ex ante provably secure procedures will provide the essential scientific fundamentals for official statistical agencies to fulfill their mission of useful data production, which the proliferation of digital information has endangered. Finally, the new techniques will permit opening the vault of industrial data, such as search logs and data on social networks, to statistical analysis?greatly expanding the research domain of the social and health sciences.
|
0.915 |
2011 — 2017 |
Eddy, William (co-PI) [⬀] Fienberg, Stephen |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ncrn-Mn: Data Integration, Online Data Collection, and Privacy Protection For Census 2020 @ Carnegie-Mellon University
This project will conduct research on three basic issues of interest related to the conduct of censuses: privacy, costs, and response rates. The researchers will address the practical problems of insuring confidentiality and privacy while still producing useful data for public and private purposes. In terms of cost issues, the researchers will investigate the use of administrative records to create a basic census frame, saving the duplicated effort of gathering that same information repeatedly, as well as other possible uses of administrative records as part of the census. They also will investigate the use of online data collection as a substitute for the traditional mail-out/mail-back census-taking. With respect to response rates, the researchers will conduct experiments that implement new ways of encouraging participation in an effort to reduce the decline in (or perhaps even increase) response rates.
This research will explore the potential for a significant reduction in the costs of conducting the 2020 census by demonstrating how information already collected by the government can serve as a starting point for the census in lieu of having the Census Bureau collect that information anew as part of the decennial census process. By learning to effectively use the Internet for censal data collection, the research should lead to a higher initial response rate, and hence lower census costs overall, and a more accurate count. Better methods for confidentiality protection and privacy notification not only will instill greater public confidence in the Census Bureau, but they also will contribute to better response rates and greater census accuracy. All of the technical statistical tools developed by the project will have other uses, both public and commercial. The project's educational and training initiatives aim to (1) prepare an educated citizenry on census and related matters, (2) use research issues under study at CMU and elsewhere as components in courses, and (3) train a new generation of students to enable them to work in agencies such as the Census Bureau in a diverse set of capacities, including the most technically demanding ones. This activity is supported by the NSF-Census Research Network funding opportunity.
|
0.915 |