2016 — 2018 |
Rudin, Cynthia |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Career: New Approaches For Ranking in Machine Learning
In numerous industries, decisions are based on large amounts of data, where a ranked list of possible actions determines how limited resources will be spent. Over the last decade, machine learning algorithms for ranking have been designed to address prioritization problems. These algorithms rank a set of objects according to the probability to possess a certain attribute; for example, we might rank a set of manholes in order of their probability to catch fire next year. However, current algorithms solve ranking problems approximately rather than exactly, and these approximate algorithms can be slow; furthermore they do not take into account many application-specific problems.
The goals of this project include:
I) Finding exact solutions to ranking problems by developing a toolbox of algorithmic techniques based on mixed-integer optimization technology.
II) Finding solutions faster by showing a fundamental equivalence of ranking problems to easier classification problems that can be solved an order of magnitude faster.
III) Developing frameworks for new structured problems. The first framework pertains to ranking problems that have a graph structure that are relevant to the energy domain. The second framework handles a sequential prediction problem arising from recommender systems, with applications also in the medical domain.
Through collaboration with industry, the proposed methods are being applied in several different areas, including the prevention of serious events (fires and explosions) on NYC's electrical grid.
|
0.97 |
2018 — 2023 |
Rudin, Cynthia Brinson, L. |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Research: Framework: Data: Hdr: Nanocomposites to Metamaterials: a Knowledge Graph Framework
A team of experts from four universities (Duke, RPI, Caltech and Northwestern) creates an open source data resource for the polymer nanocomposites and metamaterials communities. A broad spectrum of users will be able to query the system, identify materials that may have certain characteristics, and automatically produce information about these materials. The new capability (MetaMine) is based on previous work by the research team in nanomaterials (NanoMine). The effort focuses upon two significant domain problems: discovery of factors controlling the dissipation peak in nanocomposites, and tailored mechanical response in metamaterials motivated by an application to personalize running shoes. The project will significantly improve the representation of data and the robustness with which user communities can identify promising materials applications. By expanding interaction of the nanocomposite and metamaterials communities with curated data resources, the project enables new collaborations in materials discovery and design. Strong connections with the National Institute of Standards and Technology (NIST), the Air Force Research Laboratory (AFRL), and Lockheed Martin facilitate industry and government use of the resulting knowledge base.
The project develops an open source Materials Knowledge Graph (MKG) framework. The framework for materials includes extensible semantic infrastructure, customizable user templates, semi-automatic curation tools, ontology-enabled design tools and custom user dashboards. The work generalizes a prototype data resource (NanoMine) previously developed by the researchers, and demonstrates the extensibility of this framework to metamaterials. NanoMine enables annotation, organization and data storage on a wide variety of nanocomposite samples, including information on composition, processing, microstructure and properties. The extensibility will be demonstrated through creation of a MetaMine module for metamaterials, parallel to the NanoMine module for nanocomposites. The frameworks will allow for curation of data sets and end-user discovery of processing-structure-property relationships. The work supports the Materials Genome Initiative by creating an extensible data ecosystem to share and re-use materials data, enabling faster development of materials via robust testing of models and application of analysis tools. The capability will be compatible with the NIST Material Data Curator System, and the team also engages both AFRL and Lockheed Martin to facilitate industry and government use of the resulting knowledge base.
This award by the Office of Advanced Cyberinfrastructure is jointly supported by the Division of Materials Research within the NSF Directorate for Mathematical and Physical Sciences.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
1 |
2019 — 2022 |
Mukherjee, Sayan [⬀] Rudin, Cynthia Calderbank, Arthur Lu, Jianfeng (co-PI) [⬀] Ge, Rong (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Hdr Tripods: Innovations in Data Science: Integrating Stochastic Modeling, Data Representations, and Algorithms
This award supports TRIPODS@Duke Phase I, a project that will develop the foundations of data science both at Duke University and in the broader NC Research Triangle and surrounding region. A total of 25 faculty at Duke representing the disciplines of Computer Science, Electrical Engineering, Mathematics, and Statistical Science will be involved in Phase I. Activities include five semesters of workshops, with 3-4 one-week workshops each semester. These workshops will involve local and national participants and will bring experts on data science to the area. The project will support graduate students and postdoctoral trainees both in terms of education in the foundations of data science as well as in their professional development. Educational activities include the development and teaching of data science across curricula in Computer Science, Electrical and Computer Engineering, Mathematics, and Statistical Science, both at the undergraduate and graduate levels. The project will also leverage existing data science programs, including the Rhodes Information Initiative at Duke, a center for "big data" computational research and expanding opportunities for student engagement in data science; and the Statistical and Applied Mathematical Sciences Institute (SAMSI), one of the NSF/DMS-funded Mathematical Sciences Research Institutes (MSRIs), which is a partnership among Duke University, North Carolina State University (NCSU), and the University of North Carolina at Chapel Hill (UNC).
The topics of the signature workshops supported by the TRIPODS@Duke Phase I project are (1) scalable inference with uncertainty, (2) causal inference, (3) neural networks, (4) complex and dynamic image and signal processing, and (5) interpretable models. These five topics all fall under three research themes that require transdisciplinary collaborations among computer scientists, electrical engineers, mathematicians, and statisticians: Theme I: Scalable algorithms with uncertainty for data science; Theme II: Data science at the human-machine interface; and Theme III: Fundamental limits of data science. The potential research innovations for the three themes that will be developed and or advanced include: For Theme I, scalable Bayesian and generalized Bayesian inference, robust optimization for uncertain inputs, and algorithm and architecture design for neural networks; for Theme II, interpretable models and algorithms, causal inference with high-dimensional complex observational data, and image and signal processing for screening and monitoring; and for Theme III, robust optimization for uncertain inputs, statistical and approximation power of deep neural network architectures, and fundamental limits of causal inference in observational studies.
This project is part of the National Science Foundation's Harnessing the Data Revolution (HDR) Big Idea activity.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
1 |
2020 — 2025 |
Banks, David Brinson, L. Rudin, Cynthia Curtarolo, Stefano (co-PI) [⬀] Guilleminot, Johann |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Nrt-Hdr: Harnessing Ai For Understanding & Designing Materials (Aim)
Over the last decade, there has been a shift in materials science research from slow individual experiments and computation to the beginnings of accelerated data-driven artificial intelligence (AI) approaches. Yet to achieve the promise of rapid discovery, design, and application of new materials, the development of a new generation workforce trained at the nexus of AI and materials is essential. This National Science Foundation Research Traineeship awarded to Duke University, AI for understanding and designing Materials (aiM), will provide integrated training for both materials and computer scientists, to advance the research and training frontiers of this new convergent field. Students will develop expertise in AI and materials science through a new curriculum bridging disciplines, linked with convergent research, professional skills, and external internships. This NRT will fill a critical gap in the advanced manufacturing workforce, facilitating future on-demand materials development for vital societal applications in flexible electronics, biomedical implants, infrastructure development, and many other areas. A total of 50 PhD students will be trained in the aiM program, 25 of whom will be NRT funded, from degree programs in computer science, data science, statistical science, and all materials disciplines including materials science, physics, chemistry, and all engineering fields with the goal of broadening participation of women and underrepresented minorities by recruiting a diverse group of undergraduates and promoting retention through culturally aligned mentoring and an inclusive climate.
The aiM program will deliver core elements designed to equip trainees with competitive 21st century professional and technical workplace skills. These core elements include: (1) newly developed transdisciplinary courses fusing data and materials science with problem- and project-based learning; (2) experiential learning through real-world application in internships with national lab or industry partners; and (3) professional development through boot camps, workshops, mentoring, outreach opportunities, and industry networking events. Students from both materials and computer-science domains will gain critical in-depth cross-training that integrates knowledge and methods across disciplines and enables development of new frameworks for discovery and innovation. New research frontiers will incorporate computational methods for different material classes, growing materials data warehouses for simulated and experimental data, and development and improvement of AI methods for scientific discovery. This NRT will impact students far beyond Duke through development of parallel open online course modules based on the fundamentals and applications of ?AI for materials? coursework and an annual aiM Challenge in which teams across the world can compete on a common materials data problem.
The NSF Research Traineeship (NRT) Program is designed to encourage the development and implementation of bold, new potentially transformative models for STEM graduate education training. The program is dedicated to effective training of STEM graduate students in high priority interdisciplinary or convergent research areas through comprehensive traineeship models that are innovative, evidence-based, and aligned with changing workforce and research needs.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
1 |
2021 |
Rudin, Cynthia |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
A Machine Learning Framework For Understanding Impacts On the Hiv Latent Reservoir Size, Including Drugs of Abuse
The major obstacle to curing HIV infection is a durable and persistent latent reservoir of infected cells. The latent HIV reservoir is not eliminated by antiretroviral therapy (ART), and ART interruption results in uncontrolled virus rebound within weeks. Despite the importance of this reservoir, little is known about the biological parameters that influence it, or the effects of recreational drug use on it. While the size of the HIV reservoir is fairly stable within individuals, it varies greatly (up to 1000-fold) between individuals, suggesting that host factors influence its size. These factors likely include a complex set of genes, transcriptional pathways, immune cell populations, and environmental influences, including drugs of abuse. Cannabinoid (CB) use, in particular, is prevalent amongst persons with HIV (PWH) with up to 49% PWH reporting regular use. However, the impact of CBs on the HIV reservoir has not been fully investigated. CBs have immuno-modulatory and anti-inflammatory activities through activation of the CB2 receptor that is widely expressed in immune cells, including CD4 T cells that harbor most of the HIV reservoir. Our hypothesis is that CB interacts with host pathways and factors that impact the size of the HIV reservoir. Due to the complex nature of the interaction of CB with the host immune system, new computational tools are required lo achieve a deep understanding of how CB impacts both the host immune system and the HIV reservoir. Our goal is to develop a novel framework for heterogeneous data integration, including new tools for dimension reduction and interpretable machine learning, and apply it to data from three HIV cohort studies (US-UNC, Switzerland, and US-Duke). This approach will reveal relationships between host characteristics and HIV reservoir size, both in the presence and absence of CB use. In Aim 1, we develop dimension reduction {DR) tools, with application to heterogeneous data from the US-UNC PWH cohort. In Aim 2, we develop a new powerful interpretable machine technique - alternating decision trees (adtrees) - and apply it to data from a large PWH Swiss cohort study to reveal factors that determine HIV reservoir size. In Aim 3, both tools will be applied to data from a unique cohort of CB-using PWH at Duke University, to explain the effects of cannabis on the immune system of PWH and on the latent HIV reservoir.
|
1 |
2021 — 2023 |
Rudin, Cynthia |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Eager: Creating An Unsupervised Interpretable Representation of the World Through Concept Disentanglement
Humans are able to break down a large entity into smaller and simpler concepts, just from having seen many objects and their relationships. Reproducing this type of behavior in a machine learning model has several benefits. In particular, it could lead to computational ways of representing the world that are interpretable yet powerful. These new representations could be used within machine learning algorithms, allowing the algorithms to be more robust and more likely to generalize when the underlying situations change. For instance, if an algorithm has found a collection of parts that an object is typically comprised of, then it can use those parts to identify this type of object even when it is in an unusual setting, or when the object itself is unusual. This new way of representing the world will allow more robust and generalizable machine learning models. This will be particularly helpful for difficult challenges in computer vision, including problems related to vision systems in automated vehicles, analysis of medical time-series, and materials science problems related to the understanding of material properties and discovery of new materials.
Specifically, the main goal of this project is learning with interpretable learned concepts using a disentangled neural network. The approach breaks the problem down into three steps that each could be manageable, and each step can be checked and improved independently of the other steps. The steps are to decompose each observation into local parts, identify possible concepts by looking at common relationships between the local parts, and align the proposed concepts, based on their semantic meaning, within a disentangled neural network. The discovered concepts will be interpretable and can be used as features for many downstream tasks. The disentangled neural networks built from these concepts could potentially generalize more easily to new situations than other approaches.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
1 |
2021 — 2022 |
Rudin, Cynthia |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Nsf Workshop On Seamless/Seamful Human-Technology Interaction
This project will hold a workshop and publish a report addressing seams, that is, significant junctures among people and technologies. Seams involve the ways that different modalities, mechanisms, and situations of interaction are connected. Researchers across fields will gather and collaborate to address and formulate emerging research questions. Where do seams arise? How can they be stitched together? How can consideration of seams enable developing new relationships among people and technologies? When do we want our interactions with technology to be seamless, that is, for the seams to be invisible to human participants? When might we want seamful design, which makes seams visible, reminding the humans of gaps? The workshop report will be published on the workshop website, and in prominent archival venues, such as through the ACM and IEEE.
This workshop and report authoring will involve researchers across fields, including human-computer interaction, machine learning and data science, haptics, internet of things, wearables, computer graphics and computer vision, cognitive science, design, and computational creativity to consider seamless and seamful design addressing and combining: high-dimensional data analysis, computational creativity, interactive art, interaction design, exploration of machine learning model spaces, communication awareness, buildings and urban spaces, computational companions, and user modeling. Research contributions are expected to take forms such as ideas for new research initiatives, implications for design, frameworks, and theories.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
1 |
2022 — 2026 |
Lo, Joseph (co-PI) [⬀] Lo, Joseph (co-PI) [⬀] Rudin, Cynthia |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Fw-Htf-R: Interpretable Machine Learning For Human-Machine Collaboration in High Stakes Decisions in Mammography
The specific objectives of the Future of Work at the Human-Technology Frontier program are (1) to facilitate convergent research that employs the joint perspectives, methods, and knowledge of computer science, engineering, learning sciences, research on education and workforce training, and social, behavioral, and economic sciences; (2) to encourage the development of a research community dedicated to designing intelligent technologies and work organization and modes inspired by their positive impact on individual workers, the work at hand, the way people learn and adapt to technological change, creative and supportive workplaces (including remote locations, homes, classrooms, or virtual spaces), and benefits for social, economic, and environmental systems at different scales; (3) to promote deeper basic understanding of the interdependent human-technology partnership to advance societal needs by advancing design of intelligent work technologies that operate in harmony with human workers, including consideration of how adults learn the new skills needed to interact with these technologies in the workplace, and by enabling broad workforce participation, including improving accessibility for those challenged by physical or cognitive impairment; and (4) to understand, anticipate, and explore ways of mitigating potential risks arising from future work at the human-technology frontier.<br/><br/>Breast cancer is one of the most common causes of illness and death in the US and worldwide. Breast cancer screening programs using annual mammography have been highly successful in lowering the overall burden of advanced cancers. In response to increasing caseloads, artificial intelligence is being widely adopted in the field of radiology. So far, these artificial intelligence systems have been opaque in the way they work, and when they make mistakes, radiologists find it difficult to understand what went wrong. This project seeks to design an artificial intelligence system that can explain its reasoning process for deciding whether a woman’s mammograms contain a breast lesion that is suspicious. This system can improve human-machine interactions by helping radiologists to make better decisions of whether to recommend that the woman undergo a biopsy. It can also help to educate medical students and other trainees. Ultimately, this system can lead to better patient care, impacting both academic and community-based clinical practice. <br/><br/>This project does not aim to replace radiologists with black box models: its models are decision aids, rather than decision makers, following along the reasoning process that radiologists must use when deciding whether to recommend a biopsy. The approach includes the design of novel deep learning architectures that perform case-based reasoning with tailored definitions of interpretability. These models do not lose accuracy when compared to their black box counterparts. Separate models are proposed for each of the mammographic tasks of classifying mass margin, mass shape, and mass density. An important aspect of the project includes building user-interface tools for radiologists to provide fine annotation, which mitigates the harmful effects of confounding. The models' innate interpretability will allow for better troubleshooting and easier analysis, which will be transformative for not only computer-aided diagnosis in medical imaging but also computer vision in general. Wide implementation of interpretable artificial intelligence in the medical field will be a game changer for human-machine interaction and can improve efficiency in the healthcare sector, helping not only to manage workloads for physicians but also to improve the quality of patient care.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
1 |