2021 — 2022 |
Dovrolis, Constantinos Dyer, Eva |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Eager:Using Network Analysis and Representational Geometry to Learn Structure-Function Relationship in Neural Networks @ Georgia Tech Research Corporation
Over the past few years, neural networks have revolutionized the fields of computer vision and natural language processing and are now becoming commonplace in many scientific domains. Despite their successes, understanding how to design or build a neural network solution remains challenging and often results in a game of guess and check. This process is incredibly inefficient, and in the end, does not provide any insights into why a model is either good or bad. Thus, new approaches are needed to characterize the relationship between structure (how the network is constructed) and function (how the network performs on a task) in neural networks, and use this information to design learning systems that are more efficient and stable. The overarching goal of this project is to develop tools to model the relationship between the structure and function of deep neural networks. This project will generate a rich toolkit for extracting low-dimensional features from neural networks and will produce new insights that can be used to drive progress in the future design of systems capable of modifying their own architecture to adapt to new data streams.
Given the dimensionality of the problem, the discovery of compact (low-dimensional) representations and metrics that can adequately capture signatures of ``learning'' will be critical. When learning is unsuccessful, these metrics will be used to diagnose problems inherent to the network structure, such as its depth, width, and density of connections. The first part of the project will use tools in network science to discover how concepts such as network sparsity or path diversity between inputs and outputs affect the network's learning performance and efficiency (e.g., the number of examples required to learn a modular task, or whether the network can learn continually without catastrophic forgetting). The second part of the project will develop tools to study how the geometry of representations formed within networks can be used to predict learning outcomes.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
0.93 |
2022 — 2025 |
Davenport, Mark Goldstein, Thomas (co-PI) [⬀] Dyer, Eva Muthukumar, Vidya |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Cif: Ri: Medium: Design Principles and Theory For Data Augmentation @ Georgia Tech Research Corporation
Generalization, or the ability to transfer knowledge from one context to the next, is a hallmark of human intelligence. In artificial intelligence (AI), however, models trained in one setting often fail when tested in a new setting, even if the shift is minor or imperceptible. To build more generalizable AI, most modern methods employ some form of data augmentation (DA), which applies transformations to the data to create virtual samples that are then added to the dataset. The resulting synthesis of new examples appears to build helpful properties in AI such as invariance or resistance to change to certain natural transformations, and robustness to new tasks as well as noise in existing tasks. Despite the promise and performance of DA procedures, they are mostly applied in an ad-hoc manner and need to be designed and tested on a dataset by dataset basis. A set of fundamental principles and theory to understand DA and its impact on model training and testing is lacking. To address this outstanding challenge, the investigators will provide a precise understanding of the impact of DA on generalization, and leverage this understanding to design novel augmentations that can be used across multiple applications and domains.<br/><br/>In this project, the investigators propose a principled mathematical framework to 1) understand when DA helps and when DA could potentially hurt learning, 2) understand the structure induced by DA and characterize what makes high-quality augmentations, and 3) provide novel, systematic, and scalable design principles for augmenting data in new domains where we lack prior knowledge to guide us. These design principles will significantly broaden the applicability and promise of DA from computer vision to new domains (e.g., neural data, graphs and tabular data) where principled augmentations are still not known. Of special focus in this project will be applications of DA to neural activity, where augmentations have shown promise in building a more generalizable link between the brain and behavior. This research will also yield prescriptions for the role of DA in advancing fairness, accountability and transparency in modern machine learning.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
0.93 |