2015 — 2018 |
Hazan, Elad |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ri: Small: Efficient Projection-Free Algorithms For Optimization and Online Machine Learning
The advent of the Internet gives rise to an exponential growth in data collection, availability and complexity. With it increases our need for more efficient data analysis algorithms. Over super-scale datasets, the only feasible data analysis techniques are iterative linear-time first-order optimization methods.
The computational bottleneck in applying these state-of-the-art iterative methods to machine learning and data analysis is often the so-called "projection step". This project addresses the need to design projection-free optimization algorithms that replace projections by more efficient linear optimization steps. A key contribution of the project is the continual dissemination and transfer of this technology. The open-source software releases will continue to enable large-scale machine learning applications in science and engineering. The broader impact goals of the project, beyond theory and algorithms, include the development of a textbook on efficient optimization techniques in machine learning, as well as the development of a new curriculum focused on preparing students for the scientific and engineering needs in this field.
|
1 |
2017 — 2022 |
Singer, Yoram (co-PI) [⬀] Hazan, Elad Arora, Sanjeev [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Af: Large: Collaborative Research: Nonconvex Methods and Models For Learning: Toward Algorithms With Provable and Interpretable Guarantees
Artificial Intelligence along with Machine Learning are perhaps the most dominant research themes of our times - with far reaching implications for society and our current life style. While the possibilities are many, there are also doubts about how far these methods will go - and what new theoretical foundations may be required to take them to the next level overcoming possible hurdles. Recently, machine learning has undergone a paradigm shift with increasing reliance on stochastic optimization to train highly non-convex models -- including but not limited to deep nets. Theoretical understanding has lagged behind, primarily because most problems in question are provably intractable on worst-case instances. Furthermore, traditional machine learning theory is mostly concerned with classification, whereas much practical success is driven by unsupervised learning and representation learning. Most past theory of representation learning was focused on simple models such as k-means clustering and PCA, whereas practical work uses vastly more complicated models like autoencoders, restricted Boltzmann machines and deep generative models. The proposal presents an ambitious agenda for extending theory to embrace and support these practical trends, with hope of influencing practice. Theoretical foundations will be provided for the next generation of machine learning methods and optimization algorithms.
The project may end up having significant impact on practical machine learning, and even cause a cultural change in the field -- theory as well as practice -- with long-term ramifications. Given the ubiquity as well as economic and scientific implications of machine learning today, such impact will extend into other disciplines, especially in (ongoing) collaborations with researchers in neuroscience. The project will train a new generation of machine learning researchers, through an active teaching and mentoring plan at all levels, from undergrad to postdoc. This new generation will be at ease combining cutting edge theory and applications. There is a pressing need for such people today, and the senior PIs played a role in training/mentoring several existing ones. Technical contributions will include new theoretical models of knowledge representation and semantics, and also frameworks for proving convergence of nonconvex optimization routines. Theory will be developed to explain and exploit the interplay between representation learning and supervised learning that has proved so empirically successful in deep learning, and seems to underlie new learning paradigms such as domain adaptation, transfer learning, and interactive learning. Attempts will be made to replace neural models with models with more "interpretable" attributes and performance curves. All PIs have a track record of combining theory with practice. They are also devoted to a heterodox research approach, borrowing from all the past phases of machine learning: interpretable representations from the earlier phases (which relied on logical representations, or probabilistic models), provable guarantees from the middle phase (convex optimization, kernels etc.), and an embrace of nonconvex methods from the latest deep net phase. Such eclecticism is uncommon in machine learning, and may give rise to new paradigms and new kinds of science.
|
1 |
2021 — 2024 |
Hazan, Elad |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Research: Foundations of Deep Learning: Theory, Robustness, and the Brain
A truly comprehensive theory of machine learning has the potential of informing science and engineering in the same profound way Maxwell’s equations did. It was the development of that theory by Maxwell that truly unleashed the potential of electricity, leading to radio, radars, computers, and the Internet. In an analogy, deep learning (DL) has found over the past decade many applications, so far without a comprehensive theory. An eventual theory of learning that explains why and how deep networks work and what their limitations are may thus enable the development of even more powerful learning approaches – especially if the goal of reconnecting DL to brain research bears fruit. In the long term, the ability to develop and build better intelligent machines will be essential to any technology-based economy. After all, even in its current – still highly imperfect –state, DL is impacting or about to impact just about every aspect of our society and life. The investigators also plan to complement their theoretical research with the educational goal of training a diverse population of young researchers from mathematics, computer science, statistics, electrical engineering, and computational neuroscience in the field of machine learning and of its theoretical underpinnings.
The investigators propose to join forces in a multi-pronged and collaborative assault on the profound mysteries of DL, informed by the sum of their experience, expertise, ideas, and insight. The research goals are threefold: to develop a sound foundational/mathematical understanding of DL; in doing so to advance the foundational understanding of learning more generally; and to advance the practice of DL by addressing its above-mentioned weaknesses. Of six foundational thrusts, the first two focus on the standard decomposition of the prediction error in approximation and sample (or estimation) error. Their goal is to extend classical results in approximation theory and theory of learnability to DL. These two are then supported by a research project that is specific to deep learning: analysis of the dynamics of gradient descent in training a network. The fourth theme is about robustness against adversaries and shifts, a powerful test for theories which is also important for practical deployment of learning systems. The fifth thrust is about developing the theory of control through DL, as well as exploring dynamical systems aspects of deep reinforcement learning. The final topic connects research on DL to its origins - and possibly its future: networks of neurons in the brain. The proposed research also promises to advance the foundations of learning theory. Success in this project will result in sharper mathematical techniques for machine learning and comprehensive foundations of machine learning robustness, broadly construed. It will also ultimately enable development of learning algorithms that transcend deep learning and guide the way towards creating more intelligent machines, and shed new light on our own intelligence.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
1 |