2003 — 2006 |
Jebara, Tony |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Itr: Representation Learning: Transformations and Kernels For Collections of Tuples
Statistical machine learning tools permit scientists and engineers to automatically estimate computational models directly from real-world data for making predictions, classifications and inferences. However, before learning can take place, the practitioner needs to know how to properly represent data in a consistent, invariant and well-behaved numerical form for processing by the various techniques. This proposal reduces this burden and facilitates applications of machine learning via novel algorithms that not only model data but also automatically handle invariances and discover appropriate representations of the data.
This proposal formalizes a variety of potential transformations and embeds them within a principled learning framework. This makes it possible to handle real scenarios where data transforms, translates, changes nonlinearly and effectively creates many difficulties for traditional tools. The set of interesting transformations the proposal considers also includes permutations. Handling permutation allows algorithms to learn when each data-point in the dataset is a collection of tuples whose ordering is arbitrary. For example, a digital color image can be represented as a bag of pixels whose ordering is arbitrary. This project then develops the necessary algorithms for invariance to permutations, re-orderings, translations, and many other transformations affecting data in practice.
The proposal makes use of and extends state-of-the-art techniques in the machine learning field including Bayesian networks, kernel methods and convex programming. Primary applications include face recognition and surveillance. To recognize human face identity from video, the proposed algorithms compensate for possible transformations as faces translate, deform and rotate in real images. Standardized and challenging face recognition datasets are used for evaluation. The representation learning methods also facilitate machine learning in general, promising potential impact in other applied fields where data undergoes transformations including computational vision, speech, time series analysis and bio-informatics.
|
1 |
2004 — 2010 |
Jebara, Tony |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Career: Discriminative and Generative Machine Learning With Applications in Tracking and Gesture Recogniton
This project aims to develop a tighter integration of generative (e.g., Bayesian networks) and discriminative (e.g., support vector machines) machine learning. These two types of learning are typically not integrated in current practice and are often seen as competing approaches; however, such integration is crucial in complex multi-disciplinary domains, such as vision, speech, and computational biology, where scientists have real expertise and exploitable knowledge about elaborate systems yet also need machine learning to achieve optimal performance for specific tasks. This project's integrated generative-discriminative framework will allow practitioners to flexibly design and structure a given learning problem using generative tools like Bayesian networks and then to maximize the performance of these models using discriminative methods like maximum entropy and probabilistic kernels. This framework will be used in computer vision tracking applications and in classification of gestures. In a laparoscopic robotic surgery platform, the methods will be used for classifying surgical drill movements and predicting surgeon dexterity level. The project's unified approach will be used to create a more comprehensive machine learning course experience for students, complete with online class materials, visual demonstrations and software toolkits.
|
1 |
2011 — 2014 |
Chudnovsky, Maria (co-PI) [⬀] Jebara, Tony |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ri: Small: Learning and Inference With Perfect Graphs
This proposal investigates new computational and combinatorial tools for learning from data and performing inference about data. As such, it serves as a bridge between the machine learning community and the combinatorics community. The latter field develops efficient algorithms or shows when an efficient solution to a particular problem exists. The project will support graduate students in the intersection of these two fields to combine recent theoretical results with practical data problems. The team will produce a website of tutorials, data sets, downloadable code, interactive visual examples and course materials to allow better integration across the two fields. The combinatorics community has identified a family of inputs called perfect graphs where otherwise hard problems are efficiently solvable. This proposal will investigate how to bridge this powerful theoretical result to the area of machine learning and formally characterize which learning and inference problems are efficiently solvable.
More specifically, the team will investigate data learning problems (such as clustering a data set into subgroups that are self-similar or finding anomalies in a database) and statistical inference problems (computing the most likely or most typical outcome from observed measurements). These problems will be compiled or represented as graphs and networks. Then, these graphs and networks can be more formally diagnosed using perfect graph theory to determine if the instance of the problem is easy (or hard) and to provide efficient solutions via exact algorithms on perfect graphs. These solvers will be implemented using message passing or convex programming.
|
1 |
2013 — 2016 |
Kaiser, Gail (co-PI) [⬀] Jebara, Tony Sethumadhavan, L |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Shf:Medium:Overcoming the Intuition Wall: Automatic Graphical Analysis of Programs to Discover and Program New Computer Architectures
Workload characterization is central to development of new computer architectures. The rise of the mobile-cloud paradigm has increased the diversity and rate at which applications are created thus challenging computer architects' ability to build optimized systems for them. In the past, architects have been able to examine software codes of interest (often through slow laborious manual inspection if necessary) when releases were far and few in between to derive intuition necessary to make architectural and microarchitectural discoveries. But this method does not scale to emerging applications that are literally hammered out in the hundreds by the day. Further, new languages and platforms have behaviors that are quite different from legacy codes and there is an urgent need for intuition on these applications. Without new methods to characterize emerging workloads, computer architects risk running into an intuition wall. This risk might prove calamitous if unmitigated, given the added reliance on (micro)architects to develop more energy efficient designs to compensate for the losses due to slowdowns in Dennard's scaling.
Advances in machine learning provide an opportunity to overcome the intuition wall. In the last decade there have been many major advances in machine learning on graphs motivated by need/benefits of mining behaviors in social networks and enabled by cheap commodity computing. In this project, the PIs plan to leverage these advances to discover and program new computer architectures. By viewing program execution as a graph, clustering these graphs, and mining them for similarities, the PIs plan to discover new behaviors that architects and microarchitects can use to develop new on-chip acceleration structures. The PIs also plan to study how legacy code can semi-automatically be converted to execute on the architectures with the new accelerators.
|
1 |
2014 — 2016 |
Jebara, Tony |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Eager: New Optimization Methods For Machine Learning
This proposal explores the optimization of complicated nonlinear equations that underlie machine learning problems by reducing them to simpler easy-to-solve update rules. The learning problems include classification, regression, unsupervised learning and more. Through a method known as majorization, complicated optimization problems are handled by iteratively solving simpler problems like least-squares and traditional linear algebra operations. The proposal focuses on how to parallelize this method so that it can efficiently leverage many CPUs/GPUs simultaneously and in a distributed manner. Furthermore, by making the method stochastic, faster convergence on large or streaming data-sets becomes possible. Other variations are explored such as sparse learning where the recovered solution is forced to be compact which also leads to further efficiency.
Increasingly, the vast majority of machine learning problems in the literature are optimized by using generic first- and second-order methods. The approach in this proposal is designed specifically for machine learning optimization problems and uses majorization and bounding to guarantee monotonic convergence. In preliminary work, majorization has produced faster convergence in practice as well as novel theoretical guarantees. To make the method truly viable in practice, this proposal puts forward distributed, parallel, stochastic and sparse extensions. Since such extensions may violate monotonic convergence guarantees, the proposal explores additional algorithmic and theoretical efforts to preserve guarantees while also obtaining fast algorithms. In particular, parallelization and distributed computation is performed by wrapping current state-of-the-art least squares solvers with bound majorization steps. Stochastic computation is explored using singleton, small-batch and variable-sized batch methods. Sparsity is achieved by iterating current large-scale sparse solvers like FISTA and QUIC within the bound majorization technique. In terms of broader impact, one graduate student will be supported and will help produce downloadable tools for machine learning experts as well as practitioners. Modules will be developed to add to the PI's existing courses in machine learning. The PI will organize a one-day workshop on majorization methods. The proposal also provides a public project website with access to research publications, software/data downloads and schedules of upcoming events.
|
1 |
2015 — 2018 |
Jebara, Tony |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Iii: Small: Collaborative Research: Approximate Learning and Inference in Graphical Models
This project is designing efficient methods for approximate learning and prediction in graphical models. In a typical setting, the parameters of an unknown graphical model are to be estimated from data observations. Once learned, the parameters are often used to make predictions about unseen data. The learning problem can be solved by estimating the parameters of the model that generate the observed data with the highest probability (a process known as maximum likelihood estimation), and the prediction task is typically performed by a statistical inference method. As exact learning and prediction are computationally intractable, in practice, we seek to replace the maximum likelihood estimation and prediction tasks with more tractable surrogates. This project is developing such surrogates that (a) can be leveraged for learning in large, real-world graphical models with hidden variables, (b) are orders of magnitude faster than the current state-of-the-art methods, and (c) provide a rigorous alternative to the more heuristic methods that are often employed at scale.
Under certain conditions, surrogates can outperform exact maximum likelihood estimation combined with an approximate inference algorithm for prediction. However, many of the typical approaches are much too slow or too limited in power to be used to learn the kinds of large-scale graphical models with many hidden variables that arise in practice. This project studies the design of fast, distributed approximate learning and inference procedures based on the Bethe approximation, a surrogate that is known to perform well in practice. The core observation is that approximate maximum likelihood estimation using the Bethe surrogate can be reduced to solving a series of approximate inference problems via the Frank-Wolfe algorithm. The benefit of this approach is that many fast, combinatorial algorithms already exist for approximate inference in graphical models. In addition to provable bounds and convergence rates, the methods are practically evaluated using several publicly available datasets, primarily social network and image data. Baselines will be application-specific methods known to work well on those datasets, with the developed code made publicly available.
|
1 |