2000 — 2005 |
Krishnamurthy, Arvind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Career: Compiler Aided Resource Management For Clusters
Both researchers in high performance parallel computing and vendors of scalable server systems are actively studying the performance of hierarchical computer systems. There is now a consensus that the structure of the various software and hardware layers of a system (e.g. machine architecture, operating system, compilers, and user programs) has a great impact on the overall performance. Traditionally, systems have supported resource management at low levels of the hierarchy to allow better use of machinespecific features. More recently, higher-level libraries and application-specific operating systems have taken more of this responsibility so that they can use program behavior to determine resource use. Both approaches have disadvantages - the former is too application-independent, while the latter requires too many application-specific modules. This project avoids these problems by using compile-time information, program transformations, and smart runtime systems to efficiently use system resources. To do this, the compiler will rely on developments in Java mobile code, which will avoid pitfalls that have trapped previous research projects in this area.
The compiler that this project produces will analyze program-level behavior, generate program-specific system management code, and choose appropriate management policies automatically. It will make three important contributions: new compiler algorithms to analyze and optimize a program's utilization of system resources, precise models of network and storage devices to aid in resource management, and "wide" interfaces to system layers to enable management policies to be chosen by the compiler or runtime system. The project will integrate this research understanding with education by incorporating a wide variety of state-of-the-art tools into coursework. These tools will enable the students to better understand the tradeoffs between static and dynamic approaches to computing.
|
1 |
2002 — 2004 |
Krishnamurthy, Arvind Nilsson, Henrik Scassellati, Brian (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Composing Data-Rich Embedded Systems the Easy Way
Composing Data-Rich Embedded Systems the Easy Way -------------------------------------------------
Arvind Krishnamurthy and Henrik Nilsson Department of Computer Science, Yale University
Embedded computing increasingly takes place in a sensor rich environment where the acquisition of raw information is much easier than its interpretation. In addition to building components to process this data, embedded systems programmers must also arrange the communication among potentially hundreds of components, distributing computation to the processing elements so as to minimize communication costs and maximize responsiveness, and scheduling of the processing elements to adapt to changing priorities and communication patterns. These challenges must be addressed at the system level rather than the processor level.
This project develops a framework for composing distributed, data-rich embedded systems that automates many of the low-level process allocation and scheduling tasks. It takes place against a backdrop of a "next generation" humanoid robot currently being developed at Yale. The robot contains a significant number of processors connected in a heterogeneous fashion. The project addresses two fundamental research issues. The first is the use of modern programming language techniques to address critical embedded system concerns such as composability and dynamic configuration change. The second is improving overall system performance by exploiting high level system knowledge in the run-time system. The end result will be a design methodology that will enable rapid and reliable construction of complex data-rich interactive systems.
|
1 |
2002 — 2005 |
Feigenbaum, Joan [⬀] Shenker, Scott (co-PI) [⬀] Krishnamurthy, Arvind Yang, Yang (co-PI) [⬀] Yang, Yang (co-PI) [⬀] Yang, Yang (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Incentive-Compatible Designs For Distributed Systems
This project includes the research activities to obtain theoretical and practical results on mechanisms that are incentive-compatible, scalable and distributed. Specifically, distributed algorithmic mechanism design with insights from game theory is proposed for three related problems in networking: interdomain routing, web caching and peer-to-peer file sharing. The research program on interdomain routing will develop a fundamentally new approach in which many of the routing-related incentive issues are handled by incentive-compatible protocols rather than bilateral contracts; such protocols can more effectively address the system-wide issues of efficient routing and conflicting policy requirements. Within this project also the recently developed techniques for digital-goods auctions will be applied to the peer-to-peer file sharing problem and to the design of incentive-compatible caching mechanisms. This project will help to understand better the behaviors of large-scale, distributed information systems formed by autonomous components such as Internet, and develop incentive-compatible algorithms for these systems accordingly.
|
1 |
2006 — 2011 |
Lazowska, Edward (co-PI) [⬀] Anderson, Thomas [⬀] Krishnamurthy, Arvind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Mri Development: Enabling Lightweight Planetary Scale Services @ University of Washington
This project, developing virtual PlanetLab(an emulation Planet Lab), targets verifying whether an experiment attains the same performance and failure behavior as when run directly on the PlanetLab/ GENI system. The work involves network emulation, building a toolkit to ease the development and programming on PlanetLab, investigating control, and managing plane services. At present, only with extreme effort on the part of the researcher is this verification theoretically possible with Emulab. In fifteen years the Internet has gone from an obscure research network known to the academic community to a critical piece of national infrastructure; but, because its architecture is unable to quickly adapt to meet emerging challenges, it is becoming the victim of its own success. Vulnerabilities are being increasingly exploited limiting assimilation of new technologies and support of new applications. To foster the development of a new generation of distributed systems and network protocols, researchers around the world have created a testbed called PlanetLab, enabling research into truly planetary scale services where each researcher may utilize a virtualized slice across a widely distributed set of nodes. PlanetLab currently spans three hundred separate locations worldwide hosting over four hundred active research projects. Consequently, to support network and distributed systems research, NSF CISE has recently proposed constructing GENI (Global Environment for Network Investigations). This work aims to
-Dramatically improve the cost-effectiveness of planetary scale testbeds such as PlanetLab and GENI and -Reduce the startup time for new PlanetLab/GENI researchers to develop and deploy an experiment.
A transactional service manager that can adapt/survive the myriad types of node failures encountered in practice, simple job and pipe control, and distributed debugger are expected to facilitate the use of PlanetLab. The software toolkit enables research in
-Robust content distribution, Real time multimedia delivery, Security, Routing, -Network embedded storage and file sharing, Distributed information management, -Internet Measurement, Distributed resource allocation, and Network layer modifications.
|
1 |
2007 — 2011 |
Bransford, John (co-PI) [⬀] Anderson, Richard Anderson, Thomas (co-PI) [⬀] Krishnamurthy, Arvind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Technologies For Cooperative Learning in Rural India @ University of Washington
There is a great deal of interest in using new technologies to help developing nations improve the quality of lives of their citizens. Yet without a careful analysis of existing social and cultural norms and infrastructures, there is a great danger that technology-based "solutions" will fail. This grant will fund a collaboration between technologists, educators, and learning scientists in India and the US to design, implement and evaluate distance learning systems that help resource-starved village primary schools in rural villages to benefit from the better human and content resources available in urban environments. The basis of the project is a combination of YouTube, NetFlix, and file sharing: excellent teachers are videotaped in the classroom demonstrating good pedagogy teaching the same subject matter as is taught in village schools, these video clips will be automatically assembled and distributed using DVDs and the postal system to village schools, where the videos are used for on-site teacher training and mediated instruction by the local teacher.
Over a billion people on the planet are illiterate, in large part due to the lack of trained teachers in rural villages where most children in poverty live. Finding trained teachers for primary school education is nearly impossible in rural areas, despite these skills being essential for upward mobility in today's economy. Of course, distance education is nowhere near as good as a qualified teacher at the primary school level, at least with current technologies, but many students don't have that luxury. The goal of this project to improve the existing educational baseline in a cost-effective way, using solutions that can be scaled to match the scope of the problem.
|
1 |
2007 — 2011 |
Anderson, Thomas [⬀] Krishnamurthy, Arvind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Csr-Pdos: Scalable Peer-to-Peer Data Dissemination @ University of Washington
When the Internet protocols were standardized in the late 80s, most distributed applications were point-to-point, such as email, telnet, and ftp. Over the last few years, the nature of Internet traffic has changed dramatically. Driven by the production and distribution of vast amounts of user-created content, current Internet traffic is overwhelmingly many-to-many, leading to a mismatch between application demand and the underlying protocols.
This project is to develop a peer-to-peer solution, called OneSwarm, to serve as a universal communication layer for multi-point communication. By relying solely on software running on end-hosts, OneSwarm will avoid the high operational cost of infrastructure solutions and the deployability issues of clean-slate redesigns to the network architecture. Among the open questions that need answers: can swarming techniques work and meet desired levels of service quality when content files are no longer large, when there are strict deadlines on delivery, when the communication is not all-to-all, or when different nodes have different needs, all without working at cross-purposes to ISP objectives? This proposal will also investigate incentives to encourage end hosts to contribute resources to support demanding applications, by allowing users to trade resources across time, across swarms, and across applications.
|
1 |
2008 — 2013 |
Gribble, Steven Krishnamurthy, Arvind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ct-M: a Real-Time Botnet Monitoring Infrastructure @ University of Washington
Large-scale botnets have become a blight on the Internet. Botnets engage in a variety of harmful activities, including initiating DDoS attacks, committing click fraud, propagating adware, and sending enormous volumes of spam. Though there is an increasing awareness of botnets, there are gaping holes in our understanding of botnets, both in terms of macroscopic properties as well as the ability to track and thwart specific attacks.
As part of this project, we develop a response to the botnet threat by building a monitoring system that gathers and distributes objective data on the problem. Our work offers three novel contributions. First, we solve many of the challenges involved in building a real-time botnet monitoring platform. For example, our system executes live botnet nodes, and as such, it must prevent these nodes from causing harm to other hosts on the Internet. Second, we implement several prototype defensive tools that take advantage of the real-time information provided by the platform. Third, our work exposes the rich texture of the botnet ecosystem by analyzing botnets from multiple perspectives and by correlating the attack vectors with observations of real bots executed in laboratory settings.
Our botnet monitoring platform thus advances our understanding of botnets and enables promising anti-botnet defense tactics. It thus serves as a crucial step in the development of a trustworthy network that can support a much wider diversity of uses than can be found on today's Internet.
|
1 |
2010 — 2016 |
Anderson, Thomas [⬀] Krishnamurthy, Arvind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Csr: Medium: Very Large Scale Consistent Dhts @ University of Washington
Modern distributed systems increasingly rely on distributed storage and lookup services. Existing storage and lookup solutions, however, provide only a subset of high availability, high scalability, and strong consistency semantics. In this project, we develop Harmony, a distributed hash table (DHT) aimed at providing very large scale applications with highly scalable, consistent and available storage and lookup. We address many significant challenges in delivering on this goal. First, the Internet has unpredictable node and communication failures, and preserving consistency in this context is a difficult task. To address this issue, we provide the abstraction of a collection of self-managing groups that coordinate to ensure atomic updates to distributed state. Second, consistency often comes at the cost of reduced availability. In our system, consistency is an inviolable safety property; availability is provided through replication. Third, coordination mechanisms for consistent replication and atomic updates often result in performance penalties. A key insight in our work is that one can improve performance with coordination mechanisms for delegation and autonomous execution. Finally, by isolating most of the communication necessary to preserve consistency within a group, we both simplify our implementation and improve its scalability. Further, adaptation to changing workloads and resource constraints is easier because it can take place within a framework for the consistent update of distributed state. If successful, the resulting storage abstraction should greatly simplify the development of complex distributed applications.
|
1 |
2010 — 2014 |
Anderson, Thomas [⬀] Krishnamurthy, Arvind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Fia: Collaborative Research: Nebula: a Future Internet That Supports Trustworthy Cloud Computing @ University of Washington
Cloud computing provides economic advantages from shared resources, but security is a major risk for remote operations and a major barrier to the approach, with challenges for both hosts and the network. NEBULA is a potential future Internet architecture providing trustworthy networking for the emerging cloud computing model of always-available network services. NEBULA addresses many network security issues, including data availability with a new core architecture (NCore) based on redundant connections to and between NEBULA core routers, accountability and trust with a new policy-driven data plane (NDP), and extensibility with a new control plane (NVENT) that supports network virtualization, enabling results from other future Internet architectures to be incorporated in NEBULA. NEBULA?s data plane uses cryptographic tokens as demonstrable proofs that a path was both authorized and followed. The NEBULA control plane provides one or more authorized paths to NEBULA edge nodes; multiple paths provide reliability and load-balancing. The NEBULA core uses redundant high-speed paths between data centers and core routers, as well as fault-tolerant router software, for always-on core networking. The NEBULA architecture removes network (in) security as a prohibitive factor that would otherwise prevent the realization of many cloud computing applications, such as electronic health records and data from medical sensors. NEBULA will produce a working system that is deployable on core routers and is viable from both an economic and a regulatory perspective.
|
1 |
2010 — 2013 |
Krishnamurthy, Arvind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Eager: Collaborative: Aster*X: Load-Balancing Web Traffic Over Wide-Area Networks @ University of Washington
This project proposes a comprehensive load-balancing solution to minimize client response time and reduce system costs for services hosted in wide-area networks. The system, called Aster*x, uses the global state of server load and network congestion, and dynamically routes the requests over appropriate (server and path) pairs, calculated using the load-balancing algorithms developed by project staff.
The GENI network infrastructure will be used for extensive deployment, evaluation, and demonstration of Aster*x. Aster*x exploits OpenFlow?s logically centralized controller to obtain the global network state and route flows of various granularities. It will use the PlanetLab and ProtoGENI-based computation substrate to host the replicated web service and to generate client requests from multiple locations. The project will provide an opportunity for students across four universities to collaborate and build a relatively large experimental system on GENI.
|
1 |
2012 — 2014 |
Trosvig, Kelli Krishnamurthy, Arvind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Eager: University of Washington (Uw) Openflow Research @ University of Washington
This project will deploy a pilot OpenFlow network (a form of Software-Defined Network [SDN]) for the Computer Science and the Physics and Astronomy buildings at the University of Washington. This capability will enable high-performance layer-2 connections to the desktop for both GENI experimentation and for high performance computing applications. The project will provide an opportunity to develop an operational model for the innovations promised by OpenFlow and advance the understanding of how to integrate OpenFlow into campus networking infrastructures. Travel budget is included for sharing and presentations of findings to the broad GENI community and non-technical user communities.
Although OpenFlow is currently being widely discussed and are key elements in the GENI architecture, there is little operational or campus-level architectural experience with using it. In particular, the integration of OpenFlow and other SDN technology into science DMZs and other issues of campus security and policy are not fully understood in operational contexts. The project outcomes include reporting of results and lessons to other campus network operators and to SDN researchers and industry.
Broader Impact: OpenFlow and other software defined networking approaches have the potential to transform highcapacity data transfer and networking in our emerging world of data-driven discovery. The UW's OpenFlow proposed deployment, along with the commitment of a full-time engineer devoted to managing the facility in an operational network setting, will advance our understanding of these methodologies and serve to inform our broader campus-wide network planning and deployment efforts prospectively. The University of Washington will share data from these pilot deployments with both the University of Washington research community and more broadly with the Research and Education Network Community, including at one NSF sponsored conference for such purposes.
|
1 |
2013 — 2017 |
Choffnes, David Krishnamurthy, Arvind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Nets: Small: Automated Diagnosis and Root Cause Analysis of Internet Problems @ University of Washington
Reliable Internet performance and availability are essential for many existing and future network applications. While the Internet works well enough most of the time for most people, nearly everyone has experienced outages and service degradation that make the network unusable, and we are far from five nines of reliability that critical services require. Improving Internet connectivity requires action against all sources of unavailability and poor performance.
The research community has made substantial progress toward understanding and developing technologies to address short-term outages due to BGP (border gateway protocol) routing convergence. However, much less progress has been made at reducing the impact of long-term outages and route misconfiguration. Despite being rare, these events have a large impact on overall network availability because repairs happen on a human timescale. Additionally, many users suffer from the use of sub-optimal (high latency or lossy) paths to network services due to misconfigurations and ineffective route selection. Operators at an affected ISP or service often encounter stumbling blocks at each step: identifying that a problem exists, localizing the root cause of the problem, and affecting a repair.
The researchers on this project will develop a system to transform this largely manual troubleshooting process into a fully automated one. The goal of the research is that persistent outages and performance problems can be identified in real-time, rather than today's matter of hours. While automated diagnosis and identification of root cause is fundamentally hard, the project will benefit from dramatic recent progress in Internet measurement technologies, specifically reverse path measurement that provides a much more complete picture of the Internet topology than ever before.
Intellectual Merit: The goal of the research project is to change the paradigm of network diagnosis on the Internet -- from blind to informed. The state of art with network troubleshooting is to use ad-hoc techniques. For instance, it is common occurrence on the NANOG (North American Network Operators? Group) mailing list for operators to post requests asking other operators to manually issue traceroutes and report them in order to identify network anomalies. The network could thus benefit from a continuously operated service that can not only detect network problems in realtime but also identify misbehaving network elements at the granularity of routers. There are also a number of challenges to deploying a functional diagnosis system, and the researchers will address them using the following key components. First, the project will produce a scalable measurement system that will synthesize measurements from different techniques to provide snapshots of routing behavior in real-time. Second, the research will focus on developing a general theory of Internet path changes that will help model the propagation of routing events and identify the candidate set of responsible ASes (autonomous systems). Third, the researchers will develop inference techniques that will operate on measured data and identify the origin of failures and path changes in the wide area even when the measurement data is incomplete or subject to transient dynamics.
Broader Impact: Our society is increasingly relying on the Internet for critical telecommunications services, such as home health monitoring, e-911, smart grids, and so forth. It is no longer simply an inconvenience when the Internet is unavailable or inefficient. If this project is successful, it will help operators address the major sources of unavailability and misconfigurations in the Internet, benefiting all of its users. In addition, because of a lack of automated tools, operators currently spend huge amounts of time chasing down individual outages and performance misconfigurations; this raises the barrier to entry for small ISPs, ultimately raising the costs of Internet service for everyone.
|
1 |
2014 — 2015 |
Krishnamurthy, Arvind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Student Travel Support For the Eleventh Symposium On Networked Systems Design and Implementation (Nsdi 2014) @ University of Washington
This project funds travel support for students to attend the Eleventh Symposium on Networked Systems Design and Implementation (NSDI 2014) that will be held in Seattle, WA, on April 2-4, 2014. The conference focuses on the design principles of large-scale networks and distributed systems and, in particular, challenges that are shared across systems as diverse as internet routing, peer-to-peer file sharing, sensor nets, scalable web services, and distributed network measurement. The goal of the conference is to bring together researchers from across the networking and systems community -- including computer networking, distributed systems, and operating systems -- to foster cross-disciplinary approaches and to address shared research challenges. This cross-disciplinary emphasis makes it well-suited as a target for student participation. The awards permit a more qualified and diverse group of student attendees than would otherwise have been possible. Students attending the conference benefit from both the technical content of the conference and the professional interaction with senior researchers from other universities and with industry.
Participation in NSDI 2014 is a valuable and important part of the graduate school experience. It provides students with the opportunity to interact with more senior researchers in the field, and exposes students to leading edge work in the field. This project will enable the participation of students who would otherwise be unable to attend NSDI. NSDI 2014 will offer many opportunities for students to meet, interact, and exchange ideas in an informal atmosphere including members of the program committee, speakers and organizers who are on site; a poster session; small evening birds-of-a-feather sessions.
|
1 |
2014 — 2017 |
Krishnamurthy, Arvind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Csr: Small: Towards Blocking Resistant Network Services @ University of Washington
Unfettered digital communication, as provided by the Internet, has fundamentally changed the world in countless ways. Businesses, organizations, and citizens have benefited from the Internet's global reach. However, the Internet was not designed to be resilient to censorship, and governments have restricted communication to advance their social and economic agendas. Worse, network equipment providers have shown a willingness to commoditize and profit from censorship by selling interception and filtering devices. The desire for uncensored access to the Internet has motivated the development of techniques such as one-hop proxies and anonymizing networks, but existing techniques have multiple limitations, ranging from low performance and intermittent availability to reduced levels of security and application flexibility. This project will develop and deploy a censorship resistant peer-to-peer system that overcomes the limitations of existing systems and provides unrestricted access to popular web sites and Internet services.
The advent of social networks provides researchers with an opportunity to take a fresh look at peer-to-peer systems for censorship resistance. The focus of this project is to use social trust relations as a foundation for achieving highly secure and blocking resistant overlays for supporting network services, while also approaching the ease of use, performance, and reliability typically associated with direct communications. Specifically, the system will use social overlays wherein participants form a peer-to-peer overlay network bootstrapped based on real-world trust relationships. Participants of the social overlay would be able to directly communicate and share content, without having to expose their actions and without being blocked. A key design choice is also to develop the system to operate completely within a web browser, leveraging newly developed technologies that provide support for peer-to-peer communications inside the browser. Browser-based end-user solutions are easily deployable without requiring significant infrastructural costs. There are several key, interrelated challenges that the system will overcome. Principal among these are: performance, flexibility, trustworthiness, and robustness.
|
1 |
2015 — 2018 |
Wang, Xi (co-PI) [⬀] Anderson, Thomas [⬀] Krishnamurthy, Arvind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Csr: Cc: Large: a High-Performance Data Center Operating System @ University of Washington
Today, many popular cloud data center applications spend a huge fraction of their time running operating system code. Examples include the backend services implementing Facebook, Google, Amazon, and other popular websites. These applications spend much of their time moving, processing, and storing data. In a traditional operating system, however, the operating system kernel mediates all network and storage access, as a means to provide application, data, and network security within the data center. Thus, the code path for a network or storage request to traverse from application code through the kernel to the device hardware (and back again) is many times longer than the minimum required.
This work streamlines the performance of these applications without compromising security by changing the roles of the operating systems kernel, application runtime library, and device hardware. Instead, the traditional role of the kernel is split in two. Applications have direct access to virtualized I/O devices, allowing most I/O operations to skip the kernel entirely. The kernel operates primarily in the control plane, establishing and limiting data plane connections in accordance with the operating system security policy.
The work has the potential for dramatic improvements in application and server performance, as well as data center energy consumption and protocol flexibility. Network and storage intensive data center applications are used by literally billions of people around the globe on a daily basis. By reducing the overhead of network and storage these applications, the hardware needed to support existing services can be reduced, making it cheaper for new services to be developed.
|
1 |
2015 — 2016 |
Krishnamurthy, Arvind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Student Travel Support For the Twelfth Symposium On Networked Systems Design and Implementation (Nsdi 2015) @ University of Washington
This project funds travel support for students to attend the Twelfth Symposium on Networked Systems Design and Implementation (NSDI 2015) that will be held in Oakland, CA, on May 4-6, 2015. The conference focuses on the design principles of large-scale networks and distributed systems and, in particular, challenges that are shared across systems as diverse as internet routing, peer-to-peer file sharing, sensor nets, scalable web services, and distributed network measurement. The goal of the conference is to bring together researchers from across the networking and systems community -- including computer networking, distributed systems, and operating systems -- to foster cross-disciplinary approaches and to address shared research challenges. This cross-disciplinary emphasis makes it well-suited as a target for student participation. The travel support permits a more qualified and diverse group of student attendees than would otherwise have been possible. Students attending the conference benefit from both the technical content of the conference and the professional interaction with senior researchers from other universities and with industry.
Participation in NSDI 2015 is a valuable and important part of the graduate school experience. It provides students with the opportunity to interact with more senior researchers in the field, and exposes students to leading edge work in the field. This project will enable the participation of students who would otherwise be unable to attend NSDI. NSDI 2015 will offer many opportunities for students to meet, interact, and exchange ideas in an informal atmosphere including members of the program committee, speakers and organizers who are on site and a poster session.
|
1 |
2016 — 2019 |
Krishnamurthy, Arvind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Csr: Small: Enabling Deep Neural Networks For Mobile-Cloud Applications @ University of Washington
Over the past three years, Deep Neural Networks (DNNs) have become the dominant approach to solving a variety of important problems in computing. This includes problems in speech recognition, machine translation, handwriting recognition and many computer vision problems like face, object, and scene recognition. Although they are renowned for their excellent recognition performance, DNNs are also known to be computationally intensive: networks commonly used for speech, visual and language understanding tasks routinely consume hundreds of MB of memory and Gflops of computing power, typically the province of server-class computers. However, the relevance of the above applications to the mobile setting and the potential for developing new applications provides a strong case for executing DNNs on mobile devices.
This project is to build an execution framework for deep-neural networks on mobile-cloud platforms so as to enable a broad class of emerging applications such as continuous mobile vision. In particular, this work will look at enabling a large suite of DNN-based face, scene and object processing algorithms based on applying DNNs to video streams from wearable devices. This framework, given an arbitrary DNN, will compile it down to a resource-efficient variant at modest loss in accuracy. The project plans include developing novel techniques to specialize DNNs to contexts and to share resources across multiple simultaneously executing DNNs. Finally, it will create a run-time system for managing the optimized models generated. Using the challenging continuous mobile vision domain as a case-study, the plan is to demonstrate that these techniques yield very significant reductions in DNN resource usage, including orders of magnitude reduction in memory use and instructions executed, in common mobile settings.
|
1 |
2016 — 2017 |
Krishnamurthy, Arvind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Student Travel Support For the Thirteenth Symposium On Networked Systems Design and Implementation (Nsdi 2016) @ University of Washington
This award is to assist approximately 25 US-based graduate students to attend the Thirteenth Symposium on Networked Systems Design and Implementation (NSDI 2016) in Santa Clara, CA, on March 16-18, 2016. The conference focuses on the design principles of large-scale networks and distributed systems and the common challenges across diverse systems. This conference brings together researchers from across the networking and systems communities including computer networking, distributed systems, and operating systems to foster cross-disciplinary approaches and to address shared research challenges.
The purpose of this trip is to provide students with the opportunity to interact with more senior researchers in the field, and exposes students to leading edge work in the field. The support requested in this proposal will enable the participation of students who would otherwise be unable to attend NSDI.
|
1 |
2016 — 2019 |
Wang, Xi (co-PI) [⬀] Krishnamurthy, Arvind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Nets: Small: Software-Defined Data Plane For Datacenters @ University of Washington
Many popular cloud applications that power our daily lives, such as Facebook, Google, Amazon, and Twitter, spend a large fraction of their time performing network operations. Even supposedly computationally intensive applications such as parallel machine learning are often limited by communication performance. Datacenter network designers have been racing to keep up with this increased reliance on network usage, but they have had only limited success in achieving desirable system properties as they are often constrained by the lack of switch support for deploying new protocols.
Fortunately, innovation in switch and network interface card (NIC) design is now focused on building not just faster but also more flexible packet processing devices. Whereas traditional switches and NICs typically provide functionality to route and forward packets, many upcoming switches have support for transforming the packet as well as performing computations on the packet before routing it towards its destination. These devices have the potential to revolutionize the use of networking devices within datacenters as they provide the ability to reconfigure packet processing and deploy new protocols.
This project will investigate flexible packet processing functionality on NICs and switches and their potential for optimizing high performance networked systems inside the datacenter. This work provides new abstractions and building blocks for the use of flexible packet processing pipelines, while respecting the hardware constraints that will be associated with these technologies. The researchers will examine how the resulting data plane functionality can be used to implement resource allocation mechanisms inside the datacenter so as to enable congestion control, performance isolation, adaptive routing, and efficient load balancing. They will also examine implementing on the NICs and switches some of the packet processing traditionally done in end-host software. The goal is to show how and by how much flexible packet processing can benefit widely used datacenter applications and also provide guidance on what features that future iterations of the hardware should provide in order to significantly improve application performance.
Broader Impact: Network-intensive datacenter applications are used by literally billions of people around the globe on a daily basis. By improving the efficiency of network operations, results from this project can dramatically reduce the cost of provisioning existing public services, like Wikipedia, as well as make it much cheaper for new public services to be developed. By enabling the deployment of more effective resource allocation protocols on flexible switches, this work also has the potential to provide substantial improvements to the networking performance within datacenters. The researchers will publicly release the developed software and enable a rich set of network protocols and high-performance datacenter applications.
|
1 |
2017 — 2018 |
Krishnamurthy, Arvind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Student Travel Support For the Fourteenth Symposium On Networked Systems Design and Implementation (Nsdi 2017) @ University of Washington
This award is to support student travel to the Fourteenth Symposium on Networked Systems Design and Implementation (NSDI 2017) will be held in Boston, MA on March 27-29, 2017. The conference focuses on the design principles of large-scale networks and distributed systems and, in particular, challenges that are shared across systems as diverse as internet routing, peer-to-peer file sharing, sensor nets, scalable web services, and distributed network measurement.
NSDI 2017 will offer many opportunities for students to meet, interact, and exchange ideas in an informal atmosphere, including a welcome reception with members of the program committee, speakers, and organizers who are on site; small evening birds-of-a-feather sessions(including WiAC, the Women in Advanced Computing meetup), and more. USENIX also enlists the help of student volunteers to serve as scribes to summarize the talks and Q&A at all conference sessions. These reports are published in an issue of ;login:, giving additional value to the students via a deeper understanding of the material, writing experience, and interactions with speakers and other students collaborating on these reports
|
1 |
2017 — 2020 |
Krishnamurthy, Arvind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Csr: Small: Enabling in-Network Computation For Datacenter Applications @ University of Washington
The emergence of programmable network devices, such as reconfigurable switches and customizable network accelerators, along with the increasing traffic of data centers, motivate the use of in-network computation. Today's latest reconfigurable switches support configurable per-packet processing, including customizable packet headers, customizable packet processing, and the ability to maintain state inside the switch. Given this hardware trend, this project seeks to offload computing operations onto intermediate networking devices for a broad range of application services ranging from distributed storage to big data analytics and distributed machine learning, thus optimizing the operations of data center applications.
The project's primary goal is to build a programming framework to enable in-network computing using programmable networking hardware. In designing and implementing this framework, the project addresses the following research questions: First, the project tackles how to best integrate reconfigurable switches and network accelerators into a data center network, as they both have limitations in terms of the operations that they can execute. Second, a key project goal is to identify what is a simple and yet powerful programming application program interface (API) for these programmable devices to support a broad class of applications. Third, another key challenge addressed is how to keep state and computations on these devices consistent with that of application servers, and how to co-design data center applications to take advantage of the performance benefits enabled by this paradigm.
This project seeks to improve the efficiency of network-intensive data center applications that are used by literally billions of people around the globe on a daily basis. By improving their efficiency, one can dramatically reduce the cost of data center services as well as make it much cheaper for new public services to be developed. Collaborators at various switch vendors are equal partners in this effort, providing access to new technologies as well as assisting in technology transfer to the industry. The project will integrate undergraduate students as researchers, and material from the project will be incorporated into both undergraduate and graduate courses.
A key project goal is to publicly release developed software and enable a rich set of high-performance data center applications. All software will be made public as soon as they are developed, hosted via GitHub at https://github.com/arvindkrish/incbricks and accessible from the project website at the University of Washington. A mirrored version of this repository will be maintained at the University for at least five years.
|
1 |
2020 — 2023 |
Krishnamurthy, Arvind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Cns Core: Small: Optimizing Distributed Transactions On Emerging Hardware @ University of Washington
Replicated transactional storage simplifies the programming and reasoning about distributed systems by providing a simple and powerful abstraction: atomic and durable execution of transactions that can survive node failures. As a consequence, many datacenter systems using transactional storage as a critical building block. This project revisits this research topic to address the opportunities and challenges posed by multiple hardware trends, e.g., large core counts on processors, programmable network interfaces, and low-latency storage. The project seeks to build distributed transactional storage systems using principles that reduce coordination and maximize network and storage performance.
In particular, the project seeks to optimize distributed transactional storage by taking advantage of modern hardware features ranging from core-heavy servers to programmable network devices and tiered and highly-parallel durable storage. A key goal is to design distributed protocols that minimize cross-core coordination traffic to provide multi-core scalable solutions. Further, the project will distill remote access communication primitives, enabled by a programmable network interface, in order to optimize transaction processing. The project will also examine how to support durable data structures on heterogeneous storage that includes low-latency non-volatile memory.
Datacenter applications that require support for transactional storage are used by literally billions of people around the globe daily. By improving the efficiency of distributed transactions and reducing latency/overheads, the project seeks to dramatically reduce the cost of existing datacenter-based services as well as make it much cheaper for new public services to be developed. Collaborators at hardware vendors are equal partners in this effort, providing access to new hardware technologies and assisting in project execution and technology transfer to the industry. The developed software will also enrich the distributed systems curriculum at all levels.
For the broader community of users and the society at large, the project will periodically release the developed software to enable a rich set of distributed systems and high-performance datacenter applications. A public repository will be maintained until at least December 2025 at https://github.com/arvindkrish/dist-transactions/tree/master.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
1 |
2020 — 2021 |
Krishnamurthy, Arvind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Research: Pposs: Planning: Making Smart Use of Smartnics @ University of Washington
Computing paradigms occasionally undergo rather dramatic shifts as underlying technologies change, significantly modifying the dominant use cases. Some of these revolutions are seen far in advance and are heralded by great fanfare, with the hype long preceding the actual payoffs. Others are more opportunistic in nature, leveraging a technology initially developed for another purpose, and the adoption of this technology starts altering practice without much notice from the broader community. Computing is now on the verge of such a ?quiet revolution? having to do with inserting computation on the devices that connect computers to the network. This trend towards what are called SmartNICs (for computationally enhanced network interface cards) shows great promise in both making applications faster and in keeping data more secure. This project will focus on how to best leverage SmartNICs in order to improve application performance and security.
SmartNICs were originally designed to offload packet-processing from the host CPU, which processing is necessary in certain settings to perform encryption and other compute-intensive tasks on the data path. SmartNICs combine this packet-processing power with three other characteristics: (i) isolation from the host CPU, (ii) direct access to memory, and (iii) general programmability. It turns out that this combination gives SmartNICs the potential to play a powerful and unique role in the overall computational ecosystem. In particular, by sitting on the boundary between the network and hosts, they can change the interfaces being exposed to both, allowing SmartNICs to substantially improve application performance while also providing greater security and privacy. However, realizing these gains requires making progress on three separate issues. First, the hardware design of SmartNICs must combine several different units (a specialized packet-handling unit, an remote direct memory access unit, and a general computation unit), and provide fast interconnections between them and with the host memory. The design space is vast, and there is little agreement on what designs represent the best trade-offs. Second, these SmartNICs must offer applications a set of primitives that can improve their performance and security. These primitives must be chosen wisely to be feasible for SmartNICs to support while being easy for applications to leverage for better performance and security. Third, verification tools are needed to ensure that the programs on the SmartNIC are correctly executed and that the overall system -- running on multiple hosts and their SmartNICs -- is correct. This will require extensions to current verification techniques.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
1 |
2021 — 2022 |
Krishnamurthy, Arvind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Eager: Collaborative Research: Towards An Extensible Internet @ University of Washington
The Internet’s basic design has remained largely unchanged since it first became commercial in the 1990s. Now that the Internet is the centerpiece of our global communications infrastructure, it is essentially impossible to alter its design in any significant way. Unfortunately, such changes are needed to improve the Internet’s performance and security. This project aims to resolve this paradox with a design called the Extensible Internet. This is a collaborative project which brings together investigators from Mount Holyoke College, New York University, the University of Washington, and International Computer Science Institute at the University of California at Berkeley.
In today’s Internet, layer 3 has two basic functions: (i) connect all layer 2 networks and (ii) provide the packet delivery services on which host applications are built. The key aspect of the Extensible Internet is that it splits layer 3 into two layers. The first, which remains layer 3 and can use the current Internet Protocol (IP), handles the first function of connecting layer 2 networks. The second requires a new layer (called layer 3.5) that supports an extensible set of packet delivery services, and thus handles the second function of providing the services on which host applications are built. In this way, the Extensible Internet design leaves the current Internet unchanged but is able to provide an extensible set of new packet delivery services that will improve the Internet’s performance and security.
The Extensible Internet design is incrementally deployable (i.e., no unchanged applications or domains would lose connectivity), compatible with economic incentives, and can continue to evolve as new requirements arise. As such, it provides a practical way for the Internet to evolve far beyond its current design. If the Extensible Internet design is adopted, it would have a significant impact on the nature of the Internet. In particular, transitioning to the Extensible Internet is not just a one-time change in functionality, but transforms the Internet from a single and unchanging service model (best-effort packet delivery) to an evolving and expanding set of network-provided services. In addition, in pursuing this agenda, the investigators will work to increase the diversity of the STEM workforce through ongoing efforts in their own research groups and the outreach programs in their respective departments. The investigators will also incorporate their results into their courses and make the material freely available.
All of the code from this project will be available on the project’s website at ExtensibleInternet.org.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
1 |
2022 — 2026 |
Krishnamurthy, Arvind Peter, Simon |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Research: Cns Core: Medium: Programmable Disaggregated Storage @ University of Washington
Disaggregated storage has become an increasingly popular storage infrastructure, gaining significant attention in public clouds and in the enterprise. Today’s storage disaggregation lacks multi-tenancy support and dynamic adaptation to consolidated workloads. Further, it requires costly coordination and provides no application-specific control. This leads to severe interference, suboptimal performance, and cost inefficiencies. This project’s goal is to make disaggregated storage programmable. Programmable disaggregated storage (PDS) can provide end-to-end control over disaggregated storage access. It leverages recent storage and network innovations to make the entire disaggregated storage data path programmable. This allows PDS to adapt to the workload requirements and backend infrastructure conditions. PDS aims to provide predictably high performance, enable flexible storage access, and offer efficient isolation and fairness guarantees. Several research questions are addressed to make PDS practical, such as how to build an elastic and backward compatible client-side virtual disk; how to design control-plane and data-plane programming frameworks to express application intents and define flexible computations; how to enforce control-plane policies and perform fair resource allocations across heterogeneous I/O substrates; how to schedule data-plane operators across different devices; and how to achieve predictable storage access latencies.<br/> <br/>Storage infrastructure is used by billions of people around the globe on a daily basis. By tightly integrating disaggregated storage with emerging programmable entities, this project has the potential to substantially improve the performance, cost efficiency, scalability, and multi-tenancy of storage applications. There are ample opportunities for involving undergraduate students and bringing research into the classroom. For the broader community of users and the society at large, this project endeavors to publicly release hardware and software prototypes, enabling a rich set of storage applications.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
1 |
2022 — 2026 |
Anderson, Thomas (co-PI) [⬀] Krishnamurthy, Arvind |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Research: Cns Core: Large: Runtime Programmable Networks @ University of Washington
Programmability is fuel for network innovation. In today’s programmable networks, new features can be easily developed without having to rely on vendor support. However, deploying new features still requires fleet-wide maintenance to avoid disruption because device reprogramming incurs downtime. This severely constrains the speed of change, as maintenance operations require meticulous planning well ahead of time. This project proposes runtime programmable networks, where the end-to-end network infrastructure, vertically from the host kernels down to the network interface cards, and horizontally extending across switches to the other end of the network, can be reprogrammed on-the-fly without packet drops and with strong consistency guarantees. This represents a major leap from today’s programmable networks, which are reconfigurable at compile time but become fixed functions at runtime after deployment.<br/><br/>According to this project's vision, FlexNet, the network infrastructure provides a collection of basic utilities and, on demand, extensions are partially reconfigured into the infrastructure by injecting, removing, or overriding specific functions. This accelerates the speed of delivering new features to end users, increases the manageability of large networks by lowering the barrier for change, and creates new possibilities unavailable in today’s programmable networks, such as powerful, dynamic security defenses. With FlexNet, this project can summon security defenses into the network precisely when needed. Defenses can migrate to the attack location or replicate across the network to maximize their effectiveness. They can even shapeshift in real time to mitigate changing attacks. When attacks subside, these defenses can be soon removed from the network to reduce overhead. This project aims to elevate network programming from a “one-shot” endeavor at compile time to “continuous” activities throughout the lifecycle of the network.<br/><br/>In order to realize our vision, this project needs to innovate across the stack. Concretely, this project proposes a four-pronged approach to programing, compiling, verifying, and managing runtime programmable networks end-to-end. First, runtime network programming requires controlling disparate datapaths and their real-time changes as a whole, while ensuring runtime portability across devices; thus, this project will develop a new programming system. Compiling a whole-network program to a heterogeneous substrate, while continuously reoptimizing for runtime changes, requires a new compiler design. To ensure the safety of network changes, this project must simultaneously innovate on runtime verification and validation. Finally, FlexNet programs have dynamic footprints in the network—migrating, expanding, and shrinking across devices—so this project needs a new management system to control such unprecedented dynamics. This project will produce an integrated platform upon which the FlexNet techniques will be evaluated comprehensively at various scales and with diverse workloads. To achieve a wider community engagement, this project will release software and hardware prototypes and educational materials in open source, and by collaborating with industry partners, this project will transition the FlexNet technologies into practice.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
1 |