2005 — 2011 |
Jha, Somesh |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Career: Combating Malicious Behavior in Commodity Software @ University of Wisconsin-Madison
Infrastructures in critical domains such as medical, power, telecommunications, and finance depend on information systems. Therefore, attacks can have devastating affects on these critical infrastructures. Moreover, an increasing number of organizations are relying on commodity software. This trend can be attributed to increasing reliance on out-sourcing and commercial off-the-shelf (COTS) components. However, deploying commodity software poses significant risks because they can contain exploitable vulnerabilities and hidden malicious behavior. Combating malicious behavior in commodity software is especially challenging because its user only has access to the executable for the software. This proposal addresses the problem of combating malicious behavior in commodity software. The proposed tasks are applicable in the context of model-based intrusion detection systems (MIDS) which are a type of host-based intrusion detection system (HIDS) that monitor program execution using a model. There are three major areas in MIDS: model construction, enforcement, and model analysis. This proposal addresses model construction and model analysis. In the context of MIDS, the proposed research will improve precision of existing model-construction algorithms, tackle privacy violations, and develop techniques for analyzing and refining models.
|
1 |
2005 — 2009 |
Reps, Thomas [⬀] Jha, Somesh |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ct-Isg: Advanced Methods For Checking Information-Security Properties @ University of Wisconsin-Madison
ABSTRACT 0524051 Reps, Thomas U of Wisconsin Madison
The goal of the proposed project is to create techniques that (i) provide better predictions of the behavior of computer systems, and (ii) make computer systems less vulnerable to attack. Speci.cally, we propose to develop improved security-analysis technology to be applied in two areas: Access control of shared computing resources Finding security vulnerabilities in programs Intellectual Merit: Solutions to the problems addressed in this proposal would provide 1. Better methods for access control of shared computing resources. Issues that will be addressed include: Enabling collaboration across separate administrative domains, while preserving privacy. Creating better methods for identifying access-control vulnerabilities and defects in accesscontrol policies. 2. Better tools for identifying security vulnerabilities in programs. While these two topics might, at .rst blush, seem unrelated, they turn out to be closely related at the technical level: problems in both areas can be formulated using the same machinery an automatatheoretic formalism called weighted pushdown systems (WPDSs). WPDS solvers represent a united technology for the key algorithms required in both areas. Consequently, the study of WPDSs and related formalisms provides intellectual leverage for making advances in both areas. Moreover, studying them together is likely to bring added bene.ts: past history has shown that improvements motivated by the needs in one area have had unanticipated bene.ts in the other area. Broader Impact: As the Internet has become pervasive, security and reliability issues have become enormously important to society. New security exploits are announced daily, power-grid failures are caused by bugs in software, and multi-hundred-million-dollar space projects are interrupted by software glitches. Better tools for identifying vulnerabilities in programs will lead to software systems with enhanced security and reliability. The growth of the Internet also o.ers the promise of an improved platform for cross-organization interaction and collaboration. However, the decentralized nature of the Internet presents an obstacle: currently, organizations maintain their own namespaces and impose their own access-control policies. Cross-domain interactions can be hindered by the need to set up access-control mechanisms that incorporate (in whole or part) those of the individual organizations, as well as by conficts in the structure and contents of existing namespaces and access-control policies. Better methods for access control of shared computing resources would provide improved .exibility for supporting cross-domain interactions via the Internet. A related objective is to provide better methods for predicting the behavior and consequences of an access-control policy that crosses organizational and trust boundaries. The proposed project aims to make fundamental advances in science and engineering that address these issues, all of which are relevant to the goals of NSF's Cybertrust program. Our tools and implementations will be made available for other researchers to download over the web and use in their own security-analysis work.
|
1 |
2007 — 2012 |
Jha, Somesh Estan, Cristian (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ct-Isg: Alternate Representation of Nids/Nips Signatures For Fast Matching @ University of Wisconsin-Madison
Proposal: 0715358 Cristina Estan University of Wisconsin CT-ISG: Alternate Representation of NIDS/NIPS signatures for fast matching
Network intrusion prevention systems (IPSes) play an important role in protecting computers against attacks originating from the network. Signature matching is a performance-critical operation that each IPS must perform: after storing a reassembled TCP-level byte stream or a field of a higher level protocol in a buffer, the IPS needs to decide whether it matches any of the signatures that describe known attacks. This project investigates methods for representing signatures that allow fast matching, require little memory, and can support complex signatures expressed as regular expressions.
Currently used representations, such as deterministic finite-state automata (DFAs) and non-deterministic finite-state automata (NFAs), have severe drawbacks. In general, DFAs enable fast matching but are space inefficient and NFAs are concise but are slow to match against. Solutions based on multiple DFAs have intermediary matching speeds and memory requirements. One of the core reasons why such solutions provide unfavorable speed versus memory tradeoffs is the state-space explosion problem.
This project focuses on a novel signature representation that neutralizes statespace explosion: extended finite automata (XFAs). XFAs extend DFAs with a few bytes of ""scratch memory"" used to store bits which record auxiliary information during the matching, or counters which record progress. When an accepting state is reached, the scratch memory is checked and a match is declared only if it holds suitable values. Preliminary results on signature sets from Snort and Cisco IPS show that compared to solutions using multiple DFAs, XFAs can be 10 times smaller and at the same time 5 times faster in software.
|
1 |
2009 — 2013 |
Jha, Somesh |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Tc:Medium:Collaborative Research:Techniques to Retrofit Legacy Code With Security @ University of Wisconsin-Madison
This award is funded under the American Recovery and Reinvestment Act of 2009 (Public Law 111-5).
Though perhaps unfortunate, as a practical matter software is often built with functionality as a primary goal, and security features are only added later, often after vulnerabilities have been identified. To reduce the cost and increase assurance in the process of security retrofitting, the aim to develop a methodology involving automated and semi-automated tools and techniques to add authorization policy enforcement functionality to legacy software systems.
The main insight is that major portions of the tasks involved in retrofitting code can be or already have been automated, so the design process focuses on enabling further automation and aggregating these tasks into a single, coherent approach.
More specifically, techniques and tools are being developed to: (1) identify and label security-relevant objects and I/O channels by analyzing and instrumenting annotated application source code; (2) insert code to mediate access to labeled entities; (3) abstract the inserted checks into policy-relevant, security-sensitive operations that are authorized (or denied) by the application's security policy; (4) integrate the retrofitted legacy code with the site's specific policy at deployment time to ensure, through advanced policy analysis, that the application enforces that site's policy correctly, and (5) verify correct enforcement of OS policy delegation by the retrofitted application.
The techniques and tools being developed are useful not only for retrofitting, but also for augmenting and verifying existing code already outfitted with security functionality; hence improving the state-of-the-art in creating more secure software.
|
1 |
2011 — 2016 |
Jha, Somesh Swift, Michael Banerjee, Suman (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Tc: Medium: Collaborative Research: Building Trustworthy Applications For Mobile Devices @ University of Wisconsin-Madison
Mobile handheld devices such as smartphones, PDAs, and smart media players have outpaced the growth of wired hosts, and are emerging as the predominant vehicle for Internet access. In recent years, newer mobile phones, including various versions from Apple, Google, Nokia, and others, have promoted greater programmability, radically changing the age-old model of mobile phones being a closed platform. However, openness arrives with new challenges of trustworthiness. The goal of this project is to improve the trustworthiness of mobile phones in their daily operations, by analyzing threats that occur either due to malware or due to regular applications, designing mitigation strategies, and evaluating developed solutions through a real deployment on a smartphone platform (Google Android) and operating in a real network (Sprint-Nextel).
This project will undertake crosscutting research, educational, and outreach plan to improve the robustness, reliability, security, privacy, and overall trustworthiness of mobile phones. The primary focus of this project will be on performance and security threats that are unique to mobile phones, including malicious applications that ex-filtrate data, performance loss due to resource constraints, privacy threats of lost devices, and remote network-based attacks. Specifically, this project will investigate issues related to following topics: (i) performance instability due to resource constraints (ii) protection against malicious applications (iii) privacy against lost phones (iv) detection and prevention against other network attacks. Techniques developed will have broad benefits to research and society. These techniques will enhance the trustworthiness of mobile phones, thereby improving the confidence of users in using these devices in their daily activities. An educational plan will introduce new curriculum centered on the mobile phone platform and establishes a new undergraduate laboratory for hands-on mobile device programming.
|
1 |
2012 — 2017 |
Jimenez, Daniel Jha, Somesh Sankaralingam, Karthikeyan [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Shf: Medium: Title: Idempotent Processing and Architectures @ University of Wisconsin-Madison
For many decades, Moore's Law has allowed exponential growth in computing capability while simultaneously reducing the power consumed by digital devices. Due to fundamental material properties and engineering challenges, in the future the power and energy efficiency of transistors that are the building blocks of digital devices will not improve significantly. Thus to continue providing performance improvements without increasing power consumption, new techniques to design microprocessors are required. This research project looks at a new approach to build microprocessors to make them more energy efficient. The main idea in this research project is to develop techniques allowing microprocessors to efficiently predict without having to expend power-hungry resources to recover in case the prediction is wrong. The research leverages the mathematical principle of idempotence (doing something multiple times producing the same result) in a novel way. In this project, this principle is applied to microprocessor design to develop a class of processors called Idempotent Processors. The research addresses formal theoretical analysis of the technique, ways to build software compilers, and microprocessor designs spanning CPUs to GPUs to exploit this principle.
The core idea of this project is to use the property of idempotence: performing an idempotent operation many times produces the same result. The research builds upon the following insight: applications naturally decompose into a continuous series of idempotent regions; i.e., their execution can be broken down into a set of regions, where each region is idempotent - re-execution has no side-effects. The research develops the idea of Idempotent Processors, whose fundamental abstraction is executing idempotent regions of code. This allows novel modifications to the microprocessor pipeline and allows many forms of speculation without the need to restore any state prior to re-execution. This design approach unifies speculation for performance, reliability, and energy efficient execution under one principled approach. The static analysis research formalizes the notion of idempotence and investigates mechanisms for determining idempotent regions. The compiler implementation for various ISAs (instruction set architectures), CPUs (central processing units), and GPUs (grahics processing units) evaluates the approach quantitatively.
The project's end-to-end solutions across multiple synergistic directions have potential for disruptive impact. The project involves collaborative work between UW-Madison and UT-San Antonio and involves under-graduate researchers, exchanges visits between institutions, and explores integrated curriculum enhancement and outreach across UW and UTSA. The project's multi-disciplinary and multi-institution collaboration provides distributed impact.
|
1 |
2012 — 2017 |
Jha, Somesh |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Twc: Medium: Collaborative: Extending Smart-Phone Application Analysis @ University of Wisconsin-Madison
This research is focused on the creation of new techniques and algorithms to support comprehensive analysis of Android applications. We have developed formally grounded techniques for extracting accurate models of smartphone applications from installation images. The recovery formalization is based on TyDe, a typed meta-representation of Dalvik bytecode (the code structure used by the Android smartphone operating system). In developing TyDe, we are formalizing the TyDe type inferencing, ill-formed bytecode structure management, and creating a generalized Dalvik-to-Java retargeting logic based on bytecode "instruction templates".
TyDe and the models they represent are being used to perform deep analysis of application structure to infer potential application behaviors that may harm users, their data, or the cellular or Internet infrastructure. In particular, these analyses support whole program analysis, reflection, and smartphone specific data flow analysis. Such analyses provide a means for evaluating an applications adherence to best security practices or organizational requirements by inspecting permission structures, component interfaces, and source code and library origins for signals of malicious behavior. The analysis techniques are being evaluated on a large corpus of real-world applications extracted from real application markets.
In the broadest view, this work is providing new avenues for researchers, industry, and consumers to assess potential dangers presented by applications retrieved from smartphone application markets, an advancing the state of the art in application program analysis.
|
1 |
2012 — 2017 |
Jha, Somesh Sankaralingam, Karthikeyan (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Twc: Phase: Medium: Collaborative Proposal: Understanding and Exploiting Parallelism in Deep Packet Inspection On Concurrent Architectures @ University of Wisconsin-Madison
Deep packet inspection (DPI) is a crucial tool for protecting networks from emerging and sophisticated attacks. However, it is becoming increasingly difficult to implement DPI effectively due to the rising need for more complex analysis, combined with the relentless growth in the volume of network traffic that these systems must inspect. To address this challenge, future DPI technologies must exploit the power of emerging highly concurrent multi- and many-core platforms. Unfortunately, however, current DPI systems severely limit their use of parallelism by either resorting to coarse-grained load-balancing or restricting their analysis to very simple, hard-coded detectors.
In order to fully exploit parallel hardware platforms, in this project we develop a comprehensive approach that introduces parallelism across all stages of the complex DPI pipeline. We investigate application-independent scheduling strategies that take existing DPI analyses and automatically parallelize their processing. We do so by mapping them into a domain-specific intermediary representation that abstracts from specifics of the underlying hardware architecture while providing low-level consistency guarantees. Conceptually, the project's goal is to virtualize and abstract parallelism as a fundamental primitive, just like how virtual memory abstracts away physical memory size limitations from programmers.
|
1 |
2016 — 2020 |
Jha, Somesh |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Twc: Medium: Collaborative: Scaling and Prioritizing Market-Sized Application Analysis @ University of Wisconsin-Madison
The emergence of smartphones and more generally mobile platforms as a vehicle for communication, entertainment, and commerce has led to a revolution of innovation. Markets now provide a dizzying array of applications that inform and aid every conceivable human need or desire. At the same time, application markets allow previously unknown multitudes of application developers access to user devices through fast- tracked software publishing with well-documented consequent security concerns. The science and tools for performing security analysis of applications have vastly improved over the last decade. However, market providers have limited capability to apply those analyses at scale to the massive software markets.
This project is a crosscutting research, educational, and outreach plan improving the scalability and accuracy of smartphone application analysis. The research effort focuses on the creation of new techniques and algorithms to enable analysis of large bodies of applications by reducing analysis cost and prioritizing identified security vulnerabilities by their expected impact. Explored within the context of Android Intent analysis resolution, the team is developing efficient algorithms and study the computational complexity of matching application communication sources and sinks thereby supporting phone-wide information flow analysis, developing empirical models for estimating the likelihoods of inter-component communication, and exploring features and metrics indicating their potential security impacts communication pathways. The approaches are being applied to a commercial markets (Apple iOS) and domains (web, desktop, and server environments) and massive application data sets.
|
1 |
2018 — 2022 |
Jha, Somesh Zhu, Xiaojin (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Fmitf: Collaborative Research: Formal Methods For Machine Learning System Design @ University of Wisconsin-Madison
Machine learning (ML) algorithms, fueled by massive amounts of data, are increasingly being utilized in several critical domains, including health care, finance, and transportation. Models produced by ML algorithms, for example deep neural networks, are being deployed in these domains where trustworthiness is a big concern. It has become clear that, for such domains, a high degree of assurance is required regarding the safe and correct operation of ML-based systems. This project seeks to provide a systematic framework for the design of ML systems based on formal methods. The project seeks to review and improve almost every aspect of the design flow of ML systems, including data-set design, learning algorithm selection, training of ML models, analysis and verification, and deployment. The theory and ideas generated during the project will be implemented in a new software toolkit for the design of ML systems in the context of cyber-physical systems.
The project focuses on cyber-physical systems (CPS), which is a rich domain to apply formal methods principles. Moreover, the research ideas from this project can be readily applied to other contexts. A key aspect of this research is the use of a semantic approach to the design and analysis of ML systems, where the semantics of the target application and a formal specification for the full system, comprising the ML component and other components, are cornerstones of the design methodology. The project employs a range of formal methods, including satisfiability solvers, simulation-based verification, model checking, specification analysis, and synthesis to improve all stages of the ML design flow. Formal techniques are also used for the tuning of hyper-parameters and other aspects of the training process, to aid in debugging misclassifications produced by ML models, and to monitor ML systems at run time and ensure that outputs from ML models are used in a manner that ensures safe operation at all times.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
1 |
2018 — 2023 |
Jha, Somesh |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Satc: Core: Frontier: Collaborative: End-to-End Trustworthiness of Machine-Learning Systems @ University of Wisconsin-Madison
This frontier project establishes the Center for Trustworthy Machine Learning (CTML), a large-scale, multi-institution, multi-disciplinary effort whose goal is to develop scientific understanding of the risks inherent to machine learning, and to develop the tools, metrics, and methods to manage and mitigate them. The center is led by a cross-disciplinary team developing unified theory, algorithms and empirical methods within complex and ever-evolving ML approaches, application domains, and environments. The science and arsenal of defensive techniques emerging within the center will provide the basis for building future systems in a more trustworthy and secure manner, as well as fostering a long term community of research within this essential domain of technology. The center has a number of outreach efforts, including a massive open online course (MOOC) on this topic, an annual conference, and broad-based educational initiatives. The investigators continue their ongoing efforts at broadening participation in computing via a joint summer school on trustworthy ML aimed at underrepresented groups, and by engaging in activities for high school students across the country via a sequence of webinars advertised through the She++ network and other organizations.
The center focuses on three interconnected and parallel investigative directions that represent the different classes of attacks attacking ML systems: inference attacks, training attacks, and abuses of ML. The first direction explores inference time security, namely methods to defend a trained model from adversarial inputs. This effort emphasizes developing formally grounded measurements of robustness against adversarial examples (defenses), as well as understanding the limits and costs of attacks. The second research direction aims to develop rigorously grounded measures of robustness to attacks that corrupt the training data and new training methods that are robust to adversarial manipulation. The final direction tackles the general security implications of sophisticated ML algorithms including the potential abuses of generative ML models, such as models that generate (fake) content, as well as data mechanisms to prevent the theft of a machine learning model by an adversary who interacts with the model.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
1 |
2020 |
Jha, Somesh |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Satc: Core: Medium: Collaborative: User-Centered Deployment of Differential Privacy @ University of Wisconsin-Madison
Differential privacy (DP) has been accepted as the de facto standard for data privacy in the research community and beyond. Both companies and government agencies are trying to deploy DP technologies. Broader deployments of DP technology, however, face challenges. This project aims to understand the needs of different stakeholders in data privacy, and to develop algorithms and software to enable broader deployment of private data sharing. The project's novelty is combining the expertise of social science researchers with that of computer scientists who have both theoretical and system research experiences related to DP to develop a hybrid approach to private data sharing to achieve better privacy-utility tradeoff. The project's impacts are in advancing the state-of-the-art with regard to DP deployment in particular and privacy protection in general. More specifically the project identifies the workflow of DP data sharing, improve understanding of DP communication, and develop new algorithms, privacy concepts, and privacy mechanisms to support deployment of DP. The project has four tasks that will advance the understanding of user-centered DP and lay a foundation for its deployment. (1) Examine individual human users' perception, comprehension and acceptance of the concept and guarantee of DP and the effect of privacy parameter, and to investigate effective ways to communicate those concepts. (2) Implement methods from the domains of human factors and human-computer interaction to identify tasks, goals, and workflow in private data sharing. (3) Develop key algorithms and software for a hybrid approach of private data sharing. In the hybrid approach, one first publishes a private synopsis of dataset using carefully selected low-degree marginals. From these marginals, one can either synthesize new datasets, or answer queries directly using inference under the maximum entropy principle. The hybrid approach enhances this with interactive query answering, enabling extraction of information not covered by low-degree marginals. (4) Develop techniques to further improve the privacy-utility tradeoff in private data sharing, including a theory of differential privacy under publishable information.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
1 |