2003 — 2009 |
Buhler, Jeremy |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Career: New Technologies For Biosequence Comparison
Computational comparison of biosequences is fundamental to modern biology. It is the fastest, most widely used technology to annotate the functional elements of a newly sequenced genome. Algorithms for comparison must address the rapid increase in the size of sequence databases and the need for more sensitive tools of multiple alignments, as well as the need to compare inferred feature models. Enhancing the pattern-matching algorithms that form the heart of fast comparison tools is the focus of this research. A research framework for rational pattern design and a technique for combining pattern matching with substitution score matrices will be incorporated. Pattern matching generalizes the word-matching heuristics of alignment algorithms. The choice of pattern can impact the search speed and sensitivity. The improved search patterns will be specialized to particular feature types, such as coding sequence, or to groups of organisms. Collaborations with popular search tool developers are established and will help integrate the new methods. Research training involves students at many levels and will also form the basis of cross-disciplinary student exchanges.
|
1 |
2009 — 2013 |
Franklin, Mark Chamberlain, Roger Buckley, James (co-PI) [⬀] Buhler, Jeremy Gruev, Viktor (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Csr:Medium: Architecturally Diverse Systems For Streaming Applications
Architecturally Diverse Systems for Streaming Applications Abstract (0905368) Many important scientific computing problems, called ?streaming applications,? have high input data rates derived from real-time sensor data or directly from data streaming from disk arrays. Real-time sensor based data (e.g., telescopic astrophysical data obtained in the search for new planets) is frequently sourced from analog devices and requires filtering and various data cleaning prior to performing a host of complex computations. Large disk based data sets (e.g., genome and protein sets used in understanding disease factors) are often passed at high data rates from disk storage. Choices for dealing with such applications include a multiplicity of computing devices (e.g., general purpose processors, chip-multiprocessors, graphics processors, field programmable gate arrays, etc.). While each individually is well matched to certain types of computations, often more effective solutions are found by integrating multiple computer types into a single system. The central research issue is determining how to effectively integrate diverse computing resources for solution of complex streaming applications. The research includes further development of the AutoPipe design environment. AutoPipe provides tools for algorithm specification, and for design, simulation and deployment of diverse integrated computing architectures. Techniques for inclusion of analog devices in mixed analog-digital systems is being undertaken so that Auto-Pipe can handle mixed signal, analog/digital algorithmic functional and resource components in a single system. The research activity is driven by two important applications taken from the astro-physics and computational biology domains. There will be heavy involvement of graduate and undergraduate students in the research.
|
1 |
2012 — 2014 |
Franklin, Mark Chamberlain, Roger Buhler, Jeremy |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Ci-P: a Flexible Platform For Accelerating Biological Sequence Analysis
Recent declines in the cost of DNA sequencing have enabled biologists to conduct experiments that produce very large DNA and protein sequence data sets. Understanding this data requires computational analyses to recognize known sequences and group new ones by similarity. As data sets grow, these analyses become a serious bottleneck to progress. Computer scientists have therefore tried to accelerate sequence analysis using hybrid computing architectures that combine multicore CPUs with accelerators, such as field-programmable gate arrays and graphics engines, whose performance equals that of tens or hundreds of CPU cores. To more effectively accelerate biosequence analysis tasks, new infrastructure is needed to facilitate both development of accelerated analytical tools and their deployment to biologists.
This project is a planning effort to create development and deployment infrastructure for accelerated biosequence analysis applications. The PIs are developing design criteria for a preferred hardware platform and set of software tools to speed the creation, validation, and deployment of biosequence accelerators. Key activities include qualifying hardware platforms, developing prototype software and firmware, and consulting developer and user communities for accelerated sequence analysis tools to guide the planning effort. In particular, the PIs are organizing a special track at a major accelerator design conference to solicit input on proposed infrastructure.
Developing the proposed infrastructure will stimulate creation of biosequence analysis accelerators with low cost, rapid deployment, and a large supporting developer and user community. More agile development will boost adoption of accelerators by biologists, empowering labs to analyze massive biosequence data sets and speeding discovery.
|
1 |
2015 — 2017 |
Buhler, Jeremy Chamberlain, Roger |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Eager: Rapid, Efficient Implementation of Irregular Applications On Simd Many-Core Platforms
General-purpose computation with efficient Single Instruction, Multiple Data (SIMD) oriented many-core devices, such as graphical processing units (GPUs), can deliver high performance in a variety of application domains. Data-parallel applications that perform well on SIMD many-cores typically exhibit regularity in patterns of computation and data movement, acting identically on each of an ensemble of many equal-sized inputs. However, many important applications exhibit irregular behavior making them difficult to implement efficiently on these platforms. Thus, efficient SIMD implementation of applications with irregular behavior is an important ongoing research problem.
This project's focus is the investigation and validation of novel interface designs, algorithmic techniques, and implementation strategies to address the problem of efficient SIMD implementation uniformly for applications from a variety of domains. The work includes generating alternate module designs to support efficient developer-driven searches over large design spaces to tune performance. Another key area of the research will validate these technologies on bio-sequence analysis applications resulting in innovative, efficient new GPU designs for computational tasks.
More broadly and with a particular focus on new high-performance application designs for data-intensive computations critical to bioinformatics, the project will enable faster development of more efficient, more maintainable GPU software, even for applications with SIMD-unfriendly irregular behaviors.
|
1 |