Michael Schatz - US grants
Affiliations: | 2010 | Computer Science | University of Maryland, College Park, College Park, MD |
We are testing a new system for linking grants to scientists.
The funding information displayed below comes from the NIH Research Portfolio Online Reporting Tools and the NSF Award Database.The grant data on this page is limited to grants awarded in the United States and is thus partial. It can nonetheless be used to understand how funding patterns influence mentorship networks and vice-versa, which has deep implications on how research is done.
You can help! If you notice any innacuracies, please sign in and mark grants as correct or incorrect matches.
High-probability grants
According to our matching algorithm, Michael Schatz is the likely recipient of the following grants.Years | Recipients | Code | Title / Keywords | Matching score |
---|---|---|---|---|
1999 — 2004 | Schatz, Michael | N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Experiments On Dynamics and Control of Spatiotemporal Chaos in Thermal Convection @ Georgia Tech Research Corporation The research will explore the fundamental of controlling fluid flow in convective systems. Experiments will be carried out on surface tension driven (Marangoni) convection where the driving forces, thermally induced surface tension gradients confined to an interface, can be probed and altered for both laminar and chaotic convection. The driving will be determined via infrared imaging of the interfacial temperature field and altered direclty via nearly simultaneous, multipoint heating by an infrared laser scanner. Dynamics in three areas will be investigated: 1) wavenumber selection mechanism of spatially periodic patterns; 2) defect dynamics in disordered convective flow; 3) the role of unstable periodic orbits in spatiotemporal chaos. |
0.903 |
2002 — 2006 | Neitzel, G. Paul (co-PI) [⬀] Smith, Marc Schatz, Michael |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Sger: Opto-Microfluidics: Containerless, Optically Controlled Microscale Fluid Management @ Georgia Tech Research Corporation CTS-0201610 |
0.903 |
2004 — 2009 | Schatz, Michael | N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
@ Georgia Tech Research Corporation The goals of this project are to develop techniques for the detection and prediction of low-dimensional features in high-dimensional, spatially extended systems, to investigate how coherent structures within the system affect predictability, to test control strategies for high-dimensional, spatially extended systems, and to improve understanding of uncertainty as a function of spatial scale in such systems. One motivation for the work is the desire to develop better state estimation techniques for complex environmental systems. The model system for this work is that of shallow thermal convection. The project combines laboratory experiment with theory and modeling. As well as being a good model system, convection is important in many environmental systems in the atmosphere, ocean and interior of the Earth, and in industrial settings. Thermal convection exhibits multi-scale spatial and temporal complexity and laboratory convection experiments provide plentiful, high quality, observational data under relatively well-controlled conditions. The laboratory experiments use carbon dioxide as the convecting medium in a shallow cell. Specific spiral-defect chaos flow patterns can be initiated using a computer-steered laser system that selectively heats parts of the fluid. The primary theoretical tool is a state estimation technique in which a local ensemble Kalman filter is applied to a numerical Navier-Stokes solver. This will be applied to a range of flow patterns of increasing complexity exhibited by the convection cell as the Rayleigh number is increased. The data to be assimilated will be taken from shadowgraph images of the convection. Later stages of the project will include experiments in control of the convection using selective heating guided by output from the state estimation system. It is anticipated that the results of this work will be applicable to other spatially-extended complex systems. |
0.903 |
2006 — 2010 | Catrambone, Richard (co-PI) [⬀] Schatz, Michael Marr, Marcus |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Research: Institutionalizing a Reform Curriculum in Large Universities @ Georgia Tech Research Corporation In universities with large science and engineering programs, the introductory calculus-based physics course plays a central role in the education of very large numbers of students who will become scientists and engineers. Despite repeated calls from the physics community for improvement and modernization of this introductory physics course, the content and structure of the traditional course taught at most such large institutions has changed very little in the past fifty years. Although science and engineering universities often play a lead role in setting the standards for courses taught at other institutions, the large enrollment in their introductory courses, and the involvement of a large number of research faculty and academic support staff, has made it difficult to implement substantive curricular changes. Recently three large universities (NC State, Purdue, and Georgia Tech) have begun the process of implementing the Matter & Interactions curriculum, which was initially developed at Carnegie Mellon University. |
0.903 |
2009 — 2013 | Schatz, Michael Webster, Donald |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Laboratory Studies of Exact Coherent Structures in Wall Turbulence @ Georgia Tech Research Corporation 0853691 |
0.903 |
2009 — 2014 | Schatz, Michael Kohlmyer, Matthew |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Transforming Homework Into Cyberlearning in An Introductory Stem Course @ Georgia Tech Research Corporation Physics (13) |
0.903 |
2011 — 2012 | Mccombie, W. Richard [⬀] Schatz, Michael Witkowski, Jan |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
@ Cold Spring Harbor Laboratory The last few years have seen a revolution in DNA sequencing instrumentation and technology. In that short time period sequencing instrument capabilities have increased more than 1000 fold and are likely to continue to increase about 5-fold each year for the next several years. As such it is now affordable and efficient to sequence to high coverage large plant genomes that had previously been prohibitively too expensive and complex to attempt. However, analysis methods have not improved nearly as much during the same time period and a variety of technical limitations of these new DNA sequencing instruments make it even more difficult to carry out whole genome sequencing of novel genomes (de novo sequencing). These limitations also make it more difficult to use the new instruments to carry out older clone based strategies for de novo sequencing, such as BAC-by-BAC approaches. The purpose of this meeting to be held at Cold Spring Harbor Laboratory May 18-20, 2011 are to assess the current state of next generation sequencing in terms of de novo, whole genome plant sequencing, what can be expected to develop in the near future, and then determine which advances are needed to allow these exciting technologies to be used to carry out de novo sequencing of entire complex plant genomes. The meeting will bring together stakeholders with broad range of expertise in high-throughput sequencing and genomics, plant biology, bioinformatics and databases drawn from the academic, private and international sectors. Meeting outcomes will be captured in the form of a report to be developed by participants that will be submitted for publication to Genome Research. |
0.916 |
2011 — 2013 | Schatz, Michael Roy, Rajarshi (co-PI) [⬀] Swinney, Harry Showalter, Kenneth (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Hands-On Research: Complex Systems Advanced Study Institute (China) @ Georgia Tech Research Corporation This award provides partial funding for an advanced study institute (ASI) on complex systems in physics to be held in Shanghai, China, in June 2012. A team of 12 senior researchers and 24 assistants from the U.S. will join colleagues at Shanghai Jiao Tong University (SJTU) to offer a two-week hands-on course demonstrating table-top experiments in complex non-linear physical systems. About 70 participants, primarily junior faculty members, will be selected from underdeveloped regions of Central and Southeast Asia. The objective is to demonstrate that interesting and productive experiments can be conducted with relatively inexpensive and available materials. This ASI is a successor to similar programs that have been held in Africa, India, and Brazil. The local expenses will be supported by SJTU, and participant expenses as well as administrative costs are sponsored by the International Center for Theoretical Physics (ICTP) in Trieste, Italy. |
0.903 |
2011 — 2017 | Schatz, Michael | N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
@ Georgia Tech Research Corporation Numerous complex systems in nature and in technology defy concise characterization because they exhibit strongly nonlinear behaviors that lack all symmetries and are highly non-periodic on a wide range of spatial and temporal scales. Characterization by detailed measurement (in lab experiments or direct numerical simulations) is now possible in many cases using modern measurement technologies or computational techniques. However, the resulting deluge of data often leads to little insight; in particular, there is frequently no good way to connect quantitatively experimental measurements of a particular complex system with the output from simulations/models of the same system. New, computationally-based, mathematical tools from algebraic topology have the potential to bridge the gap between measurements and models; the proposed research will explore the use of algebraic topology to link numerical simulations and laboratory experiments in situations where complexity arises because the system under study is driven out of thermodynamic equilibrium. The research focuses on an outstanding paradigm for nonequilibrium complexity: fluid flow driven by temperature gradients (thermal convection). The planned work brings three unique capabilities together in a single effort: |
0.903 |
2012 — 2015 | Ware, Doreen Lippman, Zachary (co-PI) [⬀] Schatz, Michael Churchland, Anne (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Reu Site: Cshl Nsf-Reu Bioinformatics and Computational Biology Summer Undergraduate Program @ Cold Spring Harbor Laboratory A Research Experience for Undergraduates (REU) Sites award has been made to Cold Spring Harbor Laboratory (CSHL) that will provide research training for 8 students, for 10 weeks during the summers of 2012- 2014. The program trains participants on the present and growing need to integrate biological research with sophisticated computational tools and techniques. CSHL has over 40 faculty members, including members of a newly established Quantitative Biology Department, who will serve as bioinformatics and computational biology mentors in fields ranging from plant biology to machine learning for biology. Through this NSF-REU support, students are afforded the opportunity to conduct full-time research in an appropriately matched lab based on mutual interests and goals. CSHL REU participants have access to individual and shared laboratory facilities such as flow cytometry, high throughput sequencing and analysis, imaging, and proteomics facilities. Participants attend multiple seminars and workshops, such as the responsible conduct in research, professional communication skills, the graduate school application process, and introduction to science careers. REU participants also are invited to attend the CSHL summer courses or meetings, which cover a range of topics such as Computational Neuroscience and Single Cell Analysis. All students are housed on campus within walking distance of their laboratories and the CSHL cafeteria, where they receive the majority of their meals. The multilayer recruitment effort consists of both traditional and digital mailings to potential students and their professors, as well as recruitment visits to universities throughout the country. Students are selected based on academic record, motivation for the proposed program of study, and potential as future researchers. Alumni successes are monitored to determine their continued interest in their academic field of study, their career paths, and the long-term impact of their research experience. Information about the program will be assessed using faculty and student evaluations, as well as the use of an REU common assessment tool. More information is available by visiting http://www.cshl.edu/education/urp/nsf-sponsored-reu-in-bioinformatics-and-computational-biology, or by contacting the PI (Dr. Zachary Lippman at lippman@cshl.edu) or the co-PI (Dr. Doreen Ware at ware@cshl.edu). |
0.916 |
2012 — 2017 | Schatz, Michael Grigoriev, Roman |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
@ Georgia Tech Research Corporation The objective of this research program is to develop and to test experimentally a revolutionary new approach to modeling and predicting two-dimensional turbulent flows. A set of weakly unstable invariant Navier-Stokes solutions will be identified and transitions between invariant solutions will be characterized to provide a coarse global description of the nonlinear dynamics of turbulent flow. Quasi-2D flow in a shallow electrolyte layer continually driven by Lorentz forces provides the setting for theoretical, analytic and experimental development of this approach. Novel and proven techniques, such as periodic orbit theory, group representation theory, Krylov-subspace numerical methods, Newton and variational solvers will be used to develop this viewpoint, which will be tested in experiments where the flow can be measured with full spatial and temporal resolution throughout the entire flow domain. |
0.903 |
2012 — 2017 | Schatz, Michael Van Eck, Joyce Lippman, Zachary [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
@ Cold Spring Harbor Laboratory PI: Zachary B. Lippman (Cold Spring Harbor Laboratory) |
0.916 |
2014 — 2019 | Schatz, Michael | N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Career: Algorithms For Single Molecule Sequence Analysis @ Cold Spring Harbor Laboratory The Cold Spring Harbor Laboratory is awarded a CAREER grant for the PI Michael Schatz to develop new computational methods for processing DNA sequencing data from the latest high-throughput sequencing technologies. DNA sequencing costs and throughput have improved by orders of magnitudes over the last three decades, although many questions remain unsolved, especially because of the short sequence lengths currently available. Emerging "third generation" sequencing technology from Pacific Biosciences, Moleculo, Oxford Nanopore, and other companies are poised to revolutionize genomics by enabling the sequencing of long, individual molecules of DNA and RNA. The sequence lengths with these technologies can reach up to tens of thousands of nucleotides, however few or no analysis packages are capable of dealing with these types of genetic sequence data. This project will overcome these limitations by developing several novel analysis algorithms specifically for long read single molecule sequencing and their associated complex error models. The outcomes will help answer biological questions of profound significance to all of society, such as: What were the genetic implications of the domestication of rice? What genes and regulatory elements give rise to the incredible regenerative properties of the flatworm? or, What can be understood from assembling reference genomes of sugarcane and pineapple towards breeding more robust plant crops and biofuels? |
0.939 |
2016 — 2018 | Schatz, Michael | N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Graduate Teaching Assistant Professional Development (Gta-Pd) Workshop @ Georgia Tech Research Corporation The chief importance of this project is its impact on increasing the number of Science, Technology, Engineering and Mathematics (STEM) degrees awarded at U.S. universities. The nation needs a sufficient number of STEM-degree holders, who possess technical skills that are crucially important for the nation's economic health and growth. Unfortunately, too many potential STEM majors are currently lost because of poor experiences in introductory university STEM courses. This project aims to improve dramatically the retention of students in STEM majors by propagating widely (by means of a national workshop) "best practices" for preparing high-quality instructors of key introductory STEM courses. |
0.903 |
2016 — 2019 | Schatz, Michael Churchland, Anne [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Reu Site: Cshl Nsf-Reu Bioinformatics and Computational Neuroscience Summer Undergraduate Program @ Cold Spring Harbor Laboratory This REU Site award to Cold Spring Harbor Laboratory (CSHL), located in Cold Spring Harbor, NY, will support the training of ten students for ten weeks during the summers of 2016-2018. This award is supported by the Division of Biological Infrastructure in the Directorate for Biological Sciences (BIO) and the Division for Mathematical Sciences in the Directorate for Mathematics and Physical Sciences (MPS).CSHL's REU in Bioinformatics and Computational Neuroscience (BCN) provides participants with an exceptional research experience, integrating genomics and neuroscience through shared analysis tools. Spanning genomes, cells, organisms and the brain, the program trains students to approach complex biological systems quantitatively. Students conduct full-time, independent research under the mentorship of one of CSHL's approximately 50 faculty members working in genomics, quantitative biology, and neuroscience. Participants have access to state-of-the-art technologies, such as high-throughput sequencing and two-photon imaging, and attend lab meetings and research seminars. The REU curriculum includes workshops on quantitative techniques, responsible conduct of research, scientific communication, and scientific careers. The REU culminates with a symposium in which participants present their work to CSHL's scientific community. Students are housed on CSHL's 110-acre campus, within walking distance of laboratories and dining halls. Participants receive room and board and a summer stipend and have access to campus amenities. Students apply online, supplying a personal statement, two letters of recommendation, and academic records. REU participants are selected based on academics, motivation, and demonstrated potential. |
0.916 |
2016 — 2019 | Mccombie, W. Richard (co-PI) [⬀] Birnbaum, Kenneth Jackson, David (co-PI) [⬀] Schatz, Michael Gingeras, Thomas |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Maizecode - An Initial Analysis of Functional Elements in the Maize Genome @ Cold Spring Harbor Laboratory PI: Thomas Gingeras (Cold Spring Harbor Laboratory) |
0.916 |
2016 — 2019 | Schatz, Michael | N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
@ Georgia Tech Research Corporation The weather we experience is driven by convection, sunlight warms the earth which heats the atmosphere which is cooled by the cold temperatures of outer space. Most people are not interested in microscopic behavior, for example the behavior of the individual molecules in the air, nor macroscopic behavior, such as worldwide average temperature. What is of interest are mesoscopic patterns, for example weather fronts which result in local changes in temperature. This interest in mesoscopic, as opposed to micro- or macroscopic features, of large scale systems occurs in a wide variety of complex large scale physical phenomena such as combustion in engines, dynamics of biomass in the oceans, ventricle fibrillation in a human heart, etc. These mesoscopic patterns take on many different shapes and sizes and change with time, sometimes slowly and sometimes rapidly. The form of these patterns and how they evolve in time is often very dependent on parameters. New technologies are greatly increasing our abilities to measure and simulate these physical phenomena, resulting in enormous data sets, but our ability to extract and quantify this information in a way that leads to understanding, predictability, and control of these systems is not keeping pace. We will explore the use of new mathematical tools to address this problem. |
0.903 |
2016 — 2019 | Dida, Mathews Odeny, Damaris Devos, Katrien Schatz, Michael Khang, Chang Hyun (co-PI) [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Bread Abrdc: Development of Essential Genetic and Genomic Resources For Finger Millet @ University of Georgia Research Foundation Inc Finger millet is a grain crop of strategic importance to food security in Eastern Africa. The grain has high nutritional value, can grow in arid environments and thus is important to the livelihood of smallholder farmers. A major agricultural goal in the region is to develop higher yielding varieties of finger millet through reducing or eliminating diseases that impact growth of the plant. Blast fungus is a pathogen that reduces yield up to 80% and is one of the main diseases affecting finger millet. To understand how to control disease outbreaks, this project uses genomic sequencing as a powerful approach to identify precise strains of the fungus and to study how the fungus causes disease symptoms in the plant. Sequence analyses of blast strains collected in Kenya, Tanzania, Uganda and Ethiopia will provide information on the genetic diversity of the pathogen in Eastern Africa, and provide a resource to identify the factors that are responsible for infection of finger millet. The knowledge from this approach is essential to develop efficient disease management strategies. Furthermore, sequence analyses of the finger millet host will clarify why some cultivars are more resistant to blast than others. The generated resources will also be used as a vehicle to train undergraduate and graduate students in Eastern Africa in bioinformatics, an expertise that is essential to translate the information to improve breeding strategies. |
0.936 |
2017 — 2020 | Schatz, Michael Grigoriev, Roman |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Geometry and Topology of Fluid Turbulence: Theory and Experiment @ Georgia Tech Research Corporation This research project explores and experimentally tests a radically new mathematical framework for understanding and predicting complicated behaviors in numerous fundamental and practical problems in science, engineering, and medicine (e.g., weather forecasting, characterization of cardiac arrhythmias, etc.). Complex behaviors in many such problems are often governed by patterns that appear fleetingly but repeatedly. The research develops general, powerful techniques for identifying and quantifying key patterns, including the temporal sequences in which the patterns may appear; knowledge of the patterns and sequences can then be harnessed to construct "road maps" for predicting future behaviors. This study will focus on demonstrating "proof of principle" by constructing road maps of complex behavior observed in turbulent fluid flow in laboratory experiments. If successful, the results of this study will lead directly to the development of faster and more accurate ways to make predictions of complicated behavior in large real world problems. For example, the ability to identify and quantify important patterns and sequences in atmospheric turbulence should enable weather forecasts that are better and more rapid than those currently possible today. All software and useful solution data produced by the research activities will be made publicly available. The research program tightly integrates with teaching and learning at the undergraduate and graduate levels and includes activities to increase participation of underrepresented groups. |
0.903 |
2017 — 2021 | Van Der Knaap, Esther Van Eck, Joyce Lippman, Zachary [⬀] Schatz, Michael |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
@ Cold Spring Harbor Laboratory Genome DNA sequences for many crops have been determined in the last two decades, providing the blueprints to discover genes that underlie key agricultural traits. However, a great challenge is identifying the differences in DNA between related varieties of the same crop, which are responsible for the subtle trait variation that plant breeders exploit to improve productivity. A major contributor to this trait variation is 'genome structural variation' where pieces of DNA are deleted, inserted, or rearranged resulting in changes in gene expression. This project will focus on how structural variation contributed to domestication and breeding of tomatoes. A related goal is to expand and develop new molecular tools to create structural variation for crop improvement. This project will improve US agriculture by providing new knowledge and tools to efficiently and predictably enhance crop productivity. A major part of the project will also include training of young scientists in fundamental principles of plant genome research that can be applied to agriculture. This knowledge will also be shared through outreach programs in inner city New York schools that do not have access to research opportunities. Project personnel will develop hands-on teaching activities that will highlight the importance of plant genomics and new genome editing technologies to improve crops and meet the agricultural needs of the 21st century. |
0.916 |
2020 | Blobel, Gerd A (co-PI) [⬀] Bodine, David M. (co-PI) [⬀] Hardison, Ross C [⬀] Schatz, Michael Weiss, Mitchell J (co-PI) [⬀] Zhang, Yu |
R24Activity Code Description: Undocumented code - click on the grant title for more information. |
Vision: Validated Systematic Integration of Epigenomic Data @ Pennsylvania State University-Univ Park Project Summary VISION: ValIdated Systematic IntegratiON of hematopoietic epigenomes Technological advances enabling the production of large numbers of rich, genome-wide, sequence-based datasets have transformed biology. However, the volume of data is overwhelming for most investigators. Also, we do not know the mechanisms by which the vast majority of epigenetic features regulate normal differentiation or lead to aberrant function in disease. We have formed an interdisciplinary, collaborative team of investigators to address the problem of how to effectively utilize the enormous amount of epigenetic data both for basic research and precision medicine. At this point, acquisition of data is no longer the major barrier to understanding mechanisms of gene regulation during normal and pathological tissue development. The chief challenges are how to: (i) integrate epigenetic data in terms that are accessible and understandable to a broad community of researchers, (ii) build validated quantitative models explaining how the dynamics of gene expression relates to epigenetic features, and (iii) translate information effectively from mouse models to potential applications in human health. These needs are addressed by the proposed ValIdated Systematic IntegratiON (VISION) of epigenetic data to analyze mouse and human hematopoiesis, a tractable system with clear clinical significance and importance to NIDDK. By pursuing the following Specific Aims, the interdisciplinary collaboration will deliver comprehensive catalogs of cis regulatory modules (CRMs), extensive chromatin interaction maps and deduced regulatory domains, validated quantitative models for gene regulation, and a guide for investigators to translate insights from mouse models to human clinical studies. These deliverables will be provided to the community in readily accessible, web-based platforms including customized genome browsers, databases with facile query interfaces, and data-driven on-line tools. Specifically, the proposed work in Aim 1 will build comprehensive, integrative catalogs of hematopoietic CRMs and transcriptomes by compiling and determining informative epigenetic features and transcript levels in hematopoietic stem and progenitor cells and in mature cells. CRMs will be predicted using the novel IDEAS (Integrative and Discriminative Epigenome Annotation System) method. Work proposed in Aim 2 will build and validate quantitative models for gene regulation informed by chromatin interaction maps and epigenetic data. Compiling and determining chromosome interaction frequencies will predict likely target genes for CRMs. Gene regulatory models will be built that predict the contributions of CRMs and specific proteins to regulated expression; these models will be validated by extensive testing using genome-editing in ten reference loci. Finally, work in Aim 3 will produce a guide for investigators to translate insights from mouse models to human clinical studies. This effort will include categorizing orthologous mouse and human genes by conservation versus divergence of expression patterns, assigning CRMs to informative categories of epigenomic evolution, and testing the interspecies functional maps experimentally by genome-editing. |
0.94 |
2020 — 2021 | Goecks, Jeremy Morgan, Martin T Schatz, Michael |
U24Activity Code Description: To support research projects contributing to improvement of the capability of resources to serve biomedical research. |
Implementing the Genomic Data Science Analysis, Visualization, and Informatics Lab-Space (Anvil) @ Johns Hopkins University Project Summary The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL) powers the next generation of computational genomics research using cloud-scale data and compute resources. The platform is built on a set of established components, including the Terra computing platform and Dockstore for standards-based sharing of containerized tools and workflows. It also provides multiple entry points for data access and analysis, including batch workflows with Terra, notebook environments including Jupyter and RStudio, Bioconductor packages for building analysis on top of AnVIL APIs and services, and will soon offer Galaxy instances for interactive analysis. By providing a unified environment for data management and compute, AnVIL eliminates the need for data movement, allows for controlled access to sensitive data and monitoring, and provides elastic, shared computing resources that can be acquired by researchers as needed. NIH-sponsored biomedical research is increasingly moving to cloud-based data storage and analysis systems, with major cloud portals established for GTEx, Kids First, TOPMed, TCGA and several other major initiatives. However, using these systems together is a challenge. The individual data portals enable researchers to browse and query their own data but have limited functionality to share data or user registrations across portals or with cloud based workspaces, like Terra and Galaxy. The recently established NIH Cloud Platform Interoperability (NCPI) effort aims to address these issues by implementing key interoperability technologies across multiple NIH institutes. Under this project, we will work the NCPI working groups to define the use cases and standards for interoperability as well as implement three major technologies recommended by the NCPI within the Galaxy and R/Bioconductor components of AnVIL. First, we will implement the NIH Researcher Auth Service (RAS) to provide a common mechanism for researchers to establish their identity and access data they are authorized to use across Terra and Galaxy. Second, we will implement the Global Alliance for Genomics and Health (GA4GH) Data Repository Service (DRS) so that data consumers, including workflow systems, can access data objects in a single, standard way regardless of where they are stored and how they are managed. Finally, we will develop initial support in AnVIL for the Fast Healthcare Interoperability Resources (FHIR) standard. This standard describes data formats, elements, and an API for exchanging electronic health records (EHR), especially to ensure these records are available, discoverable, and understandable as patients move around the healthcare ecosystem. FHIR support in AnVIL will facilitate access to eMERGE and related projects by users once the data are ingested in AnVIL. |
0.939 |
2020 — 2021 | Goecks, Jeremy [⬀] Schatz, Michael |
U24Activity Code Description: To support research projects contributing to improvement of the capability of resources to serve biomedical research. |
A Federated Galaxy For User-Friendly Large-Scale Cancer Genomics Research @ Oregon Health & Science University Project Summary Cancer research is now a data-driven discipline, but only a minority of cancer researchers are data scientists. This severely restricts our ability to effectively study and cure the disease. The far reaching significance of our project is in federating disparate data and computational resources in order to provide a unifying analysis platform for computational cancer research. We will extend the popular scientific workbench Galaxy (https://galaxyproject.org) so that it can integrate with distributed data and compute resources used and needed by cancer researchers, including those resources in the NCI Cancer Research Data Commons (NCR DC). Our Federated Galaxy system will allow users to seamlessly access NCR DC data across multiple resources. It will support multiple analysis scenarios tuned to skills and computational requirements of individual researchers. The aims of this project are: Aim 1. Extend Galaxy for working with distributed cancer genomics and phenotypic data. This will enable Galaxy users to access both public and private cancer data regardless of their actual physical location. Best-practice approaches will be used for accessing restricted datasets. Aim 2. Enhance Galaxy for context-aware, distributed cancer genomics analyses using shared workflow representations. This will enable Galaxy users to run genomics analyses on different clouds, ultimately reducing the time, cost, and data transfer associated with analyses. Aim 3. Apply Federated Galaxy to precision oncology research. Workflows developed in this aim will leverage the technologies in Aims 1 and 2 to benchmark machine learning algorithms for predicting tumor phenotype and drug response. Interactive reports will summarize benchmarking results and utilize ITCR visualizations for deep dives into results. Our system will provide a singular access point to distributed cancer datasets and will enable these data to be analyzed within a single portal in a way that satisfies multiple analysis scenarios and utilizes diverse computational resources. Finally, a cloud-centric Galaxy built for the NCR DC will substantially grow the community of users working with the GDC and the NCR DC. This is because Galaxy brings with itself a vibrant world-wide community of users and developers, which numbers tens of thousands of scientists. These individuals will help to tune the GDC and other resources within the NCR DC to the needs of real-life analysis scenarios and will enrich the set of tools accessible to cancer researchers. |
0.927 |
2020 — 2021 | Nekrutenko, Anton [⬀] Pond, Sergei L Kosakovsky Schatz, Michael |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Tuning Big Data Analysis Infrastructure For Hiv Research @ Pennsylvania State University-Univ Park Summary The COVID?19/SARS?CoV?2 pandemic is a once in a generation, ?all?hands?on?deck? event for the scientific community. This pandemic is also the first in which real time genomic data are available, e.g. via GISAID [1], where genomic sequences are deposited daily. Vital insights about the virus and the epidemic depend on rapid and reliable genomic analysis of diverse viral sample sequences by multiple laboratories. Yet we repeatedly encounter the same avoidable shortcomings early in viral investigations, including COVID?19: lack of reproducibility, rigor, and data/analytic sharing. Only about 10% of the published genomes have quality metrics, primary data (read files), or any level of details on analytics, making these data irreproducible and unverifiable; over 40% of GISAID submissions to date provide no information about how the sequences were generated. Essential questions about the extent of intra?host genomic variability (indicative of adaptation or multiple infection), viral evolution (selection, recombination), transmission (phylogenetic and phylogeographic) cannot be answered reliably if researchers cannot trust/replicate the source data and analytical approaches. One of the key goals/deliverables of this supplement will be the open analytic workflows that can be used to curate and standardize genomic data, and high quality annotated variation data. |
0.94 |
2021 | Schatz, Michael | U01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Integrative Genomic and Epigenomic Analysis of Cancer Using Long Read Sequencing @ Johns Hopkins University PROJECT SUMMARY The last twenty years have experienced extensive growth in the sequencing of cancer genomes, leading to a dramatically increased understanding of the role of genetic and epigenetic mutations in cancer. This has largely been enabled by developments in high-throughput ?second-generation? sequencing technology and analysis that characterize cancer genomes using short-reads. Recently, a new generation of high-throughput long-read sequencing instruments, primarily from Pacific Biosciences and Oxford Nanopore, have become available that are poised to displace short-read sequencing for many applications. We and others have used these technologies to discover tens of thousands of variants per cancer genome that are not detectable using short-reads, including structural variants and differentially methylated regions in known oncogenes and cancer risk genes. These technologies carry the potential to address many open questions in cancer biology, however, the analysis of long-read sequencing data is computationally demanding and needs specialized algorithms that are either too inefficient to use at scale or do not yet exist. In this proposal, we will address several gaps in the application of long-read technology for basic research and clinical use in cancer genomics. First, we will develop improved methods for finding structural variants and complex repeat expansions from long-reads, both of which are major diagnostic and prognostic indicators of disease, yet are not accurately identified using existing methods. Leveraging the improved phasing capabilities of long reads, this work will include the detection of mosaic variants, revealing tumor heterogeneity and variants in precancerous tissues. Next, we will apply machine learning and systems level advances to accelerate and improve the comparison of variants across large patient cohorts. Critically, this will compensate for the error prone nature of single molecule long-read sequencing to make these comparisons more accurate when comparing tumor-normal samples or pedigrees of related patients so that recurrent driving mutations can be accurately identified. Finally, we will develop integrative methods for the joint analysis of genome, transcriptome, and epigenetic profiling of cancer genomes. These advances will improve the identification of fusion genes, and allow for entirely new forms of epigenetic analysis, such as the allele-specific analysis of methylation across transposable elements and other repetitive elements. Synthesizing the many thousands of novel variants we will detect using our methods, we will then develop algorithms that will identify and evaluate recurrent genetic or epigenetic variations as putative driving mutations. All methods will be released open-source and will empower us, our ITCR collaborators, and the cancer genomics community at large to study genetic and epigenetic variants with near perfect accuracy and thereby unlock many new associations to treatment and disease. |
0.939 |
2021 | Nekrutenko, Anton [⬀] Schatz, Michael |
U24Activity Code Description: To support research projects contributing to improvement of the capability of resources to serve biomedical research. |
Democratization of Data Analysis in Life Sciences Through Galaxy @ Pennsylvania State University-Univ Park Project Summary For over a decade, the Galaxy Project (https://galaxyproject.org/) has worked to solve key issues plaguing modern data intensive biology -- the ability of researchers to access cutting-edge analysis methods, to share analysis results transparently, and to precisely reproduce complex computational analyses. Galaxy has become one of the largest and most widely used open source platforms for biological data science. Promoting openness and collaboration in all facets of the project, from technical decisions to training and leadership, has enabled us to build a vibrant community of users, developers, system engineers, and educators who continuously contribute new software features, add the latest tools, adopt to the most modern infrastructure, author training materials, and lead research and training workshops. Genomics research is continuously evolving, and current challenges include the rapid growth in size and complexity of new datasets, the increasing availability of controlled-access datasets with human genomic components, and the continuing expansion in the breadth of research areas capable of generating high throughput data. The core Galaxy development team submitting this proposal will respond to these challenges by focusing on the following key priorities: - Rearchitect Galaxy for scalability and security using software container technologies; - Design new user interface (UI) for working with thousands of tools, workflows, and samples; - Enable interactive exploratory data analysis in Galaxy; - Facilitate community growth and support; - Enable effective training and outreach. Concentrating on these broad priorities will allow us to achieve the ultimate goal of the Galaxy Project: developing a data analysis medium connecting biomedical experts across the full spectrum of skill sets, scientific domains, and research practices. For biomedical researchers it will provide a powerful analysis platform populated with the latest tools and data. For tool developers it will provide a community-supported mechanism for deploying tools before a wide audience of users. For system administrators and engineers it will provide a framework they will feel comfortable deploying on any infrastructure. For educators it will provide a comprehensive collection of materials covering most data analysis needs and an infrastructure for delivering interactive, hands-on training workshops for audiences of different sizes. |
0.94 |
2022 — 2024 | Schatz, Michael | N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Research: Eager: Unraveling the Nature and Onset of Instabilities in Suspension Flows @ Georgia Tech Research Corporation Flows containing particles (suspension flows) are found in countless settings in nature and in technology; examples range from silt-laden water streaming in a river to blood coursing through a cell-counting analyzer. Pure Newtonian fluids are well-known to undergo instabilities that lead to significant changes in the flow behavior. Suspension flows also experience instabilities; however, the mechanisms that drive suspension flow instabilities are not yet understood. In this project, proven techniques for characterizing instabilities in pure Newtonian fluids will be applied to suspension flows instabilities. This approach should reveal how such instabilities can be probed and manipulated in service of developing better ways to predict how the particles move and are distributed in practical applications. The proposed project is also expected to have significant educational impacts, including providing training on complex flow problem-solving for the next generation of scientists and engineers, attracting and training new graduate and undergraduate students from underrepresented groups and communicating the main ideas in a non-technical form to students at all levels of the educational system and the general public.<br/> <br/>The primary goal of this project is to demonstrate that the vast fundamental and applied knowledge of instabilities in pure (Newtonian) flows can be harnessed to achieve breakthrough understanding of instabilities in suspension flows. Specifically, this project will test the main Newtonian insight that structuring the flow geometry can unfold the transition process to reveal well-separated, non-turbulent transitions arising from instabilities that can be manipulated by imposing suitably designed perturbations. The project employs new laboratory experiments and existing theory to explore suspension flows in structured channels. First, the laminar steady state will be characterized as a function of Reynolds number for a specified particle size and selected average particle volume fractions. The research then examines both pure Newtonian fluid and suspension flows instabilities. The outcomes of this project should lay the foundations for future studies to investigate new and heretofore uncharted fundamental fluid physics that arises when inertial particles are added to the flow. The results of our work should set the stage for the discovery of new methods to manipulate flow and particles.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria. |
0.903 |
2022 — 2027 | Gillis, Jesse Frary, Amy Schatz, Michael Van Eck, Joyce Lippman, Zachary [⬀] |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
@ Cold Spring Harbor Laboratory The growing population and climate extremes are threatening food security. Agriculture is largely based on a few major crops, and revolutionary technologies in genome sequencing and CRISPR genome engineering are accelerating their improvement. These technologies can also improve “orphan” crops, which are not widely cultivated or studied but have the potential to increase the diversity and resilience of food production. Orphan crops are related to major crops, allowing translation of knowledge between them. However, orphan crops lack research tools, and an even greater challenge is determining whether specific genetic mutations that benefitted major crops can be engineered to improve traits similarly in orphan crops. This is because gene sequence and function change as species evolve, especially among genes that become duplicated, which is common in plants. This project will take advantage of the nightshade family – a source of many major and orphan crops, such as eggplant, pepino, and tomato – to study how duplicated genes evolve and affect agricultural traits in related species. Combining genome sequencing and CRISPR will reveal sequence diversity among thousands of duplicated genes and enable improved predictability in engineering genes and traits across species. This project will train young scientists with a focus on diversity and inclusion, as well as promote public understanding of genome engineering in plant biology through a community science program on orphan crops. Finally, new curricula and research opportunities for undergraduate students at a small liberal arts college will broaden participation and training of underrepresented groups in the plant sciences.<br/><br/>This project will exploit advances in large-scale reference genome sequencing, gene co-expression analyses, and CRISPR genome editing to dissect how paralog diversification impacts species-specific phenotypes in a genus of both fundamental and applied importance. Fifty Solanum species, including 16 orphan crops, will be sequenced to establish a Solanum Pan-Genome with telomere-to-telomere reference assemblies, providing a foundation for genus-wide comparative genomics and functional genetics. Computational approaches based on genomics data will be developed for precise assembly and comparison of complex genomes, and identification and classification of paralogs and their relationships based on their variants and expression patterns. Simultaneously, transformation protocols and genome editing will be developed and deployed for an array of Solanum to test how paralogs impact genotype-to-phenotype relationships within and between species. By focusing on major domestication gene families and the adaptation and productivity traits they control, this synergistic work will provide both a new understanding of paralog diversification in evolution and a more robust translation of agriculturally relevant genotype-to-phenotype relationships to orphan crops. Beyond a valuable community resource of Solanum reference genomes, expression data, and CRISPR lines for plant researchers and breeders, this multidisciplinary project will result in new tools, resources, and principles that will enable the study and engineering of other taxa and traits of significance to both plant biology and crop improvement.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria. |
0.916 |