2005 — 2006 |
Dorrestein, Pieter C |
F32Activity Code Description: To provide postdoctoral research training to individuals to broaden their scientific background and extend their potential for research in specified health-related areas. |
Enzymology of Nrps and Pks Proteins by Mass Spectrometry @ University of Illinois Urbana-Champaign
[unreadable] DESCRIPTION (provided by applicant): [unreadable] Electrospray fourier transfer mass spectrometry (ESI-FTMS) provides a powerful tool for analysis of covalent modifications of proteins. This proposal applies ESI-FTMS for the detection of transient covalent modifications of non-ribosomal-petide synthetases (NRPS) and polyketide synthases. Initially, a general protocol will be developed for the rapid identification of the acyl domains. Subsequently, this method will be extended to the identification of the in vivo or in vitro "preferred" substrates for acyl domains. This method will then be applied to the identification of the substrate(s) for the calE8 gene product involved in the biosynthesis of the antitumor agent calicheamycin, a PKS for which we do not know the substrate. Finally, ESI-FTMS will be used for the kinetic characterization of the NRPS found in the biosynthesis of the aminocoumarin antibiotics coumeramycin and clorbiocin while the substrates are still connected to their acyl domains. [unreadable] [unreadable]
|
0.949 |
2008 — 2010 |
Dorrestein, Pieter C Pevzner, Pavel A (co-PI) [⬀] |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
New Approaches to Sequencing of Complex Peptides. @ University of California San Diego
DESCRIPTION (provided by applicant): Nonribosomal peptides (NRPs) such as penicillin, vancomycin and related molecules isolated from microbial sources have been a staple for drug discovery for many decades. We propose to employ multi-stage mass-spectrometry (MSn) for de novo sequencing of NRPs, including cyclic NRPs. Analysis of MSn spectra of a cyclic peptide results in the difficult combinatorial problem of interpreting multiple linear peptides from the same spectrum. This proposal develops new combinatorial algorithms for solving this issue. Since the MSn based mass spectrometry analysis of NPRs is fast and inexpensive and requires minimal amounts of material (<1 5g), this approach opens a possibility of high-throughput sequencing of many unknown NRPs accumulated in large bioactivity marine cyanobacterial screening programs. In parallel to the automation of the NRPs sequencing efforts, we will harvest a set of orphan gene clusters from marine actinomycetes to generate a library of cyclic peptides. The algorithms developed in this proposal will be used to fully characterize this cyclic imine library. This work not only sets the stage for the automated characterization of NRPs but will also be applicable to the characterization of other peptidic natural products such as peptaibols, peptide derived toxins or lantibiotics. PUBLIC HEALTH RELEVANCE: This project describes the development and application of a novel mass spectrometry based method and corresponding algorithms that allow the de novo sequencing of complex therapeutic agents that are non-ribosomally derived.
|
1 |
2010 — 2013 |
Dorrestein, Pieter C |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Real-Time Imaging of Metabolic Communication @ University of California San Diego
DESCRIPTION (provided by applicant): Metabolic exchange is a universal phenomenon. It is essential to every organism, from those as simple as bacteria to complex higher eukaryotes such as humans. While metabolic exchange enables cooperation and coordination between the ~70 trillion cells in an average human being, even unicellular organisms rely on metabolic exchange to adapt to environmental stress and form biofilms. Cellular communication allows stem cells to differentiate, cancer cells to proliferate, neurons to fire, bacteria to sense a quorum and pathogens to survive in human hosts. The chemical diversity of the molecules used for communication is extraordinary, and includes small ions such as calcium, small molecules such as secondary metabolites, fatty acids, peptides, but also carbohydrates, proteins and nucleic acids. Despite the universal nature of metabolic exchange, there are few methods that can characterize the communication between cells in a systematic and sensitive fashion, let alone real-time. In this proposal, our focus will be on the application and adaptation of desorption electrospray mass spectrometry to enable the real-time live cell detection and the characterization and visualization of metabolic exchange in important biological processes. We aim to accomplish this in both a spatial as well as temporal fashion. These tools will improve our understanding of secreted biomarkers, microbiome-human cell interactions and understanding the complexities of infectious disease that derive from the cooperation between different types of cells (e.g. Bacilli with macrophages, neutrophils or T-cells) and interkingdom communication. Ultimately it may drive the development of new therapeutic strategies or interventions based on paradigms involving inter-cellular metabolic communication in a system wide fashion. PUBLIC HEALTH RELEVANCE: This proposal aims to develop real-time monitoring of molecular entities involved in metabolic exchange of pathogen-immunological cell populations. Our ability to visualize metabolic exchange between different cell populations could lead to new therapeutic paradigms.
|
1 |
2011 |
Dorrestein, Pieter C |
S10Activity Code Description: To make available to institutions with a high concentration of NIH extramural research awards, research instruments which will be used on a shared basis. |
Synapt Ion Mobility Mass Spectrometer @ University of California San Diego
DESCRIPTION (provided by applicant): This proposal is for a Synapt ion mobility mass spectrometer and will be used for the characterization of therapeutics. The instrument is also used to characterize the effect of therapeutics on cellular processes and cell-to cell communication. This equipment will be used to train the future generation of Doctors of Pharmacy and PhD students engaged in drug development. Finally, the Synapt ion mobility mass spectrometer enables the discovery of the next generation therapeutics and therapeutic targets. PUBLIC HEALTH RELEVANCE: This proposal is requesting funds for a mass spectrometry instrument that is to be used to uncover targets of disease and the discovery and characterization of novel therapeutics.
|
1 |
2011 — 2014 |
Dorrestein, Pieter C Pogliano, Kit J [⬀] |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
The Chemical and Genetic Basis of Interspecies Interactions @ University of California San Diego
DESCRIPTION (provided by applicant): Bacillus subtilis produces a wide array of extracellular metabolites that can inhibit the growth of bacteria and fungi or modify their behavior to attenuate the production of antibacterial products by potentially dangerous neighbors. We here propose to use the new technique of imaging mass spectrometry and classical analytical chemistry to systematically identify the extracellular metabolome of B. subtilis, with a focus on characterizing the interactive metabolome that is induced by other bacterial species. We will investigate the role these compounds play in two distinct outcomes of the interaction of B. subtilis with other species. The first is an impasse, in which B. subtilis forms closely abutting colonies with other species that produce a variety of antibacterial compounds (such as P. aeruginosa). The second, more frequent behavior is contact-dependent predation, in which B. subtilis moves towards, invades and destroys neighboring colonies, leading to death of the prey species and expanding the territory of the B. subtilis colony. These reproducible behaviors are conserved in different undomesticated B. subtilis strains. We will determine if these behaviors depend on the interactive metabolome and investigate the effects individual compounds have on target cell viability and behavior. We will further investigate the genetic requirements for interspecies interactions to identify stress responses, developmental and biosynthetic pathways that contribute to these distinct outcomes and we will use fluorescence microscopy to visualize the cellular consequences of interspecies interactions. These studies will illuminate the mechanistic basis for interspecies interactions and identify secondary metabolites that affect viability or behavior of other species that represent potential new antibacterial drugs. PUBLIC HEALTH RELEVANCE: Bacteria produce many extracellular metabolites that mediate their interaction with other species, many of which have antibacterial and antifungal activities. We will here elucidate the chemical, genetic and cellular mechanisms by which these molecules allow Bacillus subtilis to interact with other bacterial species, producing outcomes ranging from coexistence to the invasion and destruction of neighboring colonies. Interspecies interactions are critical in medicine and the metabolites that facilitate destruction of other species represent promising new pharmaceutical leads.
|
1 |
2012 — 2015 |
Dorrestein, Pieter C Moore, Bradley S (co-PI) [⬀] |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Experiment Based Genome Mining of Ribosomal Natural Products @ University of California San Diego
DESCRIPTION (provided by applicant): Ribosomally encoded natural products were once thought to be of limited structural diversity and uncommon amongst microbes. Over the past few years, however, this viewpoint has changed due to the increased discovery rate of RNPs possessing newly described structural motifs previously ascribed to their nonribosomal counterparts. Nearly every sequenced genome, including invertebrates, contains the genetic capacity to biosynthesize ribosomally- encoded, post-translationally modified natural products such as lantibiotics, bacteriocins, microcins, cyanobactins, thiopeptides, and lasso peptides, thereby making this class of underappreciated natural products perhaps the most dominant in all of nature. What is lacking, however, is a systematic approach to harvest this ubiquitous class of natural products and assess their unique biosynthetic capacity. The difficulty associated with characterizing RNPs in a systematic fashion can be attributed to their falling outside the scope of not only most therapeutic screening programs but also metabolomic or proteomic approaches due to their larger size, structural diversity and extraordinary number of post-translational modifications. This proposal outlines the developmental strategies to create a set of tools for harnessing the biosynthetic potential of ribosomally encoded natural products through mass spectrometry based genome mining. The techniques and methodologies created as a result of the proposed work will not only be important for the detection of therapeutic lead compounds, but also for the efficient characterization of ribosomally encoded toxins secreted by pathogenic bacteria such as Staphylococcus aureus, Bacillus cereus and Clostridium difficile as well as defensins produced by higher eukaryotes such as marine snails, primates and humans.
|
1 |
2013 — 2016 |
Dorrestein, Pieter C Gerwick, Lena Gerwick, William Henry [⬀] |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Mapping the Secondary Metabolomes of Marine Cyanobacteria @ University of California San Diego
DESCRIPTION (provided by applicant): Key to the process of pharmaceutical lead compound discovery from natural sources is the effective access and characterization of highly diverse molecular structures. In this regard, exploration of the marine environment for bioactive natural products is revealing new vistas in natural products chemical diversity. In this application we propose the development of innovative technologies and knowledge, principally based on LC-MS/MS data and 'molecular mapping', which will improve the effectiveness of natural products drug discovery efforts. This will enable a much improved capacity to discover new molecular diversity or analogs in desired structure classes. We will develop an understanding of the degree of expression of natural product pathways in cultured strains, and will develop novel methods by which to upregulate low or non-expressing biosynthetic gene clusters. As a result of these studies, new marine cyanobacterial natural products will be discovered and their biomedical properties will be characterized. To accomplish these goals we have the following four specific aims: 1) To use LC-MS/MS profiling of cyanobacterial extracts and pure compounds, followed by molecular mapping, to create a representation of the chemical universe of our samples. 2) To use QPCR and genome sequencing technologies to evaluate the degree of expression of natural product pathways in our cultured marine cyanobacteria, and to connect Natural Product Super-producing strains of cyanobacteria with their genotypes. This latter information can be used to find genetic markers that can be rapidly deployed to locate this phenotype in new cyanobacterial cultures and collections. 3) To use a suite of imaginative methods to transcriptionally activate cryptic natural product biosynthetic gene clusters in strains determined in Aim 2 to possess un-expressed natural products capacity, and to analyze the resulting elicited secondary metabolomes by mass spectrometry and molecular mapping. 4) To isolate members of new families of compounds detected in Aims 1, as well as newly expressed natural products from Aim 3, and rigorously establish molecular structures using advanced analytical methods. Through the course of these four specific aims, this collaborative group will explore a number of innovative methods and approaches in the natural products sciences, including MS/MS molecular mapping, genomic analysis of natural products expression, elicitation of new natural products expression, connection of natural product-rich phenotypes to their corresponding genotypes, imaging mass spectrometry of complex consortiums of species wherein natural product pathways are activated, and novel automated MS approaches to natural products characterization. All of these methods are focused on improving the detection and characterization of the molecular diversity present in microorganisms, in this case, marine cyanobacteria. This molecular diversity continues to be an important source of inspirational molecules for biomedical research and drug discovery.
|
1 |
2016 |
Dorrestein, Pieter C |
R03Activity Code Description: To provide research support specifically limited in time and amount for studies in categorical program areas. Small grants provide flexibility for initiating studies which are generally for preliminary short-term projects and are non-renewable. |
Reuse of Public Metabolomics Data @ University of California San Diego
Project summary: Many initiatives have been launched to ensure that metabolomics data becomes publicly accessible. Despite the growing availability the data is not being reused. One of the main limitations of metabolomics data reuse and cross-comparisons is the lack of a unifying format and methods that enable comparison of multiple data sets, even collected on different instruments and methods as it is done with UniFrac for microbial sequencing. UniFrac is a distance relationship metric that takes in account phylogenetic relationships. Our goal with this project is threefold. 1) convert all public data into a unifying format. 2) subject all data with MS/MS information to living data in GNPS (http://gnps.ucsd.edu) where knowledge about the chemistry associated with the data is automatically updated and relayed to subscribers to the data. 3) create ChemiFrac, the Unifrac equivalent for metabolomics. Here we will use molecular networking as our phylogenetic relationship measure thus enabling global comparisons of data sets, that we expect will even work when different extractions and instruments are used.
|
1 |
2017 — 2020 |
Dorrestein, Pieter |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Research: Mapping Interactions Mediated by Secondary Metabolites in a Fungus-Growing Ant Microbiome @ University of California-San Diego
Most microbes produce small molecule secondary metabolites as signals and/or antibiotics that control interactions between species. Despite their immense medical importance, the natural diversity and function of these molecules remains obscure. This prevents a full understanding of how microbial communities maintain critical ecosystem functions such as healthy host-microbe relationships. To further understand these systems, this project will generate networks of microbial interactions and their associated secondary metabolites based on the co-occurrence of particular microbes and secondary metabolites in environmental samples. Work will be conducted on the fungus-growing ant Trachymyrmex septentrionalis as an experimental model for these studies, profiling host-microbe interactions that are mediated by secondary metabolites within ant colonies collected throughout the Eastern USA. The T. septentrionalis symbiosis will also be utilized to broadly demonstrate the basics of animal-microbe interactions via a public display that will be linked to newly designed web resources. In addition, postdoctoral, graduate, and undergraduate researchers will be trained in microbiology and chemical ecology, including members of underrepresented groups. This project will therefore broadly advance understanding of the diversity and natural function of secondary metabolites and their impacts of host-microbe interactions.
Nearly all microbes produce secondary metabolites. These molecules are particularly common in symbioses, where they mediate host-microbe interactions. Such interactions are typically studied using low-throughput approaches that remove secondary metabolite-producing organisms from their natural communities. These experiments are therefore insufficient to determine in situ secondary metabolite diversity, and cannot unambiguously link specific molecules to the ecological interactions that they naturally mediate. To overcome these limitations, this project will develop an approach to comprehensively identify interspecific interactions mediated by secondary metabolites by identifying patterns of co-variation between interacting taxa and the secondary metabolites that mediate these interactions, using the fungus-growing ant Trachymyrmex septentrionalis as an experimental model. In this approach, taxa that interact mutualistically will co-occur with both each other and the metabolite that mediates this mutualistic interaction. Reciprocally, taxa that interact antagonistically will rarely co-occur, and the metabolite that mediates this antagonistic interaction will co-occur with the producing taxon but not the target taxon. In situ interactions will be recapitulated using laboratory ant colonies microbial cultures. Researchers at various levels will be trained in these molecular and chemical ecology methods and museum and online resources to disseminate results to the broader public will be developed. Together, this project will provide a methodological approach to map interactions mediated by secondary metabolites in microbial communities and identify the in situ functions of such molecules.
|
0.915 |
2017 — 2021 |
Cottrell, Garrison W (co-PI) [⬀] Dorrestein, Pieter C Gerwick, Lena Gerwick, William Henry [⬀] |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Tools For Rapid and Accurate Structure Elucidation of Natural Products @ University of California, San Diego
Summary from the Application 5 R01 GM107550-08 ?Tools for rapid and accurate structure elucidation of natural products? Bacteria are extraordinarily prolific sources of structurally unique and biologically active natural products that derive from a diversity of fascinating biochemical pathways. However, the complete structure elucidation of natural products is often the most time consuming and costly endeavor in natural product drug discovery programs. Compounding this, advancements in genome sequencing have accelerated the identification of unique modular biosynthetic gene clusters in prokaryotes and revealed a wealth of new compounds yet to be isolated and biologically and chemically characterized. Resultantly, there is an urgent and continuing need in this field to connect biosynthetic gene clusters to their respective MS fragmentation signatures in the MS2 molecular networks. The capacity to make such connections will accelerate new compound discovery as well as create associations between gene cluster and biosynthetic pathway, and aid in fast and accurate structure elucidations. Combined with this informatics approach, this proposed continuation project explores innovative methods by which to solve complex molecular structures by enhanced MS and NMR experiments, as well as the development of new algorithms by which to accelerate their analysis. Thus, the overarching goal of this grant is to develop efficient methods that facilitate automated structural classification, structural feature discovery and ultimately efficient structure elucidation of natural products (or any small molecule) and to build an infrastructure that interacts with data input from the community. We will achieve this with the following four specific aims: Aim 1. Integration of MS2 molecular networking with gene cluster networking to rapidly and efficiently locate natural products that have unique molecular architectures; Aim 2. To develop a suite of high sensitivity pulse sequences for natural product structure elucidation; Aim 3. To develop NMR based molecular networking strategies using Deep Convolutional Neural Networks (DCNNs) to facilitate the categorization and structure elucidation of organic compounds; Aim 4. To integrate NMR molecular networking and MS2 -based molecular networking as an efficient structure characterization and elucidation strategy. By achieving these aims we will develop an innovative workflow for finding new compounds and for determining their structures, both quickly and accurately. The connection between gene cluster and molecule will shed light on stereochemistry and potential halogenations and methylations. This information can then be used in combination with more efficient NMR and MS methods to accurately determine structures. These tools will be widely shared, such as through the Global Natural Products Social (GNPS) Molecular Network, to enhance the overall capacity of the natural products and organic chemistry communities to solve complex molecular structures.
|
1 |
2020 — 2023 |
Aluwihare, Lihini [⬀] Dorrestein, Pieter |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Research: Characterizing Microbial Transformation of Marine Dom At the Molecular Level Using Untargeted Metabolomic @ University of California-San Diego Scripps Inst of Oceanography
Collaborative Research: Characterizing microbial transformation of marine dissolved organic matter at the molecular level using untargeted metabolomics
Dissolved organic matter is an important component of the global carbon cycle. Dissolved organic matter provides food and energy for microbes living in the ocean and influences microbial diversity. Microbes convert some dissolved organic matter to CO2 (respiration) whereas other forms of dissolved organic matter are altered by microbial processes and persist in the ocean. Thus, it is important to understand how microbes change dissolved organic matter composition and reactivity. This project will examine the chemical structure of dissolved organic matter to identify: 1) molecules that fulfill carbon demand (biomass produced minus losses from respiration) and 2) transformation processes that result from microbial activity. The project will combine lab experiments and field studies at the Moorea Coral Reef Long Term Ecological Research site. The project will support training for three graduate students in marine biogeochemistry. Undergraduate training is aimed at sustained mentoring of underrepresented minority (URM) students. Undergraduates will be recruited from existing programs at Minority Serving Institutions at San Diego State University and the University of Hawai?i at M?noa. Undergraduates will participate in the Scripps Institution of Oceanography SURF Research Experiences for Undergraduates program, where they will conduct research in marine chemistry. The goal is to provide a mentoring approach that can successfully overcome roadblocks to URM engagement in STEM and increase retention of these students in marine science.
This work will combine field and lab studies using advanced molecular-level chemical characterization tools to explore how bacteria alter the composition and bioreactivity of organic compounds dissolved in seawater. Additionally, this project will develop informatics-based tools to identify a larger proportion of chemical structures in marine dissolved organic matter (DOM) than is currently possible using traditional approaches. The project will use tandem mass spectrometry and networking techniques to comprehensively classify organic compounds into molecular families and determine common chemical transformations. Then, using a well-developed field-based experimental ecosystem to produce diverse labile DOM pools the research team will track microbial transformation using expression of hydrolytic enzymes and measure selection for particular microbial taxa and metabolisms. This approach defines the reactivity of individual molecules and broader compound classes participating in carbon fluxes that underpin DOM-microbe interactions. Field surveys conducted within the Moorea Coral Reef Long Term Ecological Research program will explore methods to track transformation of specific molecules in the environment and validate experimental observations of compound classes that appear to accumulate as semi-labile DOM. By integrating laboratory and field experiments and oceanographic surveys with the refinement of analytical tools for untargeted metabolomics, this project will characterize the fate of reactive DOM in the ocean.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
0.915 |
2020 — 2022 |
Allen, Eric Dorrestein, Pieter |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Mtm 1: Experimental Framework For Marine Fish Microbiomes Through Synthetic Communities and Multi-Omic Approaches @ University of California-San Diego Scripps Inst of Oceanography
Fish are recognized as a major global food resource, key ecological components of aquatic ecosystems, and have been valuable laboratory model organisms to study vertebrate biology. Like all animals, fish harbor rich communities of symbiotic microorganisms both upon their external body surfaces, such as the skin, and internally, such as the gut. These microbial populations play important roles in the health and fitness of their hosts by providing protection against invading pathogens, regulating immune functions, and the breakdown of environmental and dietary compounds. This project investigates the microbiota associated with two common marine fish species with contrasting feeding ecologies, the carnivorous Pacific chub mackerel and the herbivorous opaleye. Robust cultivation systems will be developed to propagate fish-associated microbes in the laboratory and analyzed using high-throughput DNA sequencing and small molecule detection methods. The data will be incorporated into a computational model to analyze and predict microbiome interactions. Thus, this research will provide an experimental framework to understand microbial contributions to fish biology and reveal the intricate interactions between microbes within the fish gut microbial ecosystem. This work has broad significance for understanding microbial activities that promote the diversity and ecology of marine fishes and can inform sustainable aquaculture practices by manipulating the fish microbiome to improve fish health and aquaculture yield. Additional benefits to society include the training of undergraduate and graduate students and the development of STEM education modules that explore the ocean microbiome.
The microbiota associated with marine fish represent an uncharted source of novel catabolic and biosynthetic pathways that impact the physiology and ecology of their hosts. This project will pioneer the development of robust in vitro bioreactor cultivation systems to replicate and propagate marine fish gut microbiota in the laboratory to assess and experimentally manipulate metabolic outputs of the fish microbiome. Classical and high-throughput cultivation methods will also be used to obtain isolates representing different fish gut microbiome functional guilds from natural samples and in vitro bioreactors. These isolates will be assembled into reproducible and experimentally tractable synthetic communities to study emergent properties of community organization and microbe-microbe interactions. Multi-omic data sets (metagenomics, metatranscriptomics, and untargeted metabolomics) will be integrated using computational modeling approaches (neural network statistical methods) to discover linkages between key microbial taxa, their enzyme diversity, metabolite production, and biotransformations within the high-fidelity bioreactor systems. The project will deliver new perspectives on microbiome dynamics in aquatic animals, setting the foundation for future microbiome-host studies and novel resources for microbial discovery.
This project is funded by the Understanding the Rules of Life: Microbiome Theory and Mechanisms Program, administered as part of NSF's Ten Big Ideas through the Division of Emerging Frontiers in the Directorate for Biological Sciences. Co-funding is provided by the Systems and Synthetic Biology Program, Division of Molecular and Cellular Biosciences, Directorate for Biological Sciences.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
|
0.915 |
2021 |
Cottrell, Garrison W (co-PI) [⬀] Dorrestein, Pieter C Gerwick, Lena Gerwick, William Henry [⬀] |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Unified Computation Tools For Natural Products Research @ University of California, San Diego
Summary The overarching goal for this proposed renewal application will be to further advance tools that are in development and to effectively integrate several types of analytical data with biological assay data and genomic information. This will create a powerful set of tools for faster and even more accurate identification of new molecules, dereplication of known ones, and to directly infer biological activities from spectroscopic information. In the current period of support, we have made substantial progress in developing highly useful tools for automatic annotations and identifications of organic molecules, specifically focused on natural products. The Global Natural Products Social (GNPS) Molecular Networking analysis and knowledge dissemination ecosystem has processed almost 160,000 jobs in nearly 160 countries worldwide, has 4-6,000 new job submissions per month and is accessed over 200,000 times a month (majority accessions are for reference library access, inspection of public data and previous jobs that the community shares as hyperlinks in papers), and has become a mainstream tool for the annotation of organic molecules deriving from diverse sources, especially in metabolomics workflows. The public website for Small Molecule Accurate Recognition Technology (SMART), a deep learning model for providing candidate structures based on 1H-13C HSQC NMR data, went live in December 2019 and already has over 3000 jobs in 50 countries. All tools developed in this proposal will become part of this analysis ecosystem. The four laboratories contributing to this proposed research activity have created an open and integrated team that is continuing to creatively innovate new informatic tools to enhance small molecule structure annotations and inference of their chemical and biological properties. We have four specific aims: 1) To complete the development and evaluation of a set of new and innovative tools for natural products analysis, and deploy these as freely available resources for the worldwide community. 2) To refine the structural characterization of molecules through leveraging repository scale mass spectral information along with NMR data and genomic inputs. 3) To create a new SMART-based tool that integrates mass spectrometry and HSQC NMR data as the input for a new deep learning system with the goal of achieving more accurate predictions of structure. 4) To use deep learning to enhance SMART with bioactivity data so as to enable SMART to predict activities of molecules based on spectroscopic features. The data will also augment the GNPS database with biological assay binding data. An additional consequence of these goals will be the further digitization of natural products analytical data so that they can be used in the computational tools planned herein, as well as other tools in the future. Completion of these four specific aims will create new integrated tools for the precise identification of new natural product structures, and enable inference of their structural relatedness to other classes of organic molecules and their biological properties. Thus, these new informatic tools will have the potential to greatly enhance the small molecule drug discovery process.
|
1 |