2008 — 2012 |
Salzman, Julia |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Statistical Methods From Spectral Analysis With Markov Chains
Modern science, especially biochemistry, has become dependent on numerical analysis of large amounts of data generated in most every experiment. Scientific advancement in biology and in understanding disease pathogenesis will likely depend on the analysis of the huge corpus of biomolecular data (eg. microarray, RNA and DNA sequence data). This advancement is linked to the field's ability to continue developing statistical methodologies capable of identifying a robust ``signal'' which can be reproducibly identified in multiple experiments all of which generate noisy data. The PI has shown how the theoretical framework of spectral analysis with Markov chains unifies several statistical methods for identifying structure in data that is observed with noise: discrete Fourier analysis, correspondence analysis, principle components analysis, as well as spectral clustering. This unifying framework also provides insight into, and generalization of, the more traditional methods listed above. Therefore, the PI's proposed research has two major directions. In one direction, it will continue basic methodological development of exploratory data analysis with a focus on methods capable of identifying biological signals observed in noisy experimental conditions. In another, it will focus on rigorous statistical analysis of this methodology which is in wide use in statistics, computer science and bioinformatics.
Statistical methods developed here will be particularly aimed at the study of cellular regulation of gene and protein expression. These cellular mechanisms have wide ranging importance in understanding human disease including cancer and infectious disease. The data analytic methods developed under this grant will be implemented and made publicly available through Bioconductor, a package in R. The broad goal of this proposal is to work towards providing a methodological unification of methods in statistics, biology and computer science to biomolecular data. Thus, it falls roughly into the field of bioinformatics.
|
1 |
2012 — 2016 |
Salzman, Julia |
K99Activity Code Description: To support the initial phase of a Career/Research Transition award program that provides 1-2 years of mentored support for highly motivated, advanced postdoctoral research scientists. R00Activity Code Description: To support the second phase of a Career/Research Transition award program that provides 1 -3 years of independent research support (R00) contingent on securing an independent research position. Award recipients will be expected to compete successfully for independent R01 support from the NIH during the R00 research transition award period. |
Discovering Genomic Rearrangements Under Selection in Serious Ovarian Cancer
Recurrent gene fusions and internal tandem duplications are among the most tumor-specific molecular markers known and can provide the potential for therapeutic targets. With a few notable exceptions, however, relatively common recurrent gene fusions have not been identified in commonly occurring carcinomas, which often have multiple, complex chromosomal rearrangements that are difficult to analyze by traditional cytogenetic approaches. Complex tumor karyotpes make it difficult to identify gene fusions using cytogenetics, but suggest the possibility that recurrent rearrangements producing fusions or internal tandem duplications (ITDs) may be prevalent. This proposal aims to use deep sequencing and the novel analytic techniques described to study aspects of the serous ovarian cancer genome and transcriptome which have remained hidden due to limitations in technology or analytical methods, and to test intra-individual and inter-individual selective pressures on tumors. The aspects of this proposal are as follows 1) to further investigate the extent of gene rearrangements in ovarian cancer, focusing on discovering local rearrangements transcribed into RNA; 2) to determine the composition of a group of novel circular transcripts that I have recently found to be expressed at relatively high levels in normal and pathogenic human cells; 3) to characterize double minutes in ovarian cancer, combining bioinformatics to determine rearrangements in their sequence composition and statistical analysis to determine evolutionary pressures on their composition exerted by the tumors. The applicant has a track-record of success in discovering novel gene fusions with ultra-high throughput sequencing (the ESRRA-C11 orf20 fusion), as well as designing original rigorous statistical and bioinformatic methods for ultra-high throughput data. Under the mentorship of Dr. Patrick O. Brown, a pioneer in high throughput genomic technologies and statistical methods for analyzing them, the applicant will continue career development and training. The first aim of this project will be performed during the mentoring phase, and experiments for aims 2 and 3 will be piloted. The K99/R00 award will support the applicant in her development into an independent investigator.
|
1 |
2015 — 2019 |
Salzman, Julia |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Unbiased Discovery of Mechanisms Regulating Circrna
? DESCRIPTION (provided by applicant) RNA is an ancient carrier of biological information, whose many functions became necessary for the evolution of cellular life, and transcriptional and post-transcriptional regulation of RNA is central to health and the progression of human disease. Remarkably, we have discovered that thousands of human genes produce circular RNA (circRNA) isoforms and in hundreds of genes, the circRNA is more abundant than the linear isoform. Our recent work has shown that circRNA expression is particularly regulated during human development. We have also demonstrated that circular RNAs are produced in organisms separated by billions of years of evolution, which suggests that the machinery, and by implication, function, of circRNA is central to eukaryotic gene expression programs: either conserved over billions of years, or a feature that has re-evolved multiple times which implies a functional role for circRNAs in the cell. Together, our work suggests a fundamental hypothesis that alternative splicing has functional consequences apart from protein production, including the production of circRNA isoforms. Yet, the field lacks a predictive mechanistic model of the cis sequences and trans-acting factors that specifically regulate circRNA, meaning a) the biochemical signaling pathways used by the cell to produce circRNA are unknown; b) we lack molecular tools to specifically express circRNA without background transcription of off-target RNAs. Such tools are required for discovery and rigorous experimental tests of circRNA function. This proposal aims to discover the mechanisms controlling circRNA production and regulation and promises to reveal novel biology regarding how biochemical signals are transduced into alternatively spliced RNA molecules and provide crucial tools for discovering the function of circRNA. Specifically, we aim to 1) engineer statistical algorithms for detecting and quantifying circRNA variants, and statistical methods for integrating expression across datasets; 2) discover trans-acting factors regulating circRNA production, export and decay; 3) systematically discover cis sequence control of circRNA abundance. The work will build on our discoveries of regulated expression of circular in human development, to delineate their regulation under normal circumstances, and how dysregulation may contribute to diseases such as neurodegeneration and cardiomyopathy.
|
1 |
2016 — 2021 |
Salzman, Julia |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Career: Dissecting the Biogenesis and Function of Circular Rna in Simple Eukaryotes
This project will study the function of circular RNA, a type of RNA recently discovered to be common to cells across the tree of life. The basic DNA instructions for life are shared among plants, animals and even simpler organisms, as is the basic way DNA is processed into RNA, the functional code of instructions for all cells. Only very recently, a new feature of this process was discovered: copies of DNA can be processed into circular RNA molecules, which are sometimes remarkably abundant in cells, and have features that suggest they have important functions. Research into circular RNA will benefit the public by increasing scientific knowledge of basic mechanisms of genome evolution and function. The research will be disseminated to the public through seminars, publications and by directly integrating the research projects with educational programs aimed at high school, undergraduate and graduate students with a focus on groups that have traditionally been underrepresented in science. In addition, the integration of experimental, statistical and computational approaches in this research will serve as a template for development of a core class aimed at graduate students to teach the principles of statistics to biologists and the principles of biology to statistically trained students.
Circular RNA is a novel class of RNA in different organisms. It is likely to be generated in a pathway linked to RNA splicing, suggesting tha suggests that splicing may have functions apart from excising introns to generate mature mRNAs that code for protein. Compared to other classes of expressed RNA, circular RNA has been the subject of little attention and study. This project will directly investigate circular RNA in unicellular eukaryotes where RNA processing is known to be complex and play important functional roles. Using the opportunities provided by recent technological advances in sequencing, the project will entail developing novel statistical and computational approaches coupled with new experimental designs to increase the understanding of circular RNA biogenesis and function. The statistical, computational and experimental methods develop in this project will be generally applicable to studying circular RNA across a variety of unicellular and multicellular organisms.
This award is co-funded by the Genetic Mechanisms Program in the Division of Molecular and Cellular Biosciences and the Division of Emerging Frontiers in the Biological Sciences Directorate, and by the Statistics Program in the Division of Mathematical Sciences in the Mathematical and Physical Sciences Directorate.
|
1 |
2020 |
Fordyce, Polly Morrell (co-PI) [⬀] Rohatgi, Rajat (co-PI) [⬀] Salzman, Julia |
R56Activity Code Description: To provide limited interim research support based on the merit of a pending R01 application while applicant gathers additional data to revise a new or competing renewal application. This grant will underwrite highly meritorious applications that if given the opportunity to revise their application could meet IC recommended standards and would be missed opportunities if not funded. Interim funded ends when the applicant succeeds in obtaining an R01 or other competing award built on the R56 grant. These awards are not renewable. |
Orthocoding For Spatial Sequencing
Project Summary The 3D spatial context of a cell determines which genes and RNA isoforms it expresses, enabling specialized cell functions fundamental to multicellular life. In typical single-cell RNA-seq (scRNA-seq), the first step of cell dissociation erases the spatial context of the cell. This flaw creates an urgent need for a technology that has the same throughput of scRNA-seq but also encodes the cells? spatial context. Although a new wave of spatial transcriptomic technologies based on sequencing has emerged recently, all suffer from severe limitations: low efficiency (~1-2% of the Drop-Seq efficiency), providing 2D resolution only, failure to discriminate cell boundaries and requiring specialized or expensive equipment. These limitations are intrinsic and result from their shared reliance on cDNA synthesis in situ by from a solid support. Imaging-based technologies have higher spatial resolution but require more equipment, time for protocol execution, have limited gene measurement throughput, and cannot profile RNA isoforms or other sequence variants. To overcome these limitations in state-of-the-art spatial transcriptomic methods, we propose to develop Orthocode, an innovative paradigm for statistically-driven spatial transcriptomics, grounded in proof-of-principle molecular experiments, and cutting-edge statistical theory. Orthocode achieves > 50x or higher sensitivity compared to current approaches by encoding and recovering spatial information from simple, inexpensive and efficient molecular biology protocols. The experimental Orthocode protocol has two steps: 1) a pool of two types of ?location-encoding oligos? (a) barcoded emitter oligos produce copies of themselves that diffuse locally and (b) ?receptors? record the barcodes of nearby emitters are coupled to cells; 2) cells coupled to location- encoding oligos that have together record the spatial position of the cell, are isolated and input into scRNA-seq workflows, eg. Drop-seq and sequenced. Orthocode then employs a rigorous statistical analysis of the barcode profiles of location encoding oligos to triangulate the location of each sequenced cell. This rigorously reasoned experimental design and prototype development builds Orthocode from the simplest test systems to prototypes that will allow unprecedented spatial transcriptomic resolution in tissues to address a critical unmet need in biomedicine. The Orthocode paradigm can be generalized beyond RNA profiling to spatial measurements of proteins, DNA and epigenetic modifications and is a potential breakthrough innovation in deep-sequencing based spatial ?omics.
|
1 |
2021 |
Salzman, Julia |
R35Activity Code Description: To provide long term support to an experienced investigator with an outstanding record of research productivity. This support is intended to encourage investigators to embark on long-term projects of unusual potential. |
Ai/Ml Ready Appraoches For Integrative Rna Processing, Splicing and Spatial Genomics
Project Summary/Abstract From parent grant: Cells and organisms, from simple to complex, carry the same genetic DNA sequence organized into genes. Multicellular eukaryotes transcribe and process genes into RNA isoforms through a process called alternative splicing. Alternative splicing is developmentally, and cell-type specifically regulated. It is foundational to how higher organisms? genomes are decoded. Yet, critical, and fundamental questions regarding its regulation and the function of its output remain unanswered. For example, circRNA being a ubiquitous product of alternative splicing was only discovered in 2012, and its regulation and function remains enigmatic. circRNAs? discovery revealed a larger critical knowledge gap in the field for ?what, how and why? genes are alternatively spliced. What RNA splice variants are expressed, how splicing is regulated, and which spliced RNAs have essential functions? Answering these questions is critical for predicting which of myriad genetic variants cause disease and why they do so. Answers will also enable a new generation of digital nucleic acid biomarkers and diagnostics for disease, drug targets for correcting dysregulated splicing and identification of pathogenic protein- or non-coding products (respectively) as well as fundamental basic scientific insight into evolution and function of eukaryotic genomes.. Despite the great promise for discovering how splicing is regulated in massive single cell RNAseq experiments, the field is still lacking unbiased precise methods to address statistical and computational challenges of splicing analysis in scRNA-Seq. State-of-the-art, reproducible, statistical algorithms to achieve precise splice variant calls, detecting how they are regulated in cell types and subcellularly lag far behind the rate at which single cell RNA- seq (scRNA-seq) data is generated, limiting ML/AI readiness. Here, we will open the possibility of analyzing novel RNA regulatory biology through ML/AI-ready software and processed data to a huge community of biomedical researchers enabling new basic and translational discoveries.
|
1 |
2021 |
Salzman, Julia |
R35Activity Code Description: To provide long term support to an experienced investigator with an outstanding record of research productivity. This support is intended to encourage investigators to embark on long-term projects of unusual potential. |
Computational- and Experimental- Driven Discovery of Splicing Regulation and Circrna Function
Project Summary/Abstract Cells and organisms, from simple to complex, carry the same genetic DNA sequence organized into genes. Multicellular eukaryotes transcribe and process genes into RNA isoforms through a process called alternative splicing. Alternative splicing is developmentally and cell-type specifically regulated. It is foundational to how higher organisms? genomes are decoded. Yet, critical and fundamental questions regarding its regulation and the function of its output remain unanswered. For example, circRNA being a ubiquitous product of alternative splicing was only discovered in 2012, and its regulation and function remains enigmatic. circRNAs? discovery revealed a larger critical knowledge gap in the field for ?what, how and why? genes are alternatively spliced. What RNA splice variants are expressed, how splicing is regulated and which spliced RNAs have essential functions? Answering these questions is critical for predicting which of myriad genetic variants cause disease and why they do so. Answers will also enable a new generation of digital nucleic acid biomarkers and diagnostics for disease, drug targets for correcting dysregulated splicing and identification of pathogenic protein- or non-coding products (respectively) as well as fundamental basic scientific insight into evolution and function of eukaryotic genomes. The proposed research will couple novel statistical analyses of -omics data by taking an unbiased approach and including biological features that are understudied or un-annotated. Predictions will be coupled with incisive experimental validation to reveal new principles of how RNAs, including circRNAs, are spliced and how they function. This research potentiates significant new discoveries in why alternative splicing exists and how this understanding can be used for precision medicine.
|
1 |