2015 — 2019 |
Mesgarani, Nima |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Neurophysiology of Robust Speech Perception in Human Superior Temporal Gyrus @ Columbia Univ New York Morningside
DESCRIPTION (provided by applicant): Perceiving and following an individual speaker in a crowded, noisy environment is a commonplace task for listeners with normal hearing. The underlying neurophysiology, however, is complex, and the task remains a struggle for people with peripheral and central auditory pathway disorders. The lack of a detailed neurobiological model of mechanisms and functions underlying robust speech perception has hindered our understanding of how these processes become impaired in the suffering population. In our innovative approach, we will record from high-density micro and macro electrode arrays surgically implanted on the superior temporal gyrus of epilepsy patients as part of their clinical evaluation. This method offers an exceptionally detailed perspective of cortical population activity. We will build upon two recent complementary findings where we identified a highly selective, spatially distributed neural representation of phonetic features (Mesgarani et. al. Science, 2014), which at the same time is highly dynamic and can change rapidly to reflect the perceptual bias of the listener (Mesgarani & Chang, Nature 2012). While significant, these studies revealed several gaps in our understanding of this process, which we intend to address in this proposal. Specifically, we will resolve the following unanswered questions: 1) what is the neural mechanism for joint encoding of both phonetic and speaker features? 2) How does attention modulate phonetic and speaker feature selectivity of neural responses? And 3) what computational mechanisms can account for dynamic feature selectivity of responses in STG? Answering these questions will significantly advance our understanding of a remarkable human ability, and will be of great interest to researchers from many areas including neurologists, and sensory and cognitive neuroscientists.
|
0.939 |
2016 — 2021 |
Mesgarani, Nima |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Career: Biologically Inspired Neural Network Models For Robust Speech Processing
The recent parallel breakthroughs in deep neural network models and neuroimaging techniques have significantly advanced the current state of artificial and biological computing. However, there has been little interaction between these two disciplines, resulting in simplistic models of neural systems with limited prediction, learning and generalization abilities. The goal of this project is to create a coherent theoretical and mathematical framework to understand the computational role of distinctive features of biological neural networks, their contribution to the formation of robust signal representations, and to model and integrate them into the current artificial neural networks. These new bio-inspired models and algorithms will have adaptive and cognitive abilities, will better predict experimental observations, and will advance the knowledge of how the brain processes speech. In addition, the performance of these models should approach human abilities in tasks mimicking cognitive functions, and will motivate new experiments that can further impose realistic constraints on the models.
This interdisciplinary project lies at the intersection of neurolinguistics, speech engineering, and machine learning, uniting the historically separated disciplines of neuroscience and engineering. The proposed innovative approach integrates methods and expertise across various disciplines, including system identification, signal processing, neurophysiology, and systems neuroscience. The aim of this proposal is to analyze and transform the artificial neural network models to accurately reflect the computational and organizational principles of biological systems through three specific objectives: I) to create analytic methods that can provide insights into the transformations that occur in artificial neural network models by examining their representational properties and feature encoding, II) to model and implement the local, bottom-up, adaptive neural mechanisms that appear ubiquitously in biological systems, and III) to model the top-down, knowledge driven abilities of cognitive systems to implement new computations in response to the task requirements. Accurate computational models of the neural transformations will have an overarching impact in many disciplines including artificial intelligence, neurolinguistics, and systems neuroscience. More realistic neural network models will not only result in human-like pattern recognition technologies and better understanding of how the brain solves speech perception, but can also help explain how these processes are impaired in people with speech and language disorders. Therefore, the proposed project will advance the state-of-the-art in multiple disciplines.
|
1 |
2019 |
David, Stephen V [⬀] Mesgarani, Nima |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Tools For Modeling State-Dependent Sensory Encoding by Neural Populations Across Spatial and Temporal Scales @ Oregon Health & Science University
Project Summary Throughout life, humans and other animals learn statistical regularities in the natural acoustic environment. They adapt their hearing to emphasize the features of sound that are important for making behavioral decisions. Normal-hearing humans are able to perceive important sounds in crowded noisy scenes and to understand the speech of individuals the first time they meet. However, patients with peripheral hearing loss or central processing disorders often have problems hearing in these challenging settings, even when sound is amplified above perceptual threshold. A better understanding of the function of the healthy and impaired auditory system will support new treatments for these deficits. This project will develop computational tools to study central auditory processing. A software library will support fitting and evaluating a large number of encoding models to describe the functional relationship between a time-varying natural auditory stimulus and the corresponding neural response. Many such models have been proposed, but relatively few direct comparisons have been made between them. This project will enable their comparison, allowing identification of the key features that contribute positively to their performance. The system will have a modular design so that useful elements from different models can be combined into comprehensive models with even greater explanatory power. The software will be open source and will support data from multiple recording modalities, including small-scale single unit electrophysiological and calcium imaging data, as well as large-scale local field and magnetoencephalography data. In addition to building on existing hypotheses about neural coding, the system will support machine learning methods for fitting artificial neural network models using the same datasets. These large, data-driven models have proven valuable for wide ranging signal processing problems, but their value and relation to existing models for neural sensory processing remain to be explored. Sensory processing involves coherent activity of large neural populations. To study coding at the population level, the system will support models that characterize the simultaneous activity of multiple neural signals and identifies latent subspaces of population activity related to sound encoding. Sensory coding is also influenced by behavioral context, reflecting changes in behavioral demands and the more general environment. The system will incorporate behavioral state variables into models, where encoding properties can be modulated by changes in behavioral context.
|
0.954 |
2020 — 2021 |
Mesgarani, Nima |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Functional and Computational Characterization of the Human Auditory Cortex @ Columbia Univ New York Morningside
According to NIDCD, 6 to 8 million people in the United States have some form of speech or communication disorder. Speech perception requires a listener to map variable acoustic signals onto a finite set of phonological categories known as phonemes, and to integrate those categories over time to form larger linguistic units such as syllables and words. It remains speculative where these different speech features are encoded and what cortical computations are needed for their calculation from an acoustic signal. A better understanding of what neural circuits are involved, how they are organized, and what computations they perform to support speech comprehension is critical for developing a detailed neurobiological model of speech perception. The major aim of this proposal is to use a joint framework to study the encoding of acoustic and linguistic features and the computational underpinnings of natural speech processing, using invasive surface and depth electrodes implanted in human neurosurgical patients. To study the cortical organization of acoustic features, we will characterize the encoding and anatomical organization of acoustic features in auditory cortical regions. To study the cortical organization of linguistic features, we will measure the encoding of phonetic, phonotactic, and semantic information using multivariate linear regression. To understand the underlying computational mechanisms, we will train convolutional neural network models to predict the neural responses to speech and use a novel method to express their computation as a set of linear transforms. By interpreting these models, we will uncover nonlinear computations used in different auditory areas and relate them to the encoding of acoustic and linguistic features. These complementary analyses will extend our knowledge of speech processing in the human auditory cortex and lead to new hypotheses about the mechanisms of various speech and language disorders. Together, the proposed research will greatly improve the current models of cortical speech processing, which are of great interest in many disciplines including neurolinguistics, speech pathology, speech prostheses, and speech technologies.
|
0.939 |
2021 |
Flinker, Adeen [⬀] Mesgarani, Nima |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Characterizing the Temporal Processing of Speech in the Human Auditory Cortex @ New York University School of Medicine
Project Summary Time is the fundamental dimension of sound, and temporal integration is thus fundamental to speech perception. To recognize a complex structure such as a word in fluent speech, the brain must integrate across many different timescales spanning tens to hundreds of milliseconds. These timescales are considerably longer than the duration of responses at the auditory nerve. Therefore, the auditory cortex must integrate acoustic information over long and varied timescales to encode linguistic units. On the other hand, the nature of the intermediate units of representation between sound and meaning remains debated. Focal brain injuries have shown selective impairment at all levels of linguistic processing (phonemic, phonotactic, and semantic) but current models of spoken word recognition disagree on the existence and type of these representational levels. The neural basis of temporal and linguistic processing remains speculative partly due to the limited spatiotemporal resolution of noninvasive human neuroimaging techniques which is needed to study the encoding of fluent speech. Our multi- PI proposal overcomes these challenges by assembling a team of researchers and clinicians with complementary expertise at NYU and Columbia University. We propose to record invasively from a large number of neurosurgical patients, which provides a rare and unique opportunity to collect direct cortical recordings across several auditory regions. We propose novel experimental paradigms and analysis methods to investigate where, when, and how acoustic features of speech are integrated over time to encode linguistic units. Our experimental paradigms will determine the functional and anatomical organization of stimulus integration periods in primary and nonprimary auditory cortical regions and relate the temporal processing in these regions to the emergence of phonemic-, phonotactic-, and semantic-level representations. Finally, we will determine the nonlinear computational mechanisms that enable the auditory cortex to integrate fast features over long durations, which is essential in speech recognition. Understanding of the temporal processing of speech in primary and nonprimary auditory cortex is critical for developing complete models of speech perception in the human brain, which is essential to understanding of how these processes break down in speech and communication disorders.
|
0.954 |