2001 |
Lotto, Andrew James |
R55Activity Code Description: Undocumented code - click on the grant title for more information. |
Auditory Enhancement in Female Speech @ Washington State University
DESCRIPTION: Females produce speech with a higher fundamental frequency (f0, voice pitch) as compared to male-produced speech. This higher f0 results in relative under sampling of the spectral envelope which should, in turn lead to lower intelligibility. However, excerpts from natural speech produced by females is as or more intelligible than male speech. This application proposes a series of complementary speech perception and production experiments designed to uncover gender-specific acoustic attributes that lead to this enhanced intelligibility. In particular, females have been characterized as speaking slower, with a more "breathy" voice, and with greater pitch excursions or "swoopiness." While these descriptions have in the past been considered derogatory, they may, in fact, be signs of adaptive production by females to compensate for the negative effects of a high voice pitch. In order to test this possibility, a large database of speech samples from oral reading will be analyzed for gender differences in breathiness, dynamic f0 range and speaking rate. A large number of acoustic measures theoretically related to breathy phonation will be computed. Using Principal Components Analysis. A composite acoustic measure of breathiness will be developed. Breathiness will then be analyzed by gender and by vowel. These measures can serve as much-needed gender-specific normative data on production. These data may be important for more sensitive diagnoses of vocal pathology, for synthesis of realistic-sounding female speech, and for the creation of gender-specific templates for automatic speech recognition. In addition, any gender differences uncovered by these analyses will form the basis of perception experiments designed to test their consequences on intelligibility. It is predicted that females produce high tense vowels as more breathy than similar lax vowels and that this pattern increases the size of the resultant vowel space. To test these predictions, synthesized vowels will be created to mimic various acoustic aspects of male- and female-produced speech. Listeners will attempt to identify these stimuli when presented in babble noise. If the female pattern of breathiness is adaptive, then the resulting vowels should be easier to identify than vowels that do not vary in breathiness. Similar predictions are made for "swoopiness." If the perception tasks reveal that gender-specific acoustics can enhance intelligibility, then these data may be important for development of signal processing techniques for enhancing communication. This series of perception-production studies will be interpreted within a framework that proposes that speakers faced with individual intelligibility challenges vary their speech production in ways that enhance the auditory cues to phonetic identification, thereby aiding the listener.
|
0.948 |
2008 — 2012 |
Lotto, Andrew |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Collaborative Research: Learning Complex Auditory Categories
The growth in globalization across traditional language boundaries suggests a need for efficient second language (L2) acquisition training regimens. One of the most significant challenges for adult language learners is learning to hear fine distinctions among non-native sounds not used in the native language; such learning may require decades of experience with the second language. A classic example is the difficulty native Japanese have learning English /r/ and /l/, a sound contrast not present in Japanese. With prior NSF support, Drs. Holt and Lotto have uncovered principles of auditory learning using controlled experiments with non-speech sounds and have used these principles to design optimal training regimens. This project uncovered how characteristics of training, feedback and presentation mode affected auditory learning.
The present project will apply these findings to adult learning of non-native speech sounds, with the aim of producing more efficient L2 learning. One series of studies will investigate the benefits of video-game-based training (found to foster non-speech category learning) in learning non-native speech sounds. Another series of experiments will test whether manipulation of the variability of sound cues, found to be important in non-speech auditory learning in prior research, is effective in shifting the attention listeners give to these cues in second-language learning. Such shifts appear to be important for many cases of L2 learning, such as native Japanese speakers learning English /r/ and /l/. Beyond practical application in adult second language learning, the project has important theoretical implications for understanding human auditory perception and language processing. Such understanding is a prerequisite to developing rehabilitative techniques for disorders such as autism, dyslexia, central auditory processing disorder and specific language impairment.
|
0.915 |
2010 — 2013 |
Liss, Julie M. [⬀] Liss, Julie M. [⬀] Lotto, Andrew J |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Perception of Dysarthric Speech @ Arizona State University-Tempe Campus
DESCRIPTION (provided by applicant): Reduced intelligibility is at the heart of the communication disorder associated with the dysarthrias and other speech production deficits, undermining quality of life. This research program aims to develop a comprehensive model of intelligibility deficits that offers an explanation for communication failure and success, and thereby identifies targets for remediation, as well as dependent variables that will serve as outcome measures. We have shown that when listeners encounter speech that is difficult to understand, they turn their attention to prosody to help them decide where words begin and end. However this strategy for lexical segmentation becomes challenged when the prosodic information itself is degraded, as in the dysarthrias. Further, the nature of the prosodic degradation predicts the ways in which word boundary identification is impaired. The differences in perceptual error patterns resulting from speech produced by two equally intelligible speakers are predictable and provide information both about the underlying motor deficit and the perceptual representations and strategies of the listener. The present proposal defines this relationship through the development of sensitive dependent variables that predict listener performance patterns and production characteristics. Specifically, we will refine a set of acoustic measures and establish their predictive relationship to perceptual performance (intelligibility and error patterns), using speakers with dysarthria and healthy controls. These automated acoustic measures include measures of based on the low-frequency modulations of the amplitude envelope and measures of fundamental frequency and average spectral variability. This set of acoustic measures will be used to classify speakers by traditional dysarthric subtypes as well as by groupings based on a perceptual-outcome clustering that will be developed using the error patterns obtained from listeners' transcription of each speaker's samples. The model will be tested and refined on a new more diverse group of speakers with intelligibility deficits. The causality of the relationships between acoustics and perception uncovered by these analyses will be tested through perceptual experiments using speech samples that are digitally manipulated to match the prosodic patterns that are associated with particular error types. The proposed project holds promise for immediate clinical impact by providing both sensitive and meaningful outcome measures and an overarching theoretical framework in which to interpret them. PUBLIC HEALTH RELEVANCE: The overall goal of the current project is to develop a theoretically-derived model of intelligibility deficits that has immediate clinical impact by identifying targets for remediation and offering dependent variables that may be used to predict perceptual outcome and track changes in speech due to intervention or disease progression. By defining a set of objective measures that map to meaningful aspects of speech understanding, these dependent variables can be applied to any communication disorder for which intelligibility is reduced.
|
0.988 |
2011 — 2015 |
Holt, Lori L [⬀] Lotto, Andrew J |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Formation and Tuning of Complex Auditory Categories @ Carnegie-Mellon University
DESCRIPTION (provided by applicant): This collaborative research program investigates processes underlying the formation and tuning of complex sound categories. The overall goal is to provide a model of auditory categorization that can be readily applied to challenges of speech perception and communication disorders. Language learners form (phonetic) auditory categories of native-language sounds from the distributions of experienced speech sounds produced by many talkers. However, these averaged categories may not be appropriate for the speech produced by a specific talker. For example, non-native speech may not adhere to the patterns typical of native speakers. The aim of the current project is to develop and test a theoretical and practical model of how listeners use context to normalize, or tune, speech perception to the characteristics of a particular listening situation. The proposed experiments will move the model beyond mere demonstrations of normalization to make quantitative predictions of performance as a function of the content and temporal extent of the context. Such a practical model can be used to develop signal processing strategies for hearing aids and implants as well as to predict intelligibility of disordered speech. Building on the empirical outcomes of the previous project, the present research tests predictions arising from the hypothesis that a general auditory mechanism sensitive to the spectral interactions that occur between context and target sounds can account quantitatively for patterns of speech perception that appear to require extraction of vocal-tract-specific talker information. Another set of experiments will test the influence of perceptual learning of talker-specific patterns of speech in supporting this mechanism. A final series of experiments will bridge the gap that often exists between tests of speech perception phenomena and understanding real-world speech intelligibility and comprehension. Such a linkage is critical for deriving theory- and evidence-based clinical approaches in treatment of communication disorders.
|
0.952 |