2013 — 2015 |
Stepp, Cara E. |
R03Activity Code Description: To provide research support specifically limited in time and amount for studies in categorical program areas. Small grants provide flexibility for initiating studies which are generally for preliminary short-term projects and are non-renewable. |
Automation of Relative Fundamental Frequency Estimation @ Boston University (Charles River Campus)
DESCRIPTION (provided by applicant): Vocal hyperfunction (VH), which is characterized by excessive laryngeal tension, accounts for nearly half of the cases referred to multidisciplinary voice clinics and. It can respond to behavioral intervention, but successful treatment depends on proper assessment. Current assessment is hampered by the lack of objective measures for detecting its presence or severity. Relative Fundamental Frequency (RFF, the change in fundamental frequency in vowels preceding and following unvoiced consonants, normalized by fundamental frequency in the more steady state portions vowels) can objectively characterize VH. However, RFF estimates are currently performed manually by trained technicians and clinicians from running speech such that the potential of RFF in diagnosis and assessment is limited by the time-consuming manual nature of RFF estimation. This study will determine the optimal speech stimuli and signal processing for development of automated RFF estimation. We will determine the differences in RFF estimates from running speech versus non-linguistic speech utterances, the effect of linguistic context on the relationship between RFF and vocal tension, and the impact of dysphonia severity on the automated RFF measure. Vocal tension will be estimated in healthy and disordered voices using listener perception of vocal strain, and cross-validated in healthy speakers using objective measurements of the ratio of sound pressure level to subglottal pressure (dB SPL / cm H2O). Measures of RFF estimated using non-linguistic speech utterances have the potential to be automated more reliably. Based on our empirical research, we will develop recommendations for clinical methods of RFF collection to optimize automatic RFF estimation: full running or non-linguistic speech. We will further develop open source algorithms and software for automated RFF estimation. Automated RFF estimation would enable comprehensive clinical collection, facilitating future validation of this promising measure.
|
1 |
2015 — 2020 |
Stepp, Cara |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Career: Enabling Enhanced Communication Through Human-Machine-Interfaces @ Trustees of Boston University
The problem of low information transfer rates (ITR) is a critical one for people with severe speech and motor impairments, who must rely on augmented and alternative communication (AAC) to interact with other people. The PI's goal in this research is to develop new technology that will enable severely paralyzed individuals to communicate in a manner that is as fast and reliable as human speech, which would transform their lives by enabling greater independence. To this end, she will bridge the fields of AAC and human-machine interfaces (HMI) in order to develop tools that are fast and clinically viable in a two-pronged approach. She will explore surface electromyography (sEMG) as a control methodology to improve communication fluency for AAC device users, by developing algorithms to automatically determine the best placement for facial sensors to detect residual muscle movement in the head and neck. And she will overcome low ITR by improving HMI control through other modalities (e.g., gaze) in "dynamic interfaces" that automatically optimize for a given user's capabilities, and which also incorporate phoneme-based and text-based inputs.
Project outcomes will enhance our fundamental understanding of head and neck sEMG control, and will improve the functionality of communication via HMIs regardless of input modality. Specific research goals include: Development and testing of algorithms to automatically determine optimal user-specific neck and face sEMG sources as an input modality for HMI control; Determining the usability and relative performance of phonemic (speech) and orthographic (text) AAC interfaces; Development and testing of methods for dynamic interfaces that speed HMI control based on predictive speech production models and additional input modalities. The project will also include a novel educational component that will enable cross-pollination between the fields of communication sciences and engineering. The PI will sponsor an organization for undergraduates in communication sciences and biomedical engineering at Boston University, where students work in teams to develop custom solutions for individuals with communication impairments to fill the gap between state-of-the-art HMI research and commercially available AAC devices. And she will develop novel educational design content, based on case studies, for a new course for first-year undergraduates and for outreach to K-12 students.
|
0.915 |
2015 — 2018 |
Stepp, Cara |
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information |
Uns: Collaborative Research: Prosodic Control of Speech Synthesis For Assistive Communication in Severe Paralysis @ Trustees of Boston University
1510563(Stepp) & 1509791 (Koch Fager)
This work will develop and evaluate a system to allow individuals with unintelligible speech due to severe paralysis to control a speech synthesizer that includes prosody (changes in the pitch, loudness, and duration in speech that convey meaning). This advancement to the synthetic speech and the ease of its control by users will facilitate improved functionality of clinical communication systems, thus improving the quality of life of users. Natural and intelligible speech production in these individuals will increase their ability to participate actively in society and empower them to self-advocate for their own medical management.
The research objective of this proposal is to test the hypothesis that providing users of alternative and augmentative communication (AAC) with a method for prosodic control will result in speech synthesis that is more natural to listeners and provides greater function to users. Up to 1.2% of the population is unable to meet daily communication needs using typical speech due to stroke or other neurological injury, requiring AAC to meet their communication needs. Their quality of life is strongly dependent on access to this communication, both for social interaction as well as to relay information about urgent medical needs. The most advanced AAC devices incorporate speech synthesis, allowing the users to communicate orally with others. However, the resulting synthetic speech is both unnatural and difficult for others to understand, and is often described as "robotic". Specifically, synthetic speech does not vary in pitch, loudness, or rhythm, the prosodic features utilized in typical speech to relay emotional state, utterance form (statement vs. question), irony, and emphasis. Asking AAC users to control each of these dimensions individually would result in an intractably slow and complex system, an unacceptable burden for individuals who already have considerably reduced communication rates. Instead, this project will leverage the fact that typical speech predictably uses these prosodic markers (pitch, loudness, rhythm) in concert. A novel AAC interface will be developed to allow users to modify the overall "stress" of synthetic speech output as a single dimension, in order to provide easily controlled, natural, and intelligible speech synthesis. The co-PIs will use their combined expertise in speech technology, clinical application of AAC, and real-time control of human-machine-interfaces to enable essential advancements in AAC technology to achieve three goals. In Research Goal 1, a multi-stress speech bank for concatenative speech synthesis will be created via a novel interactive procedure in which speech productions of healthy speakers are "misunderstood", thus prompting speakers to naturally emphasize specific target sounds in their repeated responses. This will result in a bank of triphones (sounds with a specific left and right context, based on surrounding sounds) with all potential combinations of sounds and stresses. Research Goal 2 is to develop an AAC interface that allows users to select phonemes (individual sounds of speech) using two-dimensional cursor control (e.g., head-tracking, eye-tracking) in which the stress of individual phonemes will be based on cursor dwell time. In Research Goal 3, the functionality of the AAC interface will be evaluated by testing its effect on the naturalness of communicative interactions.
|
0.915 |
2016 — 2020 |
Stepp, Cara E. |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
An Acoustic Estimate of Laryngeal Tension For Clinical Assessment of Voice Disorders @ Boston University (Charles River Campus)
Project Summary: Voice disorders affect 3 ? 9% of the U.S. population and cause devastating effects to communication, limiting occupational and social participation. Elevated laryngeal tension is a significant factor in a range of functional and neurological voice disorders and a crucial target of therapeutic intervention. Overall, disorders associated with increased laryngeal tension are highly prevalent, accounting for over 65% of referrals to multidisciplinary voice clinics. Current clinical assessment is primarily based on unreliable auditory impressions and manual palpation, since standard acoustic measures are not specific to laryngeal tension. To address this gap, we have proposed an acoustic estimate of laryngeal tension, relative fundamental frequency (RFF), which has shown promise across a range of voice disorders associated with laryngeal tension. Previously, the time required to manually estimate RFF has prevented its clinical application, but our newly developed open-source algorithms for automated RFF estimation now allow for the large-scale, fine-grained studies required to endorse clinical use of RFF as an objective measure of laryngeal tension. Our collaborative team of clinicians, scientists, and engineers will utilize our new automated algorithms to systematically validate RFF as a measure of laryngeal tension in two voice disorder populations that span age and etiology (functional vs. neurological): vocal hyperfunction and Parkinson's disease. Aim 1 will supply concurrent validity by comparing RFF to two objective estimates of laryngeal tension: 1) kinematic stiffness and 2) an acoustic-accelerometric ratio based on aerodynamic efficiency. Aim 2 will assess the ability of RFF to capture clinically meaningful changes in laryngeal tension, following expected changes in function that are both improving (post-therapy vocal hyperfunction) and worsening (disease progression in PD). The ability of RFF to capture these changes will be compared with that of the objective measures from Aim 1 as well as currently clinically viable measures (standard acoustic measures and auditory-perceptual judgments). These data are essential to support the use of RFF as a clinical outcome measure in future assessments of treatment efficacy. Finally, Aim 3 will determine the diagnostic sensitivity and specificity of RFF in a large cohort of speakers with and without tension-related voice disorders, allowing clinicians to meaningfully interpret RFF values of individual patients with respect to age, sex, and diagnosis. Overall, this project will provide the first-ever examination of the clinical utility of an acoustic measure of voice with this degree of comprehensive detail, power, and scope, feasible only within a framework of partnership between basic and clinical researchers. Successful completion will result in a fully validated, objective, and automated estimate of laryngeal tension. This non-invasive and inexpensive assessment can be translated immediately to clinical practice, facilitating evidence-based practice, targeted voice therapy, and the tracking of disease status across a range of voice disorders.
|
1 |
2018 — 2021 |
Guenther, Frank H (co-PI) [⬀] Stepp, Cara E. |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Voice and Speech Sensorimotor Control in Parkinsons Disease @ Boston University (Charles River Campus)
Project Summary: Over 90% of individuals with Parkinson's disease (PD) suffer from speech problems characterized by impairments of voice and articulation, collectively termed ?hypokinetic dysarthria?. These symptoms degrade speakers' functional communication through decreases in both naturalness and intelligibility. However, little is known about the relationship between these functional communication outcomes and their underlying neural sensorimotor bases. While previous work has evaluated disparate aspects of speech motor control in modest cohorts, the result thus far is a patchwork of seemingly conflicting information. To address this gap, this project will comprehensively examine the sensorimotor control of speech in PD in a single cohort, utilizing the DIVA model [34] as a theoretical framework to guide hypothesis development and allow for mechanistic interpretations of experimental findings. Feedback and feedforward mechanisms of speech motor control affecting both voice (larynx) and articulation (vocal tract) will be evaluated in 40 individuals with PD and 40 matched control speakers using behavioral and neural responses to perturbations in somatosensory and auditory feedback. Our primary hypotheses are that PD involves weaker-than-normal feedforward commands, leading to increased reliance on feedback control, as well as an impaired ability to update feedforward commands based on discrepancies between desired and actual movement outcomes. Comprehensive sensorimotor control parameters from each participant will be compared with their intelligibility and naturalness, determined through rigorous auditory- perceptual experiments. Identification of the specific sensorimotor bases of speech symptoms in PD are essential to guide the development of new therapeutic targets to improve communication. For instance, although speech therapy is the only current treatment, only 13% of patients with PD choose to pursue it, likely due to its low long-term effectiveness. Given the relatively slow progression of PD and the increased incidence of speech symptoms with disease progression, developing effective speech treatments is imperative for maintaining quality of life. This project will result in specific physiological markers that are linked to functional communication outcomes in PD and can act as critical targets for behavioral and surgical interventions. This will lead to new treatments that are specific, effective, and tied to functional communication outcomes.
|
1 |
2019 — 2021 |
Stepp, Cara E. |
R13Activity Code Description: To support recipient sponsored and directed international, national or regional meetings, conferences and workshops. |
Boston Speech Motor Control Symposium @ Boston University (Charles River Campus)
Project Summary: Although speech production is arguably the most complex, yet routine, motor skill that humans perform, there are only two major conferences in the area of speech motor control: the International Conference on Speech Motor Control and the Madonna Conference on Motor Speech. Both occur relatively infrequently. As such, there is a need for more options to share current research in speech motor control. The goal of this R13 application is to support a new, intentionally accessible and inclusive regional conference in speech motor control to augment the current meetings. The Boston Speech Motor Control Symposium (BSMCS) draws on the high concentration of individuals in the Boston metropolitan area and areas surrounding Boston that are reachable by car and train. In order to increase the pipeline of promising under-represented minority (URM) researchers in this area, BSMCS is designed to reduce barriers to attendance for these individuals. BSMCS will be low-cost ($40 registration; $10 for students) and short (one full conference day with an optional tutorial the evening prior for trainees). BSMCS will provide travel awards to students and post-doctoral researchers, with preference to URMs. It will incorporate best-practices to allow for inclusion of working parents, such as free on-site childcare and access to lactation rooms [1]. Finally, BSMCS will offer continuing education units (CEUs) to attendees who are speech-language pathologists (SLPs). Attracting SLP attendees will 1) encourage the dissemination of cutting-edge research that can facilitate clinical translation, and 2) further bolster the pipeline of new researchers in this area by attracting SLPs to doctoral study. The primary purpose of this R13 conference grant proposal is to request support from NIDCD for trainee travel awards, travel expenses of invited speakers from outside the Boston area, family care provider expenses, conference supplies, and poster board rentals during BSMCS 2019, 2021, and 2023. Boston University (BU) will provide an equal match to the NIDCD contribution to each of these items. The conference registration fees and additional funds from BU will cover all additional costs for the conference. Each year's meeting will include a one-day program that consists of 5 invited talks (one keynote lecture and four other expert seminars), with the remainder of the program devoted to contributed talks and a contributed poster session. Additionally, each year there will be a trainee tutorial held the evening prior to the full day of the symposium, which will be free of charge for all students and post-doctoral researchers. NIDCD funding will allow BSMCS to meet the goals of accessibility and inclusivity, fostering research and education in speech motor control across diverse groups of students, researchers, and clinicians.
|
1 |
2021 |
Moore, Christopher A [⬀] Stepp, Cara E. |
T32Activity Code Description: To enable institutions to make National Research Service Awards to individuals selected by them for predoctoral and postdoctoral research training in specified shortage areas. |
Advanced Research Training in Communication Sciences and Disorders @ Boston University (Charles River Campus)
PROJECT SUMMARY This proposed renewal of Boston University's Training Program in Advanced Research Training in Communication Disorders and Sciences builds on the successful implementation of a multi- disciplinary, multi-institutional training effort. The current five-year cycle infuses a broad clinical perspective in trainees, incorporating didactic and research experiences across the full continuum of effective training in human health (i.e., from investigations of basic and disrupted mechanisms or communication processes, to treatment research, translation to the clinic, and broad implementation across health systems). Participating pre- and postdoctoral trainees have come from programs in Biomedical Engineering, Neuroscience, Psychological & Brain Sciences, and Speech, Language, & Hearing Sciences. The program is highly competitive at both the pre- and postdoctoral levels. The impact of this T32 is doubled at the predoctoral level by the commitment of each participating department to provide identical resources and opportunities for a ?match? Communication Sciences & Disorders trainee for each T32-supported trainee. The next five-year cycle continues all of these successful components, and adds formal development for each trainee of a data sciences toolkit. New Key Personnel from Boston University's recently established Faculty of Computing and Data Sciences (see Biosketches for Azer Bestavros, Associate Provost for Computing and Data Sciences, and Director of the BU Data Science Initiative, as well as Eric Kolaczyk, Director of the Hariri Institute for Computing and Computational Science & Engineering) will provide curricular and research guidance for this aspect of the program. Required training in data sciences includes demonstrated facility by each trainee in math, statistics, data wrangling, data mechanics, and machine learning. Coursework may include data manipulation using Python, data science using R, and image analysis. The success of our trainees reflects the deep research and training resources of the inter- disciplinary and cross-institution partnerships comprising this program. Well-established laboratories guided by experienced and well-funded preceptors, with access to large, diverse patient populations are supported by experienced leaders and institutional commitment. This preparation and acculturation of talented trainees contributes substantially to our scientific and clinical capacity to understand, prevent, and remediate communication disorders.
|
1 |
2021 |
Stepp, Cara E. |
R01Activity Code Description: To support a discrete, specified, circumscribed project to be performed by the named investigator(s) in an area representing his or her specific interest and competencies. |
Objective Measures For Clinical Assessment of Voice Disorders @ Boston University (Charles River Campus)
Project Summary: Adductor laryngeal dystonia (AdLD) is a neurological voice disorder characterized by laryngeal spasms. A secondary symptom is increased vocal effort, likely related to compensatory increases in laryngeal tension. Voice symptoms significantly impact psychosocial well-being and quality of life. Gold-standard management requires repeated injections of botulinum toxin (BTX) into laryngeal muscles, each of which provides temporary reduction in symptoms. New approaches for long-lasting treatment are under development, but can only be translated to clinical practice if they are evaluated using robust outcomes of vocal function. Unfortunately, there is a dearth of outcomes that are sufficiently sensitive and specific to the voice symptoms of AdLD. In fact, clinicians often have difficulty differentiating AdLD from muscle tension dysphonia (MTD), a functional voice disorder in which there is increased global laryngeal tension without laryngeal spasms, since the two voice disorders can have shared auditory-perceptual characteristics. To address this gap, objective measures reflective of both the primary and secondary voice symptoms of AdLD are needed. In our previous grant cycle, we validated two automated estimates of laryngeal tension (a secondary symptom of AdLD): the kinematic measure, kinematic stiffness (KS), and the acoustic measure, relative fundamental frequency (RFF). We also developed a new, automated spectral acoustic measure designed to capture laryngeal spasms in AdLD (a primary symptom) via detection of pitch breaks (high-passed spectral power of the pitch contour; HSPC). Combining HSPC and RFF measures, we were able to predict the overall severity of dysphonia in AdLD with R2=85%. While these results are promising, they must be validated in a larger sample. Thus, we propose an observational study to construct and validate automated kinematic and acoustic measures of the primary and secondary symptoms in AdLD. We will assess the physiological and discriminant validity, sensitivity to change (pre/post treatment), test-retest reliability, and ability of these measures to predict voice severity in AdLD. In Aim 1 we will compare acoustic estimates of the primary and secondary symptoms of AdLD with kinematic measures in individuals with AdLD and individuals with MTD. Results will determine the sensitivity and specificity of these measures and provide the physiological validation of the acoustic measures. In Aim 2 we will assess the sensitivity to change and test-retest reliability of the acoustic and kinematic measures, validating their use as clinical outcome measures in future assessments of treatment efficacy. Finally, in Aim 3 we will construct and evaluate a statistical acoustic model of overall severity of dysphonia in a large cohort of speakers with AdLD. Completion of these aims will result in validated, objective, and automated measures of vocal function that are specific to AdLD, with the potential to be translated immediately to clinical practice.
|
1 |