Using Machine Learning to Improve Speech Perception in Noise
Skip to main content
eScholarship
Open Access Publications from the University of California

UC Irvine

UC Irvine Electronic Theses and Dissertations bannerUC Irvine

Using Machine Learning to Improve Speech Perception in Noise

No data is associated with this publication.
Creative Commons 'BY-NC' version 4.0 license
Abstract

Hearing loss affects 1.5 billion individuals worldwide. A direct effect of hearing loss is decreased performance in speech perception, especially in noisy environment. It is well established that speech perception can be enhanced by suppressing background noise. Speech enhancement has been studied for decade and recent advance in artificial intelligence has greatly improved the speech enhancement performance.Besides reducing background noise, another potential way to improve speech perception in noisy environment is to modify the speech signal itself. Humans naturally adapt their speaking style to clearly articulate speech when talking in a noisy environment or talking to people with hearing loss. This clear speech style can improve intelligibility in noisy environment. Although the intelligibility benefit of clear speech has been intensively studied, limited success is achieved using computational methods to convert conversational speech to clear speech. My Ph.D study attempts to bridge this gap. I developed Syllable-Rate Adjusted Modulation (SRAM), an objective metric to predict speech intelligibility benefit of clear speech. Based on SRAM, I designed SRAMGAN to convert conversational speech to clear speech using deep neural networks to improve speech perception in noise. I also tested the ability of SRAM to predict speech intelligibility of synthesized speech from different online Text-To-Speech platforms.

Main Content

This item is under embargo until January 31, 2027.