Tang, Y and Cooke, M 2016, Glimpse-based metrics for predicting speech intelligibility in additive noise conditions , in: 17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016, 08/09/2016-12/09/2016, San Francisco, USA.
- Published Version
Restricted to Repository staff only
Download (171kB) | Request a copy
The glimpsing model of speech perception in noise operates by recognising those speech-dominant spectro-temporal regions, or glimpses, that survive energetic masking; hence, a speech recognition component is an integral part of the model. The current study evaluates whether a simpler family of metrics based solely on quantifying the amount of supra-threshold target speech available after energetic masking can account for subjective intelligibility. The predictive power of glimpse-based metrics is compared for natural, processed and synthetic speech in the presence of stationary and fluctuating maskers. These metrics are raw glimpse proportion, extended glimpse proportion, and two further refinements: one, FMGP, incorporates a component simulating the effect of forward masking; the other, HEGP, selects speech-dominant spectro-temporal regions with above-average energy on the noisy speech. The metrics are compared alongside a state-of-the-art non-glimpsing metric, using three large datasets of listener scores. Both FMGP and HEGP equal or improve upon the predictive power of the raw and extended metrics, with across-masker correlations ranging from 0.81--0.92; both metrics equal or exceed the state-of-the-art metric in all conditions. These outcomes suggests that easily-computed measures of unmasked, supra-threshold speech can serve as robust proxies for intelligibility across a range of speech styles and additive masking conditions.
|Item Type:||Conference or Workshop Item (Paper)|
|Schools:||Schools > School of Computing, Science and Engineering|
|Journal or Publication Title:||Proceedings of the Annual Conference of the International Speech Communication Association|
|Publisher:||International Speech and Communication Association|
|Depositing User:||Y Tang|
|Date Deposited:||09 Sep 2016 07:56|
|Last Modified:||02 Nov 2016 13:24|
Actions (login required)
|Edit record (repository staff only)|