A DNN Based Speech Enhancement Approach to Noise Robust Acoustic-to-Articulatory Inversion

Sabzi Shahrebabaki, Abdolreza; Siniscalchi, Sabato Marco; Salvi, Giampiero; Svendsen, Torbjørn Karl

Sabzi Shahrebabaki, Abdolreza; Siniscalchi, Sabato Marco; Salvi, Giampiero; Svendsen, Torbjørn Karl

Chapter

Accepted version

Åpne

NoiseRobust_ISCAS2021.pdf (292.4Kb)

Permanent lenke

https://hdl.handle.net/11250/3025915

Utgivelsesdato

2021

Sammendrag

In this work, we investigate the problem of speaker independent acoustic-to-articulatory inversion (AAI) in noisy condition within the deep neural network (DNN) framework. We claim that DNN vector-to-vector regression for speech enhancement (DNN-SE) can play a key role in AAI when used in a front-end stage to enhance speech features before AAI backend processing. Our claim contrasts recent literature reporting a drop in AAI accuracy on MMSE enhanced data and thereby sheds some light on the opportunities offered by DNN-SE in robust speech applications. We have also tested single- and multitask training strategies of the DNN-SE block and experimentally found the latter to be beneficial to AAI. Moreover, DNN-SE coupled with an AAI deep system tested on enhanced speech can outperform a multi-condition AAI deep system tested on noisy speech. We assess our approach on the Haskins corpus using the Pearson's correlation coefficient (PCC). A 15% relative PCC improvement is observed over a multi-condition AAI system at 0dB signal-to-noise ratio (SNR). Our approach also compares favorably against using a conventional DSP approach, namely MMSE with IMCRA, in the front-end stage.

Utgiver

Institute of Electrical and Electronics Engineers (IEEE)

Opphavsrett

© IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.