Vis enkel innførsel

dc.contributor.authorSabzi Shahrebabaki, Abdolreza
dc.contributor.authorOlfati, Negar
dc.contributor.authorImran, Ali Shariq
dc.contributor.authorJohnsen, Magne Hallstein
dc.contributor.authorSiniscalchi, Sabato Marco
dc.contributor.authorSvendsen, Torbjørn Karl
dc.date.accessioned2022-10-11T11:30:44Z
dc.date.available2022-10-11T11:30:44Z
dc.date.created2022-03-11T13:09:55Z
dc.date.issued2021
dc.identifier.citation2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)en_US
dc.identifier.isbn978-1-7281-7606-2
dc.identifier.urihttps://hdl.handle.net/11250/3025349
dc.description.abstractThis paper proposes a two-stage deep feed-forward neural network (DNN) to tackle the acoustic-to-articulatory inversion (AAI) problem. DNNs are a viable solution for the AAI task, but the temporal continuity of the estimated articulatory values has not been exploited properly when a DNN is employed. In this work, we propose to address the lack of any temporal constraints while enforcing a parameter-parsimonious solution by deploying a two-stage solution based only on DNNs: (i) Articulatory trajectories are estimated in a first stage using DNN, and (ii) a temporal window of the estimated trajectories is used in a follow-up DNN stage as a refinement. The first stage estimation could be thought of as an auxiliary additional information that poses some constraints on the inversion process. Experimental evidence demonstrates an average error reduction of 7.51% in terms of RMSE compared to the baseline, and an improvement of 2.39% with respect to Pearson correlation is also attained. Finally, we should point out that AAI is still a highly challenging problem, mainly due to the non-linearity of the acoustic-to-articulatory and one-to-many mapping. It is thus promising that a significant improvement was attained with our simple yet elegant solution.en_US
dc.language.isoengen_US
dc.publisherIEEEen_US
dc.relation.ispartofICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
dc.titleA Two-Stage Deep Modeling Approach to Articulatory Inversionen_US
dc.typeChapteren_US
dc.description.versionacceptedVersionen_US
dc.rights.holder© 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.en_US
dc.identifier.doi10.1109/ICASSP39728.2021.9413742
dc.identifier.cristin2009124
cristin.ispublishedtrue
cristin.fulltextpostprint
cristin.qualitycode1


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel