Show simple item record

dc.contributor.authorSabzi Shahrebabaki, Abdolreza
dc.contributor.authorOlfati, Negar
dc.contributor.authorSiniscalchi, Sabato Marco
dc.contributor.authorSalvi, Giampiero
dc.contributor.authorSvendsen, Torbjørn Karl
dc.date.accessioned2021-04-29T08:34:17Z
dc.date.available2021-04-29T08:34:17Z
dc.date.created2020-10-26T10:33:35Z
dc.date.issued2020
dc.identifier.citationINTERSPEECH 2020en_US
dc.identifier.issn2308-457X
dc.identifier.urihttps://hdl.handle.net/11250/2740290
dc.description.abstractArticulatory information has been argued to be useful for several speech tasks. However, in most practical scenarios this information is not readily available. We propose a novel transfer learning framework to obtain reliable articulatory information in such cases. We demonstrate its reliability both in terms of estimating parameters of speech production and its ability to enhance the accuracy of an end-to-end phone recognizer. Articulatory information is estimated from speaker independent phonemic features, using a small speech corpus, with electro-magnetic articulography (EMA) measurements. Next, we employ a teacher-student model to learn estimation of articulatory features from acoustic features for the targeted phone recognition task. Phone recognition experiments, demonstrate that the proposed transfer learning approach outperforms the baseline transfer learning system acquired directly from an acoustic-to-articulatory (AAI) model. The articulatory features estimated by the proposed method, in conjunction with acoustic features, improved the phone error rate (PER) by 6.7% and 6% on the TIMIT core test and development sets, respectively, compared to standalone static acoustic features. Interestingly, this improvement is slightly higher than what is obtained by static+dynamic acoustic features, but with a significantly less. Adding articulatory features on top of static+dynamic acoustic features yields a small but positive PER improvement.en_US
dc.language.isoengen_US
dc.publisherISCAen_US
dc.titleTransfer learning of articulatory information through phone information.en_US
dc.typeJournal articleen_US
dc.typePeer revieweden_US
dc.description.versionpublishedVersionen_US
dc.source.journalInterspeechen_US
dc.identifier.doi10.21437/Interspeech.2020-1139
dc.identifier.cristin1842188
cristin.ispublishedfalse
cristin.fulltextoriginal
cristin.fulltextoriginal
cristin.qualitycode1


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record