Lexical Robustness for Automatic Speech Recognition

Mertens, Timo Pascal

Mertens, Timo Pascal

Doctoral thesis

Åpne

588242_FULLTEXT01.pdf (Låst)

Permanent lenke

http://hdl.handle.net/11250/2370673

Utgivelsesdato

2012

Metadata

Vis full innførsel

Samlinger

Institutt for elektroniske systemer [2350]

Sammendrag

The lexicon plays a crucial role in a speech recognition system. It defines the mapping between the words that the system can recognize and the different ways these words can be pronounced. In this thesis we address various shortcomings of the lexicon with the aim to increase lexical robustness of the speech recognizer. We focus on three aspects of lexical robustness: first we address how words that are not in the lexicon can be recognized, which is also known as the out-ofvocabulary problem. We then investigate how pronunciation variation, especially of non-native speakers, can be handled in the lexicon. Finally, we develop approaches that learn lexical entries from data in a semi-supervised fashion. Like most machine learning techniques, many of our proposed approaches depend on training data to work well. Due to data sparsity we exploit appealing properties inherent to subword modeling to adapt the lexicon in various setups, or use subwords directly as the recognition unit when decoding the speech signal. We evaluate our novel methods in the context of transcription as well as Spoken Term Detection, since both tasks rely significantly on the robustness of the lexicon.

Utgiver

Norges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for elektronikk og telekommunikasjon

Serie

Doktoravhandlinger ved NTNU, 1503-8181; 2012:214