Vis enkel innførsel

dc.contributor.advisorSvendsen, Torbjørnnb_NO
dc.contributor.authorAlcaraz Meseguer, Noelianb_NO
dc.date.accessioned2014-12-19T13:43:49Z
dc.date.accessioned2015-12-22T11:41:31Z
dc.date.available2014-12-19T13:43:49Z
dc.date.available2015-12-22T11:41:31Z
dc.date.created2010-09-03nb_NO
dc.date.issued2009nb_NO
dc.identifier347957nb_NO
dc.identifier.urihttp://hdl.handle.net/11250/2369233
dc.description.abstractThe classical front end analysis in speech recognition is a spectral analysis which parametrizes the speech signal into feature vectors; the most popular set of them is the Mel Frequency Cepstral Coefficients (MFCC). They are based on a standard power spectrum estimate which is first subjected to a log-based transform of the frequency axis (mel- frequency scale), and then decorrelated by using a modified discrete cosine transform. Following a focused introduction on speech production, perception and analysis, this paper gives a study of the implementation of a speech generative model; whereby the speech is synthesized and recovered back from its MFCC representations. The work has been developed into two steps: first, the computation of the MFCC vectors from the source speech files by using HTK Software; and second, the implementation of the generative model in itself, which, actually, represents the conversion chain from HTK-generated MFCC vectors to speech reconstruction. In order to know the goodness of the speech coding into feature vectors and to evaluate the generative model, the spectral distance between the original speech signal and the one produced from the MFCC vectors has been computed. For that, spectral models based on Linear Prediction Coding (LPC) analysis have been used. During the implementation of the generative model some results have been obtained in terms of the reconstruction of the spectral representation and the quality of the synthesized speech.nb_NO
dc.languageengnb_NO
dc.publisherInstitutt for elektronikk og telekommunikasjonnb_NO
dc.subjectntnudaimno_NO
dc.titleSpeech Analysis for Automatic Speech Recognitionnb_NO
dc.typeMaster thesisnb_NO
dc.source.pagenumber87nb_NO
dc.contributor.departmentNorges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for elektronikk og telekommunikasjonnb_NO


Tilhørende fil(er)

Thumbnail
Thumbnail
Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel