Machine Learning of Sub-Phonemic Units for Speech Recognition

Olfati, Negar

dc.contributor.advisor	Svendsen, Torbjørn
dc.contributor.author	Olfati, Negar
dc.date.accessioned	2015-12-28T10:04:52Z
dc.date.available	2015-12-28T10:04:52Z
dc.date.created	2015-02-22
dc.date.issued	2015
dc.identifier	ntnudaim:6646
dc.identifier.uri	http://hdl.handle.net/11250/2371401
dc.description.abstract	This work is intended to explore the performance of a new set of acoustic model units in speech recognition. The acoustic models were built and evaluated from scratch in several steps: Feature extraction, acoustic detection and merging, acoustic segmentation of TIMIT corpus, clustering the segment representatives, assigning labels to each cluster and labelling the segments by cluster labels, and finally acoustic modeling. At the acoustic modeling phase, two experiments were investigated, using standard HMM structures and HTK toolkit; In the first experiment, the models were trained and evaluated by the annotated version of training data from TIMIT database in terms of cluster labels. In the second experiment, the time-aligned version of transcriptions was utilized to train acoustic models. Both experiments were carried out on four systems with 128, 256, 512 and 1024 units. Both single and mixture probability estimators were testified. In both experiments, the best results were achieved using GMMs with three-components for the 128 units system.
dc.language	eng
dc.publisher	NTNU
dc.subject	Elektronikk (2årig), Akustikk
dc.title	Machine Learning of Sub-Phonemic Units for Speech Recognition
dc.type	Master thesis
dc.source.pagenumber	120

Tilhørende fil(er)

Filnavn:: 6646_FULLTEXT.pdf
Størrelse:: 22.17Mb
Format:: PDF

Åpne

Filnavn:: 6646_COVER.pdf
Størrelse:: 184.2Kb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Institutt for elektroniske systemer [2308]

Vis enkel innførsel