Vis enkel innførsel

dc.contributor.advisorOlstad, Bjørnnb_NO
dc.contributor.advisorArentz, Will Archernb_NO
dc.contributor.authorLiavaag, Haraldnb_NO
dc.date.accessioned2014-12-19T13:30:39Z
dc.date.available2014-12-19T13:30:39Z
dc.date.created2010-09-02nb_NO
dc.date.issued2006nb_NO
dc.identifier346675nb_NO
dc.identifierntnudaim:1159nb_NO
dc.identifier.urihttp://hdl.handle.net/11250/250053
dc.description.abstractArchives of digital audio and video expand, and people need to find specific information within those archives. This is why it becomes clear that a highly efficient method of searching recorded media is required. The metadata that currently tag audio information such as title, date of recording, subject or person, is not sufficient for the accurate and rapid retrieval of specifically requested information. The field of media retrieval has achieved relatively little attention, but lately, the interest has increased. New techniques to support content-based access to archives of digital audio and video information are therefore evolving and receive much attention from the research community. Recently, a novel technique for speech retrieval was presented. The technique consists of a method to represent speech as a sequence of framewise phoneme probabilities and a new method to search speech. The search method suggested is able to use the framewise phoneme probabilities to determine the most closely matched segment of speech for a spoken query. This thesis first looks at methods to improve the retrieval performance of the proposed dynamic programming algorithm. The proposed dynamic programming algorithm finds 65% of the wanted hits among the top 10 results, using our test set consisting of 1,132 speech files. The thesis then deals with ways of increasing the speed of the search. The proposed method gives somewhat promising results, reducing the response time by 11% without affecting the retrieval effectiveness.nb_NO
dc.languageengnb_NO
dc.publisherInstitutt for datateknikk og informasjonsvitenskapnb_NO
dc.subjectntnudaimno_NO
dc.subjectSIF2 datateknikkno_NO
dc.subjectKomplekse datasystemerno_NO
dc.titleCoarse-to-Fine Speech Retrieval Using Framewise Phoneme Probabilitiesnb_NO
dc.typeMaster thesisnb_NO
dc.source.pagenumber54nb_NO
dc.contributor.departmentNorges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for datateknikk og informasjonsvitenskapnb_NO


Tilhørende fil(er)

Thumbnail
Thumbnail
Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel