Coarse-to-Fine Speech Retrieval Using Framewise Phoneme Probabilities

Liavaag, Harald

dc.contributor.advisor	Olstad, Bjørn	nb_NO
dc.contributor.advisor	Arentz, Will Archer	nb_NO
dc.contributor.author	Liavaag, Harald	nb_NO
dc.date.accessioned	2014-12-19T13:30:39Z
dc.date.available	2014-12-19T13:30:39Z
dc.date.created	2010-09-02	nb_NO
dc.date.issued	2006	nb_NO
dc.identifier	346675	nb_NO
dc.identifier	ntnudaim:1159	nb_NO
dc.identifier.uri	http://hdl.handle.net/11250/250053
dc.description.abstract	Archives of digital audio and video expand, and people need to find specific information within those archives. This is why it becomes clear that a highly efficient method of searching recorded media is required. The metadata that currently tag audio information such as title, date of recording, subject or person, is not sufficient for the accurate and rapid retrieval of specifically requested information. The field of media retrieval has achieved relatively little attention, but lately, the interest has increased. New techniques to support content-based access to archives of digital audio and video information are therefore evolving and receive much attention from the research community. Recently, a novel technique for speech retrieval was presented. The technique consists of a method to represent speech as a sequence of framewise phoneme probabilities and a new method to search speech. The search method suggested is able to use the framewise phoneme probabilities to determine the most closely matched segment of speech for a spoken query. This thesis first looks at methods to improve the retrieval performance of the proposed dynamic programming algorithm. The proposed dynamic programming algorithm finds 65% of the wanted hits among the top 10 results, using our test set consisting of 1,132 speech files. The thesis then deals with ways of increasing the speed of the search. The proposed method gives somewhat promising results, reducing the response time by 11% without affecting the retrieval effectiveness.	nb_NO
dc.language	eng	nb_NO
dc.publisher	Institutt for datateknikk og informasjonsvitenskap	nb_NO
dc.subject	ntnudaim	no_NO
dc.subject	SIF2 datateknikk	no_NO
dc.subject	Komplekse datasystemer	no_NO
dc.title	Coarse-to-Fine Speech Retrieval Using Framewise Phoneme Probabilities	nb_NO
dc.type	Master thesis	nb_NO
dc.source.pagenumber	54	nb_NO
dc.contributor.department	Norges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for datateknikk og informasjonsvitenskap	nb_NO

Tilhørende fil(er)

Filnavn:: 346675_FULLTEXT01.pdf
Størrelse:: 505.4Kb
Format:: PDF

Låst

Filnavn:: 346675_ATTACHMENT01.zip
Størrelse:: 153.5Kb
Format:: Ukjent

Låst

Filnavn:: 346675_COVER01.pdf
Størrelse:: 47.54Kb
Format:: PDF

Låst

Denne innførselen finnes i følgende samling(er)

Institutt for datateknologi og informatikk [6808]

Vis enkel innførsel