Audiovisual Contents Segmentation

Sundøy, Kristoffer Johan

dc.contributor.advisor	Johnsen, Magne Hallstein	nb_NO
dc.contributor.advisor	Svendsen, Torbjørn
dc.contributor.advisor	Dugelay, Jean-Luc
dc.contributor.advisor	Essid, Slim
dc.contributor.author	Sundøy, Kristoffer Johan	nb_NO
dc.date.accessioned	2014-12-19T13:46:01Z
dc.date.accessioned	2015-12-22T11:44:32Z
dc.date.available	2014-12-19T13:46:01Z
dc.date.available	2015-12-22T11:44:32Z
dc.date.created	2010-11-02	nb_NO
dc.date.issued	2010	nb_NO
dc.identifier	360267	nb_NO
dc.identifier	ntnudaim:5746
dc.identifier.uri	http://hdl.handle.net/11250/2370071
dc.description.abstract	The objective of this thesis is to detect high level semantic ideas to help to impose a structure on television talk shows. Indexing TV-shows is a subject that, to our knowledge, is rarely talked about in the scientific community.There is no common understanding of what this imposed structure should look like. We can say that the purpose is to organise the audiovisual content into sections that convey a specific information. It thus encompasses issues as diverse as scene segmentation, speech noise detection, speaker identification, etc. The basic problem of structuring is the gap between the information extracted from visual data flow and human interpretation made by the user of these data. Numerous studies have examined the organisation of highly structured video content. Thus, the state of the art has many studies on sport or newscast transmissions. Our goal is to detect key audiovisual events using a variety of descriptors and generic classifiers. We propose a generic approach that is able to assess all TV-show indexing problems. This enables an operator to use one single tool to infer a logical structure. Our approach can be considered as ``semi-automatic'' in the sense that the training data is collected on the fly by the operator who is asked to arbitrarily select one video excerpt of each concept involved. We have assessed a wide selection of audio and video features, used MKL as a feature selection algorithm and then built various content detectors and segmentors useful for imposing broad semantic classes on television data.This master's thesis was set forth by TELECOM ParisTech and was begun there March 1, 2010. This final report was submitted to TELECOM ParisTech, NTNU and Institute EURECOM August 29, 2010.	nb_NO
dc.language	eng	nb_NO
dc.publisher	Institutt for elektronikk og telekommunikasjon	nb_NO
dc.subject	ntnudaim:5746	no_NO
dc.subject	SIE7 kommunikasjonsteknologi
dc.subject	Lyd- og bildebehandling
dc.title	Audiovisual Contents Segmentation	nb_NO
dc.type	Master thesis	nb_NO
dc.source.pagenumber	85	nb_NO
dc.contributor.department	Norges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for elektronikk og telekommunikasjon	nb_NO

Tilhørende fil(er)

Filnavn:: 360267_COVER01.pdf
Størrelse:: 48.07Kb
Format:: PDF

Åpne

Filnavn:: 360267_FULLTEXT01.pdf
Størrelse:: 3.601Mb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Institutt for elektroniske systemer [2285]

Vis enkel innførsel