Vis enkel innførsel

dc.contributor.advisorJohnsen, Magne Hallsteinnb_NO
dc.contributor.advisorSvendsen, Torbjørn
dc.contributor.advisorDugelay, Jean-Luc
dc.contributor.advisorEssid, Slim
dc.contributor.authorSundøy, Kristoffer Johannb_NO
dc.date.accessioned2014-12-19T13:46:01Z
dc.date.accessioned2015-12-22T11:44:32Z
dc.date.available2014-12-19T13:46:01Z
dc.date.available2015-12-22T11:44:32Z
dc.date.created2010-11-02nb_NO
dc.date.issued2010nb_NO
dc.identifier360267nb_NO
dc.identifierntnudaim:5746
dc.identifier.urihttp://hdl.handle.net/11250/2370071
dc.description.abstractThe objective of this thesis is to detect high level semantic ideas to help to impose a structure on television talk shows. Indexing TV-shows is a subject that, to our knowledge, is rarely talked about in the scientific community.There is no common understanding of what this imposed structure should look like. We can say that the purpose is to organise the audiovisual content into sections that convey a specific information. It thus encompasses issues as diverse as scene segmentation, speech noise detection, speaker identification, etc. The basic problem of structuring is the gap between the information extracted from visual data flow and human interpretation made by the user of these data. Numerous studies have examined the organisation of highly structured video content. Thus, the state of the art has many studies on sport or newscast transmissions. Our goal is to detect key audiovisual events using a variety of descriptors and generic classifiers. We propose a generic approach that is able to assess all TV-show indexing problems. This enables an operator to use one single tool to infer a logical structure. Our approach can be considered as ``semi-automatic'' in the sense that the training data is collected on the fly by the operator who is asked to arbitrarily select one video excerpt of each concept involved. We have assessed a wide selection of audio and video features, used MKL as a feature selection algorithm and then built various content detectors and segmentors useful for imposing broad semantic classes on television data.This master's thesis was set forth by TELECOM ParisTech and was begun there March 1, 2010. This final report was submitted to TELECOM ParisTech, NTNU and Institute EURECOM August 29, 2010.nb_NO
dc.languageengnb_NO
dc.publisherInstitutt for elektronikk og telekommunikasjonnb_NO
dc.subjectntnudaim:5746no_NO
dc.subjectSIE7 kommunikasjonsteknologi
dc.subjectLyd- og bildebehandling
dc.titleAudiovisual Contents Segmentationnb_NO
dc.typeMaster thesisnb_NO
dc.source.pagenumber85nb_NO
dc.contributor.departmentNorges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for elektronikk og telekommunikasjonnb_NO


Tilhørende fil(er)

Thumbnail
Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel