Structured data extraction: separating content from noise on news websites

dc.contributor.advisor	Öztürk, Pinar	nb_NO
dc.contributor.author	Arizaleta, Mikel	nb_NO
dc.date.accessioned	2014-12-19T13:34:01Z
dc.date.available	2014-12-19T13:34:01Z
dc.date.created	2010-09-04	nb_NO
dc.date.issued	2009	nb_NO
dc.identifier	348825	nb_NO
dc.identifier	ntnudaim:4769	nb_NO
dc.identifier.uri	http://hdl.handle.net/11250/251379
dc.description.abstract	In this thesis, we have treated the problem of separating content from noise on news websites. We have approached this problem by using TiMBL, a memory-based learning software. We have studied the relevance of the similarity in the training data and the effect of data size in the performance of the extractions.	nb_NO
dc.language	eng	nb_NO
dc.publisher	Institutt for datateknikk og informasjonsvitenskap	nb_NO
dc.subject	ntnudaim	no_NO
dc.subject	SIF2 datateknikk	no_NO
dc.subject	Intelligente systemer	no_NO
dc.title	Structured data extraction: separating content from noise on news websites	nb_NO
dc.type	Master thesis	nb_NO
dc.source.pagenumber	65	nb_NO
dc.contributor.department	Norges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for datateknikk og informasjonsvitenskap	nb_NO