dc.contributor.advisor | Öztürk, Pinar | nb_NO |
dc.contributor.author | Arizaleta, Mikel | nb_NO |
dc.date.accessioned | 2014-12-19T13:34:01Z | |
dc.date.available | 2014-12-19T13:34:01Z | |
dc.date.created | 2010-09-04 | nb_NO |
dc.date.issued | 2009 | nb_NO |
dc.identifier | 348825 | nb_NO |
dc.identifier | ntnudaim:4769 | nb_NO |
dc.identifier.uri | http://hdl.handle.net/11250/251379 | |
dc.description.abstract | In this thesis, we have treated the problem of separating content from noise on news websites. We have approached this problem by using TiMBL, a memory-based learning software. We have studied the relevance of the similarity in the training data and the effect of data size in the performance of the extractions. | nb_NO |
dc.language | eng | nb_NO |
dc.publisher | Institutt for datateknikk og informasjonsvitenskap | nb_NO |
dc.subject | ntnudaim | no_NO |
dc.subject | SIF2 datateknikk | no_NO |
dc.subject | Intelligente systemer | no_NO |
dc.title | Structured data extraction: separating content from noise on news websites | nb_NO |
dc.type | Master thesis | nb_NO |
dc.source.pagenumber | 65 | nb_NO |
dc.contributor.department | Norges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for datateknikk og informasjonsvitenskap | nb_NO |