Classifying OCR Errors for Use in Retrieval Methods for Norwegian Text

Fosse, Eirik

Fosse, Eirik

Master thesis

Åpne

18058_FULLTEXT.pdf (Låst)

18058_COVER.pdf (Låst)

18058_ATTACHMENT.zip (Låst)

Permanent lenke

http://hdl.handle.net/11250/2615887

Utgivelsesdato

2018

Metadata

Vis full innførsel

Samlinger

Institutt for datateknologi og informatikk [6819]

Sammendrag

Analysing Norwegian documents processed with Optical Character Recognition. Using information gathered about errors, a retrieval system is tested to improve information retrieval despite OCR errors in the text. Corpus gathered from the National Library of Norway.

Utgiver

NTNU