System for Distributed Text Mining
Master thesis
Permanent lenke
http://hdl.handle.net/11250/252624Utgivelsesdato
2011Metadata
Vis full innførselSamlinger
Sammendrag
Text mining presents us with new possibilities for the use of collections of documents.There exists a large amount of hidden implicit information inside these collection, which text mining techniques may help us to uncover. Unfortunately, these techniques generally requires large amounts of computational power. This is addressed by the introduction of distributed systems and methods for distributed processing, such as Hadoop and MapReduce.This thesis aims to describe, design, implement and evaluate a prototypical systemfor distributed text mining in a MapReduce/Hadoop environment, called TextMiner.