• norsk
    • English
  • English 
    • norsk
    • English
  • Login
View Item 
  •   Home
  • Fakultet for informasjonsteknologi og elektroteknikk (IE)
  • Institutt for datateknologi og informatikk
  • View Item
  •   Home
  • Fakultet for informasjonsteknologi og elektroteknikk (IE)
  • Institutt for datateknologi og informatikk
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Temporal Text Mining: The TTM Testbench

Fivelstad, Ole Kristian
Master thesis
Thumbnail
View/Open
347501_FULLTEXT01.pdf (986.1Kb)
347501_COVER01.pdf (47.53Kb)
347501_ATTACHMENT01.zip (33.02Mb)
URI
http://hdl.handle.net/11250/250496
Date
2007
Metadata
Show full item record
Collections
  • Institutt for datateknologi og informatikk [7462]
Abstract
This master thesis presents the Temporal Text Mining(TTM) Testbench, an application for discovering association rules in temporal document collections. It is a continuation of work done in a project the fall of 2005 and work done in a project the fall of 2006. These projects have laid the foundation for this thesis. The focus of the work is on identifying and extracting meaningful terms from textual documents to improve the meaningfulness of the mined association rules. Much work has been done to compile the theoretical foundation of this project. This foundation has been used for assessing different approaches for finding meaningful and descriptive terms. The old TTM Testbench has been extended to include usage of WordNet, and operations for finding collocations, performing word sense disambiguation, and for extracting higher-level concepts and categories from the individual documents. A method for rating association rules based on the semantic similarity of the terms present in the rules has also been implemented. This was done in an attempt to narrow down the result set, and filter out rules which are not likely to be interesting. Experiments performed with the improved application shows that the usage of WordNet and the new operations can help increase the meaningfulness of the rules. One factor which plays a big part in this, is that synonyms of words are added to make the term more understandable. However, the experiments showed that it was difficult to decide if a rule was interesting or not, this made it impossible to draw any conclusions regarding the suitability of semantic similarity for finding interesting rules. All work on the TTM Testbench so far has focused on finding association rules in web newspapers. It may however be useful to perform experiments in a more limited domain, for example medicine, where the interestingness of a rule may be more easily decided.
Publisher
Institutt for datateknikk og informasjonsvitenskap

Contact Us | Send Feedback

Privacy policy
DSpace software copyright © 2002-2019  DuraSpace

Service from  Unit
 

 

Browse

ArchiveCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsDocument TypesJournalsThis CollectionBy Issue DateAuthorsTitlesSubjectsDocument TypesJournals

My Account

Login

Statistics

View Usage Statistics

Contact Us | Send Feedback

Privacy policy
DSpace software copyright © 2002-2019  DuraSpace

Service from  Unit