Vis enkel innførsel

dc.contributor.advisorBratsberg, Svein Eriknb_NO
dc.contributor.authorFostvedt, Fredrik Persennb_NO
dc.contributor.authorEriksen, Stephan Nordnesnb_NO
dc.date.accessioned2014-12-19T13:41:26Z
dc.date.available2014-12-19T13:41:26Z
dc.date.created2014-09-30nb_NO
dc.date.issued2014nb_NO
dc.identifier751071nb_NO
dc.identifierntnudaim:11372nb_NO
dc.identifier.urihttp://hdl.handle.net/11250/253739
dc.description.abstractFiles and the unique terms they contain can be modeled as a graph where the vertices are files and terms and the edges describe containment. Can a graph databases be used for search and retrieval of local files? What problems arise and which optimizations can be done? How does such a method compare to today's file retrieval methods?The problem is approached in this project as a potentially commerciallizable software application. The intent is to create an environment where graph based file retrieval algorithms can easily be created, explored, tested and put in production. A highly modifiable Ruby based client server file retrieval application using Titan Aurelius Graph Database and rexpro is created. The server side consists of a Ruby on Rails back end with a rexpro connection to the graph database. The server can manage connections from several clients. The client side allows the user to index their files in the graph database on the server and run search queries for strings. Algorithms in groovy for Titan Aurelius can easily be implemented and tested on the server. Though the application is well suited for testing graph database file retrieval algorithms, only one was designed, implemented and tested. This is due to the time constraints on the project. The algorithm that was implemented and tested was ran on the indexed files of one of the project members on a handful of subjectively chosen search terms. It was a relatively simple algorithm that did not benefit from the full potential of a graph based file retrieval solution. The test was done to get an initial feel for the precision and recall of the algorithm and compare it to OSX Spotlight, which is the most highly developed local file retrieval service. The framework has proved simple enough to run and test algorithms. Because there was little test driven development involved, some uncertainty remains in the results in terms of what results the algorithms that were tested actually produced. The one algorithm that was designed and tested was pitted against OSX Spotlight. The algorithm showed a significantly lower performance than OSX Spotlight in terms of average precision and recall. Many reasons for this were identifiable. For instance, file types that were very unlikely to be a match were not filtered out. In a few cases, the application performed better than OSX Spotlight. It is too soon to determine for certain that a graph based file retrieval solution can compete with todays solutions. It does however have some precision and recall and has the potential to be significantly improved from its current state.nb_NO
dc.languageengnb_NO
dc.publisherInstitutt for datateknikk og informasjonsvitenskapnb_NO
dc.titleThe use of graph databases in file retrievalnb_NO
dc.typeMaster thesisnb_NO
dc.source.pagenumber135nb_NO
dc.contributor.departmentNorges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for datateknikk og informasjonsvitenskapnb_NO


Tilhørende fil(er)

Thumbnail
Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel