dc.contributor.advisor | Ramampiaro, Herindrasana | nb_NO |
dc.contributor.author | Blixhavn, Øystein Hoel | nb_NO |
dc.date.accessioned | 2014-12-19T13:42:19Z | |
dc.date.available | 2014-12-19T13:42:19Z | |
dc.date.created | 2014-12-07 | nb_NO |
dc.date.issued | 2014 | nb_NO |
dc.identifier | 769314 | nb_NO |
dc.identifier | ntnudaim:12121 | nb_NO |
dc.identifier.uri | http://hdl.handle.net/11250/254014 | |
dc.description.abstract | This master thesis looks at how clustering techniques can be appliedto a collection of scientific documents. Approximately one year of serverlogs from the CERN Document Server (CDS) are analyzed and preprocessed.Based on the findings of this analysis, and a review of thecurrent state of the art, three different clustering methods are selectedfor further work: Simple k-Means, Hierarchical Agglomerative Clustering(HAC) and Graph Partitioning. In addition, a custom, agglomerativeclustering algorithm is made in an attempt to tackle some of the problemsencountered during the experiments with k-Means and HAC. The resultsfrom k-Means and HAC are poor, but the graph partitioning methodyields some promising results.The main conclusion of this thesis is that the inherent clusters withinthe user-record relationship of a scientific collection are nebulous, butexisting. Furthermore, the most common clustering algorithms are notsuitable for this type of clustering. | nb_NO |
dc.language | eng | nb_NO |
dc.publisher | Institutt for datateknikk og informasjonsvitenskap | nb_NO |
dc.subject | ntnudaim:12121 | no_NO |
dc.subject | MTDT Datateknologi | no_NO |
dc.subject | Data- og informasjonsforvaltning | no_NO |
dc.title | Clustering User Behavior in Scientific Collections | nb_NO |
dc.type | Master thesis | nb_NO |
dc.source.pagenumber | 114 | nb_NO |
dc.contributor.department | Norges teknisk-naturvitenskapelige universitet, Fakultet for informasjonsteknologi, matematikk og elektroteknikk, Institutt for datateknikk og informasjonsvitenskap | nb_NO |