Similarity Search in Large Databases using Metric Indexing and Standard Database Access Methods
Master thesis
Permanent lenke
http://hdl.handle.net/11250/250780Utgivelsesdato
2009Metadata
Vis full innførselSamlinger
Sammendrag
Several methods exists for performing similarity searches quickly using metric indexing. However, most of these methods are based on main memory indexing or require specialized disk access methods. We have described and implemented a method combining standard database access methods with the LAESA Linear Approximating Eliminating Search Algorithm to perform both range and K nearest neighbour (KNN) queries using standard database access methods and relational operators. We have studied and tested various existing implementations of R-trees, and implemented the R*-tree. We also found that some of the optimizations in R*-trees was damaging to the response time at very high dimensionality. This is mostly due to the increased CPU time removing any benefit from reducing the number of disk accesses. Further we have performed comprehensive experiments using different access methods, join operators, pivot counts and range limits for both range and nearest neighbour queries. We will also implement and experiment using a multi-threaded execution environment running on several processors.