Managing Index Repartitioning
Master thesis
Permanent lenke
http://hdl.handle.net/11250/252444Utgivelsesdato
2011Metadata
Vis full innførselSamlinger
Sammendrag
Careful architectural decisions are required in order to create a highly available and scalable search system. This requires an in-depth analysis and understanding of the architecture and context of each deployment. Different requirements placed upon the system by different deployments mean different solutions provide the best case by case result, thus benchmarks provide an invaluable source of information.This thesis provides an overview of common components and important aspects of a distributed search system. It then gives an overview of different partitioning techniques before going into the details of repartitioning and rebalancing in a document-partitioned full-text search system.A processing framework that draws inspiration from flow-based programming literature is introduced, which is shown a valuable tool in creating custom tailored search solutions. The implementation is used to benchmark different repartitioning and rebalancing strategies.In conclusion, the techniques mentioned in the thesis show great promise in creating custom, maintainable and flexible partitions. The processing framework enables each specific deployment to easily compare different partitioning schemes and associated manageability and maintenance costs to determine the best fit for any given situation.