High Availability and Partitioning in Key-Value Stores
MetadataShow full item record
Maintaining and ensuring availability in modern distributed key-value stores at scale require a deep understanding of the clustering internals and the topology of the cluster. Detecting and fixing problems is often a complicated and error-prone task executed manually by system administrators. This thesis consists of two parts; the first part describes availability and partitioning methods used in modern key-value stores. The collected information is then used to evaluate missing features in the Redis Cluster system and as a base for the second part, implementing a cluster manager for Redis Cluster. The cluster manager should be able to detect and fix problems within the cluster without manual help from system administrators; a tool like this reduces the operational burden of running Redis Cluster deployments at scale. We have implemented an early version of a cluster manager for Redis Cluster. The cluster manager is designed to run alongside a new or existing deployment of Redis Cluster. The implementation consists of two components, a manager, and an agent. The manager is responsible for watching the cluster and planning operations. The agent runs alongside each Redis instance and is responsible for executing the commands planned by the manager. We performed a set of test cases to verify the implementation. The test cases monitor the cluster state while cluster failures are introduced. The results show that the cluster manager can detect and fix problems without manual intervention from system administrators. Our product is an essential step in the right direction for an autonomous cluster solution in Redis.