Highly available database clusters: Repair with large segments
MetadataShow full item record
The goal for this master thesis is finding trends and behaviors in a highly available database cluster using InfiniBand and RDMA. This will be used to optimize configuration of the segment size in such systems. To find these trends and behaviors, a mockup model has been developed. The model consists of a simple DBMS that uses only the main memory for storing data and a checkpoint method to repair nodes after a node failure. During a repair, the model simulates use of InfiniBand and RDMA during checkpointing. To simulate clients connecting to and using the database, the model includes write operations on the database and measures how many write operations it can process per second. During a repair in a database cluster, one node will flush all of its data to a new node. This is done in small batches, just like in a checkpoint. In this model, it is simulated by a checkpoint module continuously simulating flushing data from one node to another. When the checkpoint is flushing a small part of the database fragment, the model uses Copy on Write to prevent lockout for the transactions. When a node fails the system is repaired by a second node which takes over as fast as possible. The second node must process transactions while transferring all of its data to a spare node. To achieve fast repair time, the system should transfer as big segments as possible. The problem with big segments is that it takes a long time to perform Copy on Write on them. With the mockup model, the repair time is measured, how many write operations are being performed per second and CPU usage depending on the segment size and number of clients using the database. This will give a good indication of what segment size is preferable. The results from the thesis show that there are huge advantages by using state-of-the-art technology such as InfiniBand and RDMA for repair in highly available database clusters. This technology and optimal configuration improves the availability and this thesis gives an indication that the segment size should not be more than 1 % of the database size. With use of InfiniBand and RDMA using this configuration and physical repair, the availability reaches class 9.