Vis enkel innførsel

dc.contributor.advisorBratsberg, Svein Erik
dc.contributor.authorPeter, Christian
dc.date.accessioned2015-10-09T14:01:31Z
dc.date.available2015-10-09T14:01:31Z
dc.date.created2015-05-27
dc.date.issued2015
dc.identifierntnudaim:10688
dc.identifier.urihttp://hdl.handle.net/11250/2353589
dc.description.abstractThe join operation is one of the most valuable operations found in traditional database management systems. With this operation, it is possible to join data from multiple tables. Today, most NoSQL systems do not support the join operation. One of the reasons for why these systems do not support this operation is that it is too time-consuming when the data is replicated across multiple nodes. However, it is possible to accomplish the same result with two other options, denormalizing of the data or joining at the application level. Denormalizing will result in more redundant data and both options will involve the user more in the execution of join. Support for the join operation in the query language of a NoSQL system may ease the change of database system for some users that only wants to use a NoSQL system where data can be joined. This thesis presents an implementation of the equijoin in Cassandra since the two other options shown above are already covered by others. Cassandra is a NoSQL system classified as an extensible record store that is quite similar to the relational model used by, for example, MySQL. This implementation shows how the parsing, preparation and execution of the query are performed. Enabling support for queries that can be written in Cassandra Query Language (CQL) is done in the parsing step. A way of finding the join order that allows only one read of the table from memory or disk is also implemented. This join order is also slightly optimized where selections in a where clause are executed early on in the execution step. During execution, the nested loop join is used to accomplish the process of joining tables. The implementation of join in Cassandra shows a significant worse execution time than MySQL. One of the problems with Cassandra is the underlying architecture that is not designed for the purpose of joining data from multiple tables. However, this thesis shows that it is possible to support the join operation in Cassandra, but it still need some further work to execute within a reasonable time.
dc.languageeng
dc.publisherNTNU
dc.subjectInformatikk, Informasjonsforvaltning
dc.titleSupporting the Join Operation in a NoSQL System - Mastering the internals of Cassandra
dc.typeMaster thesis
dc.source.pagenumber116


Tilhørende fil(er)

Thumbnail
Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel