Storing and Querying RDF in Mars
Abstract
As part of the Semantic Web movement, the Resource Description Framework (RDF) is gaining momentum as a format for storing data, particularly metadata. The SPARQL Protocol and RDF Query Language is a SQL-like query language, recommended by W3C for querying RDF data. FAST is exploring the possibilities of supporting storage and querying of RDF data in their Mars search engine. To facilitate this, a SPARQL parser has been created for the Microsoft .NET Framework, using the MPLex and MPPG tools from Microsoft's Managed Babel package. This thesis proposes a solution for efficiently storing and retrieving RDF data in Mars, based on decomposition and B+ Tree indexing. Further, a method for transforming SPARQL queries into Mars operator graphs is described. Finally, the implementation of a prototype implementation is discussed. The prototype has been developed in collaboration with FAST and has required customized indexing in Mars. Some deviations from the proposed solution were made in order to create a working prototype within the available time frame. The focus has been on exploring possibilities, and performance has thus not been a priority, neither in indexing nor in evaluation.