Scalable Region Query Processing in Spatial and Spatiotemporal Databases

Lund, Vigleik

dc.contributor.advisor	Bratsberg, Svein Erik
dc.contributor.author	Lund, Vigleik
dc.date.accessioned	2018-07-19T14:00:23Z
dc.date.available	2018-07-19T14:00:23Z
dc.date.created	2018-06-11
dc.date.issued	2018
dc.identifier	ntnudaim:18989
dc.identifier.uri	http://hdl.handle.net/11250/2506149
dc.description.abstract	In recent years, there has been an increased focus on providing support for large spatial and spatiotemporal point data by using locality-preserving linearization techniques to transform multi-dimensional data into one-dimensional data and store the transformed data in scalable key-value NoSQL stores. Spatial and spatiotemporal region queries are then executed by decomposing the query into multiple linear range scans on the linearized system and processing the range scans in a filter-and-refine manner. The process of decomposing a multi-dimensional region query into a number of smaller linear range scans and reduce the number of false hits is essential for achieving efficient query processing in these systems. However, there exists a trade-off between the overhead due to creating additional range scans and the overhead due to removing false positives. We have, in this thesis, studied various existing methods of decomposing spatial and spatiotemporal region queries into linear range scans. Based on the research, we have proposed a cost-driven approach that decomposes a query according to the data distribution, which is estimated by a histogram. The proposed method has been implemented in a simple database prototype and evaluated with real-world test data, and compared against a popular and more space-driven decomposition approach. The experiments show that our method is able to outperform the baseline approach for both spatial and spatiotemporal point data with most improvement for the latter. One of the main strengths of the proposed method over the baseline is the capability to create an accurate decomposition with a relatively few range scans when dealing with skewed data. The downside is having to maintain a histogram over the data, which can be prohibitively expensive when dealing with large datasets. However, measures such as sampling were found to reduce this cost significantly with minimal impact on performance.
dc.language	eng
dc.publisher	NTNU
dc.subject	Datateknologi, Databaser og søk
dc.title	Scalable Region Query Processing in Spatial and Spatiotemporal Databases
dc.type	Master thesis

Tilhørende fil(er)

Filnavn:: 18989_FULLTEXT.pdf
Størrelse:: 1.810Mb
Format:: PDF

Åpne

Filnavn:: 18989_ATTACHMENT.zip
Størrelse:: 33.27Mb
Format:: application/zip

Åpne

Filnavn:: 18989_COVER.pdf
Størrelse:: 1.556Mb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Institutt for datateknologi og informatikk [6551]

Vis enkel innførsel