Ph.D. Defense: Michael Gowanlock, “In-Memory Distance Threshold Searches on Moving Object Trajectories”

Wed 2/11/2015, 3PM

POST 302


Processing moving object trajectories arises in many applications. To gaininsight into target domains, historical continuous trajectory similarity searches find those trajectories that have similar attributes in common. In this work, we focus on a trajectory similarity search, the distance threshold search, which finds all trajectories within a given distance of a query trajectory over a time interval. We investigate novel approaches for the efficient processing of these searches over in-memory databases on the CPU and on the GPU.

On the CPU, we use an in-memory R-tree index to store trajectory data and evaluate its performance on a range of trajectory datasets. The R-tree is a well-studied out-of-core indexing scheme. However, in the context of in-memory searches, we find that the traditional notion of considering good trajectory splits by minimizing the volume of MBBs so as to reduce index overlap is not well-suited to improve the performance of in-memory distance threshold searches. Another finding is that computing good trajectory splits to achieve a trade-off between the time to search the index-tree and the time to process the candidate set of trajectory segments may not be beneficial when considering large datasets.

The GPU is an attractive technology for distance threshold searches because of the inherent data parallelism involved in calculating moving distances between pairs of polylines; however, challenges arise from the SIMD programming model and limited GPU memory. We study the processing of distance threshold searches using GPU implementations that avoid the use of index-trees. We focus on two scenarios for which we propose GPU-friendly indexing schemes for trajectory data. In the first scenario the database fits in GPU memory but the entire query set does not, thereby requiring back-and-forth communication between the host and the GPU. We find that our indexing scheme yields significant speedup over a multithreaded CPU implementation. We gain insight into our GPU algorithms via a performance model that proves accurate across a range of experimental scenarios. In the second scenario, we study the case where both the query set and the database fit in GPU memory. With this relaxed memory constraint, the design space for trajectory indexing schemes is larger. We investigate indexing schemes with temporal, spatial, and spatiotemporal selectivity. Through performance analyses, we determine the salient characteristics of the trajectories that are conducive to the proposed indexes. In particular, we find that there are two complementary niches for GPU and CPU distance threshold search algorithms. Namely, using the GPU is preferable when the trajectory dataset is large and/or when the threshold distance is large. The GPU is thus the better choice for large-scale scientific trajectory dataset processing, as for instance in the motivating application for this work in the domain of Astrobiology.