- Main
Scaling Spatial Overlay Operations and Flock Pattern Discovery
- Calderon, Andres Oswaldo
- Advisor(s): Tsotras, Vassilis J
Abstract
This thesis proposes scalable solutions to two significant spatial problems: computing overlay operations and discovering flock patterns. Overlay operations are typically computed among polygon layers using spatial data structures designed for complex geometric relationships. One such structure is the Doubly Connected Edge List (DCEL), an edge-list format widely used in spatial applications for performing planar topological computations. The overlay operation, which combines the DCELs of two input layers, enables efficient spatial queries such as intersection, union, and difference between layers. However, existing sequential methods for computing overlays struggle to scale and often fail to process large datasets, such as the US Census tracts. In this thesis, we present a distributed, scalable approach for computing the overlay operation and its associated queries. We address the challenges inherent in distributing the overlay computation and introduce several optimizations that enhance performance, making these computations feasible for large-scale spatial datasets.
The second part of this thesis extends upon the above proposed approach by introducing a novel spatial partitioner based on the kd-tree spatial index. This partitioner optimizes DCEL partitioning and overlay operations by leveraging data distributions, resulting in significantly improved performance and more efficient space utilization. Additionally, we adapt the optimization techniques developed for DCEL overlay operations to address a new problem: the polygonization of dangling edges, cut edges, and polygons.
The final part of this thesis presents a scalable technique for detecting moving flock patterns in large trajectory databases. A flock pattern represents a group of entities moving closely together within a defined spatial radius over a specified time interval. Traditional sequential algorithms, though effective, struggle with high computational costs on large, dense datasets. This thesis proposes a distributed framework that leverages spatial partitioning and parallel processing to accelerate flock detection. By addressing challenges in spatial and temporal joins across large datasets, introducing partition-based parallelism, and implementing strategies to manage flock patterns spanning multiple partitions, this approach significantly reduces processing time. Experimental evaluations on synthetic datasets demonstrate substantial improvements in scalability and efficiency over conventional methods.
Main Content
Enter the password to open this PDF file:
-
-
-
-
-
-
-
-
-
-
-
-
-
-