Please note: This master’s thesis presentation will take place online.
Fadhil Abubaker, Master’s candidate
David R. Cheriton School of Computer Science
Supervisor: Professor Khuzaima Daudjee
This thesis presents the NIR+-Tree, a disk-resident R-Tree variant that eliminates overlap among its MBRs. The NIR+-Tree is an extension of the main-memory NIR-Tree, adopting techniques for efficient storage and retrieval on disk. By employing non-intersecting polygons instead of rectangles for data partitioning, the NIR+-Tree minimizes the number of spurious disk accesses incurred due to MBR overlap. To stabilize the height of the NIR+-Tree, the dynamically-sized polygons are stored in main-memory using an efficient encoding. Experimental results show that the NIR+-Tree is efficient at point queries and selective range queries, using 2× to 5× fewer disk accesses than its closest competitors, the R+-Tree and the R*-Tree.
Additionally, this thesis investigates bulk-loading algorithms for the NIR+-Tree. Bulk-loading can be used to efficiently construct an index from a pre-defined set of data. Bulk-loading algorithms that generate MBRs with significant overlap create NIR+-Trees with undesirable, complex polygons. This thesis shows that top-down bulk-loading algorithms are better suited for the NIR+-Tree than bottom-up algorithms, due to their overlap minimizing properties. These techniques enable the NIR+-Tree to be a complete, disk-based indexing solution for spatial data.