Cost-based unbalanced R-trees (CUR-trees) are a cost-function based data structure for spatial data. CUR-trees are constructed specifically to improve the evaluation of intersection queries, the most basic selection query in an R-tree. A CUR-tree is built taking into account a given query distribution for the queries and a cost model for their execution. Depending on the expected frequency of access, objects or subtrees are stored higher up in the tree. After each insertion in the tree, local reorganizations of a node and its children have their expected query cost evaluated, and a reorganization is performed if this is beneficial. No strict balancing of the trees applies allowing the tree to unfold solely based on the result of the cost evaluation. We present our cost-based approach and describe the evaluation and reorganization operations based on the cost function. We present a cost model for in-memory access costs and we present three different query models. In our experiments, we compare the performance of the CUR-tree to the R-tree and the R-tree. The CUR-tree is able to significantly improve intersection query performance, without unacceptably increasing the cost to build the tree. The use of R-trees for in-memory data reflects the high (and growing) cost of bringing data from RAM into the CPU cache relative to the cost of other computation.
|Title of host publication||Proceedings of the International Conference on Scientific and Statistical Database Management, SSDBM|
|Number of pages||10|
|Publication status||Published - 1 Jan 2001|