TY - JOUR
T1 - Storing data once in M-trees and PM-trees
T2 - Revisiting the building principles of metric access methods
AU - Razente, Humberto
AU - Barioni, Maria Camila N.
AU - Silva, Yasin N.
N1 - Funding Information:
This work has been supported by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) – Finance Code 001 , and by the Brazilian National Council for Scientific and Technological Development (CNPq) .
Publisher Copyright:
© 2021 Elsevier Ltd
PY - 2022/2
Y1 - 2022/2
N2 - Since the introduction of the M-tree, a fundamental tree-based data structure for indexing multi-dimensional information, several structural enhancements have been proposed. One of the most effective ones is the use of additional global pivots that resulted in the PM-tree. These two indexing structures, however, can store the same data element in multiple nodes. In this article, we revisit both the M-tree and the PM-tree to propose a new construction algorithm that stores data elements only once in the tree hierarchies. The main challenge to accomplish this, is to properly select data elements when an inner node split is needed. To address it, we propose an approach based on the use of aggregate nearest neighbor queries. The new algorithms enable building the search result set as data elements are evaluated for pruning during traversal, allowing faster retrieval of k-nearest neighbors and range searches. We conducted an extensive set of experiments with different real datasets. The results show that our proposed algorithms have considerably superior performance when compared with the standard M-tree and PM-tree.
AB - Since the introduction of the M-tree, a fundamental tree-based data structure for indexing multi-dimensional information, several structural enhancements have been proposed. One of the most effective ones is the use of additional global pivots that resulted in the PM-tree. These two indexing structures, however, can store the same data element in multiple nodes. In this article, we revisit both the M-tree and the PM-tree to propose a new construction algorithm that stores data elements only once in the tree hierarchies. The main challenge to accomplish this, is to properly select data elements when an inner node split is needed. To address it, we propose an approach based on the use of aggregate nearest neighbor queries. The new algorithms enable building the search result set as data elements are evaluated for pruning during traversal, allowing faster retrieval of k-nearest neighbors and range searches. We conducted an extensive set of experiments with different real datasets. The results show that our proposed algorithms have considerably superior performance when compared with the standard M-tree and PM-tree.
KW - Ball-partitioning indexing
KW - M-tree
KW - Metric access methods
KW - PM-tree
KW - Range query
KW - k-nearest neighbor query
UR - http://www.scopus.com/inward/record.url?scp=85116056637&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85116056637&partnerID=8YFLogxK
U2 - 10.1016/j.is.2021.101896
DO - 10.1016/j.is.2021.101896
M3 - Article
AN - SCOPUS:85116056637
SN - 0306-4379
VL - 104
JO - Information Systems
JF - Information Systems
M1 - 101896
ER -