Storing data once in M-trees and PM-trees: Revisiting the building principles of metric access methods

Humberto Razente; Maria Camila N. Barioni; Yasin N. Silva

doi:10.1016/j.is.2021.101896

Storing data once in M-trees and PM-trees: Revisiting the building principles of metric access methods

Humberto Razente, Maria Camila N. Barioni, Yasin N. Silva

Mathematical and Natural Sciences, School of (SMNS)

Research output: Contribution to journal › Article › peer-review

Abstract

Since the introduction of the M-tree, a fundamental tree-based data structure for indexing multi-dimensional information, several structural enhancements have been proposed. One of the most effective ones is the use of additional global pivots that resulted in the PM-tree. These two indexing structures, however, can store the same data element in multiple nodes. In this article, we revisit both the M-tree and the PM-tree to propose a new construction algorithm that stores data elements only once in the tree hierarchies. The main challenge to accomplish this, is to properly select data elements when an inner node split is needed. To address it, we propose an approach based on the use of aggregate nearest neighbor queries. The new algorithms enable building the search result set as data elements are evaluated for pruning during traversal, allowing faster retrieval of k-nearest neighbors and range searches. We conducted an extensive set of experiments with different real datasets. The results show that our proposed algorithms have considerably superior performance when compared with the standard M-tree and PM-tree.

Original language	English (US)
Article number	101896
Journal	Information Systems
Volume	104
DOIs	https://doi.org/10.1016/j.is.2021.101896
State	Published - Feb 2022

Keywords

Ball-partitioning indexing
M-tree
Metric access methods
PM-tree
Range query
k-nearest neighbor query

ASJC Scopus subject areas

Software
Information Systems
Hardware and Architecture

Access to Document

10.1016/j.is.2021.101896

Cite this

@article{a341cd8089ff4dbabe7a3c973b74f2bf,

title = "Storing data once in M-trees and PM-trees: Revisiting the building principles of metric access methods",

abstract = "Since the introduction of the M-tree, a fundamental tree-based data structure for indexing multi-dimensional information, several structural enhancements have been proposed. One of the most effective ones is the use of additional global pivots that resulted in the PM-tree. These two indexing structures, however, can store the same data element in multiple nodes. In this article, we revisit both the M-tree and the PM-tree to propose a new construction algorithm that stores data elements only once in the tree hierarchies. The main challenge to accomplish this, is to properly select data elements when an inner node split is needed. To address it, we propose an approach based on the use of aggregate nearest neighbor queries. The new algorithms enable building the search result set as data elements are evaluated for pruning during traversal, allowing faster retrieval of k-nearest neighbors and range searches. We conducted an extensive set of experiments with different real datasets. The results show that our proposed algorithms have considerably superior performance when compared with the standard M-tree and PM-tree.",

keywords = "Ball-partitioning indexing, M-tree, Metric access methods, PM-tree, Range query, k-nearest neighbor query",

author = "Humberto Razente and Barioni, {Maria Camila N.} and Silva, {Yasin N.}",

note = "Funding Information: This work has been supported by the Coordena{\c c}{\~a}o de Aperfei{\c c}oamento de Pessoal de N{\'i}vel Superior - Brasil (CAPES) – Finance Code 001 , and by the Brazilian National Council for Scientific and Technological Development (CNPq) . Publisher Copyright: {\textcopyright} 2021 Elsevier Ltd",

year = "2022",

month = feb,

doi = "10.1016/j.is.2021.101896",

language = "English (US)",

volume = "104",

journal = "Information Systems",

issn = "0306-4379",

publisher = "Elsevier Limited",

}

TY - JOUR

T1 - Storing data once in M-trees and PM-trees

T2 - Revisiting the building principles of metric access methods

AU - Razente, Humberto

AU - Barioni, Maria Camila N.

AU - Silva, Yasin N.

N1 - Funding Information: This work has been supported by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) – Finance Code 001 , and by the Brazilian National Council for Scientific and Technological Development (CNPq) . Publisher Copyright: © 2021 Elsevier Ltd

PY - 2022/2

Y1 - 2022/2

N2 - Since the introduction of the M-tree, a fundamental tree-based data structure for indexing multi-dimensional information, several structural enhancements have been proposed. One of the most effective ones is the use of additional global pivots that resulted in the PM-tree. These two indexing structures, however, can store the same data element in multiple nodes. In this article, we revisit both the M-tree and the PM-tree to propose a new construction algorithm that stores data elements only once in the tree hierarchies. The main challenge to accomplish this, is to properly select data elements when an inner node split is needed. To address it, we propose an approach based on the use of aggregate nearest neighbor queries. The new algorithms enable building the search result set as data elements are evaluated for pruning during traversal, allowing faster retrieval of k-nearest neighbors and range searches. We conducted an extensive set of experiments with different real datasets. The results show that our proposed algorithms have considerably superior performance when compared with the standard M-tree and PM-tree.

AB - Since the introduction of the M-tree, a fundamental tree-based data structure for indexing multi-dimensional information, several structural enhancements have been proposed. One of the most effective ones is the use of additional global pivots that resulted in the PM-tree. These two indexing structures, however, can store the same data element in multiple nodes. In this article, we revisit both the M-tree and the PM-tree to propose a new construction algorithm that stores data elements only once in the tree hierarchies. The main challenge to accomplish this, is to properly select data elements when an inner node split is needed. To address it, we propose an approach based on the use of aggregate nearest neighbor queries. The new algorithms enable building the search result set as data elements are evaluated for pruning during traversal, allowing faster retrieval of k-nearest neighbors and range searches. We conducted an extensive set of experiments with different real datasets. The results show that our proposed algorithms have considerably superior performance when compared with the standard M-tree and PM-tree.

KW - Ball-partitioning indexing

KW - M-tree

KW - Metric access methods

KW - PM-tree

KW - Range query

KW - k-nearest neighbor query

UR - http://www.scopus.com/inward/record.url?scp=85116056637&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85116056637&partnerID=8YFLogxK

U2 - 10.1016/j.is.2021.101896

DO - 10.1016/j.is.2021.101896

M3 - Article

AN - SCOPUS:85116056637

SN - 0306-4379

VL - 104

JO - Information Systems

JF - Information Systems

M1 - 101896

ER -

Storing data once in M-trees and PM-trees: Revisiting the building principles of metric access methods

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this