TY - GEN
T1 - Database similarity join for metric spaces
AU - Silva, Yasin
AU - Pearson, Spencer S.
AU - Cheney, Jason A.
PY - 2013/10/30
Y1 - 2013/10/30
N2 - Similarity Joins are recognized among the most useful data processing and analysis operations. They retrieve all data pairs whose distances are smaller than a predefined threshold ε. While several standalone implementations have been proposed, very little work has addressed the implementation of Similarity Join as a physical database operator. In this paper, we focus on the study, design and implementation of a Similarity Join database operator for any dataset that lies in a metric space (DBSimJoin). We describe the changes in each query engine module to implement DBSimJoin and provide details of our implementation in PostgreSQL. The extensive performance evaluation shows that DBSimJoin significantly outperforms alternative approaches.
AB - Similarity Joins are recognized among the most useful data processing and analysis operations. They retrieve all data pairs whose distances are smaller than a predefined threshold ε. While several standalone implementations have been proposed, very little work has addressed the implementation of Similarity Join as a physical database operator. In this paper, we focus on the study, design and implementation of a Similarity Join database operator for any dataset that lies in a metric space (DBSimJoin). We describe the changes in each query engine module to implement DBSimJoin and provide details of our implementation in PostgreSQL. The extensive performance evaluation shows that DBSimJoin significantly outperforms alternative approaches.
UR - http://www.scopus.com/inward/record.url?scp=84886384961&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84886384961&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-41062-8_27
DO - 10.1007/978-3-642-41062-8_27
M3 - Conference contribution
AN - SCOPUS:84886384961
SN - 9783642410611
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 266
EP - 279
BT - Similarity Search and Applications - 6th International Conference, SISAP 2013, Proceedings
T2 - 6th International Conference on Similarity Search and Applications, SISAP 2013
Y2 - 2 October 2013 through 4 October 2013
ER -