Presto-RDF

SPARQL querying over big RDF data

Mulugeta Mammo, Srividya Bansal

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

There has been a rapid increase in the amount of Resource Description Framework (RDF) data on the web. The processing of large volumes of RDF data requires an efficient storage and query-processing engine that can scale well with the volume of data. In the past two and half years, however, heavy users of big data systems, like Facebook, noted limitations with the query performance of these big data systems and began to develop new distributed query engines for big data that do not rely on map-reduce. Facebook’s Presto is one such example. This paper proposes an architecture based on Presto, called Presto-RDF, that can be used to process big RDF data. An evaluation of performance of Presto in processing big RDF data against Apache Hive is also presented. The results of the experiments show that Presto-RDF framework has a much higher performance than Apache Hive and native RDF store - 4store and it can be used to process big RDF data.

Original languageEnglish (US)
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PublisherSpringer Verlag
Pages281-293
Number of pages13
Volume9093
ISBN (Print)9783319195476
DOIs
StatePublished - 2015
Event26th Australasian Database Conference, ADC 2015 - Melbourne, Australia
Duration: Jun 4 2015Jun 7 2015

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9093
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other26th Australasian Database Conference, ADC 2015
CountryAustralia
CityMelbourne
Period6/4/156/7/15

Fingerprint

Data description
SPARQL
Resources
Engines
Query processing
Processing
Engine
Query
Framework
MapReduce
Query Processing
Big data
High Performance
Experiments

Keywords

  • Database performance
  • Evaluation
  • Querying
  • Semantic web data

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Mammo, M., & Bansal, S. (2015). Presto-RDF: SPARQL querying over big RDF data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9093, pp. 281-293). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 9093). Springer Verlag. https://doi.org/10.1007/978-3-319-19548-3_23

Presto-RDF : SPARQL querying over big RDF data. / Mammo, Mulugeta; Bansal, Srividya.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 9093 Springer Verlag, 2015. p. 281-293 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 9093).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Mammo, M & Bansal, S 2015, Presto-RDF: SPARQL querying over big RDF data. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 9093, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 9093, Springer Verlag, pp. 281-293, 26th Australasian Database Conference, ADC 2015, Melbourne, Australia, 6/4/15. https://doi.org/10.1007/978-3-319-19548-3_23
Mammo M, Bansal S. Presto-RDF: SPARQL querying over big RDF data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 9093. Springer Verlag. 2015. p. 281-293. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-319-19548-3_23
Mammo, Mulugeta ; Bansal, Srividya. / Presto-RDF : SPARQL querying over big RDF data. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 9093 Springer Verlag, 2015. pp. 281-293 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{6c76b46b411b44cca2a6cd82f979ca96,
title = "Presto-RDF: SPARQL querying over big RDF data",
abstract = "There has been a rapid increase in the amount of Resource Description Framework (RDF) data on the web. The processing of large volumes of RDF data requires an efficient storage and query-processing engine that can scale well with the volume of data. In the past two and half years, however, heavy users of big data systems, like Facebook, noted limitations with the query performance of these big data systems and began to develop new distributed query engines for big data that do not rely on map-reduce. Facebook’s Presto is one such example. This paper proposes an architecture based on Presto, called Presto-RDF, that can be used to process big RDF data. An evaluation of performance of Presto in processing big RDF data against Apache Hive is also presented. The results of the experiments show that Presto-RDF framework has a much higher performance than Apache Hive and native RDF store - 4store and it can be used to process big RDF data.",
keywords = "Database performance, Evaluation, Querying, Semantic web data",
author = "Mulugeta Mammo and Srividya Bansal",
year = "2015",
doi = "10.1007/978-3-319-19548-3_23",
language = "English (US)",
isbn = "9783319195476",
volume = "9093",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "281--293",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Presto-RDF

T2 - SPARQL querying over big RDF data

AU - Mammo, Mulugeta

AU - Bansal, Srividya

PY - 2015

Y1 - 2015

N2 - There has been a rapid increase in the amount of Resource Description Framework (RDF) data on the web. The processing of large volumes of RDF data requires an efficient storage and query-processing engine that can scale well with the volume of data. In the past two and half years, however, heavy users of big data systems, like Facebook, noted limitations with the query performance of these big data systems and began to develop new distributed query engines for big data that do not rely on map-reduce. Facebook’s Presto is one such example. This paper proposes an architecture based on Presto, called Presto-RDF, that can be used to process big RDF data. An evaluation of performance of Presto in processing big RDF data against Apache Hive is also presented. The results of the experiments show that Presto-RDF framework has a much higher performance than Apache Hive and native RDF store - 4store and it can be used to process big RDF data.

AB - There has been a rapid increase in the amount of Resource Description Framework (RDF) data on the web. The processing of large volumes of RDF data requires an efficient storage and query-processing engine that can scale well with the volume of data. In the past two and half years, however, heavy users of big data systems, like Facebook, noted limitations with the query performance of these big data systems and began to develop new distributed query engines for big data that do not rely on map-reduce. Facebook’s Presto is one such example. This paper proposes an architecture based on Presto, called Presto-RDF, that can be used to process big RDF data. An evaluation of performance of Presto in processing big RDF data against Apache Hive is also presented. The results of the experiments show that Presto-RDF framework has a much higher performance than Apache Hive and native RDF store - 4store and it can be used to process big RDF data.

KW - Database performance

KW - Evaluation

KW - Querying

KW - Semantic web data

UR - http://www.scopus.com/inward/record.url?scp=84959378108&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84959378108&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-19548-3_23

DO - 10.1007/978-3-319-19548-3_23

M3 - Conference contribution

SN - 9783319195476

VL - 9093

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 281

EP - 293

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

PB - Springer Verlag

ER -