The rapid growth of semantic data in the form of RDF triples demands a scalable distributed storage and efficient query processing engine for its management and reuse. To overcome the limitation of native RDF stores and traditional relational database management systems and scale adequately with the exponential increase in the size of RDF datasets, Big Data processing infrastructure like Hadoop with MapReduce have been used. NoSQL databases such as HBase and Cassandra for storing large-scale RDF data and in-memory data processing to execute SPARQL query as SQL query using Apache Spark is proposed in this paper. This paper presents techniques for distributed RDF data storage and querying schemes for HBase and Cassandra clusters. We also present a compiler that translates SPARQL queries into their Spark SQL equivalent for execution. An empirical comparison of HBase and Cassandra systems using datasets and queries from Berlin SPARQL Benchmark (BSBM) and SPARQL Performance Benchmark (SP2Bench) on Microsoft Azure cloud is presented.
|Original language||English (US)|
|Title of host publication||Proceedings - 2018 IEEE 19th International Conference on Information Reuse and Integration for Data Science, IRI 2018|
|Publisher||Institute of Electrical and Electronics Engineers Inc.|
|Number of pages||8|
|Publication status||Published - Aug 2 2018|
|Event||19th IEEE International Conference on Information Reuse and Integration for Data Science, IRI 2018 - Salt Lake City, United States|
Duration: Jul 7 2018 → Jul 9 2018
|Other||19th IEEE International Conference on Information Reuse and Integration for Data Science, IRI 2018|
|City||Salt Lake City|
|Period||7/7/18 → 7/9/18|
Semantic data querying over NoSQL databases with apache spark. / Hassan, Mahmudul; Bansal, Srividya.Proceedings - 2018 IEEE 19th International Conference on Information Reuse and Integration for Data Science, IRI 2018. Institute of Electrical and Electronics Engineers Inc., 2018. p. 364-371 8424732.
Research output: Chapter in Book/Report/Conference proceeding › Conference contribution