TY - GEN
T1 - PlinyCompute
T2 - 44th ACM SIGMOD International Conference on Management of Data, SIGMOD 2018
AU - Zou, Jia
AU - Barnett, R. Matthew
AU - Lorido-Botran, Tania
AU - Luo, Shangyu
AU - Monroy, Carlos
AU - Sikdar, Sourav
AU - Teymourian, Kia
AU - Yuan, Binhang
AU - Jermaine, Chris
N1 - Publisher Copyright:
© 2018 Association for Computing Machinery.
PY - 2018/5/27
Y1 - 2018/5/27
N2 - This paper describes PlinyCompute, a system for development of high-performance, data-intensive, distributed computing tools and libraries. In the large, PlinyCompute presents the programmer with a very high-level, declarative interface, relying on automatic, relational-database style optimization to figure out how to stage distributed computations. However, in the small, PlinyCompute presents the capable systems programmer with a persistent object data model and API (the "PC object model") and associated memory management system that has been designed from the ground-up for high performance, distributed, data-intensive computing. This contrasts with most other Big Data systems, which are constructed on top of the Java Virtual Machine (JVM), and hence must at least partially cede performance-critical concerns such as memory management (including layout and de/allocation) and virtual method/-function dispatch to the JVM. This hybrid approach-declarative in the large, trusting the programmer's ability to utilize PC object model efficiently in the small-results in a system that is ideal for the development of reusable, data-intensive tools and libraries.
AB - This paper describes PlinyCompute, a system for development of high-performance, data-intensive, distributed computing tools and libraries. In the large, PlinyCompute presents the programmer with a very high-level, declarative interface, relying on automatic, relational-database style optimization to figure out how to stage distributed computations. However, in the small, PlinyCompute presents the capable systems programmer with a persistent object data model and API (the "PC object model") and associated memory management system that has been designed from the ground-up for high performance, distributed, data-intensive computing. This contrasts with most other Big Data systems, which are constructed on top of the Java Virtual Machine (JVM), and hence must at least partially cede performance-critical concerns such as memory management (including layout and de/allocation) and virtual method/-function dispatch to the JVM. This hybrid approach-declarative in the large, trusting the programmer's ability to utilize PC object model efficiently in the small-results in a system that is ideal for the development of reusable, data-intensive tools and libraries.
KW - Distributed computing
KW - Object model
KW - Query compilation
UR - http://www.scopus.com/inward/record.url?scp=85048760137&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85048760137&partnerID=8YFLogxK
U2 - 10.1145/3183713.3196933
DO - 10.1145/3183713.3196933
M3 - Conference contribution
AN - SCOPUS:85048760137
T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data
SP - 1189
EP - 1204
BT - SIGMOD 2018 - Proceedings of the 2018 International Conference on Management of Data
A2 - Das, Gautam
A2 - Jermaine, Christopher
A2 - Eldawy, Ahmed
A2 - Bernstein, Philip
PB - Association for Computing Machinery
Y2 - 10 June 2018 through 15 June 2018
ER -