TY - GEN
T1 - Support for data-intensive, variable-granularity Grid applications via distributed file system virtualization - A case study of Light Scattering Spectroscopy
AU - Paladugula, Jithendar
AU - Zhao, Ming
AU - Figueiredo, Renato J.
PY - 2004/12/27
Y1 - 2004/12/27
N2 - A key challenge faced by large-scale, distributed applications in Grid environments is efficient, seamless data management. In particular, for applications that can benefit from access to data at variable granularities, data management can pose additional programming burdens to an application developer. This paper presents a case for the use of virtualized distributed file systems as a basis for data management for data-intensive, variable-granularity applications. The approach leverages on-demand transfer mechanisms of existing, de-facto network file system clients and servers that support transfers of partial data sets in an application-transparent fashion, and complement them with user-level performance and functionality enhancements such as caching and encrypted communication channels. The paper uses a nascent application from the medical imaging field (Light Scattering Spectroscopy - LSS) as a motivation for the approach, and as a basis for evaluating its performance. Results from performance experiments that consider the 16-processor parallel execution of LSS analysis and database generation programs show that, in the presence of data locality, a virtualized wide-area distributed file system setup and configured by Grid middleware can achieve performance levels close (13% overhead or less) to that of a local disk, and superior (up to 680% speedup) to non-virtualized distributed file systems.
AB - A key challenge faced by large-scale, distributed applications in Grid environments is efficient, seamless data management. In particular, for applications that can benefit from access to data at variable granularities, data management can pose additional programming burdens to an application developer. This paper presents a case for the use of virtualized distributed file systems as a basis for data management for data-intensive, variable-granularity applications. The approach leverages on-demand transfer mechanisms of existing, de-facto network file system clients and servers that support transfers of partial data sets in an application-transparent fashion, and complement them with user-level performance and functionality enhancements such as caching and encrypted communication channels. The paper uses a nascent application from the medical imaging field (Light Scattering Spectroscopy - LSS) as a motivation for the approach, and as a basis for evaluating its performance. Results from performance experiments that consider the 16-processor parallel execution of LSS analysis and database generation programs show that, in the presence of data locality, a virtualized wide-area distributed file system setup and configured by Grid middleware can achieve performance levels close (13% overhead or less) to that of a local disk, and superior (up to 680% speedup) to non-virtualized distributed file systems.
UR - http://www.scopus.com/inward/record.url?scp=10444282125&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=10444282125&partnerID=8YFLogxK
U2 - 10.1109/CLADE.2004.1309088
DO - 10.1109/CLADE.2004.1309088
M3 - Conference contribution
AN - SCOPUS:10444282125
SN - 0769521150
SN - 9780769521152
T3 - Proceedings of the Second International Workshop on Challenges of Large Applications in Distributed Environments
SP - 12
EP - 21
BT - Proceedings of the Second International Workshop on Challenges of Large Applications in Distributed Environments
T2 - of Large Applications in Distributed Environments
Y2 - 7 June 2004 through 7 June 2004
ER -