NDM 2012‎ > ‎Workshop Program‎ > ‎

Best Paper Award

How GridFTP pipelining, parallelism and concurrency work: A guide for optimizing large dataset transfers [slides]
Esma Yildirim, Fatih University, Turkey
Jangyoung Kim, University at Buffalo, USA
Tevfik Kosar, University at Buffalo, USA

Abstract: Optimizing the transfer of large files over high-bandwidth networks is a challenging task that requires the consideration of many parameters (e.g. network speed, round-trip time(RTT), current traffic). Unfortunately, this task becomes more complex when transferring datasets comprised of many small files. In this case, the performance of large dataset transfers not only depends on the characteristics of the transfer protocol and network, but also the number and the size distribution of the files that constitute the dataset. GridFTP is the most advanced transfer tool that provides functions to overcome large dataset transfer bottlenecks. Three of the most important functions of GridFTP are pipelining, parallelism and concurrency. In this study, we research the effects of these three crucial functions, provide models for optimizations of these parameters, define guidelines and give an algorithm for their practical use for transfer of large datasets of varying size files.