Data transmission performance analysis in cloud and grid
Hadoop Distributed File System (HDFS) and MapReduce programming model are for storage and retrieval of the big data. The Terabytes size file can be easily stored on the HDFS and can be analyzed with MapReduce. HDFS is becoming more popular in recent years as a key building block of integrated grid s...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Published: |
Asian Research Publishing Network
2015
|
Online Access: | http://psasir.upm.edu.my/id/eprint/44237/ https://www.arpnjournals.com/jeas/volume_18_2015.htm |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Hadoop Distributed File System (HDFS) and MapReduce programming model are for storage and retrieval of the big data. The Terabytes size file can be easily stored on the HDFS and can be analyzed with MapReduce. HDFS is becoming more popular in recent years as a key building block of integrated grid storage solution in the field of scientific computing. However, due to the nature of HDFS that it cannot support asynchronous write, it is widely confirmed that for the case of sustained high throughput in WAN transfer, single stream per GridFTP transfer is the best solution. GridFTP, designed by using Globus, is one of the most popular protocols for performing data transfers in the Grid environment. In this paper, we take on the challenge of integrating Hadoop with grid, by proposing a new framework called Grid-over-Hadoop by retaining the features of Hadoop and using GridFTP for data transfer. |
---|