You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Sidharth Kumar <si...@gmail.com> on 2017/04/19 14:38:01 UTC

Hdfs read and write operation

Hi,

please help me to understand it
1) If we read anatomy of hdfs read in hadoop definitive guide it says data
queue is consumed by streamer. So, can you just tell me that will there be
only one streamer in a cluster which consume packets from data queue and
create pipeline for each packets to store into data node or there will be
multiple streamer which will consume packets from data queue and store into
data node parallel .
2) There are multiple blogs has been written claiming read and write is a
parallel process(below I have pasted one such link). Can you also help me
by justifying  if they are wrong
http://stackoverflow.com/questions/30400249/hadoop-
pipeline-write-and-parallel-read

Bests
Sidharth
LinkedIn: www.linkedin.com/in/sidharthkumar2792

Re: Hdfs read and write operation

Posted by Mallanagouda Patil <ma...@gmail.com>.
1.The data queue and streamer are present in HDFS client it's nothing to do
with cluster. The hdfs client writes packets into datanode and it will read
packets from datanode.
2. The datanode allows parellel read/write operations, meaning multiple
hdfs clients can read/write from/into a datanode respectively.

Regards
Mallan




On Apr 19, 2017 8:08 PM, "Sidharth Kumar" <si...@gmail.com>
wrote:

Hi,

please help me to understand it
1) If we read anatomy of hdfs read in hadoop definitive guide it says data
queue is consumed by streamer. So, can you just tell me that will there be
only one streamer in a cluster which consume packets from data queue and
create pipeline for each packets to store into data node or there will be
multiple streamer which will consume packets from data queue and store into
data node parallel .
2) There are multiple blogs has been written claiming read and write is a
parallel process(below I have pasted one such link). Can you also help me
by justifying  if they are wrong
http://stackoverflow.com/questions/30400249/hadoop-pipeline-
write-and-parallel-read

Bests
Sidharth
LinkedIn: www.linkedin.com/in/sidharthkumar2792

Re: Hdfs read and write operation

Posted by Sidharth Kumar <si...@gmail.com>.
Hi,

Could anyone kindly help me to clear my below doubts

Thanks

On 19-Apr-2017 8:08 PM, "Sidharth Kumar" <si...@gmail.com>
wrote:

Hi,

please help me to understand it
1) If we read anatomy of hdfs read in hadoop definitive guide it says data
queue is consumed by streamer. So, can you just tell me that will there be
only one streamer in a cluster which consume packets from data queue and
create pipeline for each packets to store into data node or there will be
multiple streamer which will consume packets from data queue and store into
data node parallel .
2) There are multiple blogs has been written claiming read and write is a
parallel process(below I have pasted one such link). Can you also help me
by justifying  if they are wrong
http://stackoverflow.com/questions/30400249/hadoop-pipeline-
write-and-parallel-read

Bests
Sidharth
LinkedIn: www.linkedin.com/in/sidharthkumar2792