You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by "bit1129@163.com" <bi...@163.com> on 2014/12/19 04:48:43 UTC

Question about the behavior of HDFS.

Hi Hadoopers,

I got a question about the behavior of HDFS.

Say, there are 1 namenode and 10 data nodes. 

On the namenode machine, i upload a 1G file to HDFS. Will this 1G file be distributed evenly  to the data nodes, and there is no data stored on the namenode? 
If I upload the the data from the data node, will the file still distributed evenly to all the data nodes ? I think if most of the data reside on the node that i upload the data, it will save the network, but this leads to another problem, when MR this file, 
most of time will be spent on this node because it has to process most of the data. 



bit1129@163.com

Re: Re: Question about the behavior of HDFS.

Posted by "bit1129@163.com" <bi...@163.com>.
Thanks Shashwat,but I don't think the paper answers the question, :-).



bit1129@163.com
 
From: shashwat shriparv
Date: 2014-12-19 12:32
To: fireflyhoo@gmail.com
CC: user; bit1129
Subject: Re: Re: Question about the behavior of HDFS.
​Its opening for me any how i am attaching the document for you... :)​



On Fri, Dec 19, 2014 at 9:31 AM, fireflyhoo@gmail.com <fi...@gmail.com> wrote:http://hadoop.apache.org/docs/r0.18.0/hdfs_design.pdf




Warm Regards_∞_
Shashwat Shriparv

邮件带有附件预览链接,若您转发或回复此邮件时不希望对方预览附件,建议您手动删除链接。
共有 1 个附件
hdfs_design.pdf(91K) 极速下载 在线预览 

Re: Re: Question about the behavior of HDFS.

Posted by "bit1129@163.com" <bi...@163.com>.
Thanks Shashwat,but I don't think the paper answers the question, :-).



bit1129@163.com
 
From: shashwat shriparv
Date: 2014-12-19 12:32
To: fireflyhoo@gmail.com
CC: user; bit1129
Subject: Re: Re: Question about the behavior of HDFS.
​Its opening for me any how i am attaching the document for you... :)​



On Fri, Dec 19, 2014 at 9:31 AM, fireflyhoo@gmail.com <fi...@gmail.com> wrote:http://hadoop.apache.org/docs/r0.18.0/hdfs_design.pdf




Warm Regards_∞_
Shashwat Shriparv

邮件带有附件预览链接,若您转发或回复此邮件时不希望对方预览附件,建议您手动删除链接。
共有 1 个附件
hdfs_design.pdf(91K) 极速下载 在线预览 

Re: Re: Question about the behavior of HDFS.

Posted by "bit1129@163.com" <bi...@163.com>.
Thanks Shashwat,but I don't think the paper answers the question, :-).



bit1129@163.com
 
From: shashwat shriparv
Date: 2014-12-19 12:32
To: fireflyhoo@gmail.com
CC: user; bit1129
Subject: Re: Re: Question about the behavior of HDFS.
​Its opening for me any how i am attaching the document for you... :)​



On Fri, Dec 19, 2014 at 9:31 AM, fireflyhoo@gmail.com <fi...@gmail.com> wrote:http://hadoop.apache.org/docs/r0.18.0/hdfs_design.pdf




Warm Regards_∞_
Shashwat Shriparv

邮件带有附件预览链接,若您转发或回复此邮件时不希望对方预览附件,建议您手动删除链接。
共有 1 个附件
hdfs_design.pdf(91K) 极速下载 在线预览 

Re: Re: Question about the behavior of HDFS.

Posted by "bit1129@163.com" <bi...@163.com>.
Thanks Shashwat,but I don't think the paper answers the question, :-).



bit1129@163.com
 
From: shashwat shriparv
Date: 2014-12-19 12:32
To: fireflyhoo@gmail.com
CC: user; bit1129
Subject: Re: Re: Question about the behavior of HDFS.
​Its opening for me any how i am attaching the document for you... :)​



On Fri, Dec 19, 2014 at 9:31 AM, fireflyhoo@gmail.com <fi...@gmail.com> wrote:http://hadoop.apache.org/docs/r0.18.0/hdfs_design.pdf




Warm Regards_∞_
Shashwat Shriparv

邮件带有附件预览链接,若您转发或回复此邮件时不希望对方预览附件,建议您手动删除链接。
共有 1 个附件
hdfs_design.pdf(91K) 极速下载 在线预览 

Re: Re: Question about the behavior of HDFS.

Posted by shashwat shriparv <dw...@gmail.com>.
​Its opening for me any how i am attaching the document for you... :)​



On Fri, Dec 19, 2014 at 9:31 AM, fireflyhoo@gmail.com <fi...@gmail.com>
wrote:
>
> http://hadoop.apache.org/docs/r0.18.0/hdfs_design.pdf





*Warm Regards_**∞_*
* Shashwat Shriparv*
[image: http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9]
<http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9>[image:
https://twitter.com/shriparv] <https://twitter.com/shriparv>[image:
https://www.facebook.com/shriparv] <https://www.facebook.com/shriparv>[image:
http://google.com/+ShashwatShriparv]
<http://google.com/+ShashwatShriparv>[image:
http://www.youtube.com/user/sShriparv/videos]
<http://www.youtube.com/user/sShriparv/videos>[image:
http://profile.yahoo.com/SWXSTW3DVSDTF2HHSRM47AV6DI/] <sh...@yahoo.com>

Re: Re: Question about the behavior of HDFS.

Posted by shashwat shriparv <dw...@gmail.com>.
​Its opening for me any how i am attaching the document for you... :)​



On Fri, Dec 19, 2014 at 9:31 AM, fireflyhoo@gmail.com <fi...@gmail.com>
wrote:
>
> http://hadoop.apache.org/docs/r0.18.0/hdfs_design.pdf





*Warm Regards_**∞_*
* Shashwat Shriparv*
[image: http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9]
<http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9>[image:
https://twitter.com/shriparv] <https://twitter.com/shriparv>[image:
https://www.facebook.com/shriparv] <https://www.facebook.com/shriparv>[image:
http://google.com/+ShashwatShriparv]
<http://google.com/+ShashwatShriparv>[image:
http://www.youtube.com/user/sShriparv/videos]
<http://www.youtube.com/user/sShriparv/videos>[image:
http://profile.yahoo.com/SWXSTW3DVSDTF2HHSRM47AV6DI/] <sh...@yahoo.com>

Re: Re: Question about the behavior of HDFS.

Posted by shashwat shriparv <dw...@gmail.com>.
​Its opening for me any how i am attaching the document for you... :)​



On Fri, Dec 19, 2014 at 9:31 AM, fireflyhoo@gmail.com <fi...@gmail.com>
wrote:
>
> http://hadoop.apache.org/docs/r0.18.0/hdfs_design.pdf





*Warm Regards_**∞_*
* Shashwat Shriparv*
[image: http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9]
<http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9>[image:
https://twitter.com/shriparv] <https://twitter.com/shriparv>[image:
https://www.facebook.com/shriparv] <https://www.facebook.com/shriparv>[image:
http://google.com/+ShashwatShriparv]
<http://google.com/+ShashwatShriparv>[image:
http://www.youtube.com/user/sShriparv/videos]
<http://www.youtube.com/user/sShriparv/videos>[image:
http://profile.yahoo.com/SWXSTW3DVSDTF2HHSRM47AV6DI/] <sh...@yahoo.com>

Re: Re: Question about the behavior of HDFS.

Posted by shashwat shriparv <dw...@gmail.com>.
​Its opening for me any how i am attaching the document for you... :)​



On Fri, Dec 19, 2014 at 9:31 AM, fireflyhoo@gmail.com <fi...@gmail.com>
wrote:
>
> http://hadoop.apache.org/docs/r0.18.0/hdfs_design.pdf





*Warm Regards_**∞_*
* Shashwat Shriparv*
[image: http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9]
<http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9>[image:
https://twitter.com/shriparv] <https://twitter.com/shriparv>[image:
https://www.facebook.com/shriparv] <https://www.facebook.com/shriparv>[image:
http://google.com/+ShashwatShriparv]
<http://google.com/+ShashwatShriparv>[image:
http://www.youtube.com/user/sShriparv/videos]
<http://www.youtube.com/user/sShriparv/videos>[image:
http://profile.yahoo.com/SWXSTW3DVSDTF2HHSRM47AV6DI/] <sh...@yahoo.com>

Re: Re: Question about the behavior of HDFS.

Posted by "fireflyhoo@gmail.com" <fi...@gmail.com>.
Please read this once ..

http://hadoop.apache.org/docs/r0.18.0/hdfs_design.pdf


*Warm Regards_**∞_*
* Shashwat Shriparv*
[image: http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9]
<http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9>[image:
https://twitter.com/shriparv] <https://twitter.com/shriparv>[image:
https://www.facebook.com/shriparv] <https://www.facebook.com/shriparv>[image:
http://google.com/+ShashwatShriparv]
<http://google.com/+ShashwatShriparv>[image:
http://www.youtube.com/user/sShriparv/videos]
<http://www.youtube.com/user/sShriparv/videos>[image:
http://profile.yahoo.com/SWXSTW3DVSDTF2HHSRM47AV6DI/] <sh...@yahoo.com>


On Fri, Dec 19, 2014 at 9:18 AM, bit1129@163.com <bi...@163.com> wrote:
>
> Hi Hadoopers,
>
> I got a question about the behavior of HDFS.
>
> Say, there are 1 namenode and 10 data nodes.
>
> On the namenode machine, i upload a 1G file to HDFS. Will this 1G file be
> distributed evenly  to the data nodes, and there is no data stored on the
> namenode?
> If I upload the the data from the data node, will the file still distributed
> evenly to all the data nodes ? I think if most of the data reside on the
> node that i upload the data, it will save the network, but this leads to
> another problem, when MR this file,
> most of time will be spent on this node because it has to process most of
> the data.
>
> ------------------------------
> bit1129@163.com
>

Re: Re: Question about the behavior of HDFS.

Posted by "fireflyhoo@gmail.com" <fi...@gmail.com>.
Please read this once ..

http://hadoop.apache.org/docs/r0.18.0/hdfs_design.pdf


*Warm Regards_**∞_*
* Shashwat Shriparv*
[image: http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9]
<http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9>[image:
https://twitter.com/shriparv] <https://twitter.com/shriparv>[image:
https://www.facebook.com/shriparv] <https://www.facebook.com/shriparv>[image:
http://google.com/+ShashwatShriparv]
<http://google.com/+ShashwatShriparv>[image:
http://www.youtube.com/user/sShriparv/videos]
<http://www.youtube.com/user/sShriparv/videos>[image:
http://profile.yahoo.com/SWXSTW3DVSDTF2HHSRM47AV6DI/] <sh...@yahoo.com>


On Fri, Dec 19, 2014 at 9:18 AM, bit1129@163.com <bi...@163.com> wrote:
>
> Hi Hadoopers,
>
> I got a question about the behavior of HDFS.
>
> Say, there are 1 namenode and 10 data nodes.
>
> On the namenode machine, i upload a 1G file to HDFS. Will this 1G file be
> distributed evenly  to the data nodes, and there is no data stored on the
> namenode?
> If I upload the the data from the data node, will the file still distributed
> evenly to all the data nodes ? I think if most of the data reside on the
> node that i upload the data, it will save the network, but this leads to
> another problem, when MR this file,
> most of time will be spent on this node because it has to process most of
> the data.
>
> ------------------------------
> bit1129@163.com
>

Re: Re: Question about the behavior of HDFS.

Posted by "fireflyhoo@gmail.com" <fi...@gmail.com>.
Please read this once ..

http://hadoop.apache.org/docs/r0.18.0/hdfs_design.pdf


*Warm Regards_**∞_*
* Shashwat Shriparv*
[image: http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9]
<http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9>[image:
https://twitter.com/shriparv] <https://twitter.com/shriparv>[image:
https://www.facebook.com/shriparv] <https://www.facebook.com/shriparv>[image:
http://google.com/+ShashwatShriparv]
<http://google.com/+ShashwatShriparv>[image:
http://www.youtube.com/user/sShriparv/videos]
<http://www.youtube.com/user/sShriparv/videos>[image:
http://profile.yahoo.com/SWXSTW3DVSDTF2HHSRM47AV6DI/] <sh...@yahoo.com>


On Fri, Dec 19, 2014 at 9:18 AM, bit1129@163.com <bi...@163.com> wrote:
>
> Hi Hadoopers,
>
> I got a question about the behavior of HDFS.
>
> Say, there are 1 namenode and 10 data nodes.
>
> On the namenode machine, i upload a 1G file to HDFS. Will this 1G file be
> distributed evenly  to the data nodes, and there is no data stored on the
> namenode?
> If I upload the the data from the data node, will the file still distributed
> evenly to all the data nodes ? I think if most of the data reside on the
> node that i upload the data, it will save the network, but this leads to
> another problem, when MR this file,
> most of time will be spent on this node because it has to process most of
> the data.
>
> ------------------------------
> bit1129@163.com
>

Re: Re: Question about the behavior of HDFS.

Posted by "fireflyhoo@gmail.com" <fi...@gmail.com>.
Please read this once ..

http://hadoop.apache.org/docs/r0.18.0/hdfs_design.pdf


*Warm Regards_**∞_*
* Shashwat Shriparv*
[image: http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9]
<http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9>[image:
https://twitter.com/shriparv] <https://twitter.com/shriparv>[image:
https://www.facebook.com/shriparv] <https://www.facebook.com/shriparv>[image:
http://google.com/+ShashwatShriparv]
<http://google.com/+ShashwatShriparv>[image:
http://www.youtube.com/user/sShriparv/videos]
<http://www.youtube.com/user/sShriparv/videos>[image:
http://profile.yahoo.com/SWXSTW3DVSDTF2HHSRM47AV6DI/] <sh...@yahoo.com>


On Fri, Dec 19, 2014 at 9:18 AM, bit1129@163.com <bi...@163.com> wrote:
>
> Hi Hadoopers,
>
> I got a question about the behavior of HDFS.
>
> Say, there are 1 namenode and 10 data nodes.
>
> On the namenode machine, i upload a 1G file to HDFS. Will this 1G file be
> distributed evenly  to the data nodes, and there is no data stored on the
> namenode?
> If I upload the the data from the data node, will the file still distributed
> evenly to all the data nodes ? I think if most of the data reside on the
> node that i upload the data, it will save the network, but this leads to
> another problem, when MR this file,
> most of time will be spent on this node because it has to process most of
> the data.
>
> ------------------------------
> bit1129@163.com
>

Re: Question about the behavior of HDFS.

Posted by shashwat shriparv <dw...@gmail.com>.
Please read this once ..

http://hadoop.apache.org/docs/r0.18.0/hdfs_design.pdf


*Warm Regards_**∞_*
* Shashwat Shriparv*
[image: http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9]
<http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9>[image:
https://twitter.com/shriparv] <https://twitter.com/shriparv>[image:
https://www.facebook.com/shriparv] <https://www.facebook.com/shriparv>[image:
http://google.com/+ShashwatShriparv]
<http://google.com/+ShashwatShriparv>[image:
http://www.youtube.com/user/sShriparv/videos]
<http://www.youtube.com/user/sShriparv/videos>[image:
http://profile.yahoo.com/SWXSTW3DVSDTF2HHSRM47AV6DI/] <sh...@yahoo.com>


On Fri, Dec 19, 2014 at 9:18 AM, bit1129@163.com <bi...@163.com> wrote:
>
> Hi Hadoopers,
>
> I got a question about the behavior of HDFS.
>
> Say, there are 1 namenode and 10 data nodes.
>
> On the namenode machine, i upload a 1G file to HDFS. Will this 1G file be
> distributed evenly  to the data nodes, and there is no data stored on the
> namenode?
> If I upload the the data from the data node, will the file still distributed
> evenly to all the data nodes ? I think if most of the data reside on the
> node that i upload the data, it will save the network, but this leads to
> another problem, when MR this file,
> most of time will be spent on this node because it has to process most of
> the data.
>
> ------------------------------
> bit1129@163.com
>

Re: RE: Question about the behavior of HDFS.

Posted by "bit1129@163.com" <bi...@163.com>.
Thanks Natarajan!



bit1129@163.com
 
From: Natarajan, Prabakaran 1. (NSN - IN/Bangalore)
Date: 2014-12-19 12:22
To: user@hadoop.apache.org
Subject: RE: Question about the behavior of HDFS.
Where ever you upload, it upload evenly to all machines.  Namenode will not have data but has only the metadata
 
From: ext bit1129@163.com [mailto:bit1129@163.com] 
Sent: Friday, December 19, 2014 9:19 AM
To: user
Subject: Question about the behavior of HDFS.
 
Hi Hadoopers,
 
I got a question about the behavior of HDFS.
 
Say, there are 1 namenode and 10 data nodes. 
 
On the namenode machine, i upload a 1G file to HDFS. Will this 1G file be distributed evenly  to the data nodes, and there is no data stored on the namenode? 
If I upload the the data from the data node, will the file still distributed evenly to all the data nodes ? I think if most of the data reside on the node that i upload the data, it will save the network, but this leads to another problem, when MR this file, 
most of time will be spent on this node because it has to process most of the data. 
 


bit1129@163.com

Re: RE: Question about the behavior of HDFS.

Posted by "bit1129@163.com" <bi...@163.com>.
Thanks Natarajan!



bit1129@163.com
 
From: Natarajan, Prabakaran 1. (NSN - IN/Bangalore)
Date: 2014-12-19 12:22
To: user@hadoop.apache.org
Subject: RE: Question about the behavior of HDFS.
Where ever you upload, it upload evenly to all machines.  Namenode will not have data but has only the metadata
 
From: ext bit1129@163.com [mailto:bit1129@163.com] 
Sent: Friday, December 19, 2014 9:19 AM
To: user
Subject: Question about the behavior of HDFS.
 
Hi Hadoopers,
 
I got a question about the behavior of HDFS.
 
Say, there are 1 namenode and 10 data nodes. 
 
On the namenode machine, i upload a 1G file to HDFS. Will this 1G file be distributed evenly  to the data nodes, and there is no data stored on the namenode? 
If I upload the the data from the data node, will the file still distributed evenly to all the data nodes ? I think if most of the data reside on the node that i upload the data, it will save the network, but this leads to another problem, when MR this file, 
most of time will be spent on this node because it has to process most of the data. 
 


bit1129@163.com

Re: RE: Question about the behavior of HDFS.

Posted by "bit1129@163.com" <bi...@163.com>.
Thanks Natarajan!



bit1129@163.com
 
From: Natarajan, Prabakaran 1. (NSN - IN/Bangalore)
Date: 2014-12-19 12:22
To: user@hadoop.apache.org
Subject: RE: Question about the behavior of HDFS.
Where ever you upload, it upload evenly to all machines.  Namenode will not have data but has only the metadata
 
From: ext bit1129@163.com [mailto:bit1129@163.com] 
Sent: Friday, December 19, 2014 9:19 AM
To: user
Subject: Question about the behavior of HDFS.
 
Hi Hadoopers,
 
I got a question about the behavior of HDFS.
 
Say, there are 1 namenode and 10 data nodes. 
 
On the namenode machine, i upload a 1G file to HDFS. Will this 1G file be distributed evenly  to the data nodes, and there is no data stored on the namenode? 
If I upload the the data from the data node, will the file still distributed evenly to all the data nodes ? I think if most of the data reside on the node that i upload the data, it will save the network, but this leads to another problem, when MR this file, 
most of time will be spent on this node because it has to process most of the data. 
 


bit1129@163.com

Re: RE: Question about the behavior of HDFS.

Posted by "bit1129@163.com" <bi...@163.com>.
Thanks Natarajan!



bit1129@163.com
 
From: Natarajan, Prabakaran 1. (NSN - IN/Bangalore)
Date: 2014-12-19 12:22
To: user@hadoop.apache.org
Subject: RE: Question about the behavior of HDFS.
Where ever you upload, it upload evenly to all machines.  Namenode will not have data but has only the metadata
 
From: ext bit1129@163.com [mailto:bit1129@163.com] 
Sent: Friday, December 19, 2014 9:19 AM
To: user
Subject: Question about the behavior of HDFS.
 
Hi Hadoopers,
 
I got a question about the behavior of HDFS.
 
Say, there are 1 namenode and 10 data nodes. 
 
On the namenode machine, i upload a 1G file to HDFS. Will this 1G file be distributed evenly  to the data nodes, and there is no data stored on the namenode? 
If I upload the the data from the data node, will the file still distributed evenly to all the data nodes ? I think if most of the data reside on the node that i upload the data, it will save the network, but this leads to another problem, when MR this file, 
most of time will be spent on this node because it has to process most of the data. 
 


bit1129@163.com

RE: Question about the behavior of HDFS.

Posted by "Natarajan, Prabakaran 1. (NSN - IN/Bangalore)" <pr...@nsn.com>.
Where ever you upload, it upload evenly to all machines.  Namenode will not have data but has only the metadata

From: ext bit1129@163.com [mailto:bit1129@163.com]
Sent: Friday, December 19, 2014 9:19 AM
To: user
Subject: Question about the behavior of HDFS.

Hi Hadoopers,

I got a question about the behavior of HDFS.

Say, there are 1 namenode and 10 data nodes.

On the namenode machine, i upload a 1G file to HDFS. Will this 1G file be distributed evenly  to the data nodes, and there is no data stored on the namenode?
If I upload the the data from the data node, will the file still distributed evenly to all the data nodes ? I think if most of the data reside on the node that i upload the data, it will save the network, but this leads to another problem, when MR this file,
most of time will be spent on this node because it has to process most of the data.

________________________________
bit1129@163.com<ma...@163.com>

Re: Question about the behavior of HDFS.

Posted by shashwat shriparv <dw...@gmail.com>.
Please read this once ..

http://hadoop.apache.org/docs/r0.18.0/hdfs_design.pdf


*Warm Regards_**∞_*
* Shashwat Shriparv*
[image: http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9]
<http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9>[image:
https://twitter.com/shriparv] <https://twitter.com/shriparv>[image:
https://www.facebook.com/shriparv] <https://www.facebook.com/shriparv>[image:
http://google.com/+ShashwatShriparv]
<http://google.com/+ShashwatShriparv>[image:
http://www.youtube.com/user/sShriparv/videos]
<http://www.youtube.com/user/sShriparv/videos>[image:
http://profile.yahoo.com/SWXSTW3DVSDTF2HHSRM47AV6DI/] <sh...@yahoo.com>


On Fri, Dec 19, 2014 at 9:18 AM, bit1129@163.com <bi...@163.com> wrote:
>
> Hi Hadoopers,
>
> I got a question about the behavior of HDFS.
>
> Say, there are 1 namenode and 10 data nodes.
>
> On the namenode machine, i upload a 1G file to HDFS. Will this 1G file be
> distributed evenly  to the data nodes, and there is no data stored on the
> namenode?
> If I upload the the data from the data node, will the file still distributed
> evenly to all the data nodes ? I think if most of the data reside on the
> node that i upload the data, it will save the network, but this leads to
> another problem, when MR this file,
> most of time will be spent on this node because it has to process most of
> the data.
>
> ------------------------------
> bit1129@163.com
>

RE: Question about the behavior of HDFS.

Posted by "Natarajan, Prabakaran 1. (NSN - IN/Bangalore)" <pr...@nsn.com>.
Where ever you upload, it upload evenly to all machines.  Namenode will not have data but has only the metadata

From: ext bit1129@163.com [mailto:bit1129@163.com]
Sent: Friday, December 19, 2014 9:19 AM
To: user
Subject: Question about the behavior of HDFS.

Hi Hadoopers,

I got a question about the behavior of HDFS.

Say, there are 1 namenode and 10 data nodes.

On the namenode machine, i upload a 1G file to HDFS. Will this 1G file be distributed evenly  to the data nodes, and there is no data stored on the namenode?
If I upload the the data from the data node, will the file still distributed evenly to all the data nodes ? I think if most of the data reside on the node that i upload the data, it will save the network, but this leads to another problem, when MR this file,
most of time will be spent on this node because it has to process most of the data.

________________________________
bit1129@163.com<ma...@163.com>

Re: Question about the behavior of HDFS.

Posted by shashwat shriparv <dw...@gmail.com>.
Please read this once ..

http://hadoop.apache.org/docs/r0.18.0/hdfs_design.pdf


*Warm Regards_**∞_*
* Shashwat Shriparv*
[image: http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9]
<http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9>[image:
https://twitter.com/shriparv] <https://twitter.com/shriparv>[image:
https://www.facebook.com/shriparv] <https://www.facebook.com/shriparv>[image:
http://google.com/+ShashwatShriparv]
<http://google.com/+ShashwatShriparv>[image:
http://www.youtube.com/user/sShriparv/videos]
<http://www.youtube.com/user/sShriparv/videos>[image:
http://profile.yahoo.com/SWXSTW3DVSDTF2HHSRM47AV6DI/] <sh...@yahoo.com>


On Fri, Dec 19, 2014 at 9:18 AM, bit1129@163.com <bi...@163.com> wrote:
>
> Hi Hadoopers,
>
> I got a question about the behavior of HDFS.
>
> Say, there are 1 namenode and 10 data nodes.
>
> On the namenode machine, i upload a 1G file to HDFS. Will this 1G file be
> distributed evenly  to the data nodes, and there is no data stored on the
> namenode?
> If I upload the the data from the data node, will the file still distributed
> evenly to all the data nodes ? I think if most of the data reside on the
> node that i upload the data, it will save the network, but this leads to
> another problem, when MR this file,
> most of time will be spent on this node because it has to process most of
> the data.
>
> ------------------------------
> bit1129@163.com
>

Re: Question about the behavior of HDFS.

Posted by shashwat shriparv <dw...@gmail.com>.
Please read this once ..

http://hadoop.apache.org/docs/r0.18.0/hdfs_design.pdf


*Warm Regards_**∞_*
* Shashwat Shriparv*
[image: http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9]
<http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9>[image:
https://twitter.com/shriparv] <https://twitter.com/shriparv>[image:
https://www.facebook.com/shriparv] <https://www.facebook.com/shriparv>[image:
http://google.com/+ShashwatShriparv]
<http://google.com/+ShashwatShriparv>[image:
http://www.youtube.com/user/sShriparv/videos]
<http://www.youtube.com/user/sShriparv/videos>[image:
http://profile.yahoo.com/SWXSTW3DVSDTF2HHSRM47AV6DI/] <sh...@yahoo.com>


On Fri, Dec 19, 2014 at 9:18 AM, bit1129@163.com <bi...@163.com> wrote:
>
> Hi Hadoopers,
>
> I got a question about the behavior of HDFS.
>
> Say, there are 1 namenode and 10 data nodes.
>
> On the namenode machine, i upload a 1G file to HDFS. Will this 1G file be
> distributed evenly  to the data nodes, and there is no data stored on the
> namenode?
> If I upload the the data from the data node, will the file still distributed
> evenly to all the data nodes ? I think if most of the data reside on the
> node that i upload the data, it will save the network, but this leads to
> another problem, when MR this file,
> most of time will be spent on this node because it has to process most of
> the data.
>
> ------------------------------
> bit1129@163.com
>

RE: Question about the behavior of HDFS.

Posted by "Natarajan, Prabakaran 1. (NSN - IN/Bangalore)" <pr...@nsn.com>.
Where ever you upload, it upload evenly to all machines.  Namenode will not have data but has only the metadata

From: ext bit1129@163.com [mailto:bit1129@163.com]
Sent: Friday, December 19, 2014 9:19 AM
To: user
Subject: Question about the behavior of HDFS.

Hi Hadoopers,

I got a question about the behavior of HDFS.

Say, there are 1 namenode and 10 data nodes.

On the namenode machine, i upload a 1G file to HDFS. Will this 1G file be distributed evenly  to the data nodes, and there is no data stored on the namenode?
If I upload the the data from the data node, will the file still distributed evenly to all the data nodes ? I think if most of the data reside on the node that i upload the data, it will save the network, but this leads to another problem, when MR this file,
most of time will be spent on this node because it has to process most of the data.

________________________________
bit1129@163.com<ma...@163.com>

RE: Question about the behavior of HDFS.

Posted by "Natarajan, Prabakaran 1. (NSN - IN/Bangalore)" <pr...@nsn.com>.
Where ever you upload, it upload evenly to all machines.  Namenode will not have data but has only the metadata

From: ext bit1129@163.com [mailto:bit1129@163.com]
Sent: Friday, December 19, 2014 9:19 AM
To: user
Subject: Question about the behavior of HDFS.

Hi Hadoopers,

I got a question about the behavior of HDFS.

Say, there are 1 namenode and 10 data nodes.

On the namenode machine, i upload a 1G file to HDFS. Will this 1G file be distributed evenly  to the data nodes, and there is no data stored on the namenode?
If I upload the the data from the data node, will the file still distributed evenly to all the data nodes ? I think if most of the data reside on the node that i upload the data, it will save the network, but this leads to another problem, when MR this file,
most of time will be spent on this node because it has to process most of the data.

________________________________
bit1129@163.com<ma...@163.com>