You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by "S.L" <si...@gmail.com> on 2014/08/23 08:56:33 UTC

Hadoop YARM Cluster Setup Questions

Hi Folks,

I was not able to find  a clear answer to this , I know that on the master
node we need to have a slaves file listing all the slaves , but do we need
to have the slave nodes have a master file listing the single name node( I
am not using a secondary name node). I only have the slaves file on the
master node.

I was not able to find a clear answer to this ,the reason I ask this is
because when I submit a hadoop job , even though the input is being split
into 2 parts , only one data node is assigned applications , the other two
( I have three) are no tbeing assigned any applications.

Thanks in advance!

Re: Hadoop YARM Cluster Setup Questions

Posted by rab ra <ra...@gmail.com>.

Hi,

1. Typically,we used to copy the slaves file all the participating nodes
though I do not have concrete theory to back up this. Atleast, this is what
I was doing in hadoop 1.2 and I am doing the same in hadoop 2x

2. I think, you should investigate the yarn GUI and see how many maps it
has spanned. There is a high possibility that both the maps are running in
the same node in parallel. Since there are two splits, there would be two
map processes, and one node is capable of handling more than one map.

3. There could be no replica of input file stored and it is small, and
hence stored in a single block in one node itself.

These could be few hints which might help you

regards
rab

On Sat, Aug 23, 2014 at 12:26 PM, S.L <si...@gmail.com> wrote:

> Hi Folks,
>
> I was not able to find  a clear answer to this , I know that on the master
> node we need to have a slaves file listing all the slaves , but do we need
> to have the slave nodes have a master file listing the single name node( I
> am not using a secondary name node). I only have the slaves file on the
> master node.
>
> I was not able to find a clear answer to this ,the reason I ask this is
> because when I submit a hadoop job , even though the input is being split
> into 2 parts , only one data node is assigned applications , the other two
> ( I have three) are no tbeing assigned any applications.
>
> Thanks in advance!
>

Re: Hadoop YARM Cluster Setup Questions

Posted by rab ra <ra...@gmail.com>.

Hi,

1. Typically,we used to copy the slaves file all the participating nodes
though I do not have concrete theory to back up this. Atleast, this is what
I was doing in hadoop 1.2 and I am doing the same in hadoop 2x

2. I think, you should investigate the yarn GUI and see how many maps it
has spanned. There is a high possibility that both the maps are running in
the same node in parallel. Since there are two splits, there would be two
map processes, and one node is capable of handling more than one map.

3. There could be no replica of input file stored and it is small, and
hence stored in a single block in one node itself.

These could be few hints which might help you

regards
rab

On Sat, Aug 23, 2014 at 12:26 PM, S.L <si...@gmail.com> wrote:

> Hi Folks,
>
> I was not able to find  a clear answer to this , I know that on the master
> node we need to have a slaves file listing all the slaves , but do we need
> to have the slave nodes have a master file listing the single name node( I
> am not using a secondary name node). I only have the slaves file on the
> master node.
>
> I was not able to find a clear answer to this ,the reason I ask this is
> because when I submit a hadoop job , even though the input is being split
> into 2 parts , only one data node is assigned applications , the other two
> ( I have three) are no tbeing assigned any applications.
>
> Thanks in advance!
>

Re: Hadoop YARM Cluster Setup Questions

Posted by rab ra <ra...@gmail.com>.

Hi,

1. Typically,we used to copy the slaves file all the participating nodes
though I do not have concrete theory to back up this. Atleast, this is what
I was doing in hadoop 1.2 and I am doing the same in hadoop 2x

2. I think, you should investigate the yarn GUI and see how many maps it
has spanned. There is a high possibility that both the maps are running in
the same node in parallel. Since there are two splits, there would be two
map processes, and one node is capable of handling more than one map.

3. There could be no replica of input file stored and it is small, and
hence stored in a single block in one node itself.

These could be few hints which might help you

regards
rab

On Sat, Aug 23, 2014 at 12:26 PM, S.L <si...@gmail.com> wrote:

> Hi Folks,
>
> I was not able to find  a clear answer to this , I know that on the master
> node we need to have a slaves file listing all the slaves , but do we need
> to have the slave nodes have a master file listing the single name node( I
> am not using a secondary name node). I only have the slaves file on the
> master node.
>
> I was not able to find a clear answer to this ,the reason I ask this is
> because when I submit a hadoop job , even though the input is being split
> into 2 parts , only one data node is assigned applications , the other two
> ( I have three) are no tbeing assigned any applications.
>
> Thanks in advance!
>

Re: Hadoop YARM Cluster Setup Questions

Posted by rab ra <ra...@gmail.com>.

Hi,

1. Typically,we used to copy the slaves file all the participating nodes
though I do not have concrete theory to back up this. Atleast, this is what
I was doing in hadoop 1.2 and I am doing the same in hadoop 2x

2. I think, you should investigate the yarn GUI and see how many maps it
has spanned. There is a high possibility that both the maps are running in
the same node in parallel. Since there are two splits, there would be two
map processes, and one node is capable of handling more than one map.

3. There could be no replica of input file stored and it is small, and
hence stored in a single block in one node itself.

These could be few hints which might help you

regards
rab

On Sat, Aug 23, 2014 at 12:26 PM, S.L <si...@gmail.com> wrote:

> Hi Folks,
>
> I was not able to find  a clear answer to this , I know that on the master
> node we need to have a slaves file listing all the slaves , but do we need
> to have the slave nodes have a master file listing the single name node( I
> am not using a secondary name node). I only have the slaves file on the
> master node.
>
> I was not able to find a clear answer to this ,the reason I ask this is
> because when I submit a hadoop job , even though the input is being split
> into 2 parts , only one data node is assigned applications , the other two
> ( I have three) are no tbeing assigned any applications.
>
> Thanks in advance!
>