You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-user@hadoop.apache.org by "Kartashov, Andy" <An...@mpac.ca> on 2012/10/26 18:40:29 UTC

cluster set-up / a few quick questions

Gents,

1.
- do you put Master's node <hostname> under fs.default.name in core-site.xml on the slave machines or slaves' hostnames?
- do you need to run "sudo -u hdfs hadoop namenode -format" and create /tmp /var folders on the HDFS of the slave machines that will be running only DN and TT or not? Do you still need to create hadoop/dfs/name folder on the slaves?

2.
In hdfs-site.xml for dfs.name.dir & dfs.data.dir properties we specify /hadoop/dfs/name /hadoop/dfs/data being local linux NFS directories by running command "mkdir -p /hadoop/dfs/data"
but mapred.system.dir property is to point to HDFS and not NFS since we are running "sudo -u hdfs hadoop fs -mkdir /tmp/mapred/system"??
If so and since it is exactly the same format /far/boo/baz how does hadoop know which directory is local on NFS or HDFS?

Would you please kindly reconfirm.

Cheers,
AK47
NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le présent courriel et toute pièce jointe qui l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent être couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent courriel

Re: cluster set-up / a few quick questions

Posted by Andy Isaacson <ad...@cloudera.com>.

On Fri, Oct 26, 2012 at 11:47 AM, Kartashov, Andy
<An...@mpac.ca> wrote:
> I successfully ran a job on a cluster on foo1 in pseudo-distributed mode and are now trying to try fully-dist'ed one.
>
> a. I created another instance foo2 on EC2.

It seems like you're trying to use the start-dfs.sh style startup
scripts to manually run a cluster on EC2.  This is doable, but it's
not very easy due to the mismatch in expectations between EC2 style
deployments and start-dfs.sh.  Setting up a manually started cluster
requires a bit of up-front work, and EC2 spin-up/spin-down cycles mean
you end up redoing that work frequently.

You might consider using whirr, http://whirr.apache.org/ as a more
automated way of deploying Hadoop clusters on EC2.

Of course, setting up a manual cluster can be a really good way to
understand how all the parts work together, and doing it on EC2 should
work just fine.

> Installed hadoop on it and copied conf/  folder from foo1 to foo2. I created  /hadoop/dfs/data folder on the local linux system on foo2.
>
> b. on foo1 I created file conf/slaves and added:
> localhost
> <hostname-of-foo2>

I'd strongly recommend being consistent with the naming, don't mix
"localhost" and DNS names. EC2 has "ec2.internal" in /etc/resolv.conf
by default, so you can "ping ip-10-42-120-3" and it should work just
fine. Then make conf/master list your first host by name, and make
conf/slaves list all your hosts by name. Note that for small clusters,
running a DN and a NN on a single host is an acceptable compromise and
works OK.

% cat conf/master
ip-10-42-120-3
% cat conf/slaves
ip-10-42-120-3
ip-10-42-115-32
%

You also should make sure that your user account can ssh to all the nodes:
% for h in $(cat conf/slaves); do ssh -oStrictHostKeyChecking=no $h
hostname; done

 - answer "yes" to any "allow untrusted certificate" messages
 - if you get "permission denied" messages you'll need to set up the
authorized_keys properly.
 - after this loop succeeds you should be able to run it again and get
a clean list of hostnames.

> At this point I cannot find an answer on what to do next.
>
> I started NN, DN, SNN, JT, TT on foor1. After I ran "hadoop fsck /user/bar -files -blocks -locations", it showed # of datanode as 1.  I was expecting DN and TT on foo2 to be started by foo1. But it didn’t happen, so I started them myself and tried the the command again. Still  one DD.

You don't need to start the daemons individually, and doing so is very
difficult to get right. I virtually never do so -- I use the
start-dfs.sh script to start the daemons (NN, DN, TT, etc). The
"master" and "slaves" config files are parsed by the start-*.sh
scripts, not by the daemons themselves.  And, the daemons don't start
themselves -- for a manual cluster, the start-*.sh scripts are
responsible. (In a production deployment such as CDH, there is a
/etc/init.d script which is managed by the distro packaging to start
and manage the daemons.)

-andy

Re: cluster set-up / a few quick questions

Posted by Andy Isaacson <ad...@cloudera.com>.

On Fri, Oct 26, 2012 at 11:47 AM, Kartashov, Andy
<An...@mpac.ca> wrote:
> I successfully ran a job on a cluster on foo1 in pseudo-distributed mode and are now trying to try fully-dist'ed one.
>
> a. I created another instance foo2 on EC2.

It seems like you're trying to use the start-dfs.sh style startup
scripts to manually run a cluster on EC2.  This is doable, but it's
not very easy due to the mismatch in expectations between EC2 style
deployments and start-dfs.sh.  Setting up a manually started cluster
requires a bit of up-front work, and EC2 spin-up/spin-down cycles mean
you end up redoing that work frequently.

You might consider using whirr, http://whirr.apache.org/ as a more
automated way of deploying Hadoop clusters on EC2.

Of course, setting up a manual cluster can be a really good way to
understand how all the parts work together, and doing it on EC2 should
work just fine.

> Installed hadoop on it and copied conf/  folder from foo1 to foo2. I created  /hadoop/dfs/data folder on the local linux system on foo2.
>
> b. on foo1 I created file conf/slaves and added:
> localhost
> <hostname-of-foo2>

I'd strongly recommend being consistent with the naming, don't mix
"localhost" and DNS names. EC2 has "ec2.internal" in /etc/resolv.conf
by default, so you can "ping ip-10-42-120-3" and it should work just
fine. Then make conf/master list your first host by name, and make
conf/slaves list all your hosts by name. Note that for small clusters,
running a DN and a NN on a single host is an acceptable compromise and
works OK.

% cat conf/master
ip-10-42-120-3
% cat conf/slaves
ip-10-42-120-3
ip-10-42-115-32
%

You also should make sure that your user account can ssh to all the nodes:
% for h in $(cat conf/slaves); do ssh -oStrictHostKeyChecking=no $h
hostname; done

 - answer "yes" to any "allow untrusted certificate" messages
 - if you get "permission denied" messages you'll need to set up the
authorized_keys properly.
 - after this loop succeeds you should be able to run it again and get
a clean list of hostnames.

> At this point I cannot find an answer on what to do next.
>
> I started NN, DN, SNN, JT, TT on foor1. After I ran "hadoop fsck /user/bar -files -blocks -locations", it showed # of datanode as 1.  I was expecting DN and TT on foo2 to be started by foo1. But it didn’t happen, so I started them myself and tried the the command again. Still  one DD.

You don't need to start the daemons individually, and doing so is very
difficult to get right. I virtually never do so -- I use the
start-dfs.sh script to start the daemons (NN, DN, TT, etc). The
"master" and "slaves" config files are parsed by the start-*.sh
scripts, not by the daemons themselves.  And, the daemons don't start
themselves -- for a manual cluster, the start-*.sh scripts are
responsible. (In a production deployment such as CDH, there is a
/etc/init.d script which is managed by the distro packaging to start
and manage the daemons.)

-andy

Re: cluster set-up / a few quick questions

Posted by Nitin Pawar <ni...@gmail.com>.

questions

1) Have you setup password less ssh between both hosts for the user
who owns the hadoop processes (or root)
2) If answer to questions 1 is yes, how did you start NN, JT DN and TT
3) If you started them one by one, there is no reason running a
command on one node will execute it on other.


On Sat, Oct 27, 2012 at 12:17 AM, Kartashov, Andy
<An...@mpac.ca> wrote:
> Andy, many thanks.
>
> I am stuck here now so please put me in the right direction.
>
> I successfully ran a job on a cluster on foo1 in pseudo-distributed mode and are now trying to try fully-dist'ed one.
>
> a. I created another instance foo2 on EC2. Installed hadoop on it and copied conf/  folder from foo1 to foo2. I created  /hadoop/dfs/data folder on the local linux system on foo2.
>
> b. on foo1 I created file conf/slaves and added:
> localhost
> <hostname-of-foo2>
>
> At this point I cannot find an answer on what to do next.
>
> I started NN, DN, SNN, JT, TT on foor1. After I ran "hadoop fsck /user/bar -files -blocks -locations", it showed # of datanode as 1.  I was expecting DN and TT on foo2 to be started by foo1. But it didn’t happen, so I started them myself and tried the the command again. Still  one DD.
> I realise that boo2 has no data at this point but I could not find bin/start-balancer.sh script to help me to balance data over to DD from foo1 to foo2.
>
> What do I do next?
>
> Thanks
> AK
>
> -----Original Message-----
> From: Andy Isaacson [mailto:adi@cloudera.com]
> Sent: Friday, October 26, 2012 2:21 PM
> To: user@hadoop.apache.org
> Subject: Re: cluster set-up / a few quick questions
>
> On Fri, Oct 26, 2012 at 9:40 AM, Kartashov, Andy <An...@mpac.ca> wrote:
>> Gents,
>
> We're not all male here. :)  I prefer "Hadoopers" or "hi all,".
>
>> 1.
>> - do you put Master's node <hostname> under fs.default.name in core-site.xml on the slave machines or slaves' hostnames?
>
> Master.  I have a 4-node cluster, named foo1 - foo4. My fs.default.name is hdfs://foo1.domain.com.
>
>> - do you need to run "sudo -u hdfs hadoop namenode -format" and create /tmp /var folders on the HDFS of the slave machines that will be running only DN and TT or not? Do you still need to create hadoop/dfs/name folder on the slaves?
>
> (The following is the simple answer, for non-HA non-federated HDFS.
> You'll want to get the simple example working before trying the complicated ones.)
>
> No. A cluster has one namenode, running on the machine known as the master, and the admin must "hadoop namenode -format" on that machine only.
>
> In my example, I ran "hadoop namenode -format" on foo1.
>
>> 2.
>> In hdfs-site.xml for dfs.name.dir & dfs.data.dir properties  we specify  /hadoop/dfs/name /hadoop/dfs/data  being  local linux NFS directories by running command "mkdir -p /hadoop/dfs/data"
>> but mapred.system.dir  property is to point to HDFS and not NFS  since we are running "sudo -u hdfs hadoop fs -mkdir /tmp/mapred/system"??
>> If so and since it is exactly the same format  /far/boo/baz how does hadoop know which directory is local on NFS or HDFS?
>
> This is very confusing, to be sure!  There are a few places where paths are implicitly known to be on HDFS rather than a Linux filesystem path. mapred.system.dir is one of those. This does mean that given a string that starts with "/tmp/" you can't necessarily know whether it's a Linux path or a HDFS path without looking at the larger context.
>
> In the case of mapred.system.dir, the docs are the place to check; according to cluster_setup.html, mapred.system.dir is "Path on the HDFS where where the Map/Reduce framework stores system files".
>
> http://hadoop.apache.org/docs/r1.0.3/cluster_setup.html
>
> Hope this helps,
> -andy
> NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le présent courriel et toute pièce jointe qui l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent être couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent courriel



-- 
Nitin Pawar

Re: cluster set-up / a few quick questions

Posted by Nitin Pawar <ni...@gmail.com>.

questions

1) Have you setup password less ssh between both hosts for the user
who owns the hadoop processes (or root)
2) If answer to questions 1 is yes, how did you start NN, JT DN and TT
3) If you started them one by one, there is no reason running a
command on one node will execute it on other.


On Sat, Oct 27, 2012 at 12:17 AM, Kartashov, Andy
<An...@mpac.ca> wrote:
> Andy, many thanks.
>
> I am stuck here now so please put me in the right direction.
>
> I successfully ran a job on a cluster on foo1 in pseudo-distributed mode and are now trying to try fully-dist'ed one.
>
> a. I created another instance foo2 on EC2. Installed hadoop on it and copied conf/  folder from foo1 to foo2. I created  /hadoop/dfs/data folder on the local linux system on foo2.
>
> b. on foo1 I created file conf/slaves and added:
> localhost
> <hostname-of-foo2>
>
> At this point I cannot find an answer on what to do next.
>
> I started NN, DN, SNN, JT, TT on foor1. After I ran "hadoop fsck /user/bar -files -blocks -locations", it showed # of datanode as 1.  I was expecting DN and TT on foo2 to be started by foo1. But it didn’t happen, so I started them myself and tried the the command again. Still  one DD.
> I realise that boo2 has no data at this point but I could not find bin/start-balancer.sh script to help me to balance data over to DD from foo1 to foo2.
>
> What do I do next?
>
> Thanks
> AK
>
> -----Original Message-----
> From: Andy Isaacson [mailto:adi@cloudera.com]
> Sent: Friday, October 26, 2012 2:21 PM
> To: user@hadoop.apache.org
> Subject: Re: cluster set-up / a few quick questions
>
> On Fri, Oct 26, 2012 at 9:40 AM, Kartashov, Andy <An...@mpac.ca> wrote:
>> Gents,
>
> We're not all male here. :)  I prefer "Hadoopers" or "hi all,".
>
>> 1.
>> - do you put Master's node <hostname> under fs.default.name in core-site.xml on the slave machines or slaves' hostnames?
>
> Master.  I have a 4-node cluster, named foo1 - foo4. My fs.default.name is hdfs://foo1.domain.com.
>
>> - do you need to run "sudo -u hdfs hadoop namenode -format" and create /tmp /var folders on the HDFS of the slave machines that will be running only DN and TT or not? Do you still need to create hadoop/dfs/name folder on the slaves?
>
> (The following is the simple answer, for non-HA non-federated HDFS.
> You'll want to get the simple example working before trying the complicated ones.)
>
> No. A cluster has one namenode, running on the machine known as the master, and the admin must "hadoop namenode -format" on that machine only.
>
> In my example, I ran "hadoop namenode -format" on foo1.
>
>> 2.
>> In hdfs-site.xml for dfs.name.dir & dfs.data.dir properties  we specify  /hadoop/dfs/name /hadoop/dfs/data  being  local linux NFS directories by running command "mkdir -p /hadoop/dfs/data"
>> but mapred.system.dir  property is to point to HDFS and not NFS  since we are running "sudo -u hdfs hadoop fs -mkdir /tmp/mapred/system"??
>> If so and since it is exactly the same format  /far/boo/baz how does hadoop know which directory is local on NFS or HDFS?
>
> This is very confusing, to be sure!  There are a few places where paths are implicitly known to be on HDFS rather than a Linux filesystem path. mapred.system.dir is one of those. This does mean that given a string that starts with "/tmp/" you can't necessarily know whether it's a Linux path or a HDFS path without looking at the larger context.
>
> In the case of mapred.system.dir, the docs are the place to check; according to cluster_setup.html, mapred.system.dir is "Path on the HDFS where where the Map/Reduce framework stores system files".
>
> http://hadoop.apache.org/docs/r1.0.3/cluster_setup.html
>
> Hope this helps,
> -andy
> NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le présent courriel et toute pièce jointe qui l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent être couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent courriel



-- 
Nitin Pawar

Re: cluster set-up / a few quick questions

Posted by Nitin Pawar <ni...@gmail.com>.

questions

1) Have you setup password less ssh between both hosts for the user
who owns the hadoop processes (or root)
2) If answer to questions 1 is yes, how did you start NN, JT DN and TT
3) If you started them one by one, there is no reason running a
command on one node will execute it on other.


On Sat, Oct 27, 2012 at 12:17 AM, Kartashov, Andy
<An...@mpac.ca> wrote:
> Andy, many thanks.
>
> I am stuck here now so please put me in the right direction.
>
> I successfully ran a job on a cluster on foo1 in pseudo-distributed mode and are now trying to try fully-dist'ed one.
>
> a. I created another instance foo2 on EC2. Installed hadoop on it and copied conf/  folder from foo1 to foo2. I created  /hadoop/dfs/data folder on the local linux system on foo2.
>
> b. on foo1 I created file conf/slaves and added:
> localhost
> <hostname-of-foo2>
>
> At this point I cannot find an answer on what to do next.
>
> I started NN, DN, SNN, JT, TT on foor1. After I ran "hadoop fsck /user/bar -files -blocks -locations", it showed # of datanode as 1.  I was expecting DN and TT on foo2 to be started by foo1. But it didn’t happen, so I started them myself and tried the the command again. Still  one DD.
> I realise that boo2 has no data at this point but I could not find bin/start-balancer.sh script to help me to balance data over to DD from foo1 to foo2.
>
> What do I do next?
>
> Thanks
> AK
>
> -----Original Message-----
> From: Andy Isaacson [mailto:adi@cloudera.com]
> Sent: Friday, October 26, 2012 2:21 PM
> To: user@hadoop.apache.org
> Subject: Re: cluster set-up / a few quick questions
>
> On Fri, Oct 26, 2012 at 9:40 AM, Kartashov, Andy <An...@mpac.ca> wrote:
>> Gents,
>
> We're not all male here. :)  I prefer "Hadoopers" or "hi all,".
>
>> 1.
>> - do you put Master's node <hostname> under fs.default.name in core-site.xml on the slave machines or slaves' hostnames?
>
> Master.  I have a 4-node cluster, named foo1 - foo4. My fs.default.name is hdfs://foo1.domain.com.
>
>> - do you need to run "sudo -u hdfs hadoop namenode -format" and create /tmp /var folders on the HDFS of the slave machines that will be running only DN and TT or not? Do you still need to create hadoop/dfs/name folder on the slaves?
>
> (The following is the simple answer, for non-HA non-federated HDFS.
> You'll want to get the simple example working before trying the complicated ones.)
>
> No. A cluster has one namenode, running on the machine known as the master, and the admin must "hadoop namenode -format" on that machine only.
>
> In my example, I ran "hadoop namenode -format" on foo1.
>
>> 2.
>> In hdfs-site.xml for dfs.name.dir & dfs.data.dir properties  we specify  /hadoop/dfs/name /hadoop/dfs/data  being  local linux NFS directories by running command "mkdir -p /hadoop/dfs/data"
>> but mapred.system.dir  property is to point to HDFS and not NFS  since we are running "sudo -u hdfs hadoop fs -mkdir /tmp/mapred/system"??
>> If so and since it is exactly the same format  /far/boo/baz how does hadoop know which directory is local on NFS or HDFS?
>
> This is very confusing, to be sure!  There are a few places where paths are implicitly known to be on HDFS rather than a Linux filesystem path. mapred.system.dir is one of those. This does mean that given a string that starts with "/tmp/" you can't necessarily know whether it's a Linux path or a HDFS path without looking at the larger context.
>
> In the case of mapred.system.dir, the docs are the place to check; according to cluster_setup.html, mapred.system.dir is "Path on the HDFS where where the Map/Reduce framework stores system files".
>
> http://hadoop.apache.org/docs/r1.0.3/cluster_setup.html
>
> Hope this helps,
> -andy
> NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le présent courriel et toute pièce jointe qui l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent être couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent courriel



-- 
Nitin Pawar

Re: cluster set-up / a few quick questions

Posted by Nitin Pawar <ni...@gmail.com>.

questions

1) Have you setup password less ssh between both hosts for the user
who owns the hadoop processes (or root)
2) If answer to questions 1 is yes, how did you start NN, JT DN and TT
3) If you started them one by one, there is no reason running a
command on one node will execute it on other.


On Sat, Oct 27, 2012 at 12:17 AM, Kartashov, Andy
<An...@mpac.ca> wrote:
> Andy, many thanks.
>
> I am stuck here now so please put me in the right direction.
>
> I successfully ran a job on a cluster on foo1 in pseudo-distributed mode and are now trying to try fully-dist'ed one.
>
> a. I created another instance foo2 on EC2. Installed hadoop on it and copied conf/  folder from foo1 to foo2. I created  /hadoop/dfs/data folder on the local linux system on foo2.
>
> b. on foo1 I created file conf/slaves and added:
> localhost
> <hostname-of-foo2>
>
> At this point I cannot find an answer on what to do next.
>
> I started NN, DN, SNN, JT, TT on foor1. After I ran "hadoop fsck /user/bar -files -blocks -locations", it showed # of datanode as 1.  I was expecting DN and TT on foo2 to be started by foo1. But it didn’t happen, so I started them myself and tried the the command again. Still  one DD.
> I realise that boo2 has no data at this point but I could not find bin/start-balancer.sh script to help me to balance data over to DD from foo1 to foo2.
>
> What do I do next?
>
> Thanks
> AK
>
> -----Original Message-----
> From: Andy Isaacson [mailto:adi@cloudera.com]
> Sent: Friday, October 26, 2012 2:21 PM
> To: user@hadoop.apache.org
> Subject: Re: cluster set-up / a few quick questions
>
> On Fri, Oct 26, 2012 at 9:40 AM, Kartashov, Andy <An...@mpac.ca> wrote:
>> Gents,
>
> We're not all male here. :)  I prefer "Hadoopers" or "hi all,".
>
>> 1.
>> - do you put Master's node <hostname> under fs.default.name in core-site.xml on the slave machines or slaves' hostnames?
>
> Master.  I have a 4-node cluster, named foo1 - foo4. My fs.default.name is hdfs://foo1.domain.com.
>
>> - do you need to run "sudo -u hdfs hadoop namenode -format" and create /tmp /var folders on the HDFS of the slave machines that will be running only DN and TT or not? Do you still need to create hadoop/dfs/name folder on the slaves?
>
> (The following is the simple answer, for non-HA non-federated HDFS.
> You'll want to get the simple example working before trying the complicated ones.)
>
> No. A cluster has one namenode, running on the machine known as the master, and the admin must "hadoop namenode -format" on that machine only.
>
> In my example, I ran "hadoop namenode -format" on foo1.
>
>> 2.
>> In hdfs-site.xml for dfs.name.dir & dfs.data.dir properties  we specify  /hadoop/dfs/name /hadoop/dfs/data  being  local linux NFS directories by running command "mkdir -p /hadoop/dfs/data"
>> but mapred.system.dir  property is to point to HDFS and not NFS  since we are running "sudo -u hdfs hadoop fs -mkdir /tmp/mapred/system"??
>> If so and since it is exactly the same format  /far/boo/baz how does hadoop know which directory is local on NFS or HDFS?
>
> This is very confusing, to be sure!  There are a few places where paths are implicitly known to be on HDFS rather than a Linux filesystem path. mapred.system.dir is one of those. This does mean that given a string that starts with "/tmp/" you can't necessarily know whether it's a Linux path or a HDFS path without looking at the larger context.
>
> In the case of mapred.system.dir, the docs are the place to check; according to cluster_setup.html, mapred.system.dir is "Path on the HDFS where where the Map/Reduce framework stores system files".
>
> http://hadoop.apache.org/docs/r1.0.3/cluster_setup.html
>
> Hope this helps,
> -andy
> NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le présent courriel et toute pièce jointe qui l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent être couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent courriel



-- 
Nitin Pawar

Re: cluster set-up / a few quick questions

Posted by Andy Isaacson <ad...@cloudera.com>.

On Fri, Oct 26, 2012 at 11:47 AM, Kartashov, Andy
<An...@mpac.ca> wrote:
> I successfully ran a job on a cluster on foo1 in pseudo-distributed mode and are now trying to try fully-dist'ed one.
>
> a. I created another instance foo2 on EC2.

It seems like you're trying to use the start-dfs.sh style startup
scripts to manually run a cluster on EC2.  This is doable, but it's
not very easy due to the mismatch in expectations between EC2 style
deployments and start-dfs.sh.  Setting up a manually started cluster
requires a bit of up-front work, and EC2 spin-up/spin-down cycles mean
you end up redoing that work frequently.

You might consider using whirr, http://whirr.apache.org/ as a more
automated way of deploying Hadoop clusters on EC2.

Of course, setting up a manual cluster can be a really good way to
understand how all the parts work together, and doing it on EC2 should
work just fine.

> Installed hadoop on it and copied conf/  folder from foo1 to foo2. I created  /hadoop/dfs/data folder on the local linux system on foo2.
>
> b. on foo1 I created file conf/slaves and added:
> localhost
> <hostname-of-foo2>

I'd strongly recommend being consistent with the naming, don't mix
"localhost" and DNS names. EC2 has "ec2.internal" in /etc/resolv.conf
by default, so you can "ping ip-10-42-120-3" and it should work just
fine. Then make conf/master list your first host by name, and make
conf/slaves list all your hosts by name. Note that for small clusters,
running a DN and a NN on a single host is an acceptable compromise and
works OK.

% cat conf/master
ip-10-42-120-3
% cat conf/slaves
ip-10-42-120-3
ip-10-42-115-32
%

You also should make sure that your user account can ssh to all the nodes:
% for h in $(cat conf/slaves); do ssh -oStrictHostKeyChecking=no $h
hostname; done

 - answer "yes" to any "allow untrusted certificate" messages
 - if you get "permission denied" messages you'll need to set up the
authorized_keys properly.
 - after this loop succeeds you should be able to run it again and get
a clean list of hostnames.

> At this point I cannot find an answer on what to do next.
>
> I started NN, DN, SNN, JT, TT on foor1. After I ran "hadoop fsck /user/bar -files -blocks -locations", it showed # of datanode as 1.  I was expecting DN and TT on foo2 to be started by foo1. But it didn’t happen, so I started them myself and tried the the command again. Still  one DD.

You don't need to start the daemons individually, and doing so is very
difficult to get right. I virtually never do so -- I use the
start-dfs.sh script to start the daemons (NN, DN, TT, etc). The
"master" and "slaves" config files are parsed by the start-*.sh
scripts, not by the daemons themselves.  And, the daemons don't start
themselves -- for a manual cluster, the start-*.sh scripts are
responsible. (In a production deployment such as CDH, there is a
/etc/init.d script which is managed by the distro packaging to start
and manage the daemons.)

-andy

Re: cluster set-up / a few quick questions

Posted by Andy Isaacson <ad...@cloudera.com>.

On Fri, Oct 26, 2012 at 11:47 AM, Kartashov, Andy
<An...@mpac.ca> wrote:
> I successfully ran a job on a cluster on foo1 in pseudo-distributed mode and are now trying to try fully-dist'ed one.
>
> a. I created another instance foo2 on EC2.

It seems like you're trying to use the start-dfs.sh style startup
scripts to manually run a cluster on EC2.  This is doable, but it's
not very easy due to the mismatch in expectations between EC2 style
deployments and start-dfs.sh.  Setting up a manually started cluster
requires a bit of up-front work, and EC2 spin-up/spin-down cycles mean
you end up redoing that work frequently.

You might consider using whirr, http://whirr.apache.org/ as a more
automated way of deploying Hadoop clusters on EC2.

Of course, setting up a manual cluster can be a really good way to
understand how all the parts work together, and doing it on EC2 should
work just fine.

> Installed hadoop on it and copied conf/  folder from foo1 to foo2. I created  /hadoop/dfs/data folder on the local linux system on foo2.
>
> b. on foo1 I created file conf/slaves and added:
> localhost
> <hostname-of-foo2>

I'd strongly recommend being consistent with the naming, don't mix
"localhost" and DNS names. EC2 has "ec2.internal" in /etc/resolv.conf
by default, so you can "ping ip-10-42-120-3" and it should work just
fine. Then make conf/master list your first host by name, and make
conf/slaves list all your hosts by name. Note that for small clusters,
running a DN and a NN on a single host is an acceptable compromise and
works OK.

% cat conf/master
ip-10-42-120-3
% cat conf/slaves
ip-10-42-120-3
ip-10-42-115-32
%

You also should make sure that your user account can ssh to all the nodes:
% for h in $(cat conf/slaves); do ssh -oStrictHostKeyChecking=no $h
hostname; done

 - answer "yes" to any "allow untrusted certificate" messages
 - if you get "permission denied" messages you'll need to set up the
authorized_keys properly.
 - after this loop succeeds you should be able to run it again and get
a clean list of hostnames.

> At this point I cannot find an answer on what to do next.
>
> I started NN, DN, SNN, JT, TT on foor1. After I ran "hadoop fsck /user/bar -files -blocks -locations", it showed # of datanode as 1.  I was expecting DN and TT on foo2 to be started by foo1. But it didn’t happen, so I started them myself and tried the the command again. Still  one DD.

You don't need to start the daemons individually, and doing so is very
difficult to get right. I virtually never do so -- I use the
start-dfs.sh script to start the daemons (NN, DN, TT, etc). The
"master" and "slaves" config files are parsed by the start-*.sh
scripts, not by the daemons themselves.  And, the daemons don't start
themselves -- for a manual cluster, the start-*.sh scripts are
responsible. (In a production deployment such as CDH, there is a
/etc/init.d script which is managed by the distro packaging to start
and manage the daemons.)

-andy

RE: cluster set-up / a few quick questions

Posted by "Kartashov, Andy" <An...@mpac.ca>.

Andy, many thanks.

I am stuck here now so please put me in the right direction.

I successfully ran a job on a cluster on foo1 in pseudo-distributed mode and are now trying to try fully-dist'ed one.

a. I created another instance foo2 on EC2. Installed hadoop on it and copied conf/  folder from foo1 to foo2. I created  /hadoop/dfs/data folder on the local linux system on foo2.

b. on foo1 I created file conf/slaves and added:
localhost
<hostname-of-foo2>

At this point I cannot find an answer on what to do next.

I started NN, DN, SNN, JT, TT on foor1. After I ran "hadoop fsck /user/bar -files -blocks -locations", it showed # of datanode as 1.  I was expecting DN and TT on foo2 to be started by foo1. But it didn’t happen, so I started them myself and tried the the command again. Still  one DD.
I realise that boo2 has no data at this point but I could not find bin/start-balancer.sh script to help me to balance data over to DD from foo1 to foo2.

What do I do next?

Thanks
AK

-----Original Message-----
From: Andy Isaacson [mailto:adi@cloudera.com]
Sent: Friday, October 26, 2012 2:21 PM
To: user@hadoop.apache.org
Subject: Re: cluster set-up / a few quick questions

On Fri, Oct 26, 2012 at 9:40 AM, Kartashov, Andy <An...@mpac.ca> wrote:
> Gents,

We're not all male here. :)  I prefer "Hadoopers" or "hi all,".

> 1.
> - do you put Master's node <hostname> under fs.default.name in core-site.xml on the slave machines or slaves' hostnames?

Master.  I have a 4-node cluster, named foo1 - foo4. My fs.default.name is hdfs://foo1.domain.com.

> - do you need to run "sudo -u hdfs hadoop namenode -format" and create /tmp /var folders on the HDFS of the slave machines that will be running only DN and TT or not? Do you still need to create hadoop/dfs/name folder on the slaves?

(The following is the simple answer, for non-HA non-federated HDFS.
You'll want to get the simple example working before trying the complicated ones.)

No. A cluster has one namenode, running on the machine known as the master, and the admin must "hadoop namenode -format" on that machine only.

In my example, I ran "hadoop namenode -format" on foo1.

> 2.
> In hdfs-site.xml for dfs.name.dir & dfs.data.dir properties  we specify  /hadoop/dfs/name /hadoop/dfs/data  being  local linux NFS directories by running command "mkdir -p /hadoop/dfs/data"
> but mapred.system.dir  property is to point to HDFS and not NFS  since we are running "sudo -u hdfs hadoop fs -mkdir /tmp/mapred/system"??
> If so and since it is exactly the same format  /far/boo/baz how does hadoop know which directory is local on NFS or HDFS?

This is very confusing, to be sure!  There are a few places where paths are implicitly known to be on HDFS rather than a Linux filesystem path. mapred.system.dir is one of those. This does mean that given a string that starts with "/tmp/" you can't necessarily know whether it's a Linux path or a HDFS path without looking at the larger context.

In the case of mapred.system.dir, the docs are the place to check; according to cluster_setup.html, mapred.system.dir is "Path on the HDFS where where the Map/Reduce framework stores system files".

http://hadoop.apache.org/docs/r1.0.3/cluster_setup.html

Hope this helps,
-andy
NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le présent courriel et toute pièce jointe qui l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent être couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent courriel

RE: cluster set-up / a few quick questions

Posted by "Kartashov, Andy" <An...@mpac.ca>.

Andy, many thanks.

I am stuck here now so please put me in the right direction.

I successfully ran a job on a cluster on foo1 in pseudo-distributed mode and are now trying to try fully-dist'ed one.

a. I created another instance foo2 on EC2. Installed hadoop on it and copied conf/  folder from foo1 to foo2. I created  /hadoop/dfs/data folder on the local linux system on foo2.

b. on foo1 I created file conf/slaves and added:
localhost
<hostname-of-foo2>

At this point I cannot find an answer on what to do next.

I started NN, DN, SNN, JT, TT on foor1. After I ran "hadoop fsck /user/bar -files -blocks -locations", it showed # of datanode as 1.  I was expecting DN and TT on foo2 to be started by foo1. But it didn’t happen, so I started them myself and tried the the command again. Still  one DD.
I realise that boo2 has no data at this point but I could not find bin/start-balancer.sh script to help me to balance data over to DD from foo1 to foo2.

What do I do next?

Thanks
AK

-----Original Message-----
From: Andy Isaacson [mailto:adi@cloudera.com]
Sent: Friday, October 26, 2012 2:21 PM
To: user@hadoop.apache.org
Subject: Re: cluster set-up / a few quick questions

On Fri, Oct 26, 2012 at 9:40 AM, Kartashov, Andy <An...@mpac.ca> wrote:
> Gents,

We're not all male here. :)  I prefer "Hadoopers" or "hi all,".

> 1.
> - do you put Master's node <hostname> under fs.default.name in core-site.xml on the slave machines or slaves' hostnames?

Master.  I have a 4-node cluster, named foo1 - foo4. My fs.default.name is hdfs://foo1.domain.com.

> - do you need to run "sudo -u hdfs hadoop namenode -format" and create /tmp /var folders on the HDFS of the slave machines that will be running only DN and TT or not? Do you still need to create hadoop/dfs/name folder on the slaves?

(The following is the simple answer, for non-HA non-federated HDFS.
You'll want to get the simple example working before trying the complicated ones.)

No. A cluster has one namenode, running on the machine known as the master, and the admin must "hadoop namenode -format" on that machine only.

In my example, I ran "hadoop namenode -format" on foo1.

> 2.
> In hdfs-site.xml for dfs.name.dir & dfs.data.dir properties  we specify  /hadoop/dfs/name /hadoop/dfs/data  being  local linux NFS directories by running command "mkdir -p /hadoop/dfs/data"
> but mapred.system.dir  property is to point to HDFS and not NFS  since we are running "sudo -u hdfs hadoop fs -mkdir /tmp/mapred/system"??
> If so and since it is exactly the same format  /far/boo/baz how does hadoop know which directory is local on NFS or HDFS?

This is very confusing, to be sure!  There are a few places where paths are implicitly known to be on HDFS rather than a Linux filesystem path. mapred.system.dir is one of those. This does mean that given a string that starts with "/tmp/" you can't necessarily know whether it's a Linux path or a HDFS path without looking at the larger context.

In the case of mapred.system.dir, the docs are the place to check; according to cluster_setup.html, mapred.system.dir is "Path on the HDFS where where the Map/Reduce framework stores system files".

http://hadoop.apache.org/docs/r1.0.3/cluster_setup.html

Hope this helps,
-andy
NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le présent courriel et toute pièce jointe qui l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent être couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent courriel

RE: cluster set-up / a few quick questions

Posted by "Kartashov, Andy" <An...@mpac.ca>.

Andy, many thanks.

I am stuck here now so please put me in the right direction.

I successfully ran a job on a cluster on foo1 in pseudo-distributed mode and are now trying to try fully-dist'ed one.

a. I created another instance foo2 on EC2. Installed hadoop on it and copied conf/  folder from foo1 to foo2. I created  /hadoop/dfs/data folder on the local linux system on foo2.

b. on foo1 I created file conf/slaves and added:
localhost
<hostname-of-foo2>

At this point I cannot find an answer on what to do next.

I started NN, DN, SNN, JT, TT on foor1. After I ran "hadoop fsck /user/bar -files -blocks -locations", it showed # of datanode as 1.  I was expecting DN and TT on foo2 to be started by foo1. But it didn’t happen, so I started them myself and tried the the command again. Still  one DD.
I realise that boo2 has no data at this point but I could not find bin/start-balancer.sh script to help me to balance data over to DD from foo1 to foo2.

What do I do next?

Thanks
AK

-----Original Message-----
From: Andy Isaacson [mailto:adi@cloudera.com]
Sent: Friday, October 26, 2012 2:21 PM
To: user@hadoop.apache.org
Subject: Re: cluster set-up / a few quick questions

On Fri, Oct 26, 2012 at 9:40 AM, Kartashov, Andy <An...@mpac.ca> wrote:
> Gents,

We're not all male here. :)  I prefer "Hadoopers" or "hi all,".

> 1.
> - do you put Master's node <hostname> under fs.default.name in core-site.xml on the slave machines or slaves' hostnames?

Master.  I have a 4-node cluster, named foo1 - foo4. My fs.default.name is hdfs://foo1.domain.com.

> - do you need to run "sudo -u hdfs hadoop namenode -format" and create /tmp /var folders on the HDFS of the slave machines that will be running only DN and TT or not? Do you still need to create hadoop/dfs/name folder on the slaves?

(The following is the simple answer, for non-HA non-federated HDFS.
You'll want to get the simple example working before trying the complicated ones.)

No. A cluster has one namenode, running on the machine known as the master, and the admin must "hadoop namenode -format" on that machine only.

In my example, I ran "hadoop namenode -format" on foo1.

> 2.
> In hdfs-site.xml for dfs.name.dir & dfs.data.dir properties  we specify  /hadoop/dfs/name /hadoop/dfs/data  being  local linux NFS directories by running command "mkdir -p /hadoop/dfs/data"
> but mapred.system.dir  property is to point to HDFS and not NFS  since we are running "sudo -u hdfs hadoop fs -mkdir /tmp/mapred/system"??
> If so and since it is exactly the same format  /far/boo/baz how does hadoop know which directory is local on NFS or HDFS?

This is very confusing, to be sure!  There are a few places where paths are implicitly known to be on HDFS rather than a Linux filesystem path. mapred.system.dir is one of those. This does mean that given a string that starts with "/tmp/" you can't necessarily know whether it's a Linux path or a HDFS path without looking at the larger context.

In the case of mapred.system.dir, the docs are the place to check; according to cluster_setup.html, mapred.system.dir is "Path on the HDFS where where the Map/Reduce framework stores system files".

http://hadoop.apache.org/docs/r1.0.3/cluster_setup.html

Hope this helps,
-andy
NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le présent courriel et toute pièce jointe qui l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent être couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent courriel

RE: cluster set-up / a few quick questions

Posted by "Kartashov, Andy" <An...@mpac.ca>.

Andy, many thanks.

I am stuck here now so please put me in the right direction.

I successfully ran a job on a cluster on foo1 in pseudo-distributed mode and are now trying to try fully-dist'ed one.

a. I created another instance foo2 on EC2. Installed hadoop on it and copied conf/  folder from foo1 to foo2. I created  /hadoop/dfs/data folder on the local linux system on foo2.

b. on foo1 I created file conf/slaves and added:
localhost
<hostname-of-foo2>

At this point I cannot find an answer on what to do next.

I started NN, DN, SNN, JT, TT on foor1. After I ran "hadoop fsck /user/bar -files -blocks -locations", it showed # of datanode as 1.  I was expecting DN and TT on foo2 to be started by foo1. But it didn’t happen, so I started them myself and tried the the command again. Still  one DD.
I realise that boo2 has no data at this point but I could not find bin/start-balancer.sh script to help me to balance data over to DD from foo1 to foo2.

What do I do next?

Thanks
AK

-----Original Message-----
From: Andy Isaacson [mailto:adi@cloudera.com]
Sent: Friday, October 26, 2012 2:21 PM
To: user@hadoop.apache.org
Subject: Re: cluster set-up / a few quick questions

On Fri, Oct 26, 2012 at 9:40 AM, Kartashov, Andy <An...@mpac.ca> wrote:
> Gents,

We're not all male here. :)  I prefer "Hadoopers" or "hi all,".

> 1.
> - do you put Master's node <hostname> under fs.default.name in core-site.xml on the slave machines or slaves' hostnames?

Master.  I have a 4-node cluster, named foo1 - foo4. My fs.default.name is hdfs://foo1.domain.com.

> - do you need to run "sudo -u hdfs hadoop namenode -format" and create /tmp /var folders on the HDFS of the slave machines that will be running only DN and TT or not? Do you still need to create hadoop/dfs/name folder on the slaves?

(The following is the simple answer, for non-HA non-federated HDFS.
You'll want to get the simple example working before trying the complicated ones.)

No. A cluster has one namenode, running on the machine known as the master, and the admin must "hadoop namenode -format" on that machine only.

In my example, I ran "hadoop namenode -format" on foo1.

> 2.
> In hdfs-site.xml for dfs.name.dir & dfs.data.dir properties  we specify  /hadoop/dfs/name /hadoop/dfs/data  being  local linux NFS directories by running command "mkdir -p /hadoop/dfs/data"
> but mapred.system.dir  property is to point to HDFS and not NFS  since we are running "sudo -u hdfs hadoop fs -mkdir /tmp/mapred/system"??
> If so and since it is exactly the same format  /far/boo/baz how does hadoop know which directory is local on NFS or HDFS?

This is very confusing, to be sure!  There are a few places where paths are implicitly known to be on HDFS rather than a Linux filesystem path. mapred.system.dir is one of those. This does mean that given a string that starts with "/tmp/" you can't necessarily know whether it's a Linux path or a HDFS path without looking at the larger context.

In the case of mapred.system.dir, the docs are the place to check; according to cluster_setup.html, mapred.system.dir is "Path on the HDFS where where the Map/Reduce framework stores system files".

http://hadoop.apache.org/docs/r1.0.3/cluster_setup.html

Hope this helps,
-andy
NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le présent courriel et toute pièce jointe qui l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent être couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent courriel

Re: cluster set-up / a few quick questions

Posted by Andy Isaacson <ad...@cloudera.com>.

On Fri, Oct 26, 2012 at 9:40 AM, Kartashov, Andy <An...@mpac.ca> wrote:
> Gents,

We're not all male here. :)  I prefer "Hadoopers" or "hi all,".

> 1.
> - do you put Master's node <hostname> under fs.default.name in core-site.xml on the slave machines or slaves' hostnames?

Master.  I have a 4-node cluster, named foo1 - foo4. My
fs.default.name is hdfs://foo1.domain.com.

> - do you need to run "sudo -u hdfs hadoop namenode -format" and create /tmp /var folders on the HDFS of the slave machines that will be running only DN and TT or not? Do you still need to create hadoop/dfs/name folder on the slaves?

(The following is the simple answer, for non-HA non-federated HDFS.
You'll want to get the simple example working before trying the
complicated ones.)

No. A cluster has one namenode, running on the machine known as the
master, and the admin must "hadoop namenode -format" on that machine
only.

In my example, I ran "hadoop namenode -format" on foo1.

> 2.
> In hdfs-site.xml for dfs.name.dir & dfs.data.dir properties  we specify  /hadoop/dfs/name /hadoop/dfs/data  being  local linux NFS directories by running command "mkdir -p /hadoop/dfs/data"
> but mapred.system.dir  property is to point to HDFS and not NFS  since we are running "sudo -u hdfs hadoop fs -mkdir /tmp/mapred/system"??
> If so and since it is exactly the same format  /far/boo/baz how does hadoop know which directory is local on NFS or HDFS?

This is very confusing, to be sure!  There are a few places where
paths are implicitly known to be on HDFS rather than a Linux
filesystem path. mapred.system.dir is one of those. This does mean
that given a string that starts with "/tmp/" you can't necessarily
know whether it's a Linux path or a HDFS path without looking at the
larger context.

In the case of mapred.system.dir, the docs are the place to check;
according to cluster_setup.html, mapred.system.dir is "Path on the
HDFS where where the Map/Reduce framework stores system files".

http://hadoop.apache.org/docs/r1.0.3/cluster_setup.html

Hope this helps,
-andy

Re: cluster set-up / a few quick questions

Posted by Andy Isaacson <ad...@cloudera.com>.

On Fri, Oct 26, 2012 at 9:40 AM, Kartashov, Andy <An...@mpac.ca> wrote:
> Gents,

We're not all male here. :)  I prefer "Hadoopers" or "hi all,".

> 1.
> - do you put Master's node <hostname> under fs.default.name in core-site.xml on the slave machines or slaves' hostnames?

Master.  I have a 4-node cluster, named foo1 - foo4. My
fs.default.name is hdfs://foo1.domain.com.

> - do you need to run "sudo -u hdfs hadoop namenode -format" and create /tmp /var folders on the HDFS of the slave machines that will be running only DN and TT or not? Do you still need to create hadoop/dfs/name folder on the slaves?

(The following is the simple answer, for non-HA non-federated HDFS.
You'll want to get the simple example working before trying the
complicated ones.)

No. A cluster has one namenode, running on the machine known as the
master, and the admin must "hadoop namenode -format" on that machine
only.

In my example, I ran "hadoop namenode -format" on foo1.

> 2.
> In hdfs-site.xml for dfs.name.dir & dfs.data.dir properties  we specify  /hadoop/dfs/name /hadoop/dfs/data  being  local linux NFS directories by running command "mkdir -p /hadoop/dfs/data"
> but mapred.system.dir  property is to point to HDFS and not NFS  since we are running "sudo -u hdfs hadoop fs -mkdir /tmp/mapred/system"??
> If so and since it is exactly the same format  /far/boo/baz how does hadoop know which directory is local on NFS or HDFS?

This is very confusing, to be sure!  There are a few places where
paths are implicitly known to be on HDFS rather than a Linux
filesystem path. mapred.system.dir is one of those. This does mean
that given a string that starts with "/tmp/" you can't necessarily
know whether it's a Linux path or a HDFS path without looking at the
larger context.

In the case of mapred.system.dir, the docs are the place to check;
according to cluster_setup.html, mapred.system.dir is "Path on the
HDFS where where the Map/Reduce framework stores system files".

http://hadoop.apache.org/docs/r1.0.3/cluster_setup.html

Hope this helps,
-andy

Re: cluster set-up / a few quick questions

Posted by Andy Isaacson <ad...@cloudera.com>.

On Fri, Oct 26, 2012 at 9:40 AM, Kartashov, Andy <An...@mpac.ca> wrote:
> Gents,

We're not all male here. :)  I prefer "Hadoopers" or "hi all,".

> 1.
> - do you put Master's node <hostname> under fs.default.name in core-site.xml on the slave machines or slaves' hostnames?

Master.  I have a 4-node cluster, named foo1 - foo4. My
fs.default.name is hdfs://foo1.domain.com.

> - do you need to run "sudo -u hdfs hadoop namenode -format" and create /tmp /var folders on the HDFS of the slave machines that will be running only DN and TT or not? Do you still need to create hadoop/dfs/name folder on the slaves?

(The following is the simple answer, for non-HA non-federated HDFS.
You'll want to get the simple example working before trying the
complicated ones.)

No. A cluster has one namenode, running on the machine known as the
master, and the admin must "hadoop namenode -format" on that machine
only.

In my example, I ran "hadoop namenode -format" on foo1.

> 2.
> In hdfs-site.xml for dfs.name.dir & dfs.data.dir properties  we specify  /hadoop/dfs/name /hadoop/dfs/data  being  local linux NFS directories by running command "mkdir -p /hadoop/dfs/data"
> but mapred.system.dir  property is to point to HDFS and not NFS  since we are running "sudo -u hdfs hadoop fs -mkdir /tmp/mapred/system"??
> If so and since it is exactly the same format  /far/boo/baz how does hadoop know which directory is local on NFS or HDFS?

This is very confusing, to be sure!  There are a few places where
paths are implicitly known to be on HDFS rather than a Linux
filesystem path. mapred.system.dir is one of those. This does mean
that given a string that starts with "/tmp/" you can't necessarily
know whether it's a Linux path or a HDFS path without looking at the
larger context.

In the case of mapred.system.dir, the docs are the place to check;
according to cluster_setup.html, mapred.system.dir is "Path on the
HDFS where where the Map/Reduce framework stores system files".

http://hadoop.apache.org/docs/r1.0.3/cluster_setup.html

Hope this helps,
-andy

Re: cluster set-up / a few quick questions

Posted by Andy Isaacson <ad...@cloudera.com>.

On Fri, Oct 26, 2012 at 9:40 AM, Kartashov, Andy <An...@mpac.ca> wrote:
> Gents,

We're not all male here. :)  I prefer "Hadoopers" or "hi all,".

> 1.
> - do you put Master's node <hostname> under fs.default.name in core-site.xml on the slave machines or slaves' hostnames?

Master.  I have a 4-node cluster, named foo1 - foo4. My
fs.default.name is hdfs://foo1.domain.com.

> - do you need to run "sudo -u hdfs hadoop namenode -format" and create /tmp /var folders on the HDFS of the slave machines that will be running only DN and TT or not? Do you still need to create hadoop/dfs/name folder on the slaves?

(The following is the simple answer, for non-HA non-federated HDFS.
You'll want to get the simple example working before trying the
complicated ones.)

No. A cluster has one namenode, running on the machine known as the
master, and the admin must "hadoop namenode -format" on that machine
only.

In my example, I ran "hadoop namenode -format" on foo1.

> 2.
> In hdfs-site.xml for dfs.name.dir & dfs.data.dir properties  we specify  /hadoop/dfs/name /hadoop/dfs/data  being  local linux NFS directories by running command "mkdir -p /hadoop/dfs/data"
> but mapred.system.dir  property is to point to HDFS and not NFS  since we are running "sudo -u hdfs hadoop fs -mkdir /tmp/mapred/system"??
> If so and since it is exactly the same format  /far/boo/baz how does hadoop know which directory is local on NFS or HDFS?

This is very confusing, to be sure!  There are a few places where
paths are implicitly known to be on HDFS rather than a Linux
filesystem path. mapred.system.dir is one of those. This does mean
that given a string that starts with "/tmp/" you can't necessarily
know whether it's a Linux path or a HDFS path without looking at the
larger context.

In the case of mapred.system.dir, the docs are the place to check;
according to cluster_setup.html, mapred.system.dir is "Path on the
HDFS where where the Map/Reduce framework stores system files".

http://hadoop.apache.org/docs/r1.0.3/cluster_setup.html

Hope this helps,
-andy