You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Sridhar Raman <sr...@gmail.com> on 2008/05/30 10:35:49 UTC

Hadoop installation folders in multiple nodes

Should the installation paths be the same in all the nodes?  Most
documentation seems to suggest that it is _*recommended*_ to have the _*same
*_ paths in all the nodes.  But what is the workaround, if, because of some
reason, one isn't able to have the same path?

That's the problem we are facing right now.  After making Hadoop work
perfectly in a 2-node cluster, when we tried to accommodate a 3rd machine,
we realised that this machine doesn't have a E:, which is where the
installation of hadoop is in the other 2 nodes.  All our machines are
Windows machines.  The possible solutions are:
1) Move the installations in M1 & M2 to a drive that is present in M3.  We
will keep this as the last option.
2) Map a folder in M3's D: to E:.  We used the "subst" command to do this.
But when we tried to start DFS, it wasn't able to find the hadoop
installation.  Just to verify, we tried a ssh to the localhost, and were
unable to find the mapped drive.  It's only visible as a folder of D:.
Whereas, in the basic cygwin prompt, we are able to view E:.
3) Partition M3's D drive and create an E.  This carries the risk of loss of
data.

So, what should we do?  Is there any way we can specify in the NameNode the
installation paths of hadoop in each of the remaining nodes?  Or is there
some environment variable that can be set, which can make the hadoop
installation path specific to each machine?

Thanks,
Sridhar

Re: Hadoop installation folders in multiple nodes

Posted by Michael Di Domenico <md...@gmail.com>.
Oops, missed the part where you already tried that.

On Mon, Jun 2, 2008 at 3:23 PM, Michael Di Domenico <md...@gmail.com>
wrote:

> Depending on your windows version, there is a dos command called "subst"
> which you could use to virtualize a drive letter on your third machine
>
>
> On Fri, May 30, 2008 at 4:35 AM, Sridhar Raman <sr...@gmail.com>
> wrote:
>
>> Should the installation paths be the same in all the nodes?  Most
>> documentation seems to suggest that it is _*recommended*_ to have the
>> _*same
>> *_ paths in all the nodes.  But what is the workaround, if, because of
>> some
>> reason, one isn't able to have the same path?
>>
>> That's the problem we are facing right now.  After making Hadoop work
>> perfectly in a 2-node cluster, when we tried to accommodate a 3rd machine,
>> we realised that this machine doesn't have a E:, which is where the
>> installation of hadoop is in the other 2 nodes.  All our machines are
>> Windows machines.  The possible solutions are:
>> 1) Move the installations in M1 & M2 to a drive that is present in M3.  We
>> will keep this as the last option.
>> 2) Map a folder in M3's D: to E:.  We used the "subst" command to do this.
>> But when we tried to start DFS, it wasn't able to find the hadoop
>> installation.  Just to verify, we tried a ssh to the localhost, and were
>> unable to find the mapped drive.  It's only visible as a folder of D:.
>> Whereas, in the basic cygwin prompt, we are able to view E:.
>> 3) Partition M3's D drive and create an E.  This carries the risk of loss
>> of
>> data.
>>
>> So, what should we do?  Is there any way we can specify in the NameNode
>> the
>> installation paths of hadoop in each of the remaining nodes?  Or is there
>> some environment variable that can be set, which can make the hadoop
>> installation path specific to each machine?
>>
>> Thanks,
>> Sridhar
>>
>
>

Re: Hadoop installation folders in multiple nodes

Posted by Michael Di Domenico <md...@gmail.com>.
Depending on your windows version, there is a dos command called "subst"
which you could use to virtualize a drive letter on your third machine

On Fri, May 30, 2008 at 4:35 AM, Sridhar Raman <sr...@gmail.com>
wrote:

> Should the installation paths be the same in all the nodes?  Most
> documentation seems to suggest that it is _*recommended*_ to have the
> _*same
> *_ paths in all the nodes.  But what is the workaround, if, because of some
> reason, one isn't able to have the same path?
>
> That's the problem we are facing right now.  After making Hadoop work
> perfectly in a 2-node cluster, when we tried to accommodate a 3rd machine,
> we realised that this machine doesn't have a E:, which is where the
> installation of hadoop is in the other 2 nodes.  All our machines are
> Windows machines.  The possible solutions are:
> 1) Move the installations in M1 & M2 to a drive that is present in M3.  We
> will keep this as the last option.
> 2) Map a folder in M3's D: to E:.  We used the "subst" command to do this.
> But when we tried to start DFS, it wasn't able to find the hadoop
> installation.  Just to verify, we tried a ssh to the localhost, and were
> unable to find the mapped drive.  It's only visible as a folder of D:.
> Whereas, in the basic cygwin prompt, we are able to view E:.
> 3) Partition M3's D drive and create an E.  This carries the risk of loss
> of
> data.
>
> So, what should we do?  Is there any way we can specify in the NameNode the
> installation paths of hadoop in each of the remaining nodes?  Or is there
> some environment variable that can be set, which can make the hadoop
> installation path specific to each machine?
>
> Thanks,
> Sridhar
>