You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Andy Sautins <an...@returnpath.net> on 2008/12/09 00:25:47 UTC

internal/external interfaces for hadoop...

 

   I'm trying to setup what I think would be a common hadoop
configuration.  I have 4 data nodes on an internal 10.x network.  Each
of the data nodes only has access to the 10.x network.  The name node
has both an internal 10.x network interface and an external interface.
I want the hdfs filesystem and job tracker to be available on the
external network, but the communication within the cluster to be on the
10.x network.  Is this possible to do?  Changing the fs.default.name
configuration parameter I can change the filesystem to listen from the
internal to the external interface, however, then the data nodes can't
communicate to the name node.  I also tried setting the fs.default.name
IP address to 0.0.0.0 to see if it would bind to all interfaces, but
that didn't seem to work.

 

   Is it possible to configure hadoop so that the datanodes communicate
on an internal network, but access to hdfs and the job tracker are done
through an external interface?

 

   Any help would be much appreciated.

 

   Thank you

 

   Andy  


RE: internal/external interfaces for hadoop...

Posted by Andy Sautins <an...@returnpath.net>.
  Ah.  Thanks.  That makes what I was trying to do sound rather
ridiculous now, does it.

  I appreciate the insight.

  Thanks

  Andy

-----Original Message-----
From: Taeho Kang [mailto:tkang1@gmail.com] 
Sent: Monday, December 08, 2008 6:10 PM
To: core-user@hadoop.apache.org
Subject: Re: internal/external interfaces for hadoop...

When reading from or writing to a file on HDFS, datablocks never go thru
the
namenode. They are directly handled/transferred between your client and
the
datanodes that contain the blocks.

 Hence, datanodes must be accessible by your client. In this case since
your
client is on an external network, your datanodes must be accessible to
external networks.


On Tue, Dec 9, 2008 at 8:25 AM, Andy Sautins
<an...@returnpath.net>wrote:

>
>
>   I'm trying to setup what I think would be a common hadoop
> configuration.  I have 4 data nodes on an internal 10.x network.  Each
> of the data nodes only has access to the 10.x network.  The name node
> has both an internal 10.x network interface and an external interface.
> I want the hdfs filesystem and job tracker to be available on the
> external network, but the communication within the cluster to be on
the
> 10.x network.  Is this possible to do?  Changing the fs.default.name
> configuration parameter I can change the filesystem to listen from the
> internal to the external interface, however, then the data nodes can't
> communicate to the name node.  I also tried setting the
fs.default.name
> IP address to 0.0.0.0 to see if it would bind to all interfaces, but
> that didn't seem to work.
>
>
>
>   Is it possible to configure hadoop so that the datanodes communicate
> on an internal network, but access to hdfs and the job tracker are
done
> through an external interface?
>
>
>
>   Any help would be much appreciated.
>
>
>
>   Thank you
>
>
>
>   Andy
>
>

Re: internal/external interfaces for hadoop...

Posted by Taeho Kang <tk...@gmail.com>.
When reading from or writing to a file on HDFS, datablocks never go thru the
namenode. They are directly handled/transferred between your client and the
datanodes that contain the blocks.

 Hence, datanodes must be accessible by your client. In this case since your
client is on an external network, your datanodes must be accessible to
external networks.


On Tue, Dec 9, 2008 at 8:25 AM, Andy Sautins <an...@returnpath.net>wrote:

>
>
>   I'm trying to setup what I think would be a common hadoop
> configuration.  I have 4 data nodes on an internal 10.x network.  Each
> of the data nodes only has access to the 10.x network.  The name node
> has both an internal 10.x network interface and an external interface.
> I want the hdfs filesystem and job tracker to be available on the
> external network, but the communication within the cluster to be on the
> 10.x network.  Is this possible to do?  Changing the fs.default.name
> configuration parameter I can change the filesystem to listen from the
> internal to the external interface, however, then the data nodes can't
> communicate to the name node.  I also tried setting the fs.default.name
> IP address to 0.0.0.0 to see if it would bind to all interfaces, but
> that didn't seem to work.
>
>
>
>   Is it possible to configure hadoop so that the datanodes communicate
> on an internal network, but access to hdfs and the job tracker are done
> through an external interface?
>
>
>
>   Any help would be much appreciated.
>
>
>
>   Thank you
>
>
>
>   Andy
>
>