You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by viral shah <vi...@gmail.com> on 2010/02/17 11:59:46 UTC

Issue with Hadoop cluster on Amazon ec2

Hi,

We have deployed hadoop cluster on EC2, hadoop version 0.20.1.
We are having couple of data nodes.
We want to get some files from the data node which is there on the amazon
ec2 instance to our local instance using java application, which in turn use
SequentialFile.reader to read file.
The problem is amazon uses private IP for host communication, but to connect
form the environment other than amazon we will be using public IP.
So when we try to connect to the data nodes via name node, it will report
data node's private IP and using the same we are not able to reach the data
node.
Is there any way we can set name node to send data nodes public NAT IP not
the internal IP, or any other work around is there to overcome this problem.

Thanks
Viral.

Re: Issue with Hadoop cluster on Amazon ec2

Posted by Steve Loughran <st...@apache.org>.
viral shah wrote:
> Hi,
> 
> We have deployed hadoop cluster on EC2, hadoop version 0.20.1.
> We are having couple of data nodes.
> We want to get some files from the data node which is there on the amazon
> ec2 instance to our local instance using java application, which in turn use
> SequentialFile.reader to read file.
> The problem is amazon uses private IP for host communication, but to connect
> form the environment other than amazon we will be using public IP.
> So when we try to connect to the data nodes via name node, it will report
> data node's private IP and using the same we are not able to reach the data
> node.

That's a feature to stop you accidentally exporting your entire HDFS 
filesystem to the rest of the world.

> Is there any way we can set name node to send data nodes public NAT IP not
> the internal IP, or any other work around is there to overcome this problem.

-push up the data to the s3 filestore first, have the job sequence start 
from s3 and finish there too