You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Elazar Leibovich <el...@gmail.com> on 2013/07/24 21:51:33 UTC

Why Hadoop force using DNS?

Looking at Hadoop source you can see that Hadoop relies on the fact each
node has resolvable name.

For example, Hadoop 2 namenode reverse look the up of each node that
connects to it. Also, there's no way way to tell a database to advertise an
UP as it's address. Setting datanode.network.interface to, say, eth1, would
cause Hadoop to reverse lookup UPs on eth1 and advertise the result.

Why is that? Using plain IPs is simple to set up, and I can't see a reason
not to support them?

Re: Why Hadoop force using DNS?

Posted by Elazar Leibovich <el...@gmail.com>.
This is a reason to force reverse resolution of IPs if they do not appear
in the dfs.allow. If the IP appear in dfs.allow, there's no reason to
reverse-resolve it.


On Mon, Jul 29, 2013 at 4:48 PM, Daryn Sharp <da...@yahoo-inc.com> wrote:

>  One reason is the lists to accept or reject DN accepts hostnames.  If dns
> temporarily can't resolve an IP then an unauthorized DN might slip back
> into the cluster, or a decommissioning node might go back into service.
>
>  Daryn
>
>
>  On Jul 29, 2013, at 8:21 AM, 武泽胜 wrote:
>
>  I have the same confusion, anyone who can reply to this will be very
> appreciated.
>
>   From: Elazar Leibovich <el...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Thursday, July 25, 2013 3:51 AM
> To: user <us...@hadoop.apache.org>
> Subject: Why Hadoop force using DNS?
>
>   Looking at Hadoop source you can see that Hadoop relies on the fact
> each node has resolvable name.
>
>  For example, Hadoop 2 namenode reverse look the up of each node that
> connects to it. Also, there's no way way to tell a database to advertise an
> UP as it's address. Setting datanode.network.interface to, say, eth1, would
> cause Hadoop to reverse lookup UPs on eth1 and advertise the result.
>
>  Why is that? Using plain IPs is simple to set up, and I can't see a
> reason not to support them?
>
>
>

Re: Why Hadoop force using DNS?

Posted by Elazar Leibovich <el...@gmail.com>.
This is a reason to force reverse resolution of IPs if they do not appear
in the dfs.allow. If the IP appear in dfs.allow, there's no reason to
reverse-resolve it.


On Mon, Jul 29, 2013 at 4:48 PM, Daryn Sharp <da...@yahoo-inc.com> wrote:

>  One reason is the lists to accept or reject DN accepts hostnames.  If dns
> temporarily can't resolve an IP then an unauthorized DN might slip back
> into the cluster, or a decommissioning node might go back into service.
>
>  Daryn
>
>
>  On Jul 29, 2013, at 8:21 AM, 武泽胜 wrote:
>
>  I have the same confusion, anyone who can reply to this will be very
> appreciated.
>
>   From: Elazar Leibovich <el...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Thursday, July 25, 2013 3:51 AM
> To: user <us...@hadoop.apache.org>
> Subject: Why Hadoop force using DNS?
>
>   Looking at Hadoop source you can see that Hadoop relies on the fact
> each node has resolvable name.
>
>  For example, Hadoop 2 namenode reverse look the up of each node that
> connects to it. Also, there's no way way to tell a database to advertise an
> UP as it's address. Setting datanode.network.interface to, say, eth1, would
> cause Hadoop to reverse lookup UPs on eth1 and advertise the result.
>
>  Why is that? Using plain IPs is simple to set up, and I can't see a
> reason not to support them?
>
>
>

Re: Why Hadoop force using DNS?

Posted by Elazar Leibovich <el...@gmail.com>.
This is a reason to force reverse resolution of IPs if they do not appear
in the dfs.allow. If the IP appear in dfs.allow, there's no reason to
reverse-resolve it.


On Mon, Jul 29, 2013 at 4:48 PM, Daryn Sharp <da...@yahoo-inc.com> wrote:

>  One reason is the lists to accept or reject DN accepts hostnames.  If dns
> temporarily can't resolve an IP then an unauthorized DN might slip back
> into the cluster, or a decommissioning node might go back into service.
>
>  Daryn
>
>
>  On Jul 29, 2013, at 8:21 AM, 武泽胜 wrote:
>
>  I have the same confusion, anyone who can reply to this will be very
> appreciated.
>
>   From: Elazar Leibovich <el...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Thursday, July 25, 2013 3:51 AM
> To: user <us...@hadoop.apache.org>
> Subject: Why Hadoop force using DNS?
>
>   Looking at Hadoop source you can see that Hadoop relies on the fact
> each node has resolvable name.
>
>  For example, Hadoop 2 namenode reverse look the up of each node that
> connects to it. Also, there's no way way to tell a database to advertise an
> UP as it's address. Setting datanode.network.interface to, say, eth1, would
> cause Hadoop to reverse lookup UPs on eth1 and advertise the result.
>
>  Why is that? Using plain IPs is simple to set up, and I can't see a
> reason not to support them?
>
>
>

Re: Why Hadoop force using DNS?

Posted by Elazar Leibovich <el...@gmail.com>.
This is a reason to force reverse resolution of IPs if they do not appear
in the dfs.allow. If the IP appear in dfs.allow, there's no reason to
reverse-resolve it.


On Mon, Jul 29, 2013 at 4:48 PM, Daryn Sharp <da...@yahoo-inc.com> wrote:

>  One reason is the lists to accept or reject DN accepts hostnames.  If dns
> temporarily can't resolve an IP then an unauthorized DN might slip back
> into the cluster, or a decommissioning node might go back into service.
>
>  Daryn
>
>
>  On Jul 29, 2013, at 8:21 AM, 武泽胜 wrote:
>
>  I have the same confusion, anyone who can reply to this will be very
> appreciated.
>
>   From: Elazar Leibovich <el...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Thursday, July 25, 2013 3:51 AM
> To: user <us...@hadoop.apache.org>
> Subject: Why Hadoop force using DNS?
>
>   Looking at Hadoop source you can see that Hadoop relies on the fact
> each node has resolvable name.
>
>  For example, Hadoop 2 namenode reverse look the up of each node that
> connects to it. Also, there's no way way to tell a database to advertise an
> UP as it's address. Setting datanode.network.interface to, say, eth1, would
> cause Hadoop to reverse lookup UPs on eth1 and advertise the result.
>
>  Why is that? Using plain IPs is simple to set up, and I can't see a
> reason not to support them?
>
>
>

Re: Why Hadoop force using DNS?

Posted by Daryn Sharp <da...@yahoo-inc.com>.
One reason is the lists to accept or reject DN accepts hostnames.  If dns temporarily can't resolve an IP then an unauthorized DN might slip back into the cluster, or a decommissioning node might go back into service.

Daryn

On Jul 29, 2013, at 8:21 AM, 武泽胜 wrote:

I have the same confusion, anyone who can reply to this will be very appreciated.

From: Elazar Leibovich <el...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, July 25, 2013 3:51 AM
To: user <us...@hadoop.apache.org>>
Subject: Why Hadoop force using DNS?

Looking at Hadoop source you can see that Hadoop relies on the fact each node has resolvable name.

For example, Hadoop 2 namenode reverse look the up of each node that connects to it. Also, there's no way way to tell a database to advertise an UP as it's address. Setting datanode.network.interface to, say, eth1, would cause Hadoop to reverse lookup UPs on eth1 and advertise the result.

Why is that? Using plain IPs is simple to set up, and I can't see a reason not to support them?


Re: Why Hadoop force using DNS?

Posted by Daryn Sharp <da...@yahoo-inc.com>.
One reason is the lists to accept or reject DN accepts hostnames.  If dns temporarily can't resolve an IP then an unauthorized DN might slip back into the cluster, or a decommissioning node might go back into service.

Daryn

On Jul 29, 2013, at 8:21 AM, 武泽胜 wrote:

I have the same confusion, anyone who can reply to this will be very appreciated.

From: Elazar Leibovich <el...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, July 25, 2013 3:51 AM
To: user <us...@hadoop.apache.org>>
Subject: Why Hadoop force using DNS?

Looking at Hadoop source you can see that Hadoop relies on the fact each node has resolvable name.

For example, Hadoop 2 namenode reverse look the up of each node that connects to it. Also, there's no way way to tell a database to advertise an UP as it's address. Setting datanode.network.interface to, say, eth1, would cause Hadoop to reverse lookup UPs on eth1 and advertise the result.

Why is that? Using plain IPs is simple to set up, and I can't see a reason not to support them?


Re: Why Hadoop force using DNS?

Posted by Daryn Sharp <da...@yahoo-inc.com>.
One reason is the lists to accept or reject DN accepts hostnames.  If dns temporarily can't resolve an IP then an unauthorized DN might slip back into the cluster, or a decommissioning node might go back into service.

Daryn

On Jul 29, 2013, at 8:21 AM, 武泽胜 wrote:

I have the same confusion, anyone who can reply to this will be very appreciated.

From: Elazar Leibovich <el...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, July 25, 2013 3:51 AM
To: user <us...@hadoop.apache.org>>
Subject: Why Hadoop force using DNS?

Looking at Hadoop source you can see that Hadoop relies on the fact each node has resolvable name.

For example, Hadoop 2 namenode reverse look the up of each node that connects to it. Also, there's no way way to tell a database to advertise an UP as it's address. Setting datanode.network.interface to, say, eth1, would cause Hadoop to reverse lookup UPs on eth1 and advertise the result.

Why is that? Using plain IPs is simple to set up, and I can't see a reason not to support them?


Re: Why Hadoop force using DNS?

Posted by Daryn Sharp <da...@yahoo-inc.com>.
One reason is the lists to accept or reject DN accepts hostnames.  If dns temporarily can't resolve an IP then an unauthorized DN might slip back into the cluster, or a decommissioning node might go back into service.

Daryn

On Jul 29, 2013, at 8:21 AM, 武泽胜 wrote:

I have the same confusion, anyone who can reply to this will be very appreciated.

From: Elazar Leibovich <el...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, July 25, 2013 3:51 AM
To: user <us...@hadoop.apache.org>>
Subject: Why Hadoop force using DNS?

Looking at Hadoop source you can see that Hadoop relies on the fact each node has resolvable name.

For example, Hadoop 2 namenode reverse look the up of each node that connects to it. Also, there's no way way to tell a database to advertise an UP as it's address. Setting datanode.network.interface to, say, eth1, would cause Hadoop to reverse lookup UPs on eth1 and advertise the result.

Why is that? Using plain IPs is simple to set up, and I can't see a reason not to support them?


Re: Why Hadoop force using DNS?

Posted by Elazar Leibovich <el...@gmail.com>.
Ease of use is a reason to support names, not to intentionally disallow raw
IPs. Not using names is convenient if you want to erect a temporary cluster
on a group of machines you don't own.

You have a user access, but name resolution is not always defined. As a
user you cannot change /etc/hosts.
On Jul 29, 2013 5:46 PM, "Chris Embree" <ce...@gmail.com> wrote:

> Just for clarity,  DNS as a service is NOT Required.  Name resolution is.
>  I use /etc/hosts files to identify all nodes in my clusters.
>
> One of the reasons for using Names over IP's is ease of use.  I would much
> rather use a hostname in my XML to identify NN, JT, etc. vs. some random
> string of numbers.
>
>
>
>
> On Mon, Jul 29, 2013 at 10:40 AM, Greg Bledsoe <gr...@personal.com> wrote:
>
>> I can third this concern.  What purpose does this complexity increasing
>> requirement serve?  Why not remove it?
>>
>> Greg Bledsoe
>>
>> From: 武泽胜 <wu...@xiaomi.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Mon, 29 Jul 2013 08:21:51 -0500
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject: Re: Why Hadoop force using DNS?
>>
>> I have the same confusion, anyone who can reply to this will be very
>> appreciated.
>>
>> From: Elazar Leibovich <el...@gmail.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Thursday, July 25, 2013 3:51 AM
>> To: user <us...@hadoop.apache.org>
>> Subject: Why Hadoop force using DNS?
>>
>> Looking at Hadoop source you can see that Hadoop relies on the fact each
>> node has resolvable name.
>>
>> For example, Hadoop 2 namenode reverse look the up of each node that
>> connects to it. Also, there's no way way to tell a database to advertise an
>> UP as it's address. Setting datanode.network.interface to, say, eth1, would
>> cause Hadoop to reverse lookup UPs on eth1 and advertise the result.
>>
>> Why is that? Using plain IPs is simple to set up, and I can't see a
>> reason not to support them?
>>
>
>

Re: Why Hadoop force using DNS?

Posted by Elazar Leibovich <el...@gmail.com>.
Ease of use is a reason to support names, not to intentionally disallow raw
IPs. Not using names is convenient if you want to erect a temporary cluster
on a group of machines you don't own.

You have a user access, but name resolution is not always defined. As a
user you cannot change /etc/hosts.
On Jul 29, 2013 5:46 PM, "Chris Embree" <ce...@gmail.com> wrote:

> Just for clarity,  DNS as a service is NOT Required.  Name resolution is.
>  I use /etc/hosts files to identify all nodes in my clusters.
>
> One of the reasons for using Names over IP's is ease of use.  I would much
> rather use a hostname in my XML to identify NN, JT, etc. vs. some random
> string of numbers.
>
>
>
>
> On Mon, Jul 29, 2013 at 10:40 AM, Greg Bledsoe <gr...@personal.com> wrote:
>
>> I can third this concern.  What purpose does this complexity increasing
>> requirement serve?  Why not remove it?
>>
>> Greg Bledsoe
>>
>> From: 武泽胜 <wu...@xiaomi.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Mon, 29 Jul 2013 08:21:51 -0500
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject: Re: Why Hadoop force using DNS?
>>
>> I have the same confusion, anyone who can reply to this will be very
>> appreciated.
>>
>> From: Elazar Leibovich <el...@gmail.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Thursday, July 25, 2013 3:51 AM
>> To: user <us...@hadoop.apache.org>
>> Subject: Why Hadoop force using DNS?
>>
>> Looking at Hadoop source you can see that Hadoop relies on the fact each
>> node has resolvable name.
>>
>> For example, Hadoop 2 namenode reverse look the up of each node that
>> connects to it. Also, there's no way way to tell a database to advertise an
>> UP as it's address. Setting datanode.network.interface to, say, eth1, would
>> cause Hadoop to reverse lookup UPs on eth1 and advertise the result.
>>
>> Why is that? Using plain IPs is simple to set up, and I can't see a
>> reason not to support them?
>>
>
>

Re: Why Hadoop force using DNS?

Posted by Elazar Leibovich <el...@gmail.com>.
Ease of use is a reason to support names, not to intentionally disallow raw
IPs. Not using names is convenient if you want to erect a temporary cluster
on a group of machines you don't own.

You have a user access, but name resolution is not always defined. As a
user you cannot change /etc/hosts.
On Jul 29, 2013 5:46 PM, "Chris Embree" <ce...@gmail.com> wrote:

> Just for clarity,  DNS as a service is NOT Required.  Name resolution is.
>  I use /etc/hosts files to identify all nodes in my clusters.
>
> One of the reasons for using Names over IP's is ease of use.  I would much
> rather use a hostname in my XML to identify NN, JT, etc. vs. some random
> string of numbers.
>
>
>
>
> On Mon, Jul 29, 2013 at 10:40 AM, Greg Bledsoe <gr...@personal.com> wrote:
>
>> I can third this concern.  What purpose does this complexity increasing
>> requirement serve?  Why not remove it?
>>
>> Greg Bledsoe
>>
>> From: 武泽胜 <wu...@xiaomi.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Mon, 29 Jul 2013 08:21:51 -0500
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject: Re: Why Hadoop force using DNS?
>>
>> I have the same confusion, anyone who can reply to this will be very
>> appreciated.
>>
>> From: Elazar Leibovich <el...@gmail.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Thursday, July 25, 2013 3:51 AM
>> To: user <us...@hadoop.apache.org>
>> Subject: Why Hadoop force using DNS?
>>
>> Looking at Hadoop source you can see that Hadoop relies on the fact each
>> node has resolvable name.
>>
>> For example, Hadoop 2 namenode reverse look the up of each node that
>> connects to it. Also, there's no way way to tell a database to advertise an
>> UP as it's address. Setting datanode.network.interface to, say, eth1, would
>> cause Hadoop to reverse lookup UPs on eth1 and advertise the result.
>>
>> Why is that? Using plain IPs is simple to set up, and I can't see a
>> reason not to support them?
>>
>
>

Re: Why Hadoop force using DNS?

Posted by Elazar Leibovich <el...@gmail.com>.
Ease of use is a reason to support names, not to intentionally disallow raw
IPs. Not using names is convenient if you want to erect a temporary cluster
on a group of machines you don't own.

You have a user access, but name resolution is not always defined. As a
user you cannot change /etc/hosts.
On Jul 29, 2013 5:46 PM, "Chris Embree" <ce...@gmail.com> wrote:

> Just for clarity,  DNS as a service is NOT Required.  Name resolution is.
>  I use /etc/hosts files to identify all nodes in my clusters.
>
> One of the reasons for using Names over IP's is ease of use.  I would much
> rather use a hostname in my XML to identify NN, JT, etc. vs. some random
> string of numbers.
>
>
>
>
> On Mon, Jul 29, 2013 at 10:40 AM, Greg Bledsoe <gr...@personal.com> wrote:
>
>> I can third this concern.  What purpose does this complexity increasing
>> requirement serve?  Why not remove it?
>>
>> Greg Bledsoe
>>
>> From: 武泽胜 <wu...@xiaomi.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Mon, 29 Jul 2013 08:21:51 -0500
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject: Re: Why Hadoop force using DNS?
>>
>> I have the same confusion, anyone who can reply to this will be very
>> appreciated.
>>
>> From: Elazar Leibovich <el...@gmail.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Date: Thursday, July 25, 2013 3:51 AM
>> To: user <us...@hadoop.apache.org>
>> Subject: Why Hadoop force using DNS?
>>
>> Looking at Hadoop source you can see that Hadoop relies on the fact each
>> node has resolvable name.
>>
>> For example, Hadoop 2 namenode reverse look the up of each node that
>> connects to it. Also, there's no way way to tell a database to advertise an
>> UP as it's address. Setting datanode.network.interface to, say, eth1, would
>> cause Hadoop to reverse lookup UPs on eth1 and advertise the result.
>>
>> Why is that? Using plain IPs is simple to set up, and I can't see a
>> reason not to support them?
>>
>
>

Re: Why Hadoop force using DNS?

Posted by Greg Bledsoe <gr...@personal.com>.
But even if you have permission to change /etc/hosts, /etc/hosts resolution seems to introduce instability for the reverse lookup leading to unpredictable results.  Dns gets used and if this doesn't match your /etc/hosts file, you have problems.  Or am I missing something?

Greg

From: Chris Embree <ce...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>, "chris@embree.us<ma...@embree.us>" <ch...@embree.us>>
Date: Mon, 29 Jul 2013 09:45:22 -0500
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Why Hadoop force using DNS?

Just for clarity,  DNS as a service is NOT Required.  Name resolution is.  I use /etc/hosts files to identify all nodes in my clusters.

One of the reasons for using Names over IP's is ease of use.  I would much rather use a hostname in my XML to identify NN, JT, etc. vs. some random string of numbers.




On Mon, Jul 29, 2013 at 10:40 AM, Greg Bledsoe <gr...@personal.com>> wrote:
I can third this concern.  What purpose does this complexity increasing requirement serve?  Why not remove it?

Greg Bledsoe

From: 武泽胜 <wu...@xiaomi.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Mon, 29 Jul 2013 08:21:51 -0500
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Why Hadoop force using DNS?

I have the same confusion, anyone who can reply to this will be very appreciated.

From: Elazar Leibovich <el...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, July 25, 2013 3:51 AM
To: user <us...@hadoop.apache.org>>
Subject: Why Hadoop force using DNS?

Looking at Hadoop source you can see that Hadoop relies on the fact each node has resolvable name.

For example, Hadoop 2 namenode reverse look the up of each node that connects to it. Also, there's no way way to tell a database to advertise an UP as it's address. Setting datanode.network.interface to, say, eth1, would cause Hadoop to reverse lookup UPs on eth1 and advertise the result.

Why is that? Using plain IPs is simple to set up, and I can't see a reason not to support them?


Re: Why Hadoop force using DNS?

Posted by Greg Bledsoe <gr...@personal.com>.
But even if you have permission to change /etc/hosts, /etc/hosts resolution seems to introduce instability for the reverse lookup leading to unpredictable results.  Dns gets used and if this doesn't match your /etc/hosts file, you have problems.  Or am I missing something?

Greg

From: Chris Embree <ce...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>, "chris@embree.us<ma...@embree.us>" <ch...@embree.us>>
Date: Mon, 29 Jul 2013 09:45:22 -0500
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Why Hadoop force using DNS?

Just for clarity,  DNS as a service is NOT Required.  Name resolution is.  I use /etc/hosts files to identify all nodes in my clusters.

One of the reasons for using Names over IP's is ease of use.  I would much rather use a hostname in my XML to identify NN, JT, etc. vs. some random string of numbers.




On Mon, Jul 29, 2013 at 10:40 AM, Greg Bledsoe <gr...@personal.com>> wrote:
I can third this concern.  What purpose does this complexity increasing requirement serve?  Why not remove it?

Greg Bledsoe

From: 武泽胜 <wu...@xiaomi.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Mon, 29 Jul 2013 08:21:51 -0500
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Why Hadoop force using DNS?

I have the same confusion, anyone who can reply to this will be very appreciated.

From: Elazar Leibovich <el...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, July 25, 2013 3:51 AM
To: user <us...@hadoop.apache.org>>
Subject: Why Hadoop force using DNS?

Looking at Hadoop source you can see that Hadoop relies on the fact each node has resolvable name.

For example, Hadoop 2 namenode reverse look the up of each node that connects to it. Also, there's no way way to tell a database to advertise an UP as it's address. Setting datanode.network.interface to, say, eth1, would cause Hadoop to reverse lookup UPs on eth1 and advertise the result.

Why is that? Using plain IPs is simple to set up, and I can't see a reason not to support them?


Re: Why Hadoop force using DNS?

Posted by Greg Bledsoe <gr...@personal.com>.
But even if you have permission to change /etc/hosts, /etc/hosts resolution seems to introduce instability for the reverse lookup leading to unpredictable results.  Dns gets used and if this doesn't match your /etc/hosts file, you have problems.  Or am I missing something?

Greg

From: Chris Embree <ce...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>, "chris@embree.us<ma...@embree.us>" <ch...@embree.us>>
Date: Mon, 29 Jul 2013 09:45:22 -0500
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Why Hadoop force using DNS?

Just for clarity,  DNS as a service is NOT Required.  Name resolution is.  I use /etc/hosts files to identify all nodes in my clusters.

One of the reasons for using Names over IP's is ease of use.  I would much rather use a hostname in my XML to identify NN, JT, etc. vs. some random string of numbers.




On Mon, Jul 29, 2013 at 10:40 AM, Greg Bledsoe <gr...@personal.com>> wrote:
I can third this concern.  What purpose does this complexity increasing requirement serve?  Why not remove it?

Greg Bledsoe

From: 武泽胜 <wu...@xiaomi.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Mon, 29 Jul 2013 08:21:51 -0500
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Why Hadoop force using DNS?

I have the same confusion, anyone who can reply to this will be very appreciated.

From: Elazar Leibovich <el...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, July 25, 2013 3:51 AM
To: user <us...@hadoop.apache.org>>
Subject: Why Hadoop force using DNS?

Looking at Hadoop source you can see that Hadoop relies on the fact each node has resolvable name.

For example, Hadoop 2 namenode reverse look the up of each node that connects to it. Also, there's no way way to tell a database to advertise an UP as it's address. Setting datanode.network.interface to, say, eth1, would cause Hadoop to reverse lookup UPs on eth1 and advertise the result.

Why is that? Using plain IPs is simple to set up, and I can't see a reason not to support them?


Re: Why Hadoop force using DNS?

Posted by Greg Bledsoe <gr...@personal.com>.
But even if you have permission to change /etc/hosts, /etc/hosts resolution seems to introduce instability for the reverse lookup leading to unpredictable results.  Dns gets used and if this doesn't match your /etc/hosts file, you have problems.  Or am I missing something?

Greg

From: Chris Embree <ce...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>, "chris@embree.us<ma...@embree.us>" <ch...@embree.us>>
Date: Mon, 29 Jul 2013 09:45:22 -0500
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Why Hadoop force using DNS?

Just for clarity,  DNS as a service is NOT Required.  Name resolution is.  I use /etc/hosts files to identify all nodes in my clusters.

One of the reasons for using Names over IP's is ease of use.  I would much rather use a hostname in my XML to identify NN, JT, etc. vs. some random string of numbers.




On Mon, Jul 29, 2013 at 10:40 AM, Greg Bledsoe <gr...@personal.com>> wrote:
I can third this concern.  What purpose does this complexity increasing requirement serve?  Why not remove it?

Greg Bledsoe

From: 武泽胜 <wu...@xiaomi.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Mon, 29 Jul 2013 08:21:51 -0500
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Why Hadoop force using DNS?

I have the same confusion, anyone who can reply to this will be very appreciated.

From: Elazar Leibovich <el...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, July 25, 2013 3:51 AM
To: user <us...@hadoop.apache.org>>
Subject: Why Hadoop force using DNS?

Looking at Hadoop source you can see that Hadoop relies on the fact each node has resolvable name.

For example, Hadoop 2 namenode reverse look the up of each node that connects to it. Also, there's no way way to tell a database to advertise an UP as it's address. Setting datanode.network.interface to, say, eth1, would cause Hadoop to reverse lookup UPs on eth1 and advertise the result.

Why is that? Using plain IPs is simple to set up, and I can't see a reason not to support them?


Re: Why Hadoop force using DNS?

Posted by Chris Embree <ce...@gmail.com>.
Just for clarity,  DNS as a service is NOT Required.  Name resolution is.
 I use /etc/hosts files to identify all nodes in my clusters.

One of the reasons for using Names over IP's is ease of use.  I would much
rather use a hostname in my XML to identify NN, JT, etc. vs. some random
string of numbers.




On Mon, Jul 29, 2013 at 10:40 AM, Greg Bledsoe <gr...@personal.com> wrote:

> I can third this concern.  What purpose does this complexity increasing
> requirement serve?  Why not remove it?
>
> Greg Bledsoe
>
> From: 武泽胜 <wu...@xiaomi.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Mon, 29 Jul 2013 08:21:51 -0500
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: Why Hadoop force using DNS?
>
> I have the same confusion, anyone who can reply to this will be very
> appreciated.
>
> From: Elazar Leibovich <el...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Thursday, July 25, 2013 3:51 AM
> To: user <us...@hadoop.apache.org>
> Subject: Why Hadoop force using DNS?
>
> Looking at Hadoop source you can see that Hadoop relies on the fact each
> node has resolvable name.
>
> For example, Hadoop 2 namenode reverse look the up of each node that
> connects to it. Also, there's no way way to tell a database to advertise an
> UP as it's address. Setting datanode.network.interface to, say, eth1, would
> cause Hadoop to reverse lookup UPs on eth1 and advertise the result.
>
> Why is that? Using plain IPs is simple to set up, and I can't see a reason
> not to support them?
>

Re: Why Hadoop force using DNS?

Posted by Chris Embree <ce...@gmail.com>.
Just for clarity,  DNS as a service is NOT Required.  Name resolution is.
 I use /etc/hosts files to identify all nodes in my clusters.

One of the reasons for using Names over IP's is ease of use.  I would much
rather use a hostname in my XML to identify NN, JT, etc. vs. some random
string of numbers.




On Mon, Jul 29, 2013 at 10:40 AM, Greg Bledsoe <gr...@personal.com> wrote:

> I can third this concern.  What purpose does this complexity increasing
> requirement serve?  Why not remove it?
>
> Greg Bledsoe
>
> From: 武泽胜 <wu...@xiaomi.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Mon, 29 Jul 2013 08:21:51 -0500
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: Why Hadoop force using DNS?
>
> I have the same confusion, anyone who can reply to this will be very
> appreciated.
>
> From: Elazar Leibovich <el...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Thursday, July 25, 2013 3:51 AM
> To: user <us...@hadoop.apache.org>
> Subject: Why Hadoop force using DNS?
>
> Looking at Hadoop source you can see that Hadoop relies on the fact each
> node has resolvable name.
>
> For example, Hadoop 2 namenode reverse look the up of each node that
> connects to it. Also, there's no way way to tell a database to advertise an
> UP as it's address. Setting datanode.network.interface to, say, eth1, would
> cause Hadoop to reverse lookup UPs on eth1 and advertise the result.
>
> Why is that? Using plain IPs is simple to set up, and I can't see a reason
> not to support them?
>

Re: Why Hadoop force using DNS?

Posted by Chris Embree <ce...@gmail.com>.
Just for clarity,  DNS as a service is NOT Required.  Name resolution is.
 I use /etc/hosts files to identify all nodes in my clusters.

One of the reasons for using Names over IP's is ease of use.  I would much
rather use a hostname in my XML to identify NN, JT, etc. vs. some random
string of numbers.




On Mon, Jul 29, 2013 at 10:40 AM, Greg Bledsoe <gr...@personal.com> wrote:

> I can third this concern.  What purpose does this complexity increasing
> requirement serve?  Why not remove it?
>
> Greg Bledsoe
>
> From: 武泽胜 <wu...@xiaomi.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Mon, 29 Jul 2013 08:21:51 -0500
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: Why Hadoop force using DNS?
>
> I have the same confusion, anyone who can reply to this will be very
> appreciated.
>
> From: Elazar Leibovich <el...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Thursday, July 25, 2013 3:51 AM
> To: user <us...@hadoop.apache.org>
> Subject: Why Hadoop force using DNS?
>
> Looking at Hadoop source you can see that Hadoop relies on the fact each
> node has resolvable name.
>
> For example, Hadoop 2 namenode reverse look the up of each node that
> connects to it. Also, there's no way way to tell a database to advertise an
> UP as it's address. Setting datanode.network.interface to, say, eth1, would
> cause Hadoop to reverse lookup UPs on eth1 and advertise the result.
>
> Why is that? Using plain IPs is simple to set up, and I can't see a reason
> not to support them?
>

Re: Why Hadoop force using DNS?

Posted by Chris Embree <ce...@gmail.com>.
Just for clarity,  DNS as a service is NOT Required.  Name resolution is.
 I use /etc/hosts files to identify all nodes in my clusters.

One of the reasons for using Names over IP's is ease of use.  I would much
rather use a hostname in my XML to identify NN, JT, etc. vs. some random
string of numbers.




On Mon, Jul 29, 2013 at 10:40 AM, Greg Bledsoe <gr...@personal.com> wrote:

> I can third this concern.  What purpose does this complexity increasing
> requirement serve?  Why not remove it?
>
> Greg Bledsoe
>
> From: 武泽胜 <wu...@xiaomi.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Mon, 29 Jul 2013 08:21:51 -0500
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Re: Why Hadoop force using DNS?
>
> I have the same confusion, anyone who can reply to this will be very
> appreciated.
>
> From: Elazar Leibovich <el...@gmail.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Thursday, July 25, 2013 3:51 AM
> To: user <us...@hadoop.apache.org>
> Subject: Why Hadoop force using DNS?
>
> Looking at Hadoop source you can see that Hadoop relies on the fact each
> node has resolvable name.
>
> For example, Hadoop 2 namenode reverse look the up of each node that
> connects to it. Also, there's no way way to tell a database to advertise an
> UP as it's address. Setting datanode.network.interface to, say, eth1, would
> cause Hadoop to reverse lookup UPs on eth1 and advertise the result.
>
> Why is that? Using plain IPs is simple to set up, and I can't see a reason
> not to support them?
>

Re: Why Hadoop force using DNS?

Posted by Greg Bledsoe <gr...@personal.com>.
I can third this concern.  What purpose does this complexity increasing requirement serve?  Why not remove it?

Greg Bledsoe

From: 武泽胜 <wu...@xiaomi.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Mon, 29 Jul 2013 08:21:51 -0500
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Why Hadoop force using DNS?

I have the same confusion, anyone who can reply to this will be very appreciated.

From: Elazar Leibovich <el...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, July 25, 2013 3:51 AM
To: user <us...@hadoop.apache.org>>
Subject: Why Hadoop force using DNS?

Looking at Hadoop source you can see that Hadoop relies on the fact each node has resolvable name.

For example, Hadoop 2 namenode reverse look the up of each node that connects to it. Also, there's no way way to tell a database to advertise an UP as it's address. Setting datanode.network.interface to, say, eth1, would cause Hadoop to reverse lookup UPs on eth1 and advertise the result.

Why is that? Using plain IPs is simple to set up, and I can't see a reason not to support them?

Re: Why Hadoop force using DNS?

Posted by Greg Bledsoe <gr...@personal.com>.
I can third this concern.  What purpose does this complexity increasing requirement serve?  Why not remove it?

Greg Bledsoe

From: 武泽胜 <wu...@xiaomi.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Mon, 29 Jul 2013 08:21:51 -0500
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Why Hadoop force using DNS?

I have the same confusion, anyone who can reply to this will be very appreciated.

From: Elazar Leibovich <el...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, July 25, 2013 3:51 AM
To: user <us...@hadoop.apache.org>>
Subject: Why Hadoop force using DNS?

Looking at Hadoop source you can see that Hadoop relies on the fact each node has resolvable name.

For example, Hadoop 2 namenode reverse look the up of each node that connects to it. Also, there's no way way to tell a database to advertise an UP as it's address. Setting datanode.network.interface to, say, eth1, would cause Hadoop to reverse lookup UPs on eth1 and advertise the result.

Why is that? Using plain IPs is simple to set up, and I can't see a reason not to support them?

Re: Why Hadoop force using DNS?

Posted by Greg Bledsoe <gr...@personal.com>.
I can third this concern.  What purpose does this complexity increasing requirement serve?  Why not remove it?

Greg Bledsoe

From: 武泽胜 <wu...@xiaomi.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Mon, 29 Jul 2013 08:21:51 -0500
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Why Hadoop force using DNS?

I have the same confusion, anyone who can reply to this will be very appreciated.

From: Elazar Leibovich <el...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, July 25, 2013 3:51 AM
To: user <us...@hadoop.apache.org>>
Subject: Why Hadoop force using DNS?

Looking at Hadoop source you can see that Hadoop relies on the fact each node has resolvable name.

For example, Hadoop 2 namenode reverse look the up of each node that connects to it. Also, there's no way way to tell a database to advertise an UP as it's address. Setting datanode.network.interface to, say, eth1, would cause Hadoop to reverse lookup UPs on eth1 and advertise the result.

Why is that? Using plain IPs is simple to set up, and I can't see a reason not to support them?

Re: Why Hadoop force using DNS?

Posted by Greg Bledsoe <gr...@personal.com>.
I can third this concern.  What purpose does this complexity increasing requirement serve?  Why not remove it?

Greg Bledsoe

From: 武泽胜 <wu...@xiaomi.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Mon, 29 Jul 2013 08:21:51 -0500
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Re: Why Hadoop force using DNS?

I have the same confusion, anyone who can reply to this will be very appreciated.

From: Elazar Leibovich <el...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, July 25, 2013 3:51 AM
To: user <us...@hadoop.apache.org>>
Subject: Why Hadoop force using DNS?

Looking at Hadoop source you can see that Hadoop relies on the fact each node has resolvable name.

For example, Hadoop 2 namenode reverse look the up of each node that connects to it. Also, there's no way way to tell a database to advertise an UP as it's address. Setting datanode.network.interface to, say, eth1, would cause Hadoop to reverse lookup UPs on eth1 and advertise the result.

Why is that? Using plain IPs is simple to set up, and I can't see a reason not to support them?

Re: Why Hadoop force using DNS?

Posted by 武泽胜 <wu...@xiaomi.com>.
I have the same confusion, anyone who can reply to this will be very appreciated.

From: Elazar Leibovich <el...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, July 25, 2013 3:51 AM
To: user <us...@hadoop.apache.org>>
Subject: Why Hadoop force using DNS?

Looking at Hadoop source you can see that Hadoop relies on the fact each node has resolvable name.

For example, Hadoop 2 namenode reverse look the up of each node that connects to it. Also, there's no way way to tell a database to advertise an UP as it's address. Setting datanode.network.interface to, say, eth1, would cause Hadoop to reverse lookup UPs on eth1 and advertise the result.

Why is that? Using plain IPs is simple to set up, and I can't see a reason not to support them?

Re: Why Hadoop force using DNS?

Posted by 武泽胜 <wu...@xiaomi.com>.
I have the same confusion, anyone who can reply to this will be very appreciated.

From: Elazar Leibovich <el...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, July 25, 2013 3:51 AM
To: user <us...@hadoop.apache.org>>
Subject: Why Hadoop force using DNS?

Looking at Hadoop source you can see that Hadoop relies on the fact each node has resolvable name.

For example, Hadoop 2 namenode reverse look the up of each node that connects to it. Also, there's no way way to tell a database to advertise an UP as it's address. Setting datanode.network.interface to, say, eth1, would cause Hadoop to reverse lookup UPs on eth1 and advertise the result.

Why is that? Using plain IPs is simple to set up, and I can't see a reason not to support them?

Re: Why Hadoop force using DNS?

Posted by 武泽胜 <wu...@xiaomi.com>.
I have the same confusion, anyone who can reply to this will be very appreciated.

From: Elazar Leibovich <el...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, July 25, 2013 3:51 AM
To: user <us...@hadoop.apache.org>>
Subject: Why Hadoop force using DNS?

Looking at Hadoop source you can see that Hadoop relies on the fact each node has resolvable name.

For example, Hadoop 2 namenode reverse look the up of each node that connects to it. Also, there's no way way to tell a database to advertise an UP as it's address. Setting datanode.network.interface to, say, eth1, would cause Hadoop to reverse lookup UPs on eth1 and advertise the result.

Why is that? Using plain IPs is simple to set up, and I can't see a reason not to support them?

Re: Why Hadoop force using DNS?

Posted by 武泽胜 <wu...@xiaomi.com>.
I have the same confusion, anyone who can reply to this will be very appreciated.

From: Elazar Leibovich <el...@gmail.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Thursday, July 25, 2013 3:51 AM
To: user <us...@hadoop.apache.org>>
Subject: Why Hadoop force using DNS?

Looking at Hadoop source you can see that Hadoop relies on the fact each node has resolvable name.

For example, Hadoop 2 namenode reverse look the up of each node that connects to it. Also, there's no way way to tell a database to advertise an UP as it's address. Setting datanode.network.interface to, say, eth1, would cause Hadoop to reverse lookup UPs on eth1 and advertise the result.

Why is that? Using plain IPs is simple to set up, and I can't see a reason not to support them?