You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by John Martyniak <jo...@avum.com> on 2009/06/09 06:26:32 UTC

Multiple NIC Cards

Hi,

I am creating a small Hadoop (0.19.1) cluster (2 nodes to start), each  
of the machines has 2 NIC cards (1 external facing, 1 internal  
facing).  It is important that Hadoop run and communicate on the  
internal facing NIC (because the external facing NIC costs me money),  
also the internal is more protected.

So I found a couple of parameters both in the mail archives and on the  
Hadoop core site, but they don't seem to do anything as the result is  
always the same, it resolves to the external IP, and name (which is in  
DNS).

  <property>
   <name>dfs.datanode.dns.interface</name>
   <value>en1</value>
   <description>The name of the Network Interface from which a data  
node should
   report its IP address.
   </description>
  </property>

<property>
   <name>dfs.datanode.dns.nameserver</name>
   <value>DN</value>
   <description>The host name or IP address of the name server (DNS)
   which a DataNode should use to determine the host name used by the
   NameNode for communication and display purposes.
   </description>
  </property>

<property>
   <name>mapred.tasktracker.dns.interface</name>
   <value>en1</value>
   <description>The name of the Network Interface from which a task
   tracker should report its IP address.
   </description>
  </property>

<property>
   <name>mapred.tasktracker.dns.nameserver</name>
   <value>DN</value>
   <description>The host name or IP address of the name server (DNS)
   which a TaskTracker should use to determine the host name used by
   the JobTracker for communication and display purposes.
   </description>
  </property>

Any help would be greatly appreciated.

Thank you in advance.

-John


John Martyniak
Before Dawn Solutions, Inc.
9457 S. University Blvd #266
Highlands Ranch, CO 80126
o: 877-499-1562
c: 303-522-1756
e: john@beforedawnsoutions.com
w: http://www.beforedawnsolutions.com


Re: Multiple NIC Cards

Posted by John Martyniak <jo...@avum.com>.
David,

Thanks for suggestions.

1) seems like a good possibility, makes it a little less readable, but  
that is ok.
2) they are in the local /etc/hosts file, and they do resolve locally,  
meaning that I can ssh to either machine by using the internal name.
3) This is another possibility, but would like to try and avoid this  
as this is another level of management.

Right now the internal machines are just on a switch, and each has a  
internal IP address (192.168.1.x), and both machines have themselves  
and the other machine in the /etc/hosts file.

-John

On Jun 9, 2009, at 7:30 AM, David B. Ritch wrote:

> I see several possibilities:
>
> 1) Use IP addresses instead of machine names in the config files.
> 2) Make sure that the names are in local /etc/hosts files, and that  
> they
> resolve to the internal IP addresses.
> 3) Set up an internal name server, and make sure that it serves your
> internal addresses and your cluster nodes point to it.
>
> In general, how do you resolve your machine names to the internal  
> addresses?
>
> David
>
> John Martyniak wrote:
>> Hi,
>>
>> I am creating a small Hadoop (0.19.1) cluster (2 nodes to start),  
>> each
>> of the machines has 2 NIC cards (1 external facing, 1 internal
>> facing).  It is important that Hadoop run and communicate on the
>> internal facing NIC (because the external facing NIC costs me money),
>> also the internal is more protected.
>>
>> So I found a couple of parameters both in the mail archives and on  
>> the
>> Hadoop core site, but they don't seem to do anything as the result is
>> always the same, it resolves to the external IP, and name (which is  
>> in
>> DNS).
>>
>> <property>
>>  <name>dfs.datanode.dns.interface</name>
>>  <value>en1</value>
>>  <description>The name of the Network Interface from which a data
>> node should
>>  report its IP address.
>>  </description>
>> </property>
>>
>> <property>
>>  <name>dfs.datanode.dns.nameserver</name>
>>  <value>DN</value>
>>  <description>The host name or IP address of the name server (DNS)
>>  which a DataNode should use to determine the host name used by the
>>  NameNode for communication and display purposes.
>>  </description>
>> </property>
>>
>> <property>
>>  <name>mapred.tasktracker.dns.interface</name>
>>  <value>en1</value>
>>  <description>The name of the Network Interface from which a task
>>  tracker should report its IP address.
>>  </description>
>> </property>
>>
>> <property>
>>  <name>mapred.tasktracker.dns.nameserver</name>
>>  <value>DN</value>
>>  <description>The host name or IP address of the name server (DNS)
>>  which a TaskTracker should use to determine the host name used by
>>  the JobTracker for communication and display purposes.
>>  </description>
>> </property>
>>
>> Any help would be greatly appreciated.
>>
>> Thank you in advance.
>>
>> -John
>>
>>
>> John Martyniak
>> Before Dawn Solutions, Inc.
>> 9457 S. University Blvd #266
>> Highlands Ranch, CO 80126
>> o: 877-499-1562
>> c: 303-522-1756
>> e: john@beforedawnsoutions.com
>> w: http://www.beforedawnsolutions.com
>>
>>
>


Re: Multiple NIC Cards

Posted by John Martyniak <jo...@avum.com>.
Steve,

I missed this part of the email.

So if I change the 0.0.0.0 to either the 192.168.1.102 or 103  
depending on what is necessary will that solve the problem?

It looks like it lives in 10 places.

-John

On Jun 9, 2009, at 10:17 AM, Steve Loughran wrote:

> > I have some other applications running on these machines, that
> > communicate across the internal network and they work perfectly.
>
> I admire their strength. Multihost systems cause us trouble. That  
> and machines that don't quite know who they are
> http://jira.smartfrog.org/jira/browse/SFOS-5
> https://issues.apache.org/jira/browse/HADOOP-3612
> https://issues.apache.org/jira/browse/HADOOP-3426
> https://issues.apache.org/jira/browse/HADOOP-3613
> https://issues.apache.org/jira/browse/HADOOP-5339
>
> One thing to consider is that some of the various services of Hadoop  
> are bound to 0:0:0:0, which means every Ipv4 address, you really  
> want to bring up everything, including jetty services, on the en0  
> network adapter, by binding them to  192.168.1.102; this will cause  
> anyone trying to talk to them over the other network to fail, which  
> at least find the problem sooner rather than later


Re: Multiple NIC Cards

Posted by John Martyniak <jo...@avum.com>.
I haven't even looked into that at all.

I am just trying to get a simple 2 node cluster working with 2 Nics.

-John

On Jun 9, 2009, at 1:41 PM, Edward Capriolo wrote:

> Also if you are using a topology rack map, make sure you scripts
> responds correctly to every possible hostname or IP address as well.
>
> On Tue, Jun 9, 2009 at 1:19 PM, John Martyniak<jo...@avum.com> wrote:
>> It seems that this is the issue, as there several posts related to  
>> same
>> topic but with no resolution.
>>
>> I guess the thing of it is that it shouldn't use the hostname of  
>> the machine
>> at all.  If I tell it the master is x and it has an IP Address of  
>> x.x.x.102
>> that should be good enough.
>>
>> And if that isn't the case then I should be able to specify which  
>> network
>> adaptor to use as the ip address that it is going to lookup  
>> against, whether
>> it is by DNS or by /etc/hosts.
>>
>> Because I suspect the problem is that I have named the machine as
>> duey.xxxx.com but have told hadoop that machine is called duey- 
>> direct.
>>
>> Is there work around in 0.19.1?  I am using this with Nutch so  
>> don't have an
>> option to upgrade at this time.
>>
>> -John
>>
>>
>> On Jun 9, 2009, at 11:59 AM, Steve Loughran wrote:
>>
>>> John Martyniak wrote:
>>>>
>>>> When I run either of those on either of the two machines, it is  
>>>> trying to
>>>> resolve against the DNS servers configured for the external  
>>>> addresses for
>>>> the box.
>>>> Here is the result
>>>> Server:        xxx.xxx.xxx.69
>>>> Address:    xxx.xxx.xxx.69#53
>>>
>>> OK. in an ideal world, each NIC has a different hostname. Now, that
>>> confuses code that assumes a host has exactly one hostname, not  
>>> zero or two,
>>> and I'm not sure how well Hadoop handles the 2+ situation (I know  
>>> it doesn't
>>> like 0, but hey, its a distributed application). With separate  
>>> hostnames,
>>> you set hadoop up to work on the inner addresses, and give out the  
>>> inner
>>> hostnames of the jobtracker and namenode. As a result, all traffic  
>>> to the
>>> master nodes should be routed on the internal network
>>
>>


Re: Multiple NIC Cards

Posted by Edward Capriolo <ed...@gmail.com>.
Also if you are using a topology rack map, make sure you scripts
responds correctly to every possible hostname or IP address as well.

On Tue, Jun 9, 2009 at 1:19 PM, John Martyniak<jo...@avum.com> wrote:
> It seems that this is the issue, as there several posts related to same
> topic but with no resolution.
>
> I guess the thing of it is that it shouldn't use the hostname of the machine
> at all.  If I tell it the master is x and it has an IP Address of x.x.x.102
> that should be good enough.
>
> And if that isn't the case then I should be able to specify which network
> adaptor to use as the ip address that it is going to lookup against, whether
> it is by DNS or by /etc/hosts.
>
> Because I suspect the problem is that I have named the machine as
> duey.xxxx.com but have told hadoop that machine is called duey-direct.
>
> Is there work around in 0.19.1?  I am using this with Nutch so don't have an
> option to upgrade at this time.
>
> -John
>
>
> On Jun 9, 2009, at 11:59 AM, Steve Loughran wrote:
>
>> John Martyniak wrote:
>>>
>>> When I run either of those on either of the two machines, it is trying to
>>> resolve against the DNS servers configured for the external addresses for
>>> the box.
>>> Here is the result
>>> Server:        xxx.xxx.xxx.69
>>> Address:    xxx.xxx.xxx.69#53
>>
>> OK. in an ideal world, each NIC has a different hostname. Now, that
>> confuses code that assumes a host has exactly one hostname, not zero or two,
>> and I'm not sure how well Hadoop handles the 2+ situation (I know it doesn't
>> like 0, but hey, its a distributed application). With separate hostnames,
>> you set hadoop up to work on the inner addresses, and give out the inner
>> hostnames of the jobtracker and namenode. As a result, all traffic to the
>> master nodes should be routed on the internal network
>
>

Re: Multiple NIC Cards

Posted by John Martyniak <jo...@avum.com>.
It seems that this is the issue, as there several posts related to  
same topic but with no resolution.

I guess the thing of it is that it shouldn't use the hostname of the  
machine at all.  If I tell it the master is x and it has an IP Address  
of x.x.x.102 that should be good enough.

And if that isn't the case then I should be able to specify which  
network adaptor to use as the ip address that it is going to lookup  
against, whether it is by DNS or by /etc/hosts.

Because I suspect the problem is that I have named the machine as  
duey.xxxx.com but have told hadoop that machine is called duey-direct.

Is there work around in 0.19.1?  I am using this with Nutch so don't  
have an option to upgrade at this time.

-John


On Jun 9, 2009, at 11:59 AM, Steve Loughran wrote:

> John Martyniak wrote:
>> When I run either of those on either of the two machines, it is  
>> trying to resolve against the DNS servers configured for the  
>> external addresses for the box.
>> Here is the result
>> Server:        xxx.xxx.xxx.69
>> Address:    xxx.xxx.xxx.69#53
>
> OK. in an ideal world, each NIC has a different hostname. Now, that  
> confuses code that assumes a host has exactly one hostname, not zero  
> or two, and I'm not sure how well Hadoop handles the 2+ situation (I  
> know it doesn't like 0, but hey, its a distributed application).  
> With separate hostnames, you set hadoop up to work on the inner  
> addresses, and give out the inner hostnames of the jobtracker and  
> namenode. As a result, all traffic to the master nodes should be  
> routed on the internal network


Re: Multiple NIC Cards

Posted by Edward Capriolo <ed...@gmail.com>.
On Tue, Jun 9, 2009 at 11:59 AM, Steve Loughran<st...@apache.org> wrote:
> John Martyniak wrote:
>>
>> When I run either of those on either of the two machines, it is trying to
>> resolve against the DNS servers configured for the external addresses for
>> the box.
>>
>> Here is the result
>> Server:        xxx.xxx.xxx.69
>> Address:    xxx.xxx.xxx.69#53
>
> OK. in an ideal world, each NIC has a different hostname. Now, that confuses
> code that assumes a host has exactly one hostname, not zero or two, and I'm
> not sure how well Hadoop handles the 2+ situation (I know it doesn't like 0,
> but hey, its a distributed application). With separate hostnames, you set
> hadoop up to work on the inner addresses, and give out the inner hostnames
> of the jobtracker and namenode. As a result, all traffic to the master nodes
> should be routed on the internal network
>

Also a subtle issue that I run into is full or partial host names.
Even though all my configuration files reference full host names
server1.domain.com. The name node web interface will redirect people
to http://server1 probably because that is the system hostname. In my
case it is a big deal but it is something to consider during setup.

Re: Multiple NIC Cards

Posted by Steve Loughran <st...@apache.org>.
John Martyniak wrote:
> When I run either of those on either of the two machines, it is trying 
> to resolve against the DNS servers configured for the external addresses 
> for the box.
> 
> Here is the result
> Server:        xxx.xxx.xxx.69
> Address:    xxx.xxx.xxx.69#53

OK. in an ideal world, each NIC has a different hostname. Now, that 
confuses code that assumes a host has exactly one hostname, not zero or 
two, and I'm not sure how well Hadoop handles the 2+ situation (I know 
it doesn't like 0, but hey, its a distributed application). With 
separate hostnames, you set hadoop up to work on the inner addresses, 
and give out the inner hostnames of the jobtracker and namenode. As a 
result, all traffic to the master nodes should be routed on the internal 
network

Re: Multiple NIC Cards

Posted by John Martyniak <jo...@avum.com>.
When I run either of those on either of the two machines, it is trying  
to resolve against the DNS servers configured for the external  
addresses for the box.

Here is the result
Server:		xxx.xxx.xxx.69
Address:	xxx.xxx.xxx.69#53

** server can't find 102.1.168.192.in-addr.arpa.: NXDOMAIN

-John



On Jun 9, 2009, at 10:17 AM, Steve Loughran wrote:

> John Martyniak wrote:
>> I am running Mac OS X.
>> So en0 points to the external address and en1 points to the  
>> internal address on both machines.
>> Here is the internal results from duey:
>> en1:  
>> flags=8963<UP,BROADCAST,SMART,RUNNING,PROMISC,SIMPLEX,MULTICAST>  
>> mtu 1500
>>    inet6 fe80::21e:52ff:fef4:65%en1 prefixlen 64 scopeid 0x5
>>    inet 192.168.1.102 netmask 0xffffff00 broadcast 192.168.1.255
>>    ether 00:1e:52:f4:00:65
>>    media: autoselect (1000baseT <full-duplex>) status: active
>
>>    lladdr 00:23:32:ff:fe:1a:20:66
>>    media: autoselect <full-duplex> status: inactive
>>    supported media: autoselect <full-duplex>
>> Here are the internal results from huey:
>> en1: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu  
>> 1500
>>    inet6 fe80::21e:52ff:fef3:f489%en1 prefixlen 64 scopeid 0x5
>>    inet 192.168.1.103 netmask 0xffffff00 broadcast 192.168.1.255
>
> what does
>  nslookup 192.168.1.103
> and
>  nslookup 192.168.1.102
> say?
>
> There really ought to be different names for them.
>
>
> > I have some other applications running on these machines, that
> > communicate across the internal network and they work perfectly.
>
> I admire their strength. Multihost systems cause us trouble. That  
> and machines that don't quite know who they are
> http://jira.smartfrog.org/jira/browse/SFOS-5
> https://issues.apache.org/jira/browse/HADOOP-3612
> https://issues.apache.org/jira/browse/HADOOP-3426
> https://issues.apache.org/jira/browse/HADOOP-3613
> https://issues.apache.org/jira/browse/HADOOP-5339
>
> One thing to consider is that some of the various services of Hadoop  
> are bound to 0:0:0:0, which means every Ipv4 address, you really  
> want to bring up everything, including jetty services, on the en0  
> network adapter, by binding them to  192.168.1.102; this will cause  
> anyone trying to talk to them over the other network to fail, which  
> at least find the problem sooner rather than later
>


Re: Multiple NIC Cards

Posted by JQ Hadoop <jq...@gmail.com>.
The address of the JobTracker (NameNode) is specified using *
mapred.job.tracker* (*fs.default.name*) in the configurations. When the
JobTracker (NameNode) starts, it will listen on the address specified by *
mapred.job.tracker* (*fs.default.name*); and when a TaskTracker (DataNode)
starts, it will talk to the address specified by *mapred.job.tracker* (*
fs.default.name*) through RPC. So there are no confusions (about the
communications between TaskTracker and JobTracker, as well as between
DataNode and NameNode) even for multi-homed nodes, so long those two
addresses are correctly specified.

On the other hand, when a TaskTracker (DataNode) starts, it will also listen
on its own service addresses which are usually specified in the
configurations as *0.0.0.0* (e.g., *mapred.task.tracker.http.address* and *
dfs.datanode.address*); that is, it will accept connections from all the
NICs in the node. In addition, the TaskTracker (DataNode) will send to the
JobTracker (NameNode) status messages regularly, which contain its hostname.
Consequently, when a Map or Reduce task obtains the addresses of the
TaskTrackers (DataNodes) from the JobTracker (NameNode), e.g., for copying
the Map output or reading a HDFS block, it will get the hostnames specified
in the status messages and talk to the TaskTrackers (DataNodes) using those
hostnames.

The hostname specified in the status messages are determined something like
below (as of Hadoop 0.19.1), which can be a little tricky for multi-homed
nodes.

    String hostname = conf.get("slave.host.name");
    if (hostname == null) {
      String interface =
conf.get("mapred.tasktracker.dns.interface","default");
      String nameserver = conf.get("mapred.tasktracker.dns.nameserver",
"default");
      if (interface.equals("default"))
        hostname = InetAddress.getLocalHost().getCanonicalHostName();
      else {
        String[] ips = getIPs(strInterface);
        Vector<String> hosts = new Vector<String>();
        for (int i = 0; i < ips.length; i ++) {
          hosts.add(reverseDns(InetAddress.getByName(ips[i]), nameserver));
        }
        if (hosts.size() == 0)
          hostname = InetAddress.getLocalHost().getCanonicalHostName();
        else
          hostname = hosts.toArray(new String[] {});
      }
    }

I think the easiest way for multiple NICs is probably to start each
TaskTracker (DataNode) by specifying appropriate *slave.host.name* at its
command line, which can be done in bin/slave.sh.



On Thu, Jun 11, 2009 at 11:35 AM, John Martyniak <
john@beforedawnsolutions.com> wrote:

> So it turns out the reason that I was getting the duey.local. was because
> that is what was in the reverse DNS on the nameserver from a previous test.
>  So that is fixed, and now the machine says duey.local.xxx.com.
>
> The only remaining issue is the trailing "." (Period) that is required by
> DNS to make the name fully qualified.
>
> So not sure if this is a bug in the Hadoop uses this information or some
> other issue.
>
> If anybody has run across this issue before any help would be greatly
> appreciated.
>
> Thank you,
>
> -John
>
> On Jun 10, 2009, at 9:21 PM, Matt Massie wrote:
>
>  If you look at the documentation for the getCanonicalHostName() function
>> (thanks, Steve)...
>>
>>
>> http://java.sun.com/javase/6/docs/api/java/net/InetAddress.html#getCanonicalHostName()
>>
>> you'll see two Java security properties (networkaddress.cache.ttl and
>> networkaddress.cache.negative.ttl).
>>
>> You might take a look at your /etc/nsswitch.conf configuration as well to
>> learn how hosts are resolved on your machine, e.g...
>>
>> $ grep hosts /etc/nsswitch.conf
>> hosts:      files dns
>>
>> and lastly, you may want to check if you are running nscd (the NameService
>> cache daemon).  If you are, take a look at /etc/nscd.conf for the caching
>> policy it's using.
>>
>> Good luck.
>>
>> -Matt
>>
>>
>>
>> On Jun 10, 2009, at 1:09 PM, John Martyniak wrote:
>>
>>  That is what I thought also, is that it needs to keep that information
>>> somewhere, because it needs to be able to communicate with all of the
>>> servers.
>>>
>>> So I deleted the /tmp/had* and /tmp/hs* directories, removed the log
>>> files, and grepped for the duey name in all files in config.  And the
>>> problem still exists.  Originally I thought that it might have had something
>>> to do with multiple entries in the .ssh/authorized_keys file but removed
>>> everything there.  And the problem still existed.
>>>
>>> So I think that I am going to grab a new install of hadoop 0.19.1, delete
>>> the existing one and start out fresh to see if that changes anything.
>>>
>>> Wish me luck:)
>>>
>>> -John
>>>
>>> On Jun 10, 2009, at 12:30 PM, Steve Loughran wrote:
>>>
>>>  John Martyniak wrote:
>>>>
>>>>> Does hadoop "cache" the server names anywhere?  Because I changed to
>>>>> using DNS for name resolution, but when I go to the nodes view, it is trying
>>>>> to view with the old name.  And I changed the hadoop-site.xml file so that
>>>>> it no longer has any of those values.
>>>>>
>>>>
>>>> in SVN head, we try and get Java to tell us what is going on
>>>>
>>>> http://svn.apache.org/viewvc/hadoop/core/trunk/src/core/org/apache/hadoop/net/DNS.java
>>>>
>>>> This uses InetAddress.getLocalHost().getCanonicalHostName() to get the
>>>> value, which is cached for life of the process. I don't know of anything
>>>> else, but wouldn't be surprised -the Namenode has to remember the machines
>>>> where stuff was stored.
>>>>
>>>>
>>>>
>>> John Martyniak
>>> President/CEO
>>> Before Dawn Solutions, Inc.
>>> 9457 S. University Blvd #266
>>> Highlands Ranch, CO 80126
>>> o: 877-499-1562
>>> c: 303-522-1756
>>> e: john@beforedawnsoutions.com
>>> w: http://www.beforedawnsolutions.com
>>>
>>>
>>
> John Martyniak
> President/CEO
> Before Dawn Solutions, Inc.
> 9457 S. University Blvd #266
> Highlands Ranch, CO 80126
> o: 877-499-1562
> c: 303-522-1756
> e: john@beforedawnsoutions.com
> w: http://www.beforedawnsolutions.com
>
>

Re: Multiple NIC Cards

Posted by John Martyniak <jo...@beforedawnsolutions.com>.
So it turns out the reason that I was getting the duey.local. was  
because that is what was in the reverse DNS on the nameserver from a  
previous test.  So that is fixed, and now the machine says  
duey.local.xxx.com.

The only remaining issue is the trailing "." (Period) that is required  
by DNS to make the name fully qualified.

So not sure if this is a bug in the Hadoop uses this information or  
some other issue.

If anybody has run across this issue before any help would be greatly  
appreciated.

Thank you,

-John

On Jun 10, 2009, at 9:21 PM, Matt Massie wrote:

> If you look at the documentation for the getCanonicalHostName()  
> function (thanks, Steve)...
>
> http://java.sun.com/javase/6/docs/api/java/net/InetAddress.html#getCanonicalHostName()
>
> you'll see two Java security properties (networkaddress.cache.ttl  
> and networkaddress.cache.negative.ttl).
>
> You might take a look at your /etc/nsswitch.conf configuration as  
> well to learn how hosts are resolved on your machine, e.g...
>
> $ grep hosts /etc/nsswitch.conf
> hosts:      files dns
>
> and lastly, you may want to check if you are running nscd (the  
> NameService cache daemon).  If you are, take a look at /etc/ 
> nscd.conf for the caching policy it's using.
>
> Good luck.
>
> -Matt
>
>
>
> On Jun 10, 2009, at 1:09 PM, John Martyniak wrote:
>
>> That is what I thought also, is that it needs to keep that  
>> information somewhere, because it needs to be able to communicate  
>> with all of the servers.
>>
>> So I deleted the /tmp/had* and /tmp/hs* directories, removed the  
>> log files, and grepped for the duey name in all files in config.   
>> And the problem still exists.  Originally I thought that it might  
>> have had something to do with multiple entries in the .ssh/ 
>> authorized_keys file but removed everything there.  And the problem  
>> still existed.
>>
>> So I think that I am going to grab a new install of hadoop 0.19.1,  
>> delete the existing one and start out fresh to see if that changes  
>> anything.
>>
>> Wish me luck:)
>>
>> -John
>>
>> On Jun 10, 2009, at 12:30 PM, Steve Loughran wrote:
>>
>>> John Martyniak wrote:
>>>> Does hadoop "cache" the server names anywhere?  Because I changed  
>>>> to using DNS for name resolution, but when I go to the nodes  
>>>> view, it is trying to view with the old name.  And I changed the  
>>>> hadoop-site.xml file so that it no longer has any of those values.
>>>
>>> in SVN head, we try and get Java to tell us what is going on
>>> http://svn.apache.org/viewvc/hadoop/core/trunk/src/core/org/apache/hadoop/net/DNS.java
>>>
>>> This uses InetAddress.getLocalHost().getCanonicalHostName() to get  
>>> the value, which is cached for life of the process. I don't know  
>>> of anything else, but wouldn't be surprised -the Namenode has to  
>>> remember the machines where stuff was stored.
>>>
>>>
>>
>> John Martyniak
>> President/CEO
>> Before Dawn Solutions, Inc.
>> 9457 S. University Blvd #266
>> Highlands Ranch, CO 80126
>> o: 877-499-1562
>> c: 303-522-1756
>> e: john@beforedawnsoutions.com
>> w: http://www.beforedawnsolutions.com
>>
>

John Martyniak
President/CEO
Before Dawn Solutions, Inc.
9457 S. University Blvd #266
Highlands Ranch, CO 80126
o: 877-499-1562
c: 303-522-1756
e: john@beforedawnsoutions.com
w: http://www.beforedawnsolutions.com


Re: Multiple NIC Cards

Posted by John Martyniak <jo...@beforedawnsolutions.com>.
Matt,

Thanks for the suggestion.

I had actually forgotten about local dns caching.  I am using a mac so  
I used
dscacheutil -flushcache

To clear the cache, and also investigated the ordering.  And  
everything seems to be in order.

Except I still get a bogus result.

it is using the old name except with a trailing period.  So it is  
using duey.local. when it should be using duey.local.xxxxx.com (which  
is the internal name).



-John

On Jun 10, 2009, at 9:21 PM, Matt Massie wrote:

> If you look at the documentation for the getCanonicalHostName()  
> function (thanks, Steve)...
>
> http://java.sun.com/javase/6/docs/api/java/net/InetAddress.html#getCanonicalHostName()
>
> you'll see two Java security properties (networkaddress.cache.ttl  
> and networkaddress.cache.negative.ttl).
>
> You might take a look at your /etc/nsswitch.conf configuration as  
> well to learn how hosts are resolved on your machine, e.g...
>
> $ grep hosts /etc/nsswitch.conf
> hosts:      files dns
>
> and lastly, you may want to check if you are running nscd (the  
> NameService cache daemon).  If you are, take a look at /etc/ 
> nscd.conf for the caching policy it's using.
>
> Good luck.
>
> -Matt
>
>
>
> On Jun 10, 2009, at 1:09 PM, John Martyniak wrote:
>
>> That is what I thought also, is that it needs to keep that  
>> information somewhere, because it needs to be able to communicate  
>> with all of the servers.
>>
>> So I deleted the /tmp/had* and /tmp/hs* directories, removed the  
>> log files, and grepped for the duey name in all files in config.   
>> And the problem still exists.  Originally I thought that it might  
>> have had something to do with multiple entries in the .ssh/ 
>> authorized_keys file but removed everything there.  And the problem  
>> still existed.
>>
>> So I think that I am going to grab a new install of hadoop 0.19.1,  
>> delete the existing one and start out fresh to see if that changes  
>> anything.
>>
>> Wish me luck:)
>>
>> -John
>>
>> On Jun 10, 2009, at 12:30 PM, Steve Loughran wrote:
>>
>>> John Martyniak wrote:
>>>> Does hadoop "cache" the server names anywhere?  Because I changed  
>>>> to using DNS for name resolution, but when I go to the nodes  
>>>> view, it is trying to view with the old name.  And I changed the  
>>>> hadoop-site.xml file so that it no longer has any of those values.
>>>
>>> in SVN head, we try and get Java to tell us what is going on
>>> http://svn.apache.org/viewvc/hadoop/core/trunk/src/core/org/apache/hadoop/net/DNS.java
>>>
>>> This uses InetAddress.getLocalHost().getCanonicalHostName() to get  
>>> the value, which is cached for life of the process. I don't know  
>>> of anything else, but wouldn't be surprised -the Namenode has to  
>>> remember the machines where stuff was stored.
>>>
>>>
>>
>> John Martyniak
>> President/CEO
>> Before Dawn Solutions, Inc.
>> 9457 S. University Blvd #266
>> Highlands Ranch, CO 80126
>> o: 877-499-1562
>> c: 303-522-1756
>> e: john@beforedawnsoutions.com
>> w: http://www.beforedawnsolutions.com
>>
>

John Martyniak
President/CEO
Before Dawn Solutions, Inc.
9457 S. University Blvd #266
Highlands Ranch, CO 80126
o: 877-499-1562
c: 303-522-1756
e: john@beforedawnsoutions.com
w: http://www.beforedawnsolutions.com


Re: Multiple NIC Cards

Posted by Matt Massie <ma...@cloudera.com>.
If you look at the documentation for the getCanonicalHostName()  
function (thanks, Steve)...

http://java.sun.com/javase/6/docs/api/java/net/InetAddress.html#getCanonicalHostName()

you'll see two Java security properties (networkaddress.cache.ttl and  
networkaddress.cache.negative.ttl).

You might take a look at your /etc/nsswitch.conf configuration as well  
to learn how hosts are resolved on your machine, e.g...

$ grep hosts /etc/nsswitch.conf
hosts:      files dns

and lastly, you may want to check if you are running nscd (the  
NameService cache daemon).  If you are, take a look at /etc/nscd.conf  
for the caching policy it's using.

Good luck.

-Matt



On Jun 10, 2009, at 1:09 PM, John Martyniak wrote:

> That is what I thought also, is that it needs to keep that  
> information somewhere, because it needs to be able to communicate  
> with all of the servers.
>
> So I deleted the /tmp/had* and /tmp/hs* directories, removed the log  
> files, and grepped for the duey name in all files in config.  And  
> the problem still exists.  Originally I thought that it might have  
> had something to do with multiple entries in the .ssh/ 
> authorized_keys file but removed everything there.  And the problem  
> still existed.
>
> So I think that I am going to grab a new install of hadoop 0.19.1,  
> delete the existing one and start out fresh to see if that changes  
> anything.
>
> Wish me luck:)
>
> -John
>
> On Jun 10, 2009, at 12:30 PM, Steve Loughran wrote:
>
>> John Martyniak wrote:
>>> Does hadoop "cache" the server names anywhere?  Because I changed  
>>> to using DNS for name resolution, but when I go to the nodes view,  
>>> it is trying to view with the old name.  And I changed the hadoop- 
>>> site.xml file so that it no longer has any of those values.
>>
>> in SVN head, we try and get Java to tell us what is going on
>> http://svn.apache.org/viewvc/hadoop/core/trunk/src/core/org/apache/hadoop/net/DNS.java
>>
>> This uses InetAddress.getLocalHost().getCanonicalHostName() to get  
>> the value, which is cached for life of the process. I don't know of  
>> anything else, but wouldn't be surprised -the Namenode has to  
>> remember the machines where stuff was stored.
>>
>>
>
> John Martyniak
> President/CEO
> Before Dawn Solutions, Inc.
> 9457 S. University Blvd #266
> Highlands Ranch, CO 80126
> o: 877-499-1562
> c: 303-522-1756
> e: john@beforedawnsoutions.com
> w: http://www.beforedawnsolutions.com
>


Re: Multiple NIC Cards

Posted by John Martyniak <jo...@beforedawnsolutions.com>.
That is what I thought also, is that it needs to keep that information  
somewhere, because it needs to be able to communicate with all of the  
servers.

So I deleted the /tmp/had* and /tmp/hs* directories, removed the log  
files, and grepped for the duey name in all files in config.  And the  
problem still exists.  Originally I thought that it might have had  
something to do with multiple entries in the .ssh/authorized_keys file  
but removed everything there.  And the problem still existed.

So I think that I am going to grab a new install of hadoop 0.19.1,  
delete the existing one and start out fresh to see if that changes  
anything.

Wish me luck:)

-John

On Jun 10, 2009, at 12:30 PM, Steve Loughran wrote:

> John Martyniak wrote:
>> Does hadoop "cache" the server names anywhere?  Because I changed  
>> to using DNS for name resolution, but when I go to the nodes view,  
>> it is trying to view with the old name.  And I changed the hadoop- 
>> site.xml file so that it no longer has any of those values.
>
> in SVN head, we try and get Java to tell us what is going on
> http://svn.apache.org/viewvc/hadoop/core/trunk/src/core/org/apache/hadoop/net/DNS.java
>
> This uses InetAddress.getLocalHost().getCanonicalHostName() to get  
> the value, which is cached for life of the process. I don't know of  
> anything else, but wouldn't be surprised -the Namenode has to  
> remember the machines where stuff was stored.
>
>

John Martyniak
President/CEO
Before Dawn Solutions, Inc.
9457 S. University Blvd #266
Highlands Ranch, CO 80126
o: 877-499-1562
c: 303-522-1756
e: john@beforedawnsoutions.com
w: http://www.beforedawnsolutions.com


Re: Multiple NIC Cards

Posted by Steve Loughran <st...@apache.org>.
John Martyniak wrote:
> Does hadoop "cache" the server names anywhere?  Because I changed to 
> using DNS for name resolution, but when I go to the nodes view, it is 
> trying to view with the old name.  And I changed the hadoop-site.xml 
> file so that it no longer has any of those values.
> 

in SVN head, we try and get Java to tell us what is going on
http://svn.apache.org/viewvc/hadoop/core/trunk/src/core/org/apache/hadoop/net/DNS.java

This uses InetAddress.getLocalHost().getCanonicalHostName() to get the 
value, which is cached for life of the process. I don't know of anything 
else, but wouldn't be surprised -the Namenode has to remember the 
machines where stuff was stored.



Re: Multiple NIC Cards

Posted by John Martyniak <jo...@beforedawnsolutions.com>.
Does hadoop "cache" the server names anywhere?  Because I changed to  
using DNS for name resolution, but when I go to the nodes view, it is  
trying to view with the old name.  And I changed the hadoop-site.xml  
file so that it no longer has any of those values.

Any help would be appreciated.

Thank you,

-John

On Jun 9, 2009, at 9:24 PM, John Martyniak wrote:

>
> So I setup a dns server that is for the internal network.  changed  
> all of the names to duey.local, and created a master zone for .local  
> on the DNS.  Put the domains server as the first one in /etc/ 
> resolv.conf file, added it to the interface.  I changed the hostname  
> of the machine that it is running on from duey.xxxx.com to  
> duey.local.  Checked that the dns resolves, and it does. Ran  
> nslookup and returns the name of the machine given the ip address.
>
> changed all of the names from the IP Addresses to duey.local, in my  
> hadoop-site.xml, changed the names in the masters and slaves files.
>
> Deleted all of the logs, deleted the /tmp directory stuff.
>
> Then restarted hadoop.  And much to my surprise.....it still didn't  
> work.
>
> I really thought that this would work as it seems to be the  
> consensus that the issue is the resolution of the name.
>
> Any other thoughts would be greatly appreciated.
>
> -John
>
>
>
>
>
> On Jun 9, 2009, at 3:17 PM, Raghu Angadi wrote:
>
>>
>> I still need to go through the whole thread. but we feel your pain.
>>
>> First, please try setting fs.default.name to namenode internal ip  
>> on the datanodes. This should make NN to attach internal ip so the  
>> datanodes (assuming your routing is correct). NameNode webUI should  
>> list internal ips for datanode. You might have to temporarily  
>> change NameNode code to listen on 0.0.0.0.
>>
>> That said, The issues you are facing are pretty unfortunate. As  
>> Steve mentioned Hadoop is all confused about hostname/ip and there  
>> is unecessary reliance on hostname and reverse DNS look ups in many  
>> many places.
>>
>> At least fairly straight fwd set ups with multiple NICs should be  
>> handled well.
>>
>> dfs.datanode.dns.interface should work like you expected (but not  
>> very surprised it didn't).
>>
>> Another thing you could try is setting dfs.datanode.address to the  
>> internal ip address (this might already be discussed in the  
>> thread). This should at least get all the bulk datatransfers happen  
>> over internal NICs. One way to make sure is to hover on the  
>> datanode node on NameNode webUI.. it shows the ip address.
>>
>> good luck.
>>
>> It might be better document your pains and findings in a Jira (with  
>> most of the details in one or more comments rather than in  
>> description).
>>
>> Raghu.
>>
>> John Martyniak wrote:
>>> So I changed all of the 0.0.0.0 on one machine to point to the  
>>> 192.168.1.102 address.
>>> And still it picks up the hostname and ip address of the external  
>>> network.
>>> I am kind of at my wits end with this, as I am not seeing a  
>>> solution yet, except to take the machines off of the external  
>>> network and leave them on the internal network which isn't an  
>>> option.
>>> Has anybody had this problem before?  What was the solution?
>>> -John
>>> On Jun 9, 2009, at 10:17 AM, Steve Loughran wrote:
>>>> One thing to consider is that some of the various services of  
>>>> Hadoop are bound to 0:0:0:0, which means every Ipv4 address, you  
>>>> really want to bring up everything, including jetty services, on  
>>>> the en0 network adapter, by binding them to  192.168.1.102; this  
>>>> will cause anyone trying to talk to them over the other network  
>>>> to fail, which at least find the problem sooner rather than later
>>
>

John Martyniak
President/CEO
Before Dawn Solutions, Inc.
9457 S. University Blvd #266
Highlands Ranch, CO 80126
o: 877-499-1562
c: 303-522-1756
e: john@beforedawnsoutions.com
w: http://www.beforedawnsolutions.com


Re: Multiple NIC Cards

Posted by John Martyniak <jo...@avum.com>.
So I setup a dns server that is for the internal network.  changed all  
of the names to duey.local, and created a master zone for .local on  
the DNS.  Put the domains server as the first one in /etc/resolv.conf  
file, added it to the interface.  I changed the hostname of the  
machine that it is running on from duey.xxxx.com to duey.local.   
Checked that the dns resolves, and it does. Ran nslookup and returns  
the name of the machine given the ip address.

changed all of the names from the IP Addresses to duey.local, in my  
hadoop-site.xml, changed the names in the masters and slaves files.

Deleted all of the logs, deleted the /tmp directory stuff.

Then restarted hadoop.  And much to my surprise.....it still didn't  
work.

I really thought that this would work as it seems to be the consensus  
that the issue is the resolution of the name.

Any other thoughts would be greatly appreciated.

-John





On Jun 9, 2009, at 3:17 PM, Raghu Angadi wrote:

>
> I still need to go through the whole thread. but we feel your pain.
>
> First, please try setting fs.default.name to namenode internal ip on  
> the datanodes. This should make NN to attach internal ip so the  
> datanodes (assuming your routing is correct). NameNode webUI should  
> list internal ips for datanode. You might have to temporarily change  
> NameNode code to listen on 0.0.0.0.
>
> That said, The issues you are facing are pretty unfortunate. As  
> Steve mentioned Hadoop is all confused about hostname/ip and there  
> is unecessary reliance on hostname and reverse DNS look ups in many  
> many places.
>
> At least fairly straight fwd set ups with multiple NICs should be  
> handled well.
>
> dfs.datanode.dns.interface should work like you expected (but not  
> very surprised it didn't).
>
> Another thing you could try is setting dfs.datanode.address to the  
> internal ip address (this might already be discussed in the thread).  
> This should at least get all the bulk datatransfers happen over  
> internal NICs. One way to make sure is to hover on the datanode node  
> on NameNode webUI.. it shows the ip address.
>
> good luck.
>
> It might be better document your pains and findings in a Jira (with  
> most of the details in one or more comments rather than in  
> description).
>
> Raghu.
>
> John Martyniak wrote:
>> So I changed all of the 0.0.0.0 on one machine to point to the  
>> 192.168.1.102 address.
>> And still it picks up the hostname and ip address of the external  
>> network.
>> I am kind of at my wits end with this, as I am not seeing a  
>> solution yet, except to take the machines off of the external  
>> network and leave them on the internal network which isn't an option.
>> Has anybody had this problem before?  What was the solution?
>> -John
>> On Jun 9, 2009, at 10:17 AM, Steve Loughran wrote:
>>> One thing to consider is that some of the various services of  
>>> Hadoop are bound to 0:0:0:0, which means every Ipv4 address, you  
>>> really want to bring up everything, including jetty services, on  
>>> the en0 network adapter, by binding them to  192.168.1.102; this  
>>> will cause anyone trying to talk to them over the other network to  
>>> fail, which at least find the problem sooner rather than later
>


Re: Multiple NIC Cards

Posted by John Martyniak <jo...@avum.com>.
Raghu,

Thanks for the suggestions.

So I made those suggestions and on both the Map/Reduce, NameNode web  
UIs the machines are listed using the external IP address.

So I don't think that worked.  I am going to try it again and clear  
out everything in the /tmp directory and try again.

-John

On Jun 9, 2009, at 3:17 PM, Raghu Angadi wrote:

>
> I still need to go through the whole thread. but we feel your pain.
>
> First, please try setting fs.default.name to namenode internal ip on  
> the datanodes. This should make NN to attach internal ip so the  
> datanodes (assuming your routing is correct). NameNode webUI should  
> list internal ips for datanode. You might have to temporarily change  
> NameNode code to listen on 0.0.0.0.
>
> That said, The issues you are facing are pretty unfortunate. As  
> Steve mentioned Hadoop is all confused about hostname/ip and there  
> is unecessary reliance on hostname and reverse DNS look ups in many  
> many places.
>
> At least fairly straight fwd set ups with multiple NICs should be  
> handled well.
>
> dfs.datanode.dns.interface should work like you expected (but not  
> very surprised it didn't).
>
> Another thing you could try is setting dfs.datanode.address to the  
> internal ip address (this might already be discussed in the thread).  
> This should at least get all the bulk datatransfers happen over  
> internal NICs. One way to make sure is to hover on the datanode node  
> on NameNode webUI.. it shows the ip address.
>
> good luck.
>
> It might be better document your pains and findings in a Jira (with  
> most of the details in one or more comments rather than in  
> description).
>
> Raghu.
>
> John Martyniak wrote:
>> So I changed all of the 0.0.0.0 on one machine to point to the  
>> 192.168.1.102 address.
>> And still it picks up the hostname and ip address of the external  
>> network.
>> I am kind of at my wits end with this, as I am not seeing a  
>> solution yet, except to take the machines off of the external  
>> network and leave them on the internal network which isn't an option.
>> Has anybody had this problem before?  What was the solution?
>> -John
>> On Jun 9, 2009, at 10:17 AM, Steve Loughran wrote:
>>> One thing to consider is that some of the various services of  
>>> Hadoop are bound to 0:0:0:0, which means every Ipv4 address, you  
>>> really want to bring up everything, including jetty services, on  
>>> the en0 network adapter, by binding them to  192.168.1.102; this  
>>> will cause anyone trying to talk to them over the other network to  
>>> fail, which at least find the problem sooner rather than later
>


Re: Multiple NIC Cards

Posted by Raghu Angadi <ra...@yahoo-inc.com>.
I still need to go through the whole thread. but we feel your pain.

First, please try setting fs.default.name to namenode internal ip on the 
datanodes. This should make NN to attach internal ip so the datanodes 
(assuming your routing is correct). NameNode webUI should list internal 
ips for datanode. You might have to temporarily change NameNode code to 
listen on 0.0.0.0.

That said, The issues you are facing are pretty unfortunate. As Steve 
mentioned Hadoop is all confused about hostname/ip and there is 
unecessary reliance on hostname and reverse DNS look ups in many many 
places.

At least fairly straight fwd set ups with multiple NICs should be 
handled well.

dfs.datanode.dns.interface should work like you expected (but not very 
surprised it didn't).

Another thing you could try is setting dfs.datanode.address to the 
internal ip address (this might already be discussed in the thread). 
This should at least get all the bulk datatransfers happen over internal 
NICs. One way to make sure is to hover on the datanode node on NameNode 
webUI.. it shows the ip address.

good luck.

It might be better document your pains and findings in a Jira (with most 
of the details in one or more comments rather than in description).

Raghu.

John Martyniak wrote:
> So I changed all of the 0.0.0.0 on one machine to point to the 
> 192.168.1.102 address.
> 
> And still it picks up the hostname and ip address of the external network.
> 
> I am kind of at my wits end with this, as I am not seeing a solution 
> yet, except to take the machines off of the external network and leave 
> them on the internal network which isn't an option.
> 
> Has anybody had this problem before?  What was the solution?
> 
> -John
> 
> On Jun 9, 2009, at 10:17 AM, Steve Loughran wrote:
> 
>> One thing to consider is that some of the various services of Hadoop 
>> are bound to 0:0:0:0, which means every Ipv4 address, you really want 
>> to bring up everything, including jetty services, on the en0 network 
>> adapter, by binding them to  192.168.1.102; this will cause anyone 
>> trying to talk to them over the other network to fail, which at least 
>> find the problem sooner rather than later
> 


Re: Multiple NIC Cards

Posted by John Martyniak <jo...@avum.com>.
So I changed all of the 0.0.0.0 on one machine to point to the  
192.168.1.102 address.

And still it picks up the hostname and ip address of the external  
network.

I am kind of at my wits end with this, as I am not seeing a solution  
yet, except to take the machines off of the external network and leave  
them on the internal network which isn't an option.

Has anybody had this problem before?  What was the solution?

-John

On Jun 9, 2009, at 10:17 AM, Steve Loughran wrote:

> One thing to consider is that some of the various services of Hadoop  
> are bound to 0:0:0:0, which means every Ipv4 address, you really  
> want to bring up everything, including jetty services, on the en0  
> network adapter, by binding them to  192.168.1.102; this will cause  
> anyone trying to talk to them over the other network to fail, which  
> at least find the problem sooner rather than later


Re: Multiple NIC Cards

Posted by Steve Loughran <st...@apache.org>.
John Martyniak wrote:
> I am running Mac OS X.
> 
> So en0 points to the external address and en1 points to the internal 
> address on both machines.
> 
> Here is the internal results from duey:
> en1: flags=8963<UP,BROADCAST,SMART,RUNNING,PROMISC,SIMPLEX,MULTICAST> 
> mtu 1500
>     inet6 fe80::21e:52ff:fef4:65%en1 prefixlen 64 scopeid 0x5
>     inet 192.168.1.102 netmask 0xffffff00 broadcast 192.168.1.255
>     ether 00:1e:52:f4:00:65
>     media: autoselect (1000baseT <full-duplex>) status: active

>     lladdr 00:23:32:ff:fe:1a:20:66
>     media: autoselect <full-duplex> status: inactive
>     supported media: autoselect <full-duplex>
> 
> Here are the internal results from huey:
> en1: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
>     inet6 fe80::21e:52ff:fef3:f489%en1 prefixlen 64 scopeid 0x5
>     inet 192.168.1.103 netmask 0xffffff00 broadcast 192.168.1.255

what does
   nslookup 192.168.1.103
and
   nslookup 192.168.1.102
say?

There really ought to be different names for them.


 > I have some other applications running on these machines, that
 > communicate across the internal network and they work perfectly.

I admire their strength. Multihost systems cause us trouble. That and 
machines that don't quite know who they are
http://jira.smartfrog.org/jira/browse/SFOS-5
https://issues.apache.org/jira/browse/HADOOP-3612
https://issues.apache.org/jira/browse/HADOOP-3426
https://issues.apache.org/jira/browse/HADOOP-3613
https://issues.apache.org/jira/browse/HADOOP-5339

One thing to consider is that some of the various services of Hadoop are 
bound to 0:0:0:0, which means every Ipv4 address, you really want to 
bring up everything, including jetty services, on the en0 network 
adapter, by binding them to  192.168.1.102; this will cause anyone 
trying to talk to them over the other network to fail, which at least 
find the problem sooner rather than later


Re: Multiple NIC Cards

Posted by John Martyniak <jo...@avum.com>.
I am running Mac OS X.

So en0 points to the external address and en1 points to the internal  
address on both machines.

Here is the internal results from duey:
en1: flags=8963<UP,BROADCAST,SMART,RUNNING,PROMISC,SIMPLEX,MULTICAST>  
mtu 1500
	inet6 fe80::21e:52ff:fef4:65%en1 prefixlen 64 scopeid 0x5
	inet 192.168.1.102 netmask 0xffffff00 broadcast 192.168.1.255
	ether 00:1e:52:f4:00:65
	media: autoselect (1000baseT <full-duplex>) status: active
	supported media: autoselect 10baseT/UTP <half-duplex> 10baseT/UTP  
<full-duplex> 10baseT/UTP <full-duplex,hw-loopback> 10baseT/UTP <full- 
duplex,flow-control> 100baseTX <half-duplex> 100baseTX <full-duplex>  
100baseTX <full-duplex,hw-loopback> 100baseTX <full-duplex,flow- 
control> 1000baseT <full-duplex> 1000baseT <full-duplex,hw-loopback>  
1000baseT <full-duplex,flow-control>
fw0: flags=8822<BROADCAST,SMART,SIMPLEX,MULTICAST> mtu 4078
	lladdr 00:23:32:ff:fe:1a:20:66
	media: autoselect <full-duplex> status: inactive
	supported media: autoselect <full-duplex>

Here are the internal results from huey:
en1: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
	inet6 fe80::21e:52ff:fef3:f489%en1 prefixlen 64 scopeid 0x5
	inet 192.168.1.103 netmask 0xffffff00 broadcast 192.168.1.255
	ether 00:1e:52:f3:f4:89
	media: autoselect (1000baseT <full-duplex>) status: active
	supported media: autoselect 10baseT/UTP <half-duplex> 10baseT/UTP  
<full-duplex> 10baseT/UTP <full-duplex,hw-loopback> 10baseT/UTP <full- 
duplex,flow-control> 100baseTX <half-duplex> 100baseTX <full-duplex>  
100baseTX <full-duplex,hw-loopback> 100baseTX <full-duplex,flow- 
control> 1000baseT <full-duplex> 1000baseT <full-duplex,hw-loopback>  
1000baseT <full-duplex,flow-control>

I have some other applications running on these machines, that  
communicate across the internal network and they work perfectly.

-John

On Jun 9, 2009, at 9:45 AM, Steve Loughran wrote:

> John Martyniak wrote:
>> My original names where huey-direct and duey-direct, both names in  
>> the /etc/hosts file on both machines.
>> Are nn.internal and jt.interal special names?
>
> no, just examples on a multihost network when your external names  
> could be something completely different.
>
> What does /sbin/ifconfig say on each of the hosts?
>


Re: Multiple NIC Cards

Posted by Steve Loughran <st...@apache.org>.
John Martyniak wrote:
> My original names where huey-direct and duey-direct, both names in the 
> /etc/hosts file on both machines.
> 
> Are nn.internal and jt.interal special names?

  no, just examples on a multihost network when your external names 
could be something completely different.

What does /sbin/ifconfig say on each of the hosts?


Re: Multiple NIC Cards

Posted by John Martyniak <jo...@avum.com>.
My original names where huey-direct and duey-direct, both names in  
the /etc/hosts file on both machines.

Are nn.internal and jt.interal special names?

-John


On Jun 9, 2009, at 9:26 AM, Steve Loughran wrote:

> John Martyniak wrote:
>> David,
>> For the Option #1.
>> I just changed the names to the IP Addresses, and it still comes up  
>> as the external name and ip address in the log files, and on the  
>> job tracker screen.
>> So option 1 is a no go.
>> When I change the "dfs.datanode.dns.interface" values it doesn't  
>> seem to do anything.  When I was search archived mail, this seemed  
>> to be a the approach to change the NIC card being used for  
>> resolution.  But when I change it nothing happens, I even put in  
>> bogus values and still no issues.
>> -John
>
>
> I've been having similar but different fun with Hadoop-on-VMs,  
> there's a lot of assumption that DNS and rDNS all works consistently  
> in the code. Do you have separate internal and external hostnames?  
> In which case, can you bring up the job tracker as jt.internal ,  
> namenode as nn.internal (so the full HDFS URl is something like  
> hdfs://nn.internal/ ) , etc, etc.?


Re: Multiple NIC Cards

Posted by Steve Loughran <st...@apache.org>.
John Martyniak wrote:
> David,
> 
> For the Option #1.
> 
> I just changed the names to the IP Addresses, and it still comes up as 
> the external name and ip address in the log files, and on the job 
> tracker screen.
> 
> So option 1 is a no go.
> 
> When I change the "dfs.datanode.dns.interface" values it doesn't seem to 
> do anything.  When I was search archived mail, this seemed to be a the 
> approach to change the NIC card being used for resolution.  But when I 
> change it nothing happens, I even put in bogus values and still no issues.
> 
> -John


I've been having similar but different fun with Hadoop-on-VMs, there's a 
lot of assumption that DNS and rDNS all works consistently in the code. 
Do you have separate internal and external hostnames? In which case, can 
you bring up the job tracker as jt.internal , namenode as nn.internal 
(so the full HDFS URl is something like hdfs://nn.internal/ ) , etc, etc.?

Re: Multiple NIC Cards

Posted by John Martyniak <jo...@avum.com>.
David,

For the Option #1.

I just changed the names to the IP Addresses, and it still comes up as  
the external name and ip address in the log files, and on the job  
tracker screen.

So option 1 is a no go.

When I change the "dfs.datanode.dns.interface" values it doesn't seem  
to do anything.  When I was search archived mail, this seemed to be a  
the approach to change the NIC card being used for resolution.  But  
when I change it nothing happens, I even put in bogus values and still  
no issues.

-John

On Jun 9, 2009, at 7:30 AM, David B. Ritch wrote:

> I see several possibilities:
>
> 1) Use IP addresses instead of machine names in the config files.
> 2) Make sure that the names are in local /etc/hosts files, and that  
> they
> resolve to the internal IP addresses.
> 3) Set up an internal name server, and make sure that it serves your
> internal addresses and your cluster nodes point to it.
>
> In general, how do you resolve your machine names to the internal  
> addresses?
>
> David
>
> John Martyniak wrote:
>> Hi,
>>
>> I am creating a small Hadoop (0.19.1) cluster (2 nodes to start),  
>> each
>> of the machines has 2 NIC cards (1 external facing, 1 internal
>> facing).  It is important that Hadoop run and communicate on the
>> internal facing NIC (because the external facing NIC costs me money),
>> also the internal is more protected.
>>
>> So I found a couple of parameters both in the mail archives and on  
>> the
>> Hadoop core site, but they don't seem to do anything as the result is
>> always the same, it resolves to the external IP, and name (which is  
>> in
>> DNS).
>>
>> <property>
>>  <name>dfs.datanode.dns.interface</name>
>>  <value>en1</value>
>>  <description>The name of the Network Interface from which a data
>> node should
>>  report its IP address.
>>  </description>
>> </property>
>>
>> <property>
>>  <name>dfs.datanode.dns.nameserver</name>
>>  <value>DN</value>
>>  <description>The host name or IP address of the name server (DNS)
>>  which a DataNode should use to determine the host name used by the
>>  NameNode for communication and display purposes.
>>  </description>
>> </property>
>>
>> <property>
>>  <name>mapred.tasktracker.dns.interface</name>
>>  <value>en1</value>
>>  <description>The name of the Network Interface from which a task
>>  tracker should report its IP address.
>>  </description>
>> </property>
>>
>> <property>
>>  <name>mapred.tasktracker.dns.nameserver</name>
>>  <value>DN</value>
>>  <description>The host name or IP address of the name server (DNS)
>>  which a TaskTracker should use to determine the host name used by
>>  the JobTracker for communication and display purposes.
>>  </description>
>> </property>
>>
>> Any help would be greatly appreciated.
>>
>> Thank you in advance.
>>
>> -John
>>
>>
>> John Martyniak
>> Before Dawn Solutions, Inc.
>> 9457 S. University Blvd #266
>> Highlands Ranch, CO 80126
>> o: 877-499-1562
>> c: 303-522-1756
>> e: john@beforedawnsoutions.com
>> w: http://www.beforedawnsolutions.com
>>
>>
>


Re: Multiple NIC Cards

Posted by "David B. Ritch" <Da...@gmail.com>.
I see several possibilities:

1) Use IP addresses instead of machine names in the config files.
2) Make sure that the names are in local /etc/hosts files, and that they
resolve to the internal IP addresses.
3) Set up an internal name server, and make sure that it serves your
internal addresses and your cluster nodes point to it.

In general, how do you resolve your machine names to the internal addresses?

David

John Martyniak wrote:
> Hi,
>
> I am creating a small Hadoop (0.19.1) cluster (2 nodes to start), each
> of the machines has 2 NIC cards (1 external facing, 1 internal
> facing).  It is important that Hadoop run and communicate on the
> internal facing NIC (because the external facing NIC costs me money),
> also the internal is more protected.
>
> So I found a couple of parameters both in the mail archives and on the
> Hadoop core site, but they don't seem to do anything as the result is
> always the same, it resolves to the external IP, and name (which is in
> DNS).
>
>  <property>
>   <name>dfs.datanode.dns.interface</name>
>   <value>en1</value>
>   <description>The name of the Network Interface from which a data
> node should
>   report its IP address.
>   </description>
>  </property>
>
> <property>
>   <name>dfs.datanode.dns.nameserver</name>
>   <value>DN</value>
>   <description>The host name or IP address of the name server (DNS)
>   which a DataNode should use to determine the host name used by the
>   NameNode for communication and display purposes.
>   </description>
>  </property>
>
> <property>
>   <name>mapred.tasktracker.dns.interface</name>
>   <value>en1</value>
>   <description>The name of the Network Interface from which a task
>   tracker should report its IP address.
>   </description>
>  </property>
>
> <property>
>   <name>mapred.tasktracker.dns.nameserver</name>
>   <value>DN</value>
>   <description>The host name or IP address of the name server (DNS)
>   which a TaskTracker should use to determine the host name used by
>   the JobTracker for communication and display purposes.
>   </description>
>  </property>
>
> Any help would be greatly appreciated.
>
> Thank you in advance.
>
> -John
>
>
> John Martyniak
> Before Dawn Solutions, Inc.
> 9457 S. University Blvd #266
> Highlands Ranch, CO 80126
> o: 877-499-1562
> c: 303-522-1756
> e: john@beforedawnsoutions.com
> w: http://www.beforedawnsolutions.com
>
>