You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by "Ramesh.Ramasamy" <ra...@gmail.com> on 2009/10/19 15:40:20 UTC

editing etc hosts files of a cluster

Hi,

I have a cluster setup with 3 nodes, and I'm adding hostname details (in
/etc/hosts) manually in each node. Seems it is not an effective approach.
How this scenario is handled in big clusters?

Is there any simple of way to add the hostname details in all the nodes by
editing a single entry/file/script? 

Thanks and Regards,
Ramesh


-- 
View this message in context: http://www.nabble.com/editing-etc-hosts-files-of-a-cluster-tp25958579p25958579.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.


One day sold out -- Hadoop Conference Japan 2009

Posted by Mikio Uzawa <m_...@amber.plala.or.jp>.
Hi all,

The conference will be taken place Nov.13 in Tokyo.
http://jclouds.wordpress.com/2009/10/19/one-day-sold-out-hadoop-conference-j
apan-2009/

One day sold out -- It's amazing!

/mikio




Re: editing etc hosts files of a cluster

Posted by Allen Wittenauer <aw...@linkedin.com>.
Everything can get made to work in a small scale.  As the grid grows,
well...


On 10/20/09 10:32 AM, "David Ritch" <da...@gmail.com> wrote:

> I also prefer to avoid custom software, and follow standards.  We use Puppet
> to manage our node configuration (including hadoop config files), and adding
> one more file to the configuration is trivial.
> 
> I prefer not to run additional daemons on all my nodes when I can avoid it.
> Replicating our hosts file allows us to avoid running named on all the
> nodes.
> 
> David
> 
> On Tue, Oct 20, 2009 at 1:15 PM, Allen Wittenauer
> <aw...@linkedin.com>wrote:
> 
>> Any time you deal with pushing files around, you also have to deal with the
>> repercussions of when the file fails to get to its destination or it fails
>> to get there in a timely manner. [Hai hadoop config files.] If you use an
>> interface alias/vip/multi-a/whatever to deal with namenode availability,
>> then the host information becomes even more critical.
>> 
>> Rather than build something custom, I chose to use well known, off the
>> shelf
>> software to deal with keeping host information relatively in-sync:  bind.
>> 
>> 
>> On 10/19/09 8:09 PM, "David B. Ritch" <da...@gmail.com> wrote:
>> 
>>> Most of the communication and name lookups within a cluster refer to
>>> other nodes within that same cluster.  It is usually not a big deal to
>>> put all the systems from a cluster in a single hosts file, and rsync it
>>> around the cluster.  (Consider using prsync, which comes with pssh,
>>> http://www.theether.org/pssh/, or your favorite cluster management
>>> software.)
>>> Editing each individually clearly doesn't scale; but editing it once and
>>> replicating it does.
>>> 
>>> Is a large hosts file less efficient than nscd or a caching DNS server
>>> for nodes within the cluster?
>>> 
>>> Thanks,
>>> 
>>> David
>>> 
>>> On 10/19/2009 8:02 PM, Edward Capriolo wrote:
>>>> On Mon, Oct 19, 2009 at 7:17 PM, Allen Wittenauer
>>>> <aw...@linkedin.com> wrote:
>>>> 
>>>>> 
>>>>> 
>>>>> On 10/19/09 11:46 AM, "Edward Capriolo" <ed...@gmail.com> wrote:
>>>>> 
>>>>> 
>>>>>> I am interested in your post. What has caused you to run caching DNS
>>>>>> servers on each of your nodes? Is this a hadoop specific problem or a
>>>>>> problem  specific to your implementation?
>>>>>> 
>>>>> Hadoop does a -tremendous- amount of hostname lookups.  If you don't
>> have
>>>>> either nscd or a local DNS caching server, you are likely throwing what
>>>>> could be some significant performance gains away.
>>>>> 
>>>>> 
>>>>>> My assumption here is that a hadoop cluster of say 1000 nodes would
>>>>>> repeatedly talk to the same 1000 nodes.
>>>>>> 
>>>>> ... and that's the catch!  Every node running the DFSClient code or
>> being
>>>>> called out from a map/reduce task is a potential hostname that would
>> need be
>>>>> resolved.  Just think about something like distcp.
>>>>> 
>>>>> Also note that this is before we talk about monitoring, any other
>> naming
>>>>> services, CNAMEs, multi-As, etc, that get built as a normal part of
>> running
>>>>> an infrastructure.
>>>>> 
>>>>> 
>>>>>> Are you saying that nscd is
>>>>>> inadequacy to handle the size of the cache, or nscd is not very
>>>>>> efficient? What exactly is the reason you are running a caching DNS
>>>>>> server on each node?
>>>>>> 
>>>>> In the case of Yahoo!, we had (or, at least, a perception) that we had
>> or
>>>>> were going to have jobs that did a lot of direct DNS lookups and/or
>>>>> accessed/referenced things outside of the local grid.  Also note that a
>> DNS
>>>>> caching server is going to store more information about hostnames than
>> a
>>>>> simple host to IP service like nscd.
>>>>> 
>>>>> Hypothetical:  Let's say I'm building rules for a spam filter and part
>> of my
>>>>> process is to look up the MX record for a given host.  nscd isn't going
>> to
>>>>> help you there.
>>>>> 
>>>>> In the case of LinkedIn, the jury is still out.  I suspect we don't
>> have
>>>>> nscd.conf tuned correctly.  Our grid isn't that big, our connections
>> in/out
>>>>> are fairly small, etc. It has been one of the things on my todo list
>> since I
>>>>> got hired here 2 months ago. :)
>>>>> 
>>>>> [For the record, I'm not one of those crazy people who turns off nscd
>>>>> because I had a bad experience with a  broken version five years ago.
>>  In
>>>>> the case of Yahoo!, I was the crazy person who started insisting we
>> turn it
>>>>> on, albeit not for hosts.]
>>>>> 
>>>>> 
>>>>> 
>>>> Cool thanks for the info.
>>>> 
>>>> I have found NSCD to be absolutely essential in most/all situations.
>>>> Whenever I would truss processes on OS'es without NSCD (say freebsd
>>>> 6.2) I would see numerous repeated 'stat' against /etc/passwd and
>>>> /etc/group.
>>>> 
>>>> If you are doing users and groups through LDAP nscd is super important
>>>> as well. Your not going to want to make a series of lookups each stat.
>>>> 
>>>> I would think the most efficient implementation would be nscd and a
>>>> local caching server in that case. NSCD should be very efficient since
>>>> it is done through libraries, dns lookups have to open sockets
>>>> (overhead). However I can see your point nscd can not do other types
>>>> of records.
>>>> 
>>>> 
>>> 
>> 
>> 


Re: editing etc hosts files of a cluster

Posted by David Ritch <da...@gmail.com>.
I also prefer to avoid custom software, and follow standards.  We use Puppet
to manage our node configuration (including hadoop config files), and adding
one more file to the configuration is trivial.

I prefer not to run additional daemons on all my nodes when I can avoid it.
Replicating our hosts file allows us to avoid running named on all the
nodes.

David

On Tue, Oct 20, 2009 at 1:15 PM, Allen Wittenauer
<aw...@linkedin.com>wrote:

> Any time you deal with pushing files around, you also have to deal with the
> repercussions of when the file fails to get to its destination or it fails
> to get there in a timely manner. [Hai hadoop config files.] If you use an
> interface alias/vip/multi-a/whatever to deal with namenode availability,
> then the host information becomes even more critical.
>
> Rather than build something custom, I chose to use well known, off the
> shelf
> software to deal with keeping host information relatively in-sync:  bind.
>
>
> On 10/19/09 8:09 PM, "David B. Ritch" <da...@gmail.com> wrote:
>
> > Most of the communication and name lookups within a cluster refer to
> > other nodes within that same cluster.  It is usually not a big deal to
> > put all the systems from a cluster in a single hosts file, and rsync it
> > around the cluster.  (Consider using prsync, which comes with pssh,
> > http://www.theether.org/pssh/, or your favorite cluster management
> > software.)
> > Editing each individually clearly doesn't scale; but editing it once and
> > replicating it does.
> >
> > Is a large hosts file less efficient than nscd or a caching DNS server
> > for nodes within the cluster?
> >
> > Thanks,
> >
> > David
> >
> > On 10/19/2009 8:02 PM, Edward Capriolo wrote:
> >> On Mon, Oct 19, 2009 at 7:17 PM, Allen Wittenauer
> >> <aw...@linkedin.com> wrote:
> >>
> >>>
> >>>
> >>> On 10/19/09 11:46 AM, "Edward Capriolo" <ed...@gmail.com> wrote:
> >>>
> >>>
> >>>> I am interested in your post. What has caused you to run caching DNS
> >>>> servers on each of your nodes? Is this a hadoop specific problem or a
> >>>> problem  specific to your implementation?
> >>>>
> >>> Hadoop does a -tremendous- amount of hostname lookups.  If you don't
> have
> >>> either nscd or a local DNS caching server, you are likely throwing what
> >>> could be some significant performance gains away.
> >>>
> >>>
> >>>> My assumption here is that a hadoop cluster of say 1000 nodes would
> >>>> repeatedly talk to the same 1000 nodes.
> >>>>
> >>> ... and that's the catch!  Every node running the DFSClient code or
> being
> >>> called out from a map/reduce task is a potential hostname that would
> need be
> >>> resolved.  Just think about something like distcp.
> >>>
> >>> Also note that this is before we talk about monitoring, any other
> naming
> >>> services, CNAMEs, multi-As, etc, that get built as a normal part of
> running
> >>> an infrastructure.
> >>>
> >>>
> >>>> Are you saying that nscd is
> >>>> inadequacy to handle the size of the cache, or nscd is not very
> >>>> efficient? What exactly is the reason you are running a caching DNS
> >>>> server on each node?
> >>>>
> >>> In the case of Yahoo!, we had (or, at least, a perception) that we had
> or
> >>> were going to have jobs that did a lot of direct DNS lookups and/or
> >>> accessed/referenced things outside of the local grid.  Also note that a
> DNS
> >>> caching server is going to store more information about hostnames than
> a
> >>> simple host to IP service like nscd.
> >>>
> >>> Hypothetical:  Let's say I'm building rules for a spam filter and part
> of my
> >>> process is to look up the MX record for a given host.  nscd isn't going
> to
> >>> help you there.
> >>>
> >>> In the case of LinkedIn, the jury is still out.  I suspect we don't
> have
> >>> nscd.conf tuned correctly.  Our grid isn't that big, our connections
> in/out
> >>> are fairly small, etc. It has been one of the things on my todo list
> since I
> >>> got hired here 2 months ago. :)
> >>>
> >>> [For the record, I'm not one of those crazy people who turns off nscd
> >>> because I had a bad experience with a  broken version five years ago.
>  In
> >>> the case of Yahoo!, I was the crazy person who started insisting we
> turn it
> >>> on, albeit not for hosts.]
> >>>
> >>>
> >>>
> >> Cool thanks for the info.
> >>
> >> I have found NSCD to be absolutely essential in most/all situations.
> >> Whenever I would truss processes on OS'es without NSCD (say freebsd
> >> 6.2) I would see numerous repeated 'stat' against /etc/passwd and
> >> /etc/group.
> >>
> >> If you are doing users and groups through LDAP nscd is super important
> >> as well. Your not going to want to make a series of lookups each stat.
> >>
> >> I would think the most efficient implementation would be nscd and a
> >> local caching server in that case. NSCD should be very efficient since
> >> it is done through libraries, dns lookups have to open sockets
> >> (overhead). However I can see your point nscd can not do other types
> >> of records.
> >>
> >>
> >
>
>

Re: editing etc hosts files of a cluster

Posted by Steve Loughran <st...@apache.org>.
David B. Ritch wrote:
> Most of the communication and name lookups within a cluster refer to
> other nodes within that same cluster.  It is usually not a big deal to
> put all the systems from a cluster in a single hosts file, and rsync it
> around the cluster.  (Consider using prsync, which comes with pssh,
> http://www.theether.org/pssh/, or your favorite cluster management
> software.)
> Editing each individually clearly doesn't scale; but editing it once and
> replicating it does.
> 
> Is a large hosts file less efficient than nscd or a caching DNS server
> for nodes within the cluster?
> 

Pro
  * removes the DNS server as a SPOF
  * works on clusters without DNS servers (virtual ones, for example)
  * lets you set up private hostnames ("namenode", "jobtracker") that 
don't change
  * lets you keep the cluster config under SCM

Con
  * harder to push out changes
  * wierd errors when your cluster is inconsistent


We could do a lot in Hadoop in detecting and reporting DNS problems; 
contributions here would be very welcome. They are a dog to test though.


Re: editing etc hosts files of a cluster

Posted by Allen Wittenauer <aw...@linkedin.com>.
Any time you deal with pushing files around, you also have to deal with the
repercussions of when the file fails to get to its destination or it fails
to get there in a timely manner. [Hai hadoop config files.] If you use an
interface alias/vip/multi-a/whatever to deal with namenode availability,
then the host information becomes even more critical.

Rather than build something custom, I chose to use well known, off the shelf
software to deal with keeping host information relatively in-sync:  bind.


On 10/19/09 8:09 PM, "David B. Ritch" <da...@gmail.com> wrote:

> Most of the communication and name lookups within a cluster refer to
> other nodes within that same cluster.  It is usually not a big deal to
> put all the systems from a cluster in a single hosts file, and rsync it
> around the cluster.  (Consider using prsync, which comes with pssh,
> http://www.theether.org/pssh/, or your favorite cluster management
> software.)
> Editing each individually clearly doesn't scale; but editing it once and
> replicating it does.
> 
> Is a large hosts file less efficient than nscd or a caching DNS server
> for nodes within the cluster?
> 
> Thanks,
> 
> David
> 
> On 10/19/2009 8:02 PM, Edward Capriolo wrote:
>> On Mon, Oct 19, 2009 at 7:17 PM, Allen Wittenauer
>> <aw...@linkedin.com> wrote:
>>   
>>> 
>>> 
>>> On 10/19/09 11:46 AM, "Edward Capriolo" <ed...@gmail.com> wrote:
>>> 
>>>     
>>>> I am interested in your post. What has caused you to run caching DNS
>>>> servers on each of your nodes? Is this a hadoop specific problem or a
>>>> problem  specific to your implementation?
>>>>       
>>> Hadoop does a -tremendous- amount of hostname lookups.  If you don't have
>>> either nscd or a local DNS caching server, you are likely throwing what
>>> could be some significant performance gains away.
>>> 
>>>     
>>>> My assumption here is that a hadoop cluster of say 1000 nodes would
>>>> repeatedly talk to the same 1000 nodes.
>>>>       
>>> ... and that's the catch!  Every node running the DFSClient code or being
>>> called out from a map/reduce task is a potential hostname that would need be
>>> resolved.  Just think about something like distcp.
>>> 
>>> Also note that this is before we talk about monitoring, any other naming
>>> services, CNAMEs, multi-As, etc, that get built as a normal part of running
>>> an infrastructure.
>>> 
>>>     
>>>> Are you saying that nscd is
>>>> inadequacy to handle the size of the cache, or nscd is not very
>>>> efficient? What exactly is the reason you are running a caching DNS
>>>> server on each node?
>>>>       
>>> In the case of Yahoo!, we had (or, at least, a perception) that we had or
>>> were going to have jobs that did a lot of direct DNS lookups and/or
>>> accessed/referenced things outside of the local grid.  Also note that a DNS
>>> caching server is going to store more information about hostnames than a
>>> simple host to IP service like nscd.
>>> 
>>> Hypothetical:  Let's say I'm building rules for a spam filter and part of my
>>> process is to look up the MX record for a given host.  nscd isn't going to
>>> help you there.
>>> 
>>> In the case of LinkedIn, the jury is still out.  I suspect we don't have
>>> nscd.conf tuned correctly.  Our grid isn't that big, our connections in/out
>>> are fairly small, etc. It has been one of the things on my todo list since I
>>> got hired here 2 months ago. :)
>>> 
>>> [For the record, I'm not one of those crazy people who turns off nscd
>>> because I had a bad experience with a  broken version five years ago.  In
>>> the case of Yahoo!, I was the crazy person who started insisting we turn it
>>> on, albeit not for hosts.]
>>> 
>>> 
>>>     
>> Cool thanks for the info.
>> 
>> I have found NSCD to be absolutely essential in most/all situations.
>> Whenever I would truss processes on OS'es without NSCD (say freebsd
>> 6.2) I would see numerous repeated 'stat' against /etc/passwd and
>> /etc/group.
>> 
>> If you are doing users and groups through LDAP nscd is super important
>> as well. Your not going to want to make a series of lookups each stat.
>> 
>> I would think the most efficient implementation would be nscd and a
>> local caching server in that case. NSCD should be very efficient since
>> it is done through libraries, dns lookups have to open sockets
>> (overhead). However I can see your point nscd can not do other types
>> of records.
>> 
>>   
> 


Re: editing etc hosts files of a cluster

Posted by "David B. Ritch" <da...@gmail.com>.
Most of the communication and name lookups within a cluster refer to
other nodes within that same cluster.  It is usually not a big deal to
put all the systems from a cluster in a single hosts file, and rsync it
around the cluster.  (Consider using prsync, which comes with pssh,
http://www.theether.org/pssh/, or your favorite cluster management
software.)
Editing each individually clearly doesn't scale; but editing it once and
replicating it does.

Is a large hosts file less efficient than nscd or a caching DNS server
for nodes within the cluster?

Thanks,

David

On 10/19/2009 8:02 PM, Edward Capriolo wrote:
> On Mon, Oct 19, 2009 at 7:17 PM, Allen Wittenauer
> <aw...@linkedin.com> wrote:
>   
>>
>>
>> On 10/19/09 11:46 AM, "Edward Capriolo" <ed...@gmail.com> wrote:
>>
>>     
>>> I am interested in your post. What has caused you to run caching DNS
>>> servers on each of your nodes? Is this a hadoop specific problem or a
>>> problem  specific to your implementation?
>>>       
>> Hadoop does a -tremendous- amount of hostname lookups.  If you don't have
>> either nscd or a local DNS caching server, you are likely throwing what
>> could be some significant performance gains away.
>>
>>     
>>> My assumption here is that a hadoop cluster of say 1000 nodes would
>>> repeatedly talk to the same 1000 nodes.
>>>       
>> ... and that's the catch!  Every node running the DFSClient code or being
>> called out from a map/reduce task is a potential hostname that would need be
>> resolved.  Just think about something like distcp.
>>
>> Also note that this is before we talk about monitoring, any other naming
>> services, CNAMEs, multi-As, etc, that get built as a normal part of running
>> an infrastructure.
>>
>>     
>>> Are you saying that nscd is
>>> inadequacy to handle the size of the cache, or nscd is not very
>>> efficient? What exactly is the reason you are running a caching DNS
>>> server on each node?
>>>       
>> In the case of Yahoo!, we had (or, at least, a perception) that we had or
>> were going to have jobs that did a lot of direct DNS lookups and/or
>> accessed/referenced things outside of the local grid.  Also note that a DNS
>> caching server is going to store more information about hostnames than a
>> simple host to IP service like nscd.
>>
>> Hypothetical:  Let's say I'm building rules for a spam filter and part of my
>> process is to look up the MX record for a given host.  nscd isn't going to
>> help you there.
>>
>> In the case of LinkedIn, the jury is still out.  I suspect we don't have
>> nscd.conf tuned correctly.  Our grid isn't that big, our connections in/out
>> are fairly small, etc. It has been one of the things on my todo list since I
>> got hired here 2 months ago. :)
>>
>> [For the record, I'm not one of those crazy people who turns off nscd
>> because I had a bad experience with a  broken version five years ago.  In
>> the case of Yahoo!, I was the crazy person who started insisting we turn it
>> on, albeit not for hosts.]
>>
>>
>>     
> Cool thanks for the info.
>
> I have found NSCD to be absolutely essential in most/all situations.
> Whenever I would truss processes on OS'es without NSCD (say freebsd
> 6.2) I would see numerous repeated 'stat' against /etc/passwd and
> /etc/group.
>
> If you are doing users and groups through LDAP nscd is super important
> as well. Your not going to want to make a series of lookups each stat.
>
> I would think the most efficient implementation would be nscd and a
> local caching server in that case. NSCD should be very efficient since
> it is done through libraries, dns lookups have to open sockets
> (overhead). However I can see your point nscd can not do other types
> of records.
>
>   


Re: editing etc hosts files of a cluster

Posted by Edward Capriolo <ed...@gmail.com>.
On Mon, Oct 19, 2009 at 7:17 PM, Allen Wittenauer
<aw...@linkedin.com> wrote:
>
>
>
> On 10/19/09 11:46 AM, "Edward Capriolo" <ed...@gmail.com> wrote:
>
>> I am interested in your post. What has caused you to run caching DNS
>> servers on each of your nodes? Is this a hadoop specific problem or a
>> problem  specific to your implementation?
>
> Hadoop does a -tremendous- amount of hostname lookups.  If you don't have
> either nscd or a local DNS caching server, you are likely throwing what
> could be some significant performance gains away.
>
>> My assumption here is that a hadoop cluster of say 1000 nodes would
>> repeatedly talk to the same 1000 nodes.
>
> ... and that's the catch!  Every node running the DFSClient code or being
> called out from a map/reduce task is a potential hostname that would need be
> resolved.  Just think about something like distcp.
>
> Also note that this is before we talk about monitoring, any other naming
> services, CNAMEs, multi-As, etc, that get built as a normal part of running
> an infrastructure.
>
>> Are you saying that nscd is
>> inadequacy to handle the size of the cache, or nscd is not very
>> efficient? What exactly is the reason you are running a caching DNS
>> server on each node?
>
> In the case of Yahoo!, we had (or, at least, a perception) that we had or
> were going to have jobs that did a lot of direct DNS lookups and/or
> accessed/referenced things outside of the local grid.  Also note that a DNS
> caching server is going to store more information about hostnames than a
> simple host to IP service like nscd.
>
> Hypothetical:  Let's say I'm building rules for a spam filter and part of my
> process is to look up the MX record for a given host.  nscd isn't going to
> help you there.
>
> In the case of LinkedIn, the jury is still out.  I suspect we don't have
> nscd.conf tuned correctly.  Our grid isn't that big, our connections in/out
> are fairly small, etc. It has been one of the things on my todo list since I
> got hired here 2 months ago. :)
>
> [For the record, I'm not one of those crazy people who turns off nscd
> because I had a bad experience with a  broken version five years ago.  In
> the case of Yahoo!, I was the crazy person who started insisting we turn it
> on, albeit not for hosts.]
>
>
Cool thanks for the info.

I have found NSCD to be absolutely essential in most/all situations.
Whenever I would truss processes on OS'es without NSCD (say freebsd
6.2) I would see numerous repeated 'stat' against /etc/passwd and
/etc/group.

If you are doing users and groups through LDAP nscd is super important
as well. Your not going to want to make a series of lookups each stat.

I would think the most efficient implementation would be nscd and a
local caching server in that case. NSCD should be very efficient since
it is done through libraries, dns lookups have to open sockets
(overhead). However I can see your point nscd can not do other types
of records.

Re: editing etc hosts files of a cluster

Posted by Allen Wittenauer <aw...@linkedin.com>.


On 10/19/09 11:46 AM, "Edward Capriolo" <ed...@gmail.com> wrote:

> I am interested in your post. What has caused you to run caching DNS
> servers on each of your nodes? Is this a hadoop specific problem or a
> problem  specific to your implementation?

Hadoop does a -tremendous- amount of hostname lookups.  If you don't have
either nscd or a local DNS caching server, you are likely throwing what
could be some significant performance gains away.

> My assumption here is that a hadoop cluster of say 1000 nodes would
> repeatedly talk to the same 1000 nodes.

... and that's the catch!  Every node running the DFSClient code or being
called out from a map/reduce task is a potential hostname that would need be
resolved.  Just think about something like distcp.

Also note that this is before we talk about monitoring, any other naming
services, CNAMEs, multi-As, etc, that get built as a normal part of running
an infrastructure.

> Are you saying that nscd is
> inadequacy to handle the size of the cache, or nscd is not very
> efficient? What exactly is the reason you are running a caching DNS
> server on each node?

In the case of Yahoo!, we had (or, at least, a perception) that we had or
were going to have jobs that did a lot of direct DNS lookups and/or
accessed/referenced things outside of the local grid.  Also note that a DNS
caching server is going to store more information about hostnames than a
simple host to IP service like nscd.

Hypothetical:  Let's say I'm building rules for a spam filter and part of my
process is to look up the MX record for a given host.  nscd isn't going to
help you there.

In the case of LinkedIn, the jury is still out.  I suspect we don't have
nscd.conf tuned correctly.  Our grid isn't that big, our connections in/out
are fairly small, etc. It has been one of the things on my todo list since I
got hired here 2 months ago. :)

[For the record, I'm not one of those crazy people who turns off nscd
because I had a bad experience with a  broken version five years ago.  In
the case of Yahoo!, I was the crazy person who started insisting we turn it
on, albeit not for hosts.]


Re: editing etc hosts files of a cluster

Posted by Edward Capriolo <ed...@gmail.com>.
On Mon, Oct 19, 2009 at 2:36 PM, Allen Wittenauer
<aw...@linkedin.com> wrote:
>
> A bit more specific:
>
> At Yahoo!, we had either every server as a DNS slave or a DNS caching
> server.
>
> In the case of LinkedIn, we're running Solaris so nscd is significantly
> better than its Linux counterpart.  However, we still seem to be blowing out
> the cache too much.  So we'll likely switch to DNS caching servers here as
> well.
>
> On 10/19/09 6:45 AM, "Last-chance Architect" <ar...@galatea.com> wrote:
>
>> DNS ;)
>>
>> Ramesh.Ramasamy wrote:
>>> Hi,
>>>
>>> I have a cluster setup with 3 nodes, and I'm adding hostname details (in
>>> /etc/hosts) manually in each node. Seems it is not an effective approach.
>>> How this scenario is handled in big clusters?
>>>
>>> Is there any simple of way to add the hostname details in all the nodes by
>>> editing a single entry/file/script?
>>>
>>> Thanks and Regards,
>>> Ramesh
>>>
>>>
>
>

Allan,

I am interested in your post. What has caused you to run caching DNS
servers on each of your nodes? Is this a hadoop specific problem or a
problem  specific to your implementation?

My assumption here is that a hadoop cluster of say 1000 nodes would
repeatedly talk to the same 1000 nodes. Are you saying that nscd is
inadequacy to handle the size of the cache, or nscd is not very
efficient? What exactly is the reason you are running a caching DNS
server on each node?

Thank you,
Edward

Re: editing etc hosts files of a cluster

Posted by Steve Loughran <st...@apache.org>.
Allen Wittenauer wrote:
> A bit more specific:
> 
> At Yahoo!, we had either every server as a DNS slave or a DNS caching
> server.  
> 
> In the case of LinkedIn, we're running Solaris so nscd is significantly
> better than its Linux counterpart.  However, we still seem to be blowing out
> the cache too much.  So we'll likely switch to DNS caching servers here as
> well. 

the standard hadoop scripts don't tune DNS caching in the JVM, so Hadoop 
doesn't notice DNS entries changing; that adds extra complexity to the 
DNS-lookup-failure class of bugs -the situation where the TT and forked 
jobs see different IP addresses for the same hosts

Re: editing etc hosts files of a cluster

Posted by Allen Wittenauer <aw...@linkedin.com>.
A bit more specific:

At Yahoo!, we had either every server as a DNS slave or a DNS caching
server.  

In the case of LinkedIn, we're running Solaris so nscd is significantly
better than its Linux counterpart.  However, we still seem to be blowing out
the cache too much.  So we'll likely switch to DNS caching servers here as
well. 

On 10/19/09 6:45 AM, "Last-chance Architect" <ar...@galatea.com> wrote:

> DNS ;)
> 
> Ramesh.Ramasamy wrote:
>> Hi,
>> 
>> I have a cluster setup with 3 nodes, and I'm adding hostname details (in
>> /etc/hosts) manually in each node. Seems it is not an effective approach.
>> How this scenario is handled in big clusters?
>> 
>> Is there any simple of way to add the hostname details in all the nodes by
>> editing a single entry/file/script?
>> 
>> Thanks and Regards,
>> Ramesh
>> 
>> 


Re: editing etc hosts files of a cluster

Posted by Last-chance Architect <ar...@galatea.com>.
DNS ;)

Ramesh.Ramasamy wrote:
> Hi,
> 
> I have a cluster setup with 3 nodes, and I'm adding hostname details (in
> /etc/hosts) manually in each node. Seems it is not an effective approach.
> How this scenario is handled in big clusters?
> 
> Is there any simple of way to add the hostname details in all the nodes by
> editing a single entry/file/script? 
> 
> Thanks and Regards,
> Ramesh
> 
> 

-- 
***************************
The 'Last-Chance' Architect
www.galatea.com
(US) +1 303 731 3116
(UK) +44 20 8144 4367
***************************