You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-user@hadoop.apache.org by Bryan Beaudreault <bb...@hubspot.com> on 2013/01/11 18:01:28 UTC

services requiring topology conf

The documentation on topology conf (topology.script.file.name) is a little
sparse, and while we have it working in our cluster I am trying to make it
a little easier to configure.

Currently we upload a python file and conf file to every node in our
cluster.  However I have a feeling that it is only needed on the
NameNode(s) and perhaps JobTracker.  I checked the code for DataNode and
see no reference to this configuration parameter, but I wanted to check
with you all before I stop updating the conf on every one of my nodes.

Can anyone confirm whether these configuration files only need to be
present on the NameNode/JobTracker, or do they need to be on every node in
a cluster?

Thanks

Re: services requiring topology conf

Posted by Bryan Beaudreault <bb...@hubspot.com>.

Thanks Adam!


On Fri, Jan 11, 2013 at 12:15 PM, Adam Faris <af...@linkedin.com> wrote:

> A patch was submitted for topology documentation, but it doesn't appear to
> have made it to any releases.  This svn link may help starting at line 1294.
>
> http://svn.apache.org/viewvc?view=revision&revision=1411359
>
> Assuming you are using hadoop 1.x and not yarn, the topology script only
> needs to be on the namenode and jobtracker.  As you have noticed it doesn't
> hurt anything if you copy the script everywhere as the tasktracker and
> datanode process will ignore it.   Try looking at pdsh for controlling
> compute nodes and pushing files, but be careful as if you type a bad
> command it's going to get ran everywhere. http://code.google.com/p/pdsh/
>
> -- Adam
>
>
> On Jan 11, 2013, at 9:01 AM, Bryan Beaudreault <bb...@hubspot.com>
> wrote:
>
> > The documentation on topology conf (topology.script.file.name) is a
> little sparse, and while we have it working in our cluster I am trying to
> make it a little easier to configure.
> >
> > Currently we upload a python file and conf file to every node in our
> cluster.  However I have a feeling that it is only needed on the
> NameNode(s) and perhaps JobTracker.  I checked the code for DataNode and
> see no reference to this configuration parameter, but I wanted to check
> with you all before I stop updating the conf on every one of my nodes.
> >
> > Can anyone confirm whether these configuration files only need to be
> present on the NameNode/JobTracker, or do they need to be on every node in
> a cluster?
> >
> > Thanks
>
>

Re: services requiring topology conf

Posted by Bryan Beaudreault <bb...@hubspot.com>.

Thanks Adam!


On Fri, Jan 11, 2013 at 12:15 PM, Adam Faris <af...@linkedin.com> wrote:

> A patch was submitted for topology documentation, but it doesn't appear to
> have made it to any releases.  This svn link may help starting at line 1294.
>
> http://svn.apache.org/viewvc?view=revision&revision=1411359
>
> Assuming you are using hadoop 1.x and not yarn, the topology script only
> needs to be on the namenode and jobtracker.  As you have noticed it doesn't
> hurt anything if you copy the script everywhere as the tasktracker and
> datanode process will ignore it.   Try looking at pdsh for controlling
> compute nodes and pushing files, but be careful as if you type a bad
> command it's going to get ran everywhere. http://code.google.com/p/pdsh/
>
> -- Adam
>
>
> On Jan 11, 2013, at 9:01 AM, Bryan Beaudreault <bb...@hubspot.com>
> wrote:
>
> > The documentation on topology conf (topology.script.file.name) is a
> little sparse, and while we have it working in our cluster I am trying to
> make it a little easier to configure.
> >
> > Currently we upload a python file and conf file to every node in our
> cluster.  However I have a feeling that it is only needed on the
> NameNode(s) and perhaps JobTracker.  I checked the code for DataNode and
> see no reference to this configuration parameter, but I wanted to check
> with you all before I stop updating the conf on every one of my nodes.
> >
> > Can anyone confirm whether these configuration files only need to be
> present on the NameNode/JobTracker, or do they need to be on every node in
> a cluster?
> >
> > Thanks
>
>

Re: services requiring topology conf

Posted by Bryan Beaudreault <bb...@hubspot.com>.

Thanks Adam!


On Fri, Jan 11, 2013 at 12:15 PM, Adam Faris <af...@linkedin.com> wrote:

> A patch was submitted for topology documentation, but it doesn't appear to
> have made it to any releases.  This svn link may help starting at line 1294.
>
> http://svn.apache.org/viewvc?view=revision&revision=1411359
>
> Assuming you are using hadoop 1.x and not yarn, the topology script only
> needs to be on the namenode and jobtracker.  As you have noticed it doesn't
> hurt anything if you copy the script everywhere as the tasktracker and
> datanode process will ignore it.   Try looking at pdsh for controlling
> compute nodes and pushing files, but be careful as if you type a bad
> command it's going to get ran everywhere. http://code.google.com/p/pdsh/
>
> -- Adam
>
>
> On Jan 11, 2013, at 9:01 AM, Bryan Beaudreault <bb...@hubspot.com>
> wrote:
>
> > The documentation on topology conf (topology.script.file.name) is a
> little sparse, and while we have it working in our cluster I am trying to
> make it a little easier to configure.
> >
> > Currently we upload a python file and conf file to every node in our
> cluster.  However I have a feeling that it is only needed on the
> NameNode(s) and perhaps JobTracker.  I checked the code for DataNode and
> see no reference to this configuration parameter, but I wanted to check
> with you all before I stop updating the conf on every one of my nodes.
> >
> > Can anyone confirm whether these configuration files only need to be
> present on the NameNode/JobTracker, or do they need to be on every node in
> a cluster?
> >
> > Thanks
>
>

Re: services requiring topology conf

Posted by Bryan Beaudreault <bb...@hubspot.com>.

Thanks Adam!


On Fri, Jan 11, 2013 at 12:15 PM, Adam Faris <af...@linkedin.com> wrote:

> A patch was submitted for topology documentation, but it doesn't appear to
> have made it to any releases.  This svn link may help starting at line 1294.
>
> http://svn.apache.org/viewvc?view=revision&revision=1411359
>
> Assuming you are using hadoop 1.x and not yarn, the topology script only
> needs to be on the namenode and jobtracker.  As you have noticed it doesn't
> hurt anything if you copy the script everywhere as the tasktracker and
> datanode process will ignore it.   Try looking at pdsh for controlling
> compute nodes and pushing files, but be careful as if you type a bad
> command it's going to get ran everywhere. http://code.google.com/p/pdsh/
>
> -- Adam
>
>
> On Jan 11, 2013, at 9:01 AM, Bryan Beaudreault <bb...@hubspot.com>
> wrote:
>
> > The documentation on topology conf (topology.script.file.name) is a
> little sparse, and while we have it working in our cluster I am trying to
> make it a little easier to configure.
> >
> > Currently we upload a python file and conf file to every node in our
> cluster.  However I have a feeling that it is only needed on the
> NameNode(s) and perhaps JobTracker.  I checked the code for DataNode and
> see no reference to this configuration parameter, but I wanted to check
> with you all before I stop updating the conf on every one of my nodes.
> >
> > Can anyone confirm whether these configuration files only need to be
> present on the NameNode/JobTracker, or do they need to be on every node in
> a cluster?
> >
> > Thanks
>
>

Re: services requiring topology conf

Posted by Adam Faris <af...@linkedin.com>.

A patch was submitted for topology documentation, but it doesn't appear to have made it to any releases.  This svn link may help starting at line 1294.

http://svn.apache.org/viewvc?view=revision&revision=1411359  

Assuming you are using hadoop 1.x and not yarn, the topology script only needs to be on the namenode and jobtracker.  As you have noticed it doesn't hurt anything if you copy the script everywhere as the tasktracker and datanode process will ignore it.   Try looking at pdsh for controlling compute nodes and pushing files, but be careful as if you type a bad command it's going to get ran everywhere. http://code.google.com/p/pdsh/

-- Adam

On Jan 11, 2013, at 9:01 AM, Bryan Beaudreault <bb...@hubspot.com> wrote:

> The documentation on topology conf (topology.script.file.name) is a little sparse, and while we have it working in our cluster I am trying to make it a little easier to configure.
> 
> Currently we upload a python file and conf file to every node in our cluster.  However I have a feeling that it is only needed on the NameNode(s) and perhaps JobTracker.  I checked the code for DataNode and see no reference to this configuration parameter, but I wanted to check with you all before I stop updating the conf on every one of my nodes.
> 
> Can anyone confirm whether these configuration files only need to be present on the NameNode/JobTracker, or do they need to be on every node in a cluster?
> 
> Thanks

Re: services requiring topology conf

Posted by Adam Faris <af...@linkedin.com>.

A patch was submitted for topology documentation, but it doesn't appear to have made it to any releases.  This svn link may help starting at line 1294.

http://svn.apache.org/viewvc?view=revision&revision=1411359  

Assuming you are using hadoop 1.x and not yarn, the topology script only needs to be on the namenode and jobtracker.  As you have noticed it doesn't hurt anything if you copy the script everywhere as the tasktracker and datanode process will ignore it.   Try looking at pdsh for controlling compute nodes and pushing files, but be careful as if you type a bad command it's going to get ran everywhere. http://code.google.com/p/pdsh/

-- Adam

On Jan 11, 2013, at 9:01 AM, Bryan Beaudreault <bb...@hubspot.com> wrote:

> The documentation on topology conf (topology.script.file.name) is a little sparse, and while we have it working in our cluster I am trying to make it a little easier to configure.
> 
> Currently we upload a python file and conf file to every node in our cluster.  However I have a feeling that it is only needed on the NameNode(s) and perhaps JobTracker.  I checked the code for DataNode and see no reference to this configuration parameter, but I wanted to check with you all before I stop updating the conf on every one of my nodes.
> 
> Can anyone confirm whether these configuration files only need to be present on the NameNode/JobTracker, or do they need to be on every node in a cluster?
> 
> Thanks

Re: services requiring topology conf

Posted by Adam Faris <af...@linkedin.com>.

A patch was submitted for topology documentation, but it doesn't appear to have made it to any releases.  This svn link may help starting at line 1294.

http://svn.apache.org/viewvc?view=revision&revision=1411359  

Assuming you are using hadoop 1.x and not yarn, the topology script only needs to be on the namenode and jobtracker.  As you have noticed it doesn't hurt anything if you copy the script everywhere as the tasktracker and datanode process will ignore it.   Try looking at pdsh for controlling compute nodes and pushing files, but be careful as if you type a bad command it's going to get ran everywhere. http://code.google.com/p/pdsh/

-- Adam

On Jan 11, 2013, at 9:01 AM, Bryan Beaudreault <bb...@hubspot.com> wrote:

> The documentation on topology conf (topology.script.file.name) is a little sparse, and while we have it working in our cluster I am trying to make it a little easier to configure.
> 
> Currently we upload a python file and conf file to every node in our cluster.  However I have a feeling that it is only needed on the NameNode(s) and perhaps JobTracker.  I checked the code for DataNode and see no reference to this configuration parameter, but I wanted to check with you all before I stop updating the conf on every one of my nodes.
> 
> Can anyone confirm whether these configuration files only need to be present on the NameNode/JobTracker, or do they need to be on every node in a cluster?
> 
> Thanks

Re: services requiring topology conf

Posted by Adam Faris <af...@linkedin.com>.

A patch was submitted for topology documentation, but it doesn't appear to have made it to any releases.  This svn link may help starting at line 1294.

http://svn.apache.org/viewvc?view=revision&revision=1411359  

Assuming you are using hadoop 1.x and not yarn, the topology script only needs to be on the namenode and jobtracker.  As you have noticed it doesn't hurt anything if you copy the script everywhere as the tasktracker and datanode process will ignore it.   Try looking at pdsh for controlling compute nodes and pushing files, but be careful as if you type a bad command it's going to get ran everywhere. http://code.google.com/p/pdsh/

-- Adam

On Jan 11, 2013, at 9:01 AM, Bryan Beaudreault <bb...@hubspot.com> wrote:

> The documentation on topology conf (topology.script.file.name) is a little sparse, and while we have it working in our cluster I am trying to make it a little easier to configure.
> 
> Currently we upload a python file and conf file to every node in our cluster.  However I have a feeling that it is only needed on the NameNode(s) and perhaps JobTracker.  I checked the code for DataNode and see no reference to this configuration parameter, but I wanted to check with you all before I stop updating the conf on every one of my nodes.
> 
> Can anyone confirm whether these configuration files only need to be present on the NameNode/JobTracker, or do they need to be on every node in a cluster?
> 
> Thanks