You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2008/10/09 15:34:44 UTC

[jira] Created: (HADOOP-4383) Add standard interface/methods for all services to query IPC and HTTP addresses and ports

Add standard interface/methods for all services to query IPC and HTTP addresses and ports
-----------------------------------------------------------------------------------------

                 Key: HADOOP-4383
                 URL: https://issues.apache.org/jira/browse/HADOOP-4383
             Project: Hadoop Core
          Issue Type: Improvement
          Components: dfs, mapred
    Affects Versions: 0.20.0
            Reporter: Steve Loughran
            Priority: Minor


This is something I've ended up doing in subclasses of all the services: methods to get at the IPC and HTTP port and addresses. Some services have exported methods for this (JobTracker), others package-private member variables (namenode), while others don't allow you to get at all the data (Datanode keeps the http server private). 

A uniform way to query any service for its live port and address values make some aspects of service management much easier, such as feeding those values in to http page monitoring tools.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4383) Add standard interface/methods for all services to query IPC and HTTP addresses and ports

Posted by "Jeff Hammerbacher (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714489#action_12714489 ] 

Jeff Hammerbacher commented on HADOOP-4383:
-------------------------------------------

Hey Todd,

Leveraging a "core J2EE pattern" (http://java.sun.com/blueprints/corej2eepatterns/Patterns/ServiceLocator.html) to make services inherit from a common interface sounds like a great idea! It should simplify service management considerably. Go for it.

Later,
Jeff

> Add standard interface/methods for all services to query IPC and HTTP addresses and ports
> -----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4383
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4383
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs, mapred
>    Affects Versions: 0.20.0
>            Reporter: Steve Loughran
>            Priority: Minor
>
> This is something I've ended up doing in subclasses of all the services: methods to get at the IPC and HTTP port and addresses. Some services have exported methods for this (JobTracker), others package-private member variables (namenode), while others don't allow you to get at all the data (Datanode keeps the http server private). 
> A uniform way to query any service for its live port and address values make some aspects of service management much easier, such as feeding those values in to http page monitoring tools.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4383) Add standard interface/methods for all services to query IPC and HTTP addresses and ports

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647250#action_12647250 ] 

Steve Loughran commented on HADOOP-4383:
----------------------------------------

SmartFrog is currently under LPGL, but anything I push into the apache codebase is relicensed under the Apache license; I need to do some paperwork for such events. Have a look at the interface and see if you are happy with the idea, and I will sort things out. 

> Add standard interface/methods for all services to query IPC and HTTP addresses and ports
> -----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4383
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4383
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs, mapred
>    Affects Versions: 0.20.0
>            Reporter: Steve Loughran
>            Priority: Minor
>
> This is something I've ended up doing in subclasses of all the services: methods to get at the IPC and HTTP port and addresses. Some services have exported methods for this (JobTracker), others package-private member variables (namenode), while others don't allow you to get at all the data (Datanode keeps the http server private). 
> A uniform way to query any service for its live port and address values make some aspects of service management much easier, such as feeding those values in to http page monitoring tools.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4383) Add standard interface/methods for all services to query IPC and HTTP addresses and ports

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638288#action_12638288 ] 

Steve Loughran commented on HADOOP-4383:
----------------------------------------

This could be integrated with HADOOP-3628 to have a ServiceWithPorts class that exposes the various methods; in my own code I've patched a ServiceInfo on top that has the named methods and one to return the number of live workers for a service

> Add standard interface/methods for all services to query IPC and HTTP addresses and ports
> -----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4383
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4383
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs, mapred
>    Affects Versions: 0.20.0
>            Reporter: Steve Loughran
>            Priority: Minor
>
> This is something I've ended up doing in subclasses of all the services: methods to get at the IPC and HTTP port and addresses. Some services have exported methods for this (JobTracker), others package-private member variables (namenode), while others don't allow you to get at all the data (Datanode keeps the http server private). 
> A uniform way to query any service for its live port and address values make some aspects of service management much easier, such as feeding those values in to http page monitoring tools.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4383) Add standard interface/methods for all services to query IPC and HTTP addresses and ports

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638344#action_12638344 ] 

Raghu Angadi commented on HADOOP-4383:
--------------------------------------

Could attach an approximate patch for this jira alone, if it not much of work to separate it from HADOOP-3628. I does not even need to compile. Thanks.

> Add standard interface/methods for all services to query IPC and HTTP addresses and ports
> -----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4383
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4383
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs, mapred
>    Affects Versions: 0.20.0
>            Reporter: Steve Loughran
>            Priority: Minor
>
> This is something I've ended up doing in subclasses of all the services: methods to get at the IPC and HTTP port and addresses. Some services have exported methods for this (JobTracker), others package-private member variables (namenode), while others don't allow you to get at all the data (Datanode keeps the http server private). 
> A uniform way to query any service for its live port and address values make some aspects of service management much easier, such as feeding those values in to http page monitoring tools.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4383) Add standard interface/methods for all services to query IPC and HTTP addresses and ports

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647070#action_12647070 ] 

Konstantin Shvachko commented on HADOOP-4383:
---------------------------------------------

Your code is under GNU license?

> Add standard interface/methods for all services to query IPC and HTTP addresses and ports
> -----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4383
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4383
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs, mapred
>    Affects Versions: 0.20.0
>            Reporter: Steve Loughran
>            Priority: Minor
>
> This is something I've ended up doing in subclasses of all the services: methods to get at the IPC and HTTP port and addresses. Some services have exported methods for this (JobTracker), others package-private member variables (namenode), while others don't allow you to get at all the data (Datanode keeps the http server private). 
> A uniform way to query any service for its live port and address values make some aspects of service management much easier, such as feeding those values in to http page monitoring tools.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4383) Add standard interface/methods for all services to query IPC and HTTP addresses and ports

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12715088#action_12715088 ] 

Todd Lipcon commented on HADOOP-4383:
-------------------------------------

Hey Steve,

I see this and the service lifecycle interface as orthogonal tasks. Making services extend an interface is important if you want to use some sort of Java-based container to handle cluster lifecycle management (the "choreography of booting" as you nicely described it). There are some people, however, who already do this choreography using some different external (non-Java) tools where that common Java interface is unnecessary (for example, using unix tools to manage the services via init scripts).

Regarding the J2EE pattern, I think Jeff was just pointing it out as an example of this pattern that we should look to for inspiration rather than suggesting any actual integration of parts of the J2EE stack.

-Todd

> Add standard interface/methods for all services to query IPC and HTTP addresses and ports
> -----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4383
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4383
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs, mapred
>    Affects Versions: 0.20.0
>            Reporter: Steve Loughran
>            Priority: Minor
>
> This is something I've ended up doing in subclasses of all the services: methods to get at the IPC and HTTP port and addresses. Some services have exported methods for this (JobTracker), others package-private member variables (namenode), while others don't allow you to get at all the data (Datanode keeps the http server private). 
> A uniform way to query any service for its live port and address values make some aspects of service management much easier, such as feeding those values in to http page monitoring tools.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4383) Add standard interface/methods for all services to query IPC and HTTP addresses and ports

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713322#action_12713322 ] 

Todd Lipcon commented on HADOOP-4383:
-------------------------------------

Rather than adding interfaces to the specific daemons, I'd like to propose going the other direction and factoring this into a ServiceLocator interface. This would provide the traditional "name service" role which many distributed systems implicitly assume exists.

When any daemon opens up a server, they would register it (IP(s), port, protocol type) with the ServiceLocator. When any daemon (or client) wants to connect to a specific endpoint, they query the ServiceLocator to find it. The initial implementation of this interface would simply perform lookups in Configuration, maintaining the status quo, but I foresee a lot other very useful potential implementations:
 - The J2EE-ish solution (I'm not a big J2EE guy, but I think JMS or JNDI are the appropriate TLAs here?)
 - ZooKeeper
 - mdns (aka zeroconf)
 - RFC 2136 dynamic DNS updates
 - Organization specific service locators (eg SmartFrog)
 - Amazon Elastic IP (eg automatically attach an elastic IP to the NN when the NN boots)

Making this nicely pluggable through contrib jars would do well to allow flexibility while keeping core clean.

This should solve several goals in parallel:
  - Factors out common code regarding "bind address" configurations, wildcard addresses, localhost vs wildcard vs external IPs, etc
  - Reduces the reliance on "writing back" into Conf objects at service start time, which I think most people would agree is a somewhat dirty practice.
  - Provides pluggable methods we'll need later if we look into automatic failover of master daemons
  - Provides better integration with external systems already in use in various organizations (eg SmartFrog, Thrift-based service directories, etc)

> Add standard interface/methods for all services to query IPC and HTTP addresses and ports
> -----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4383
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4383
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs, mapred
>    Affects Versions: 0.20.0
>            Reporter: Steve Loughran
>            Priority: Minor
>
> This is something I've ended up doing in subclasses of all the services: methods to get at the IPC and HTTP port and addresses. Some services have exported methods for this (JobTracker), others package-private member variables (namenode), while others don't allow you to get at all the data (Datanode keeps the http server private). 
> A uniform way to query any service for its live port and address values make some aspects of service management much easier, such as feeding those values in to http page monitoring tools.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4383) Add standard interface/methods for all services to query IPC and HTTP addresses and ports

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12715080#action_12715080 ] 

Steve Loughran commented on HADOOP-4383:
----------------------------------------

Jeff, HADOOP-3628 adds a common base class for services, I am busy running the tests to see that it works with SVN_HEAD. 

The ServiceLocator patten in J2EE is subtly different as J2EE is pretty weak with most aspects of operations; we also have to deal with the choreography problems of booting a large cluster, in which the ops team reserve the right to move stuff in order to maintain availability. 

This is something I can help with,  getting that base lifecycle stuff in there would be the first step. Then we can deal with binding/rebinding and config. Oh, and all the tests. 



> Add standard interface/methods for all services to query IPC and HTTP addresses and ports
> -----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4383
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4383
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs, mapred
>    Affects Versions: 0.20.0
>            Reporter: Steve Loughran
>            Priority: Minor
>
> This is something I've ended up doing in subclasses of all the services: methods to get at the IPC and HTTP port and addresses. Some services have exported methods for this (JobTracker), others package-private member variables (namenode), while others don't allow you to get at all the data (Datanode keeps the http server private). 
> A uniform way to query any service for its live port and address values make some aspects of service management much easier, such as feeding those values in to http page monitoring tools.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4383) Add standard interface/methods for all services to query IPC and HTTP addresses and ports

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638544#action_12638544 ] 

Steve Loughran commented on HADOOP-4383:
----------------------------------------

I could certainly separate it, the main issue being that my copy of hadoop has diverged somewhat, due mainly to the lifecycle work. I'd have to check out a new version and build the patch against that. 

What I've done in my code is add an interface in our own codebase
http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/components/hadoop/src/org/smartfrog/services/hadoop/core/ServiceInfo.java?view=markup
then patch my subclases of the various services 

http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/components/hadoop/src/org/apache/hadoop/hdfs/server/namenode/ExtNameNode.java?view=markup
http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/components/hadoop/src/org/apache/hadoop/mapred/ExtJobTracker.java?view=markup
http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/components/hadoop/src/org/apache/hadoop/mapred/ExtTaskTracker.java?view=markup

one troublespot here is the datanode, which doesn't expose enough information to get the http port; I return an error value
http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/components/hadoop/src/org/apache/hadoop/hdfs/server/datanode/ExtDataNode.java?view=markup

-putting this stuff in an interface that is visible for all classes would be best, though it runs up against the interface/class issue. Patching the methods into a subclass to the service class would make it consistent and avoid adding an extra interface, which is just my current way to add the methods to every service without changing the hadoop codebase itself.

> Add standard interface/methods for all services to query IPC and HTTP addresses and ports
> -----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4383
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4383
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs, mapred
>    Affects Versions: 0.20.0
>            Reporter: Steve Loughran
>            Priority: Minor
>
> This is something I've ended up doing in subclasses of all the services: methods to get at the IPC and HTTP port and addresses. Some services have exported methods for this (JobTracker), others package-private member variables (namenode), while others don't allow you to get at all the data (Datanode keeps the http server private). 
> A uniform way to query any service for its live port and address values make some aspects of service management much easier, such as feeding those values in to http page monitoring tools.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.