You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Stefan Groschupf <sg...@media-style.com> on 2006/02/17 01:06:57 UTC

find a tasktracker that process a named task

Hi,
this is in my mind since some time.
Since the global locking discussion in the nutch list, I would love  
to inject my thoughts into this list.

First let me give a use case:
We have some nutch index search servers running as MapRunnable tasks.
The problem: how a client that want's to connect to this search  
servers knows all the hosts that runs this runnable and may track  
changes?

Once I had implement a kind of prototype for this use case and done  
it like this:
I had start a rpc server on the search server client side (my tomcat  
host) and all search servers was registering itself there. Than the  
client was using this repository to lookup the search servers.

I think there could be a better more generalized solution.
Is it possible already possible or what are people thinking about  
extending the jobtracker api to query tasktracker that run a named task?
This would be a smarter solution for the described usecase, also  
people would be able to start a map runnable that runs a locking  
service and would be able to connect to that.
However I agree with Doug's point of view that centralized locking  
makes for this specific fetching use case no sense.

Any comments?
Stefan


Re: find a tasktracker that process a named task

Posted by Stefan Groschupf <sg...@media-style.com>.
Thanks, that is a cool idea. Would it still possible provide status  
information like number of queries a search server actually process?
I don't think so since there is only one status and no status  
protocol, right?. Using status for things like load messages can help  
to decide if it make sense to start more search server automatically  
in high load times.
But may this can also handled by the search server client. I will add  
this to my todo list and try it out.:)

Stefan

Am 17.02.2006 um 01:14 schrieb Doug Cutting:

> Stefan Groschupf wrote:
>> First let me give a use case:
>> We have some nutch index search servers running as MapRunnable tasks.
>> The problem: how a client that want's to connect to this search   
>> servers knows all the hosts that runs this runnable and may track   
>> changes?
>
> Here's a hack I've contemplated: you can call setStatus() with the  
> host:port, then use JobSubmissionProtocol.getMapTaskReports() to  
> retrieve all of the status strings.  Voila.
>
> Doug
>

---------------------------------------------
George Orwel was an Optimist
blog: http://www.find23.org
company: http://www.media-style.com



Re: find a tasktracker that process a named task

Posted by Doug Cutting <cu...@apache.org>.
Stefan Groschupf wrote:
> First let me give a use case:
> We have some nutch index search servers running as MapRunnable tasks.
> The problem: how a client that want's to connect to this search  servers 
> knows all the hosts that runs this runnable and may track  changes?

Here's a hack I've contemplated: you can call setStatus() with the 
host:port, then use JobSubmissionProtocol.getMapTaskReports() to 
retrieve all of the status strings.  Voila.

Doug