You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Samprita Hegde <sa...@gmail.com> on 2009/04/22 15:47:44 UTC

Decoupling Intertracker protocol

Hi,
   I am trying see the feasibility of using shared spaces for the
communication of the Task Completion Events in Hadoop. For this I am trying
to replace the InterTracker Protocol with a co-ordination space so that one
thread in a Task Tracker puts the MapTask Completion Events on to the space
and another Thread  receives these events sends them to the Reudce Tasks
launched by that TaskTracker. .Even the Job tracker can subscribe to these
events to make decisions regarding scheduling/ Restarting the sluggish tasks
etc ..

Currently all the information seems to be sent via the heartbeat message in
the InterTracker Protocol. Is there a way where I can decouple only some
part of heartbeat message and put it on to the space? (Especially the
TaskCompletionEvent and TaskStatus). Using this the task completion events
can be exchanged directly among the Task Trackers adn not through the Job
Tracker.
I am not sure if this strategy is good for large scale Map_Reduce
applications. But it might work well for small scale Map-reduce jobs.

More Information on the co-ordination space that I want to use can be found
here : http://www.caip.rutgers.edu/~zhljenny/comet.htm. I am still going
through Hadoop's code and trying to undersatnd the various protocol between
the processes. If you have any Good documentation regarding Hadoop's
architecture, it would be really helpful for me.

Thanks a lot in advance,
Samprita Hegde

Re: Decoupling Intertracker protocol

Posted by Samprita Hegde <sa...@gmail.com>.
Hello All,
    Thanks a lot for your feed back. i will be working on this during the
summer break and will keep you posted.
Thanks Again,
Samprita


On Sat, Apr 25, 2009 at 2:47 AM, Arun C Murthy <ac...@yahoo-inc.com> wrote:

>
> On Apr 22, 2009, at 7:17 PM, Samprita Hegde wrote:
>
>  Hi,
>>  I am trying see the feasibility of using shared spaces for the
>> communication of the Task Completion Events in Hadoop. For this I am
>> trying
>> to replace the InterTracker Protocol with a co-ordination space so that
>> one
>> thread in a Task Tracker puts the MapTask Completion Events on to the
>> space
>> and another Thread  receives these events sends them to the Reudce Tasks
>> launched by that TaskTracker. .Even the Job tracker can subscribe to these
>> events to make decisions regarding scheduling/ Restarting the sluggish
>> tasks
>> etc ..
>>
>
> It's an interesting experiment, please keep us posted.
>
>
>>
>>
>> Currently all the information seems to be sent via the heartbeat message
>> in
>> the InterTracker Protocol. Is there a way where I can decouple only some
>> part of heartbeat message and put it on to the space? (Especially the
>> TaskCompletionEvent and TaskStatus). Using this the task completion events
>> can be exchanged directly among the Task Trackers adn not through the Job
>> Tracker.
>>
>
> You would need to fix the TaskTrackers to update the shared-space rather
> than send it to the JobTracker.
>
> One pertinent point to remember is that you do need some global arbitration
> for e.g. deciding which among the concurrently successful speculative tasks
> are to be declared as 'successful' etc.
>
> Arun
>
>
>
>> I am not sure if this strategy is good for large scale Map_Reduce
>> applications. But it might work well for small scale Map-reduce jobs.
>>
>> More Information on the co-ordination space that I want to use can be
>> found
>> here : http://www.caip.rutgers.edu/~zhljenny/comet.htm<http://www.caip.rutgers.edu/%7Ezhljenny/comet.htm>.
>> I am still going
>> through Hadoop's code and trying to undersatnd the various protocol
>> between
>> the processes. If you have any Good documentation regarding Hadoop's
>> architecture, it would be really helpful for me.
>>
>> Thanks a lot in advance,
>> Samprita Hegde
>>
>
>

Re: Decoupling Intertracker protocol

Posted by Ted Dunning <te...@gmail.com>.
I think that this kind of strategy could easily scale to clusters of
thousands of nodes and possibly much more (if you have a good communication
substrate).

I looked at the comet documentation and it really looks like they went a
long ways to make sure that they are a pure p2p coordination layer.  That is
definitely an interesting research project, but it makes it much, much less
interesting in terms of realistic systems, if only because the resulting
system is so much more complex.

This sounds like an academic project, though, so comparing a p2p
coordination layer with state of the commercial art alternatives is probably
a good thing.  I would recommend that if you want a very interesting result,
you should implement your coordination with BOTH zookeeper and comet and
figure out what the differences are.

On Fri, Apr 24, 2009 at 11:47 PM, Arun C Murthy <ac...@yahoo-inc.com> wrote:

> I am not sure if this strategy is good for large scale Map_Reduce
>> applications. But it might work well for small scale Map-reduce jobs.
>>
>> More Information on the co-ordination space that I want to use can be
>> found
>> here : http://www.caip.rutgers.edu/~zhljenny/comet.htm<http://www.caip.rutgers.edu/%7Ezhljenny/comet.htm>.
>>
>
>


-- 
Ted Dunning, CTO
DeepDyve

111 West Evelyn Ave. Ste. 202
Sunnyvale, CA 94086
www.deepdyve.com
858-414-0013 (m)
408-773-0220 (fax)

Re: Decoupling Intertracker protocol

Posted by Ted Dunning <te...@gmail.com>.
This sounds like a perfect application of Zookeeper.


On Fri, Apr 24, 2009 at 11:47 PM, Arun C Murthy <ac...@yahoo-inc.com> wrote:

> On Apr 22, 2009, at 7:17 PM, Samprita Hegde wrote:
>
>
> Currently all the information seems to be sent via the heartbeat message in
>> the InterTracker Protocol. Is there a way where I can decouple only some
>> part of heartbeat message and put it on to the space? (Especially the
>> TaskCompletionEvent and TaskStatus). Using this the task completion events
>> can be exchanged directly among the Task Trackers adn not through the Job
>> Tracker.
>>
>
> You would need to fix the TaskTrackers to update the shared-space rather
> than send it to the JobTracker.
>
> One pertinent point to remember is that you do need some global arbitration
> for e.g. deciding which among the concurrently successful speculative tasks
> are to be declared as 'successful' etc.
>



-- 
Ted Dunning, CTO
DeepDyve

Re: Decoupling Intertracker protocol

Posted by Arun C Murthy <ac...@yahoo-inc.com>.
On Apr 22, 2009, at 7:17 PM, Samprita Hegde wrote:

> Hi,
>   I am trying see the feasibility of using shared spaces for the
> communication of the Task Completion Events in Hadoop. For this I am  
> trying
> to replace the InterTracker Protocol with a co-ordination space so  
> that one
> thread in a Task Tracker puts the MapTask Completion Events on to  
> the space
> and another Thread  receives these events sends them to the Reudce  
> Tasks
> launched by that TaskTracker. .Even the Job tracker can subscribe to  
> these
> events to make decisions regarding scheduling/ Restarting the  
> sluggish tasks
> etc ..

It's an interesting experiment, please keep us posted.

>
>
>
> Currently all the information seems to be sent via the heartbeat  
> message in
> the InterTracker Protocol. Is there a way where I can decouple only  
> some
> part of heartbeat message and put it on to the space? (Especially the
> TaskCompletionEvent and TaskStatus). Using this the task completion  
> events
> can be exchanged directly among the Task Trackers adn not through  
> the Job
> Tracker.

You would need to fix the TaskTrackers to update the shared-space  
rather than send it to the JobTracker.

One pertinent point to remember is that you do need some global  
arbitration for e.g. deciding which among the concurrently successful  
speculative tasks are to be declared as 'successful' etc.

Arun

>
> I am not sure if this strategy is good for large scale Map_Reduce
> applications. But it might work well for small scale Map-reduce jobs.
>
> More Information on the co-ordination space that I want to use can  
> be found
> here : http://www.caip.rutgers.edu/~zhljenny/comet.htm. I am still  
> going
> through Hadoop's code and trying to undersatnd the various protocol  
> between
> the processes. If you have any Good documentation regarding Hadoop's
> architecture, it would be really helpful for me.
>
> Thanks a lot in advance,
> Samprita Hegde