You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hama.apache.org by Lin Chia-Hung <cl...@googlemail.com> on 2010/12/16 04:55:33 UTC

Task distribution issue

When working on BSPPeer fault tolerance
(https://issues.apache.org/jira/browse/HAMA-199), I came across
another issue - the task distribution currently employed in HAMA is
done by GroomServer requesting tasks from BSPMaster, similar to one
used in Hadoop mapreduce. This strategy has discussion at
https://issues.apache.org/jira/browse/MAPREDUCE-278 regarding to e.g
memory footprint, race condition, etc. Although this issue do not
directly relate to the BSPPeer fault tolerance, the strategy
(GroomServer requests tasks from BSPMaster) employed may have
potential issue e.g. task can not be reschedule to the node expected.

So I would like to know if there is any chance the HAMA task
distribution mechanism may work toward this direction (proactive task
assignment)?

Thanks

Re: Task distribution issue

Posted by Chia-Hung Lin <cl...@googlemail.com>.

Yes. My understanding is that HADOOP-815 concerns JobInProgress's
memory footprint issue; and HADOOP-600 has issues JobTracker lost
tracks of tasks in TaskTracker due to race condition. From the
discussion the root cause looks like relating to the way how
JobTracker communicate with TaskTracker; therefore, I was thinking
that HAMA can learn from Hadoop's previous experience.

Thanks for helping schedule task killing, re-attempting functions, etc. issues.

Sincerely,
ChiaHung



2010/12/16 Edward J. Yoon <ed...@apache.org>:
> Hi,
>
> According to my understanding, it looks like a refactoring issue to
> reduce the complexity and the memory footprint of master server by
> removing back-calls of TaskInProgress and map structures. Right?
>
> I think, that's a nice idea. But we didn't implement the task killing,
> re-attempting functions yet. So, I would propose to schedule it for
> 0.2.1 release.
>
> -Edward
>
> On Thu, Dec 16, 2010 at 12:55 PM, Lin Chia-Hung <cl...@googlemail.com> wrote:
>> When working on BSPPeer fault tolerance
>> (https://issues.apache.org/jira/browse/HAMA-199), I came across
>> another issue - the task distribution currently employed in HAMA is
>> done by GroomServer requesting tasks from BSPMaster, similar to one
>> used in Hadoop mapreduce. This strategy has discussion at
>> https://issues.apache.org/jira/browse/MAPREDUCE-278 regarding to e.g
>> memory footprint, race condition, etc. Although this issue do not
>> directly relate to the BSPPeer fault tolerance, the strategy
>> (GroomServer requests tasks from BSPMaster) employed may have
>> potential issue e.g. task can not be reschedule to the node expected.
>>
>> So I would like to know if there is any chance the HAMA task
>> distribution mechanism may work toward this direction (proactive task
>> assignment)?
>>
>> Thanks
>>
>
>
>
> --
> Best Regards, Edward J. Yoon
> edwardyoon@apache.org
> http://blog.udanax.org
>

Re: Task distribution issue

Posted by "Edward J. Yoon" <ed...@apache.org>.

Hi,

According to my understanding, it looks like a refactoring issue to
reduce the complexity and the memory footprint of master server by
removing back-calls of TaskInProgress and map structures. Right?

I think, that's a nice idea. But we didn't implement the task killing,
re-attempting functions yet. So, I would propose to schedule it for
0.2.1 release.

-Edward

On Thu, Dec 16, 2010 at 12:55 PM, Lin Chia-Hung <cl...@googlemail.com> wrote:
> When working on BSPPeer fault tolerance
> (https://issues.apache.org/jira/browse/HAMA-199), I came across
> another issue - the task distribution currently employed in HAMA is
> done by GroomServer requesting tasks from BSPMaster, similar to one
> used in Hadoop mapreduce. This strategy has discussion at
> https://issues.apache.org/jira/browse/MAPREDUCE-278 regarding to e.g
> memory footprint, race condition, etc. Although this issue do not
> directly relate to the BSPPeer fault tolerance, the strategy
> (GroomServer requests tasks from BSPMaster) employed may have
> potential issue e.g. task can not be reschedule to the node expected.
>
> So I would like to know if there is any chance the HAMA task
> distribution mechanism may work toward this direction (proactive task
> assignment)?
>
> Thanks
>



-- 
Best Regards, Edward J. Yoon
edwardyoon@apache.org
http://blog.udanax.org