You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hama.apache.org by "Edward J. Yoon" <ed...@apache.org> on 2011/07/07 12:48:48 UTC

About HAMA-410

Hi,

To support multi-tasks, I'm thinking about merging BSPPeer and Task.
Then, communication will be occurred among Tasks directly. I think,
there's no need to manage BSPPeers inside GroomServer.

Can we think about the latent side-effects from this decision, together?

Thanks.

-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: About HAMA-410

Posted by "Edward J. Yoon" <ed...@apache.org>.
> What about the barrier sync of zookeeper? Does he can deal with these
> multiple tasks? Would each task be a znode?

Hmm, good question.

I think, ZK have to manage all tasks. Otherwise, it'll increase the
complexity of the program.

On Fri, Jul 8, 2011 at 11:46 AM, Edward J. Yoon <ed...@apache.org> wrote:
> Just FYI,
>
> To better understand, refer the diagram, described in 0.2 user guide:
>
> http://incubator.apache.org/hama/docs/r0.2.0/ApacheHama-0.2_UserGuide.pdf
>
> On Fri, Jul 8, 2011 at 6:33 AM, Thomas Jungblut
> <th...@googlemail.com> wrote:
>> Yes, that is already implemented in the latest patch.
>> That is quite okay, I would be +1 to let each task be a BSPPeer. Or
>> actually, has a BSPPeer, maybe we are going to add some kind of JVM reuse,
>> then we just have to set a new BSPPeer instead of swapping the whole task.
>>
>> Overall I thought of this cascading design:
>>
>>>BSPMaster
>> ->Groom1
>> -->Task1
>> -->Task2
>> ->Groom2
>> -->Task3
>>
>> So each task can directly communicate with other tasks using RPC. (Altough
>> I'm not a great friend of this RPC stuff [1])
>> Grooms are only there to communicate with each task, for pinging tasks to be
>> alive. And the BSPMaster is responsible to keep track of the availability of
>> the grooms.
>>
>> We should take care of syncs and use them as sparse as possible, since they
>> tend to be a large bottleneck.
>> What about the barrier sync of zookeeper? Does he can deal with these
>> multiple tasks? Would each task be a znode?
>>
>> [1]:
>> https://issues.apache.org/jira/browse/HAMA-358?focusedCommentId=13059229&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13059229
>>
>> 2011/7/7 Edward J. Yoon <ed...@apache.org>:
>>> Invoked (child) process will become a BSPPeer.
>>>
>>> On Thu, Jul 7, 2011 at 9:14 PM, Thomas Jungblut
>>> <th...@googlemail.com> wrote:
>>>> Just for clarification:
>>>> What is your plan now?
>>>> To setup a BSPPeer for several tasks on a server (groom) or is the
>>>> groom now the one and only BSPPeer?
>>>>
>>>> 2011/7/7 Edward J. Yoon <ed...@apache.org>:
>>>>> Hi,
>>>>>
>>>>> To support multi-tasks, I'm thinking about merging BSPPeer and Task.
>>>>> Then, communication will be occurred among Tasks directly. I think,
>>>>> there's no need to manage BSPPeers inside GroomServer.
>>>>>
>>>>> Can we think about the latent side-effects from this decision, together?
>>>>>
>>>>> Thanks.
>>>>>
>>>>> --
>>>>> Best Regards, Edward J. Yoon
>>>>> @eddieyoon
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Thomas Jungblut
>>>> Berlin
>>>>
>>>> mobile: 0170-3081070
>>>>
>>>> business: thomas.jungblut@testberichte.de
>>>> private: thomas.jungblut@gmail.com
>>>>
>>>
>>>
>>>
>>> --
>>> Best Regards, Edward J. Yoon
>>> @eddieyoon
>>>
>>
>>
>>
>> --
>> Thomas Jungblut
>> Berlin
>>
>> mobile: 0170-3081070
>>
>> business: thomas.jungblut@testberichte.de
>> private: thomas.jungblut@gmail.com
>>
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon
>



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: About HAMA-410

Posted by Thomas Jungblut <th...@googlemail.com>.
Yeah :) So the design is good ;D

Hmm, good question.
>
> I think, ZK have to manage all tasks. Otherwise, it'll increase the
> complexity of the program.
>

You're right.

2011/7/8 Edward J. Yoon <ed...@apache.org>

> Just FYI,
>
> To better understand, refer the diagram, described in 0.2 user guide:
>
> http://incubator.apache.org/hama/docs/r0.2.0/ApacheHama-0.2_UserGuide.pdf
>
> On Fri, Jul 8, 2011 at 6:33 AM, Thomas Jungblut
> <th...@googlemail.com> wrote:
> > Yes, that is already implemented in the latest patch.
> > That is quite okay, I would be +1 to let each task be a BSPPeer. Or
> > actually, has a BSPPeer, maybe we are going to add some kind of JVM
> reuse,
> > then we just have to set a new BSPPeer instead of swapping the whole
> task.
> >
> > Overall I thought of this cascading design:
> >
> >>BSPMaster
> > ->Groom1
> > -->Task1
> > -->Task2
> > ->Groom2
> > -->Task3
> >
> > So each task can directly communicate with other tasks using RPC.
> (Altough
> > I'm not a great friend of this RPC stuff [1])
> > Grooms are only there to communicate with each task, for pinging tasks to
> be
> > alive. And the BSPMaster is responsible to keep track of the availability
> of
> > the grooms.
> >
> > We should take care of syncs and use them as sparse as possible, since
> they
> > tend to be a large bottleneck.
> > What about the barrier sync of zookeeper? Does he can deal with these
> > multiple tasks? Would each task be a znode?
> >
> > [1]:
> >
> https://issues.apache.org/jira/browse/HAMA-358?focusedCommentId=13059229&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13059229
> >
> > 2011/7/7 Edward J. Yoon <ed...@apache.org>:
> >> Invoked (child) process will become a BSPPeer.
> >>
> >> On Thu, Jul 7, 2011 at 9:14 PM, Thomas Jungblut
> >> <th...@googlemail.com> wrote:
> >>> Just for clarification:
> >>> What is your plan now?
> >>> To setup a BSPPeer for several tasks on a server (groom) or is the
> >>> groom now the one and only BSPPeer?
> >>>
> >>> 2011/7/7 Edward J. Yoon <ed...@apache.org>:
> >>>> Hi,
> >>>>
> >>>> To support multi-tasks, I'm thinking about merging BSPPeer and Task.
> >>>> Then, communication will be occurred among Tasks directly. I think,
> >>>> there's no need to manage BSPPeers inside GroomServer.
> >>>>
> >>>> Can we think about the latent side-effects from this decision,
> together?
> >>>>
> >>>> Thanks.
> >>>>
> >>>> --
> >>>> Best Regards, Edward J. Yoon
> >>>> @eddieyoon
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Thomas Jungblut
> >>> Berlin
> >>>
> >>> mobile: 0170-3081070
> >>>
> >>> business: thomas.jungblut@testberichte.de
> >>> private: thomas.jungblut@gmail.com
> >>>
> >>
> >>
> >>
> >> --
> >> Best Regards, Edward J. Yoon
> >> @eddieyoon
> >>
> >
> >
> >
> > --
> > Thomas Jungblut
> > Berlin
> >
> > mobile: 0170-3081070
> >
> > business: thomas.jungblut@testberichte.de
> > private: thomas.jungblut@gmail.com
> >
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon
>



-- 
Thomas Jungblut
Berlin

mobile: 0170-3081070

business: thomas.jungblut@testberichte.de
private: thomas.jungblut@gmail.com

Re: About HAMA-410

Posted by "Edward J. Yoon" <ed...@apache.org>.
Just FYI,

To better understand, refer the diagram, described in 0.2 user guide:

http://incubator.apache.org/hama/docs/r0.2.0/ApacheHama-0.2_UserGuide.pdf

On Fri, Jul 8, 2011 at 6:33 AM, Thomas Jungblut
<th...@googlemail.com> wrote:
> Yes, that is already implemented in the latest patch.
> That is quite okay, I would be +1 to let each task be a BSPPeer. Or
> actually, has a BSPPeer, maybe we are going to add some kind of JVM reuse,
> then we just have to set a new BSPPeer instead of swapping the whole task.
>
> Overall I thought of this cascading design:
>
>>BSPMaster
> ->Groom1
> -->Task1
> -->Task2
> ->Groom2
> -->Task3
>
> So each task can directly communicate with other tasks using RPC. (Altough
> I'm not a great friend of this RPC stuff [1])
> Grooms are only there to communicate with each task, for pinging tasks to be
> alive. And the BSPMaster is responsible to keep track of the availability of
> the grooms.
>
> We should take care of syncs and use them as sparse as possible, since they
> tend to be a large bottleneck.
> What about the barrier sync of zookeeper? Does he can deal with these
> multiple tasks? Would each task be a znode?
>
> [1]:
> https://issues.apache.org/jira/browse/HAMA-358?focusedCommentId=13059229&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13059229
>
> 2011/7/7 Edward J. Yoon <ed...@apache.org>:
>> Invoked (child) process will become a BSPPeer.
>>
>> On Thu, Jul 7, 2011 at 9:14 PM, Thomas Jungblut
>> <th...@googlemail.com> wrote:
>>> Just for clarification:
>>> What is your plan now?
>>> To setup a BSPPeer for several tasks on a server (groom) or is the
>>> groom now the one and only BSPPeer?
>>>
>>> 2011/7/7 Edward J. Yoon <ed...@apache.org>:
>>>> Hi,
>>>>
>>>> To support multi-tasks, I'm thinking about merging BSPPeer and Task.
>>>> Then, communication will be occurred among Tasks directly. I think,
>>>> there's no need to manage BSPPeers inside GroomServer.
>>>>
>>>> Can we think about the latent side-effects from this decision, together?
>>>>
>>>> Thanks.
>>>>
>>>> --
>>>> Best Regards, Edward J. Yoon
>>>> @eddieyoon
>>>>
>>>
>>>
>>>
>>> --
>>> Thomas Jungblut
>>> Berlin
>>>
>>> mobile: 0170-3081070
>>>
>>> business: thomas.jungblut@testberichte.de
>>> private: thomas.jungblut@gmail.com
>>>
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>>
>
>
>
> --
> Thomas Jungblut
> Berlin
>
> mobile: 0170-3081070
>
> business: thomas.jungblut@testberichte.de
> private: thomas.jungblut@gmail.com
>



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: About HAMA-410

Posted by Thomas Jungblut <th...@googlemail.com>.
Yes, that is already implemented in the latest patch.
That is quite okay, I would be +1 to let each task be a BSPPeer. Or
actually, has a BSPPeer, maybe we are going to add some kind of JVM reuse,
then we just have to set a new BSPPeer instead of swapping the whole task.

Overall I thought of this cascading design:

>BSPMaster
->Groom1
-->Task1
-->Task2
->Groom2
-->Task3

So each task can directly communicate with other tasks using RPC. (Altough
I'm not a great friend of this RPC stuff [1])
Grooms are only there to communicate with each task, for pinging tasks to be
alive. And the BSPMaster is responsible to keep track of the availability of
the grooms.

We should take care of syncs and use them as sparse as possible, since they
tend to be a large bottleneck.
What about the barrier sync of zookeeper? Does he can deal with these
multiple tasks? Would each task be a znode?

[1]:
https://issues.apache.org/jira/browse/HAMA-358?focusedCommentId=13059229&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13059229

2011/7/7 Edward J. Yoon <ed...@apache.org>:
> Invoked (child) process will become a BSPPeer.
>
> On Thu, Jul 7, 2011 at 9:14 PM, Thomas Jungblut
> <th...@googlemail.com> wrote:
>> Just for clarification:
>> What is your plan now?
>> To setup a BSPPeer for several tasks on a server (groom) or is the
>> groom now the one and only BSPPeer?
>>
>> 2011/7/7 Edward J. Yoon <ed...@apache.org>:
>>> Hi,
>>>
>>> To support multi-tasks, I'm thinking about merging BSPPeer and Task.
>>> Then, communication will be occurred among Tasks directly. I think,
>>> there's no need to manage BSPPeers inside GroomServer.
>>>
>>> Can we think about the latent side-effects from this decision, together?
>>>
>>> Thanks.
>>>
>>> --
>>> Best Regards, Edward J. Yoon
>>> @eddieyoon
>>>
>>
>>
>>
>> --
>> Thomas Jungblut
>> Berlin
>>
>> mobile: 0170-3081070
>>
>> business: thomas.jungblut@testberichte.de
>> private: thomas.jungblut@gmail.com
>>
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon
>



-- 
Thomas Jungblut
Berlin

mobile: 0170-3081070

business: thomas.jungblut@testberichte.de
private: thomas.jungblut@gmail.com

Re: About HAMA-410

Posted by "Edward J. Yoon" <ed...@apache.org>.
Invoked (child) process will become a BSPPeer.

On Thu, Jul 7, 2011 at 9:14 PM, Thomas Jungblut
<th...@googlemail.com> wrote:
> Just for clarification:
> What is your plan now?
> To setup a BSPPeer for several tasks on a server (groom) or is the
> groom now the one and only BSPPeer?
>
> 2011/7/7 Edward J. Yoon <ed...@apache.org>:
>> Hi,
>>
>> To support multi-tasks, I'm thinking about merging BSPPeer and Task.
>> Then, communication will be occurred among Tasks directly. I think,
>> there's no need to manage BSPPeers inside GroomServer.
>>
>> Can we think about the latent side-effects from this decision, together?
>>
>> Thanks.
>>
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>>
>
>
>
> --
> Thomas Jungblut
> Berlin
>
> mobile: 0170-3081070
>
> business: thomas.jungblut@testberichte.de
> private: thomas.jungblut@gmail.com
>



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: About HAMA-410

Posted by Thomas Jungblut <th...@googlemail.com>.
Just for clarification:
What is your plan now?
To setup a BSPPeer for several tasks on a server (groom) or is the
groom now the one and only BSPPeer?

2011/7/7 Edward J. Yoon <ed...@apache.org>:
> Hi,
>
> To support multi-tasks, I'm thinking about merging BSPPeer and Task.
> Then, communication will be occurred among Tasks directly. I think,
> there's no need to manage BSPPeers inside GroomServer.
>
> Can we think about the latent side-effects from this decision, together?
>
> Thanks.
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon
>



-- 
Thomas Jungblut
Berlin

mobile: 0170-3081070

business: thomas.jungblut@testberichte.de
private: thomas.jungblut@gmail.com

Re: About HAMA-410

Posted by Thomas Jungblut <th...@googlemail.com>.
So we have 2 scenarios:
Let the Peers directly communicate with each other, or take the route over
the groom(s).

First scenario:
We speak directly to the tasks and don't make additional routing through the
grooms, but RPC is currently our bottleneck. So I doubt this would be faster
than the second scenario.

Let's have a look at the performance using RPC over the grooms (second
scenario):
If every task is sending the message to the local groom, we have some kind
of IPC- so this is considered to be quite fast.
On the groom we can now batch these messages and transfer to the other
grooms in larger and maybe compressed packages/binary files.
The messages must be delivered back to the task from the groom that received
the package.

How do a BSPPeer distinguish other peers only related to computation itself
> involved in? For instance, each GroomServer has 3 tasks where tasks are
> divided into 3 groups including {1,4,7}, {2,5,8} and {3,6,9}. How do they
> communicate e.g without falsely sync with different peers?
>

But I agree with ChiaHung, what if a user configures one task, and the
cluster is configured for 15.
How does Zookeeper know what peers he actually needs to sync? There is a
problem with different jobs running at the same time. Without knowing which
task belongs to which job this is not working, especially ZooKeeper doesn't
have that information. So we have to add the jobID to the ZKNode, or
something equal...

Greetings.

2011/7/8 Edward J. Yoon <ed...@apache.org>

> Hi,
>
> Let's assume that BSPPeer1 send a message to BSPPeer7.
>
> Currently, BSPPeer1 send a message to GroomServerA first, and then
> GroomServerA send to GroomServerC. Finally, BSPPeer7 will receive that
> message from GroomServerC.
>
> > From the GroomServer source, it seems that BSPPeer and Task perform
> different roles where Task takes responsibility of task execution and
> BSPPeer in communication (sync, send). What's the benefit of mering two
> different roles into one?
>
> So again, the communication will be occurred among Invoked (child)
> processes directly. BSPPeer1 <-> BSPPeer7.
>
> P.S., The reason why we don't use the multi-threads inside
> GroomServer, is related with killing job/task.
>
> > How do a BSPPeer distinguish other peers only related to computation
> itself involved in? For instance, each GroomServer has 3 tasks where tasks
> are divided into 3 groups including {1,4,7}, {2,5,8} and {3,6,9}. How do
> they communicate e.g without falsely sync with different peers?
>
> There's no change. BSPPeer knows all peer names, and barrier will be
> managed by ZK.
>
> On Fri, Jul 8, 2011 at 4:22 PM, ChiaHung Lin <ch...@nuk.edu.tw> wrote:
> > This looks ok from the perspective of executing function. In addition, I
> have a few questions and would like to gain more ideas on how it may work
> after refactored.
> >
> > From the GroomServer source, it seems that BSPPeer and Task perform
> different roles where Task takes responsibility of task execution and
> BSPPeer in communication (sync, send). What's the benefit of mering two
> different roles into one?
> >
> > How do a BSPPeer distinguish other peers only related to computation
> itself involved in? For instance, each GroomServer has 3 tasks where tasks
> are divided into 3 groups including {1,4,7}, {2,5,8} and {3,6,9}. How do
> they communicate e.g without falsely sync with different peers?
> >
> > GroomServerA    GroomServerB    GroomServerC
> > BSPPeer1        BSPPeer4        BSPPeer7
> > BSPPeer2        BSPPeer5        BSPPeer8
> > BSPPeer3        BSPPeer6        BSPPeer9
> >
> > -----Original message-----
> > From:Edward J. Yoon <ed...@apache.org>
> > To:hama-dev@incubator.apache.org
> > Date:Thu, 7 Jul 2011 19:48:48 +0900
> > Subject:About HAMA-410
> >
> > Hi,
> >
> > To support multi-tasks, I'm thinking about merging BSPPeer and Task.
> > Then, communication will be occurred among Tasks directly. I think,
> > there's no need to manage BSPPeers inside GroomServer.
> >
> > Can we think about the latent side-effects from this decision, together?
> >
> > Thanks.
> >
> > --
> > Best Regards, Edward J. Yoon
> > @eddieyoon
> >
> >
> > --
> > ChiaHung Lin
> > Department of Information Management
> > National University of Kaohsiung
> > Taiwan
> >
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon
>



-- 
Thomas Jungblut
Berlin

mobile: 0170-3081070

business: thomas.jungblut@testberichte.de
private: thomas.jungblut@gmail.com

Re: About HAMA-410

Posted by "Edward J. Yoon" <ed...@apache.org>.
I'll commit that patch. Then, we have to remove of the size limit of
tasks per groom and test. (Recently, I'm learning maven :P)

TRUNK can be improved or reverted back anytime if there's some problem.

On Tue, Jul 12, 2011 at 2:43 PM, Thomas Jungblut
<th...@googlemail.com> wrote:
> BTW, I think, I have to split task into smaller tasks.
>>
> It would be great if you can do this, I'd really like to do some tasks. But
> I don't want to force you to merge everything together, since you've already
> made a lot in your patch.
>
>
> 2011/7/12 Edward J. Yoon <ed...@apache.org>:
>> It's not a problem if they use different port numbers.
>>
>> This is good discussion.
>>
>> BTW, I think, I have to split task into smaller tasks.
>>
>> Sent from my iPad
>>
>> On Jul 11, 2011, at 6:06 PM, "ChiaHung Lin" <ch...@nuk.edu.tw> wrote:
>>
>>> A bit more questions.
>>>
>>> Suppose the BSPPeer1, on GroomServer A, talks to BSPPeer7 at GroomServer
> C. Now when BSPPeer2, also on GroomServer A, wants to synchronize with
> BSPPeer8. How will GroomServer C know which peer (e.g. {7,8,9}) to be
> synchronized with BSPPeer2 from GroomServer A?
>>>
>>> The current implementation in trunk seems only identify peerName, which
> consists of host:port value. Therefore, during the sync() stage, the
> outgoingqueue probably would be confused which task/ BSPPeer the message to
> be deliver. This potentially might have issue when performing checkpoint.
> For checkpointing, the state e.g an incoming message is needed to be saved
> to persistent storage so that in the recovery stage, previous state can be
> rollback.
>>>
>>>
>>> -----Original message-----
>>> From:Edward J. Yoon <ed...@apache.org>
>>> To:hama-dev@incubator.apache.org,chl501@nuk.edu.tw
>>> Date:Fri, 8 Jul 2011 17:51:05 +0900
>>> Subject:Re: About HAMA-410
>>>
>>> Hi,
>>>
>>> Let's assume that BSPPeer1 send a message to BSPPeer7.
>>>
>>> Currently, BSPPeer1 send a message to GroomServerA first, and then
>>> GroomServerA send to GroomServerC. Finally, BSPPeer7 will receive that
>>> message from GroomServerC.
>>>
>>>> From the GroomServer source, it seems that BSPPeer and Task perform
> different roles where Task takes responsibility of task execution and
> BSPPeer in communication (sync, send). What's the benefit of mering two
> different roles into one?
>>>
>>> So again, the communication will be occurred among Invoked (child)
>>> processes directly. BSPPeer1 <-> BSPPeer7.
>>>
>>> P.S., The reason why we don't use the multi-threads inside
>>> GroomServer, is related with killing job/task.
>>>
>>>> How do a BSPPeer distinguish other peers only related to computation
> itself involved in? For instance, each GroomServer has 3 tasks where tasks
> are divided into 3 groups including {1,4,7}, {2,5,8} and {3,6,9}. How do
> they communicate e.g without falsely sync with different peers?
>>>
>>> There's no change. BSPPeer knows all peer names, and barrier will be
>>> managed by ZK.
>>>
>>> On Fri, Jul 8, 2011 at 4:22 PM, ChiaHung Lin <ch...@nuk.edu.tw> wrote:
>>>> This looks ok from the perspective of executing function. In addition, I
> have a few questions and would like to gain more ideas on how it may work
> after refactored.
>>>>
>>>> From the GroomServer source, it seems that BSPPeer and Task perform
> different roles where Task takes responsibility of task execution and
> BSPPeer in communication (sync, send). What's the benefit of mering two
> different roles into one?
>>>>
>>>> How do a BSPPeer distinguish other peers only related to computation
> itself involved in? For instance, each GroomServer has 3 tasks where tasks
> are divided into 3 groups including {1,4,7}, {2,5,8} and {3,6,9}. How do
> they communicate e.g without falsely sync with different peers?
>>>>
>>>> GroomServerA    GroomServerB    GroomServerC
>>>> BSPPeer1        BSPPeer4        BSPPeer7
>>>> BSPPeer2        BSPPeer5        BSPPeer8
>>>> BSPPeer3        BSPPeer6        BSPPeer9
>>>>
>>>> -----Original message-----
>>>> From:Edward J. Yoon <ed...@apache.org>
>>>> To:hama-dev@incubator.apache.org
>>>> Date:Thu, 7 Jul 2011 19:48:48 +0900
>>>> Subject:About HAMA-410
>>>>
>>>> Hi,
>>>>
>>>> To support multi-tasks, I'm thinking about merging BSPPeer and Task.
>>>> Then, communication will be occurred among Tasks directly. I think,
>>>> there's no need to manage BSPPeers inside GroomServer.
>>>>
>>>> Can we think about the latent side-effects from this decision, together?
>>>>
>>>> Thanks.
>>>>
>>>> --
>>>> Best Regards, Edward J. Yoon
>>>> @eddieyoon
>>>>
>>>>
>>>> --
>>>> ChiaHung Lin
>>>> Department of Information Management
>>>> National University of Kaohsiung
>>>> Taiwan
>>>>
>>>
>>>
>>>
>>> --
>>> Best Regards, Edward J. Yoon
>>> @eddieyoon
>>>
>>>
>>> --
>>> ChiaHung Lin
>>> Department of Information Management
>>> National University of Kaohsiung
>>> Taiwan
>>
>
>
>
> --
> Thomas Jungblut
> Berlin
>
> mobile: 0170-3081070
>
> business: thomas.jungblut@testberichte.de
> private: thomas.jungblut@gmail.com
>



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: About HAMA-410

Posted by Thomas Jungblut <th...@googlemail.com>.
BTW, I think, I have to split task into smaller tasks.
>
It would be great if you can do this, I'd really like to do some tasks. But
I don't want to force you to merge everything together, since you've already
made a lot in your patch.


2011/7/12 Edward J. Yoon <ed...@apache.org>:
> It's not a problem if they use different port numbers.
>
> This is good discussion.
>
> BTW, I think, I have to split task into smaller tasks.
>
> Sent from my iPad
>
> On Jul 11, 2011, at 6:06 PM, "ChiaHung Lin" <ch...@nuk.edu.tw> wrote:
>
>> A bit more questions.
>>
>> Suppose the BSPPeer1, on GroomServer A, talks to BSPPeer7 at GroomServer
C. Now when BSPPeer2, also on GroomServer A, wants to synchronize with
BSPPeer8. How will GroomServer C know which peer (e.g. {7,8,9}) to be
synchronized with BSPPeer2 from GroomServer A?
>>
>> The current implementation in trunk seems only identify peerName, which
consists of host:port value. Therefore, during the sync() stage, the
outgoingqueue probably would be confused which task/ BSPPeer the message to
be deliver. This potentially might have issue when performing checkpoint.
For checkpointing, the state e.g an incoming message is needed to be saved
to persistent storage so that in the recovery stage, previous state can be
rollback.
>>
>>
>> -----Original message-----
>> From:Edward J. Yoon <ed...@apache.org>
>> To:hama-dev@incubator.apache.org,chl501@nuk.edu.tw
>> Date:Fri, 8 Jul 2011 17:51:05 +0900
>> Subject:Re: About HAMA-410
>>
>> Hi,
>>
>> Let's assume that BSPPeer1 send a message to BSPPeer7.
>>
>> Currently, BSPPeer1 send a message to GroomServerA first, and then
>> GroomServerA send to GroomServerC. Finally, BSPPeer7 will receive that
>> message from GroomServerC.
>>
>>> From the GroomServer source, it seems that BSPPeer and Task perform
different roles where Task takes responsibility of task execution and
BSPPeer in communication (sync, send). What's the benefit of mering two
different roles into one?
>>
>> So again, the communication will be occurred among Invoked (child)
>> processes directly. BSPPeer1 <-> BSPPeer7.
>>
>> P.S., The reason why we don't use the multi-threads inside
>> GroomServer, is related with killing job/task.
>>
>>> How do a BSPPeer distinguish other peers only related to computation
itself involved in? For instance, each GroomServer has 3 tasks where tasks
are divided into 3 groups including {1,4,7}, {2,5,8} and {3,6,9}. How do
they communicate e.g without falsely sync with different peers?
>>
>> There's no change. BSPPeer knows all peer names, and barrier will be
>> managed by ZK.
>>
>> On Fri, Jul 8, 2011 at 4:22 PM, ChiaHung Lin <ch...@nuk.edu.tw> wrote:
>>> This looks ok from the perspective of executing function. In addition, I
have a few questions and would like to gain more ideas on how it may work
after refactored.
>>>
>>> From the GroomServer source, it seems that BSPPeer and Task perform
different roles where Task takes responsibility of task execution and
BSPPeer in communication (sync, send). What's the benefit of mering two
different roles into one?
>>>
>>> How do a BSPPeer distinguish other peers only related to computation
itself involved in? For instance, each GroomServer has 3 tasks where tasks
are divided into 3 groups including {1,4,7}, {2,5,8} and {3,6,9}. How do
they communicate e.g without falsely sync with different peers?
>>>
>>> GroomServerA    GroomServerB    GroomServerC
>>> BSPPeer1        BSPPeer4        BSPPeer7
>>> BSPPeer2        BSPPeer5        BSPPeer8
>>> BSPPeer3        BSPPeer6        BSPPeer9
>>>
>>> -----Original message-----
>>> From:Edward J. Yoon <ed...@apache.org>
>>> To:hama-dev@incubator.apache.org
>>> Date:Thu, 7 Jul 2011 19:48:48 +0900
>>> Subject:About HAMA-410
>>>
>>> Hi,
>>>
>>> To support multi-tasks, I'm thinking about merging BSPPeer and Task.
>>> Then, communication will be occurred among Tasks directly. I think,
>>> there's no need to manage BSPPeers inside GroomServer.
>>>
>>> Can we think about the latent side-effects from this decision, together?
>>>
>>> Thanks.
>>>
>>> --
>>> Best Regards, Edward J. Yoon
>>> @eddieyoon
>>>
>>>
>>> --
>>> ChiaHung Lin
>>> Department of Information Management
>>> National University of Kaohsiung
>>> Taiwan
>>>
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>>
>>
>> --
>> ChiaHung Lin
>> Department of Information Management
>> National University of Kaohsiung
>> Taiwan
>



-- 
Thomas Jungblut
Berlin

mobile: 0170-3081070

business: thomas.jungblut@testberichte.de
private: thomas.jungblut@gmail.com

Re: About HAMA-410

Posted by "Edward J. Yoon" <ed...@apache.org>.
It's not a problem if they use different port numbers.

This is good discussion.

BTW, I think, I have to split task into smaller tasks.

Sent from my iPad

On Jul 11, 2011, at 6:06 PM, "ChiaHung Lin" <ch...@nuk.edu.tw> wrote:

> A bit more questions. 
> 
> Suppose the BSPPeer1, on GroomServer A, talks to BSPPeer7 at GroomServer C. Now when BSPPeer2, also on GroomServer A, wants to synchronize with BSPPeer8. How will GroomServer C know which peer (e.g. {7,8,9}) to be synchronized with BSPPeer2 from GroomServer A? 
> 
> The current implementation in trunk seems only identify peerName, which consists of host:port value. Therefore, during the sync() stage, the outgoingqueue probably would be confused which task/ BSPPeer the message to be deliver. This potentially might have issue when performing checkpoint. For checkpointing, the state e.g an incoming message is needed to be saved to persistent storage so that in the recovery stage, previous state can be rollback. 
> 
> 
> -----Original message-----
> From:Edward J. Yoon <ed...@apache.org>
> To:hama-dev@incubator.apache.org,chl501@nuk.edu.tw
> Date:Fri, 8 Jul 2011 17:51:05 +0900
> Subject:Re: About HAMA-410
> 
> Hi,
> 
> Let's assume that BSPPeer1 send a message to BSPPeer7.
> 
> Currently, BSPPeer1 send a message to GroomServerA first, and then
> GroomServerA send to GroomServerC. Finally, BSPPeer7 will receive that
> message from GroomServerC.
> 
>> From the GroomServer source, it seems that BSPPeer and Task perform different roles where Task takes responsibility of task execution and BSPPeer in communication (sync, send). What's the benefit of mering two different roles into one?
> 
> So again, the communication will be occurred among Invoked (child)
> processes directly. BSPPeer1 <-> BSPPeer7.
> 
> P.S., The reason why we don't use the multi-threads inside
> GroomServer, is related with killing job/task.
> 
>> How do a BSPPeer distinguish other peers only related to computation itself involved in? For instance, each GroomServer has 3 tasks where tasks are divided into 3 groups including {1,4,7}, {2,5,8} and {3,6,9}. How do they communicate e.g without falsely sync with different peers?
> 
> There's no change. BSPPeer knows all peer names, and barrier will be
> managed by ZK.
> 
> On Fri, Jul 8, 2011 at 4:22 PM, ChiaHung Lin <ch...@nuk.edu.tw> wrote:
>> This looks ok from the perspective of executing function. In addition, I have a few questions and would like to gain more ideas on how it may work after refactored.
>> 
>> From the GroomServer source, it seems that BSPPeer and Task perform different roles where Task takes responsibility of task execution and BSPPeer in communication (sync, send). What's the benefit of mering two different roles into one?
>> 
>> How do a BSPPeer distinguish other peers only related to computation itself involved in? For instance, each GroomServer has 3 tasks where tasks are divided into 3 groups including {1,4,7}, {2,5,8} and {3,6,9}. How do they communicate e.g without falsely sync with different peers?
>> 
>> GroomServerA    GroomServerB    GroomServerC
>> BSPPeer1        BSPPeer4        BSPPeer7
>> BSPPeer2        BSPPeer5        BSPPeer8
>> BSPPeer3        BSPPeer6        BSPPeer9
>> 
>> -----Original message-----
>> From:Edward J. Yoon <ed...@apache.org>
>> To:hama-dev@incubator.apache.org
>> Date:Thu, 7 Jul 2011 19:48:48 +0900
>> Subject:About HAMA-410
>> 
>> Hi,
>> 
>> To support multi-tasks, I'm thinking about merging BSPPeer and Task.
>> Then, communication will be occurred among Tasks directly. I think,
>> there's no need to manage BSPPeers inside GroomServer.
>> 
>> Can we think about the latent side-effects from this decision, together?
>> 
>> Thanks.
>> 
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>> 
>> 
>> --
>> ChiaHung Lin
>> Department of Information Management
>> National University of Kaohsiung
>> Taiwan
>> 
> 
> 
> 
> -- 
> Best Regards, Edward J. Yoon
> @eddieyoon
> 
> 
> --
> ChiaHung Lin
> Department of Information Management
> National University of Kaohsiung
> Taiwan

Re: About HAMA-410

Posted by ChiaHung Lin <ch...@nuk.edu.tw>.
A bit more questions. 

Suppose the BSPPeer1, on GroomServer A, talks to BSPPeer7 at GroomServer C. Now when BSPPeer2, also on GroomServer A, wants to synchronize with BSPPeer8. How will GroomServer C know which peer (e.g. {7,8,9}) to be synchronized with BSPPeer2 from GroomServer A? 

The current implementation in trunk seems only identify peerName, which consists of host:port value. Therefore, during the sync() stage, the outgoingqueue probably would be confused which task/ BSPPeer the message to be deliver. This potentially might have issue when performing checkpoint. For checkpointing, the state e.g an incoming message is needed to be saved to persistent storage so that in the recovery stage, previous state can be rollback. 
 

-----Original message-----
From:Edward J. Yoon <ed...@apache.org>
To:hama-dev@incubator.apache.org,chl501@nuk.edu.tw
Date:Fri, 8 Jul 2011 17:51:05 +0900
Subject:Re: About HAMA-410

Hi,

Let's assume that BSPPeer1 send a message to BSPPeer7.

Currently, BSPPeer1 send a message to GroomServerA first, and then
GroomServerA send to GroomServerC. Finally, BSPPeer7 will receive that
message from GroomServerC.

> From the GroomServer source, it seems that BSPPeer and Task perform different roles where Task takes responsibility of task execution and BSPPeer in communication (sync, send). What's the benefit of mering two different roles into one?

So again, the communication will be occurred among Invoked (child)
processes directly. BSPPeer1 <-> BSPPeer7.

P.S., The reason why we don't use the multi-threads inside
GroomServer, is related with killing job/task.

> How do a BSPPeer distinguish other peers only related to computation itself involved in? For instance, each GroomServer has 3 tasks where tasks are divided into 3 groups including {1,4,7}, {2,5,8} and {3,6,9}. How do they communicate e.g without falsely sync with different peers?

There's no change. BSPPeer knows all peer names, and barrier will be
managed by ZK.

On Fri, Jul 8, 2011 at 4:22 PM, ChiaHung Lin <ch...@nuk.edu.tw> wrote:
> This looks ok from the perspective of executing function. In addition, I have a few questions and would like to gain more ideas on how it may work after refactored.
>
> From the GroomServer source, it seems that BSPPeer and Task perform different roles where Task takes responsibility of task execution and BSPPeer in communication (sync, send). What's the benefit of mering two different roles into one?
>
> How do a BSPPeer distinguish other peers only related to computation itself involved in? For instance, each GroomServer has 3 tasks where tasks are divided into 3 groups including {1,4,7}, {2,5,8} and {3,6,9}. How do they communicate e.g without falsely sync with different peers?
>
> GroomServerA    GroomServerB    GroomServerC
> BSPPeer1        BSPPeer4        BSPPeer7
> BSPPeer2        BSPPeer5        BSPPeer8
> BSPPeer3        BSPPeer6        BSPPeer9
>
> -----Original message-----
> From:Edward J. Yoon <ed...@apache.org>
> To:hama-dev@incubator.apache.org
> Date:Thu, 7 Jul 2011 19:48:48 +0900
> Subject:About HAMA-410
>
> Hi,
>
> To support multi-tasks, I'm thinking about merging BSPPeer and Task.
> Then, communication will be occurred among Tasks directly. I think,
> there's no need to manage BSPPeers inside GroomServer.
>
> Can we think about the latent side-effects from this decision, together?
>
> Thanks.
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon
>
>
> --
> ChiaHung Lin
> Department of Information Management
> National University of Kaohsiung
> Taiwan
>



-- 
Best Regards, Edward J. Yoon
@eddieyoon


--
ChiaHung Lin
Department of Information Management
National University of Kaohsiung
Taiwan

Re: About HAMA-410

Posted by "Edward J. Yoon" <ed...@apache.org>.
Hi,

Let's assume that BSPPeer1 send a message to BSPPeer7.

Currently, BSPPeer1 send a message to GroomServerA first, and then
GroomServerA send to GroomServerC. Finally, BSPPeer7 will receive that
message from GroomServerC.

> From the GroomServer source, it seems that BSPPeer and Task perform different roles where Task takes responsibility of task execution and BSPPeer in communication (sync, send). What's the benefit of mering two different roles into one?

So again, the communication will be occurred among Invoked (child)
processes directly. BSPPeer1 <-> BSPPeer7.

P.S., The reason why we don't use the multi-threads inside
GroomServer, is related with killing job/task.

> How do a BSPPeer distinguish other peers only related to computation itself involved in? For instance, each GroomServer has 3 tasks where tasks are divided into 3 groups including {1,4,7}, {2,5,8} and {3,6,9}. How do they communicate e.g without falsely sync with different peers?

There's no change. BSPPeer knows all peer names, and barrier will be
managed by ZK.

On Fri, Jul 8, 2011 at 4:22 PM, ChiaHung Lin <ch...@nuk.edu.tw> wrote:
> This looks ok from the perspective of executing function. In addition, I have a few questions and would like to gain more ideas on how it may work after refactored.
>
> From the GroomServer source, it seems that BSPPeer and Task perform different roles where Task takes responsibility of task execution and BSPPeer in communication (sync, send). What's the benefit of mering two different roles into one?
>
> How do a BSPPeer distinguish other peers only related to computation itself involved in? For instance, each GroomServer has 3 tasks where tasks are divided into 3 groups including {1,4,7}, {2,5,8} and {3,6,9}. How do they communicate e.g without falsely sync with different peers?
>
> GroomServerA    GroomServerB    GroomServerC
> BSPPeer1        BSPPeer4        BSPPeer7
> BSPPeer2        BSPPeer5        BSPPeer8
> BSPPeer3        BSPPeer6        BSPPeer9
>
> -----Original message-----
> From:Edward J. Yoon <ed...@apache.org>
> To:hama-dev@incubator.apache.org
> Date:Thu, 7 Jul 2011 19:48:48 +0900
> Subject:About HAMA-410
>
> Hi,
>
> To support multi-tasks, I'm thinking about merging BSPPeer and Task.
> Then, communication will be occurred among Tasks directly. I think,
> there's no need to manage BSPPeers inside GroomServer.
>
> Can we think about the latent side-effects from this decision, together?
>
> Thanks.
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon
>
>
> --
> ChiaHung Lin
> Department of Information Management
> National University of Kaohsiung
> Taiwan
>



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: About HAMA-410

Posted by ChiaHung Lin <ch...@nuk.edu.tw>.
This looks ok from the perspective of executing function. In addition, I have a few questions and would like to gain more ideas on how it may work after refactored. 

From the GroomServer source, it seems that BSPPeer and Task perform different roles where Task takes responsibility of task execution and BSPPeer in communication (sync, send). What's the benefit of mering two different roles into one? 

How do a BSPPeer distinguish other peers only related to computation itself involved in? For instance, each GroomServer has 3 tasks where tasks are divided into 3 groups including {1,4,7}, {2,5,8} and {3,6,9}. How do they communicate e.g without falsely sync with different peers?

GroomServerA	GroomServerB	GroomServerC
BSPPeer1	BSPPeer4	BSPPeer7
BSPPeer2	BSPPeer5	BSPPeer8
BSPPeer3	BSPPeer6	BSPPeer9

-----Original message-----
From:Edward J. Yoon <ed...@apache.org>
To:hama-dev@incubator.apache.org
Date:Thu, 7 Jul 2011 19:48:48 +0900
Subject:About HAMA-410

Hi,

To support multi-tasks, I'm thinking about merging BSPPeer and Task.
Then, communication will be occurred among Tasks directly. I think,
there's no need to manage BSPPeers inside GroomServer.

Can we think about the latent side-effects from this decision, together?

Thanks.

-- 
Best Regards, Edward J. Yoon
@eddieyoon


--
ChiaHung Lin
Department of Information Management
National University of Kaohsiung
Taiwan