You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hama.apache.org by Thomas Jungblut <th...@googlemail.com> on 2011/10/13 14:05:51 UTC

Master Task

Hey all,

I (theoretically) faced a problem with our examples that require a master
task. e.G. PiEstimator, SSSP/Pagerank and stuff.
Looking at the code, the cluster status which is used to "elect" a master
task returns plainly the groom names.
Since peers are now created per task and not globally for a single groom,
these names won't match anymore.
The same problem exists with YARN, since we don't have the information how
these tasks are named, and currently now ClusterStatus is supported.

My first question, is this just a theoretical worry of me?
If not, my proposal would be to pre-launch a task in the cluster, get its
name and put it with a generic conf keyword (bsp.master.task) into
configuration.
Then launch the other tasks with the "new" configuration.

-- 
Thomas Jungblut
Berlin <th...@gmail.com>

Re: Master Task

Posted by Thomas Jungblut <th...@googlemail.com>.
This would be okay. But we have to take care that the names at the indices
are consistent on every task.
In my experience this is not always the case.

2011/10/17 Edward J. Yoon <ed...@apache.org>

> How about we add a method
>
> + public String getPeerName(int index);
>
> to BSPPeer class?
>
> It's hard to specify hostname for some special task e.g., master task,
> secondary task in a job configuration because we'll provide automated
> task scheduler.
>
> On Fri, Oct 14, 2011 at 8:48 PM, Thomas Jungblut
> <th...@googlemail.com> wrote:
> > Hey thanks Edward,
> >
> > the worse thing is that we are the user (in case of the examples).
> > So do we want to provide a function for that?
> > We could use the getAllPeerNames()[0] as a master task, but Zookeeper is
> not
> > consistent with the ordering of the peers, so there would be collisions.
> >
> > Or is there a master-election algorithm for BSP?:P
> >
> > 2011/10/14 Edward J. Yoon <ed...@apache.org>
> >
> >> IMO, user have to elect master task in bsp() function, considering
> >> advanced job scheduler for concurrent jobs and multi-user system.
> >>
> >> On Thu, Oct 13, 2011 at 9:05 PM, Thomas Jungblut
> >> <th...@googlemail.com> wrote:
> >> > Hey all,
> >> >
> >> > I (theoretically) faced a problem with our examples that require a
> master
> >> > task. e.G. PiEstimator, SSSP/Pagerank and stuff.
> >> > Looking at the code, the cluster status which is used to "elect" a
> master
> >> > task returns plainly the groom names.
> >> > Since peers are now created per task and not globally for a single
> groom,
> >> > these names won't match anymore.
> >> > The same problem exists with YARN, since we don't have the information
> >> how
> >> > these tasks are named, and currently now ClusterStatus is supported.
> >> >
> >> > My first question, is this just a theoretical worry of me?
> >> > If not, my proposal would be to pre-launch a task in the cluster, get
> its
> >> > name and put it with a generic conf keyword (bsp.master.task) into
> >> > configuration.
> >> > Then launch the other tasks with the "new" configuration.
> >> >
> >> > --
> >> > Thomas Jungblut
> >> > Berlin <th...@gmail.com>
> >> >
> >>
> >>
> >>
> >> --
> >> Best Regards, Edward J. Yoon
> >> @eddieyoon
> >>
> >
> >
> >
> > --
> > Thomas Jungblut
> > Berlin <th...@gmail.com>
> >
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon
>



-- 
Thomas Jungblut
Berlin <th...@gmail.com>

Re: Master Task

Posted by "Edward J. Yoon" <ed...@apache.org>.
How about we add a method

+ public String getPeerName(int index);

to BSPPeer class?

It's hard to specify hostname for some special task e.g., master task,
secondary task in a job configuration because we'll provide automated
task scheduler.

On Fri, Oct 14, 2011 at 8:48 PM, Thomas Jungblut
<th...@googlemail.com> wrote:
> Hey thanks Edward,
>
> the worse thing is that we are the user (in case of the examples).
> So do we want to provide a function for that?
> We could use the getAllPeerNames()[0] as a master task, but Zookeeper is not
> consistent with the ordering of the peers, so there would be collisions.
>
> Or is there a master-election algorithm for BSP?:P
>
> 2011/10/14 Edward J. Yoon <ed...@apache.org>
>
>> IMO, user have to elect master task in bsp() function, considering
>> advanced job scheduler for concurrent jobs and multi-user system.
>>
>> On Thu, Oct 13, 2011 at 9:05 PM, Thomas Jungblut
>> <th...@googlemail.com> wrote:
>> > Hey all,
>> >
>> > I (theoretically) faced a problem with our examples that require a master
>> > task. e.G. PiEstimator, SSSP/Pagerank and stuff.
>> > Looking at the code, the cluster status which is used to "elect" a master
>> > task returns plainly the groom names.
>> > Since peers are now created per task and not globally for a single groom,
>> > these names won't match anymore.
>> > The same problem exists with YARN, since we don't have the information
>> how
>> > these tasks are named, and currently now ClusterStatus is supported.
>> >
>> > My first question, is this just a theoretical worry of me?
>> > If not, my proposal would be to pre-launch a task in the cluster, get its
>> > name and put it with a generic conf keyword (bsp.master.task) into
>> > configuration.
>> > Then launch the other tasks with the "new" configuration.
>> >
>> > --
>> > Thomas Jungblut
>> > Berlin <th...@gmail.com>
>> >
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>>
>
>
>
> --
> Thomas Jungblut
> Berlin <th...@gmail.com>
>



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: Master Task

Posted by Thomas Jungblut <th...@googlemail.com>.
Hey thanks Edward,

the worse thing is that we are the user (in case of the examples).
So do we want to provide a function for that?
We could use the getAllPeerNames()[0] as a master task, but Zookeeper is not
consistent with the ordering of the peers, so there would be collisions.

Or is there a master-election algorithm for BSP?:P

2011/10/14 Edward J. Yoon <ed...@apache.org>

> IMO, user have to elect master task in bsp() function, considering
> advanced job scheduler for concurrent jobs and multi-user system.
>
> On Thu, Oct 13, 2011 at 9:05 PM, Thomas Jungblut
> <th...@googlemail.com> wrote:
> > Hey all,
> >
> > I (theoretically) faced a problem with our examples that require a master
> > task. e.G. PiEstimator, SSSP/Pagerank and stuff.
> > Looking at the code, the cluster status which is used to "elect" a master
> > task returns plainly the groom names.
> > Since peers are now created per task and not globally for a single groom,
> > these names won't match anymore.
> > The same problem exists with YARN, since we don't have the information
> how
> > these tasks are named, and currently now ClusterStatus is supported.
> >
> > My first question, is this just a theoretical worry of me?
> > If not, my proposal would be to pre-launch a task in the cluster, get its
> > name and put it with a generic conf keyword (bsp.master.task) into
> > configuration.
> > Then launch the other tasks with the "new" configuration.
> >
> > --
> > Thomas Jungblut
> > Berlin <th...@gmail.com>
> >
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon
>



-- 
Thomas Jungblut
Berlin <th...@gmail.com>

Re: Master Task

Posted by "Edward J. Yoon" <ed...@apache.org>.
IMO, user have to elect master task in bsp() function, considering
advanced job scheduler for concurrent jobs and multi-user system.

On Thu, Oct 13, 2011 at 9:05 PM, Thomas Jungblut
<th...@googlemail.com> wrote:
> Hey all,
>
> I (theoretically) faced a problem with our examples that require a master
> task. e.G. PiEstimator, SSSP/Pagerank and stuff.
> Looking at the code, the cluster status which is used to "elect" a master
> task returns plainly the groom names.
> Since peers are now created per task and not globally for a single groom,
> these names won't match anymore.
> The same problem exists with YARN, since we don't have the information how
> these tasks are named, and currently now ClusterStatus is supported.
>
> My first question, is this just a theoretical worry of me?
> If not, my proposal would be to pre-launch a task in the cluster, get its
> name and put it with a generic conf keyword (bsp.master.task) into
> configuration.
> Then launch the other tasks with the "new" configuration.
>
> --
> Thomas Jungblut
> Berlin <th...@gmail.com>
>



-- 
Best Regards, Edward J. Yoon
@eddieyoon