You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@reef.apache.org by Markus Weimer <ma...@weimo.de> on 2015/04/16 20:12:02 UTC

REEF Project ideas

Hi,

REEF is used in a couple of academic settings. It could make sense for
us to collect some project ideas for students. I started that collection
over on the wiki:

https://cwiki.apache.org/confluence/display/REEF/Project+Ideas

Please have a look and, most importantly, add ideas :-)

Markus

Re: REEF Project ideas

Posted by Byung-Gon Chun <bg...@gmail.com>.
On Tue, Jun 30, 2015 at 8:02 AM, Markus Weimer <ma...@weimo.de> wrote:

> Hi,
>
> > We have a long-running distributed key-value service, which we will turn
> it
> > into a parameter service.
>
> Awesome. Any plans for open sourcing it?
>

Yes, at some point.


>
> > Markus and Yingda, it's also interesting to think about running a
> parameter
> > server as a REEF service that runs together with tasks as you discussed.
> > How beneficial is this setup?
>
> Good question. Initially, I thought this might be useful for small jobs.
> But then, one wouldn't need a parameter server at all.
>
> I can come up with a reason to run the PS as part of the job, though:
> That way, it enjoys fate and bill sharing with the job that needs it.
> Otherwise, one would have to add cleanup (and billing) logic to the
> parameter-as-a-service.
>
>
This makes sense. Running the PS as part of the job avoids inter-process
communication between PS and task, which seems to be a great benefit.
But most work I have seen runs PS as a separate service.




> Markus
>



-- 
Byung-Gon Chun

Re: REEF Project ideas

Posted by Markus Weimer <ma...@weimo.de>.
Hi,

> We have a long-running distributed key-value service, which we will turn it
> into a parameter service.

Awesome. Any plans for open sourcing it?

> Markus and Yingda, it's also interesting to think about running a parameter
> server as a REEF service that runs together with tasks as you discussed.
> How beneficial is this setup?

Good question. Initially, I thought this might be useful for small jobs.
But then, one wouldn't need a parameter server at all.

I can come up with a reason to run the PS as part of the job, though:
That way, it enjoys fate and bill sharing with the job that needs it.
Otherwise, one would have to add cleanup (and billing) logic to the
parameter-as-a-service.

Markus

Re: REEF Project ideas

Posted by Byung-Gon Chun <bg...@gmail.com>.
We have a long-running distributed key-value service, which we will turn it
into a parameter service. Parameter servers are typically shared by
multiple ML jobs, and the number of the servers is typically smaller than
the number of ML tasks.

Markus and Yingda, it's also interesting to think about running a parameter
server as a REEF service that runs together with tasks as you discussed.
How beneficial is this setup?



On Mon, Jun 29, 2015 at 1:56 PM, Yingda Chen <yd...@gmail.com> wrote:

> Yes, server/worker/coordinator can each be implemented as a special-kind of
> evaluator.
>
> As for the optimization mentioned, that would have been nice if the server
> numbers are about the same (order) as the worker numbers. Depending on the
> particular algorithms and/or the ratio of sample data size v.s. model size,
> that may or may not be true though.
>
> -Yingda
>
> On Sat, Jun 27, 2015 at 3:26 PM, Markus Weimer <ma...@weimo.de> wrote:
>
> > On 2015-06-25 21:44, Yingda Chen wrote:
> > > I think with the current group communication layer, building a
> > > parameter server with minimum functionality on reef would be
> > > straightforward too...
> >
> > Agreed. Ideally, such a server would be implemented as an Evaluator-side
> > service such that it can run on the very same machines that also run the
> > Tasks.
> >
> > Markus
> >
>



-- 
Byung-Gon Chun

Re: REEF Project ideas

Posted by Yingda Chen <yd...@gmail.com>.
Yes, server/worker/coordinator can each be implemented as a special-kind of
evaluator.

As for the optimization mentioned, that would have been nice if the server
numbers are about the same (order) as the worker numbers. Depending on the
particular algorithms and/or the ratio of sample data size v.s. model size,
that may or may not be true though.

-Yingda

On Sat, Jun 27, 2015 at 3:26 PM, Markus Weimer <ma...@weimo.de> wrote:

> On 2015-06-25 21:44, Yingda Chen wrote:
> > I think with the current group communication layer, building a
> > parameter server with minimum functionality on reef would be
> > straightforward too...
>
> Agreed. Ideally, such a server would be implemented as an Evaluator-side
> service such that it can run on the very same machines that also run the
> Tasks.
>
> Markus
>

Re: REEF Project ideas

Posted by Markus Weimer <ma...@weimo.de>.
On 2015-06-25 21:44, Yingda Chen wrote:
> I think with the current group communication layer, building a
> parameter server with minimum functionality on reef would be
> straightforward too...

Agreed. Ideally, such a server would be implemented as an Evaluator-side
service such that it can run on the very same machines that also run the
Tasks.

Markus

Re: REEF Project ideas

Posted by Yingda Chen <yd...@gmail.com>.
I think with the current group communication layer, building a parameter
server with minimum functionality on reef would be straightforward too...

On Thu, Apr 16, 2015 at 11:12 AM, Markus Weimer <ma...@weimo.de> wrote:

> Hi,
>
> REEF is used in a couple of academic settings. It could make sense for
> us to collect some project ideas for students. I started that collection
> over on the wiki:
>
> https://cwiki.apache.org/confluence/display/REEF/Project+Ideas
>
> Please have a look and, most importantly, add ideas :-)
>
> Markus
>