You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@reef.apache.org by Matteo Interlandi <m....@gmail.com> on 2017/03/30 18:44:56 UTC

Introduction and question about REEF local mode

Hi REEF developers,

my name is Matteo Interlandi and I am currently postdoc at UCLA. I am
getting familiar with REEF and I was noticing that in local mode different
processes for the driver and evaluators are spawned. While this is good for
testing, I found a little bit difficult to do debugging and understanding
the flow of execution. I was wondering if it could be useful to add a
thread-based local mode, as in Apache Spark. Of course the thread and
process modes code paths should be as overlapping as possible to help
testing.

I have an hack working on java but I would like to know from you what
should be the clean (REEF) way of doing this, or if you already have
something similar there.

Thanks.
--Matteo

Re: Introduction and question about REEF local mode

Posted by Matteo Interlandi <m....@gmail.com>.
Now it works! Thanks Markus

On Fri, Apr 7, 2017 at 11:26 AM, Markus Weimer <ma...@weimo.de> wrote:

> On Fri, Apr 7, 2017 at 10:08 AM, Matteo Interlandi
> <m....@gmail.com> wrote:
> > I am still not able to assign the issue to me
>
> Hmm, can you try again? I moved you around into yet another
> `Contributors` group.
>
> Markus
>

Re: Introduction and question about REEF local mode

Posted by Markus Weimer <ma...@weimo.de>.
On Fri, Apr 7, 2017 at 10:08 AM, Matteo Interlandi
<m....@gmail.com> wrote:
> I am still not able to assign the issue to me

Hmm, can you try again? I moved you around into yet another
`Contributors` group.

Markus

Re: Introduction and question about REEF local mode

Posted by Matteo Interlandi <m....@gmail.com>.
Under my profile I see that I am under the 'jira-users' group. Should I
instead see 'Contributor'? (I am still not able to assign the issue to me)

On Fri, Apr 7, 2017 at 9:44 AM, Markus Weimer <ma...@weimo.de> wrote:

> On Fri, Apr 7, 2017 at 9:22 AM, Matteo Interlandi
> <m....@gmail.com> wrote:
> > I haven't found a way (I am not allowed?) to assign the JIRA to me.
> Could some please help?
>
> I have added you to the `Contributor` group, which should give you
> that permission. Can you try?
>
> Markus
>

Re: Introduction and question about REEF local mode

Posted by Markus Weimer <ma...@weimo.de>.
On Fri, Apr 7, 2017 at 9:22 AM, Matteo Interlandi
<m....@gmail.com> wrote:
> I haven't found a way (I am not allowed?) to assign the JIRA to me. Could some please help?

I have added you to the `Contributor` group, which should give you
that permission. Can you try?

Markus

Re: Introduction and question about REEF local mode

Posted by Matteo Interlandi <m....@gmail.com>.
I have create a JIRA [1] for this. I haven't found a way (I am not
allowed?) to assign the JIRA to me. Could some please help?

Thanks!

Best,
Matteo

[1]  https://issues.apache.org/jira/browse/REEF-1769

On Thu, Mar 30, 2017 at 4:35 PM, Matteo Interlandi <m....@gmail.com>
wrote:

> Thanks Markus and Gon. I think I have kind of clear now the flow, I will
> keep you posted on my progress.
>
> On Thu, Mar 30, 2017 at 3:21 PM, Byung-Gon Chun <bg...@gmail.com> wrote:
>
>> Welcome, Matteo!
>>
>> Markus summed it up very well. One thing to add is that we introduced the
>> process-based runtime since it emulates better the multi-process nature of
>> REEF running on resource managers.
>>
>> As you mentioned, the thread-based local mode is handy for debugging.
>>
>> Looking forward to your contribution!
>>
>> On Fri, Mar 31, 2017 at 6:43 AM, Markus Weimer <ma...@weimo.de> wrote:
>>
>> > Welcome to the list, Matteo!
>> >
>> > Sergiy has done some work to make it so that Driver and Client can be in
>> > the same process. That way, you can easily step through your Driver in a
>> > debugger, even if the Evaluators are running on a YARN cluster. @Sergiy:
>> > Maybe it is time to have a short documentation of how to do that?
>> >
>> > Regarding Evaluators in the same process: We used to have this, and the
>> > local runtime still has vestiges of that support. The `ResourceManager`
>> in
>> > the local runtime uses the `ContainerManager` to ultimately launch the
>> > Evaluators. `ContainerManager` is presently hard coded to create
>> > `ProcessContainers` (see line 372). In order to create in-process, let's
>> > call them `ThreadContainers`, that code needs to be made configurable.
>> > Maybe we need a new injectable interface `ContainerFactory` with
>> > implementations for `ProcessContainer` and `ThreadContainer` which we
>> can
>> > use in that line to make a new `Container` instance?
>> >
>> > Markus
>> >
>> >
>> >
>> > On Thu, Mar 30, 2017 at 11:44 AM, Matteo Interlandi <
>> > m.interlandi@gmail.com>
>> > wrote:
>> >
>> > > Hi REEF developers,
>> > >
>> > > my name is Matteo Interlandi and I am currently postdoc at UCLA. I am
>> > > getting familiar with REEF and I was noticing that in local mode
>> > different
>> > > processes for the driver and evaluators are spawned. While this is
>> good
>> > for
>> > > testing, I found a little bit difficult to do debugging and
>> understanding
>> > > the flow of execution. I was wondering if it could be useful to add a
>> > > thread-based local mode, as in Apache Spark. Of course the thread and
>> > > process modes code paths should be as overlapping as possible to help
>> > > testing.
>> > >
>> > > I have an hack working on java but I would like to know from you what
>> > > should be the clean (REEF) way of doing this, or if you already have
>> > > something similar there.
>> > >
>> > > Thanks.
>> > > --Matteo
>> > >
>> >
>>
>>
>>
>> --
>> Byung-Gon Chun
>>
>
>

Re: Introduction and question about REEF local mode

Posted by Matteo Interlandi <m....@gmail.com>.
Thanks Markus and Gon. I think I have kind of clear now the flow, I will
keep you posted on my progress.

On Thu, Mar 30, 2017 at 3:21 PM, Byung-Gon Chun <bg...@gmail.com> wrote:

> Welcome, Matteo!
>
> Markus summed it up very well. One thing to add is that we introduced the
> process-based runtime since it emulates better the multi-process nature of
> REEF running on resource managers.
>
> As you mentioned, the thread-based local mode is handy for debugging.
>
> Looking forward to your contribution!
>
> On Fri, Mar 31, 2017 at 6:43 AM, Markus Weimer <ma...@weimo.de> wrote:
>
> > Welcome to the list, Matteo!
> >
> > Sergiy has done some work to make it so that Driver and Client can be in
> > the same process. That way, you can easily step through your Driver in a
> > debugger, even if the Evaluators are running on a YARN cluster. @Sergiy:
> > Maybe it is time to have a short documentation of how to do that?
> >
> > Regarding Evaluators in the same process: We used to have this, and the
> > local runtime still has vestiges of that support. The `ResourceManager`
> in
> > the local runtime uses the `ContainerManager` to ultimately launch the
> > Evaluators. `ContainerManager` is presently hard coded to create
> > `ProcessContainers` (see line 372). In order to create in-process, let's
> > call them `ThreadContainers`, that code needs to be made configurable.
> > Maybe we need a new injectable interface `ContainerFactory` with
> > implementations for `ProcessContainer` and `ThreadContainer` which we can
> > use in that line to make a new `Container` instance?
> >
> > Markus
> >
> >
> >
> > On Thu, Mar 30, 2017 at 11:44 AM, Matteo Interlandi <
> > m.interlandi@gmail.com>
> > wrote:
> >
> > > Hi REEF developers,
> > >
> > > my name is Matteo Interlandi and I am currently postdoc at UCLA. I am
> > > getting familiar with REEF and I was noticing that in local mode
> > different
> > > processes for the driver and evaluators are spawned. While this is good
> > for
> > > testing, I found a little bit difficult to do debugging and
> understanding
> > > the flow of execution. I was wondering if it could be useful to add a
> > > thread-based local mode, as in Apache Spark. Of course the thread and
> > > process modes code paths should be as overlapping as possible to help
> > > testing.
> > >
> > > I have an hack working on java but I would like to know from you what
> > > should be the clean (REEF) way of doing this, or if you already have
> > > something similar there.
> > >
> > > Thanks.
> > > --Matteo
> > >
> >
>
>
>
> --
> Byung-Gon Chun
>

Re: Introduction and question about REEF local mode

Posted by Byung-Gon Chun <bg...@gmail.com>.
Welcome, Matteo!

Markus summed it up very well. One thing to add is that we introduced the
process-based runtime since it emulates better the multi-process nature of
REEF running on resource managers.

As you mentioned, the thread-based local mode is handy for debugging.

Looking forward to your contribution!

On Fri, Mar 31, 2017 at 6:43 AM, Markus Weimer <ma...@weimo.de> wrote:

> Welcome to the list, Matteo!
>
> Sergiy has done some work to make it so that Driver and Client can be in
> the same process. That way, you can easily step through your Driver in a
> debugger, even if the Evaluators are running on a YARN cluster. @Sergiy:
> Maybe it is time to have a short documentation of how to do that?
>
> Regarding Evaluators in the same process: We used to have this, and the
> local runtime still has vestiges of that support. The `ResourceManager` in
> the local runtime uses the `ContainerManager` to ultimately launch the
> Evaluators. `ContainerManager` is presently hard coded to create
> `ProcessContainers` (see line 372). In order to create in-process, let's
> call them `ThreadContainers`, that code needs to be made configurable.
> Maybe we need a new injectable interface `ContainerFactory` with
> implementations for `ProcessContainer` and `ThreadContainer` which we can
> use in that line to make a new `Container` instance?
>
> Markus
>
>
>
> On Thu, Mar 30, 2017 at 11:44 AM, Matteo Interlandi <
> m.interlandi@gmail.com>
> wrote:
>
> > Hi REEF developers,
> >
> > my name is Matteo Interlandi and I am currently postdoc at UCLA. I am
> > getting familiar with REEF and I was noticing that in local mode
> different
> > processes for the driver and evaluators are spawned. While this is good
> for
> > testing, I found a little bit difficult to do debugging and understanding
> > the flow of execution. I was wondering if it could be useful to add a
> > thread-based local mode, as in Apache Spark. Of course the thread and
> > process modes code paths should be as overlapping as possible to help
> > testing.
> >
> > I have an hack working on java but I would like to know from you what
> > should be the clean (REEF) way of doing this, or if you already have
> > something similar there.
> >
> > Thanks.
> > --Matteo
> >
>



-- 
Byung-Gon Chun

Re: Introduction and question about REEF local mode

Posted by Markus Weimer <ma...@weimo.de>.
Welcome to the list, Matteo!

Sergiy has done some work to make it so that Driver and Client can be in
the same process. That way, you can easily step through your Driver in a
debugger, even if the Evaluators are running on a YARN cluster. @Sergiy:
Maybe it is time to have a short documentation of how to do that?

Regarding Evaluators in the same process: We used to have this, and the
local runtime still has vestiges of that support. The `ResourceManager` in
the local runtime uses the `ContainerManager` to ultimately launch the
Evaluators. `ContainerManager` is presently hard coded to create
`ProcessContainers` (see line 372). In order to create in-process, let's
call them `ThreadContainers`, that code needs to be made configurable.
Maybe we need a new injectable interface `ContainerFactory` with
implementations for `ProcessContainer` and `ThreadContainer` which we can
use in that line to make a new `Container` instance?

Markus



On Thu, Mar 30, 2017 at 11:44 AM, Matteo Interlandi <m....@gmail.com>
wrote:

> Hi REEF developers,
>
> my name is Matteo Interlandi and I am currently postdoc at UCLA. I am
> getting familiar with REEF and I was noticing that in local mode different
> processes for the driver and evaluators are spawned. While this is good for
> testing, I found a little bit difficult to do debugging and understanding
> the flow of execution. I was wondering if it could be useful to add a
> thread-based local mode, as in Apache Spark. Of course the thread and
> process modes code paths should be as overlapping as possible to help
> testing.
>
> I have an hack working on java but I would like to know from you what
> should be the clean (REEF) way of doing this, or if you already have
> something similar there.
>
> Thanks.
> --Matteo
>