You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Renyi Xiong <re...@gmail.com> on 2015/08/07 05:33:34 UTC

SparkR driver side JNI

why SparkR chose to uses inter-process socket solution eventually on driver
side instead of in-process JNI showed in one of its doc's below (about page
20)?

https://spark-summit.org/wp-content/uploads/2014/07/SparkR-Interactive-R-Programs-at-Scale-Shivaram-Vankataraman-Zongheng-Yang.pdf

Re: SparkR driver side JNI

Posted by Renyi Xiong <re...@gmail.com>.
got it! thanks a lot.

On Fri, Sep 11, 2015 at 11:10 AM, Shivaram Venkataraman <
shivaram@eecs.berkeley.edu> wrote:

> Its possible -- in the sense that a lot of designs are possible. But
> AFAIK there are no clean interfaces for getting all the arguments /
> SparkConf options from spark-submit and its all the more tricker to
> handle scenarios where the first JVM has already created a
> SparkContext that you want to use from R. The inter-process
> communication is cleaner, pretty lightweight and handles all the
> scenarios.
>
> Thanks
> Shivaram
>
> On Fri, Sep 11, 2015 at 10:54 AM, Renyi Xiong <re...@gmail.com>
> wrote:
> > forgot to reply all.
> >
> > I see. but what prevents e.g. R driver getting those command line
> arguments
> > from spark-submit and setting them with SparkConf to R diver's in-process
> > JVM through JNI?
> >
> > On Thu, Sep 10, 2015 at 9:29 PM, Shivaram Venkataraman
> > <sh...@eecs.berkeley.edu> wrote:
> >>
> >> Yeah in addition to the downside of having 2 JVMs the command line
> >> arguments and SparkConf etc. will be set by spark-submit in the first
> >> JVM which won't be available in the second JVM.
> >>
> >> Shivaram
> >>
> >> On Thu, Sep 10, 2015 at 5:18 PM, Renyi Xiong <re...@gmail.com>
> >> wrote:
> >> > for 2nd case where JVM comes up first, we also can launch in-process
> JNI
> >> > just like inter-process mode, correct? (difference is that a 2nd JVM
> >> > gets
> >> > loaded)
> >> >
> >> > On Thu, Aug 6, 2015 at 9:51 PM, Shivaram Venkataraman
> >> > <sh...@eecs.berkeley.edu> wrote:
> >> >>
> >> >> The in-process JNI only works out when the R process comes up first
> >> >> and we launch a JVM inside it. In many deploy modes like YARN (or
> >> >> actually in anything using spark-submit) the JVM comes up first and
> we
> >> >> launch R after that. Using an inter-process solution helps us cover
> >> >> both use cases
> >> >>
> >> >> Thanks
> >> >> Shivaram
> >> >>
> >> >> On Thu, Aug 6, 2015 at 8:33 PM, Renyi Xiong <re...@gmail.com>
> >> >> wrote:
> >> >> > why SparkR chose to uses inter-process socket solution eventually
> on
> >> >> > driver
> >> >> > side instead of in-process JNI showed in one of its doc's below
> >> >> > (about
> >> >> > page
> >> >> > 20)?
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> https://spark-summit.org/wp-content/uploads/2014/07/SparkR-Interactive-R-Programs-at-Scale-Shivaram-Vankataraman-Zongheng-Yang.pdf
> >> >> >
> >> >> >
> >> >
> >> >
> >
> >
>

Re: SparkR driver side JNI

Posted by Shivaram Venkataraman <sh...@eecs.berkeley.edu>.
Its possible -- in the sense that a lot of designs are possible. But
AFAIK there are no clean interfaces for getting all the arguments /
SparkConf options from spark-submit and its all the more tricker to
handle scenarios where the first JVM has already created a
SparkContext that you want to use from R. The inter-process
communication is cleaner, pretty lightweight and handles all the
scenarios.

Thanks
Shivaram

On Fri, Sep 11, 2015 at 10:54 AM, Renyi Xiong <re...@gmail.com> wrote:
> forgot to reply all.
>
> I see. but what prevents e.g. R driver getting those command line arguments
> from spark-submit and setting them with SparkConf to R diver's in-process
> JVM through JNI?
>
> On Thu, Sep 10, 2015 at 9:29 PM, Shivaram Venkataraman
> <sh...@eecs.berkeley.edu> wrote:
>>
>> Yeah in addition to the downside of having 2 JVMs the command line
>> arguments and SparkConf etc. will be set by spark-submit in the first
>> JVM which won't be available in the second JVM.
>>
>> Shivaram
>>
>> On Thu, Sep 10, 2015 at 5:18 PM, Renyi Xiong <re...@gmail.com>
>> wrote:
>> > for 2nd case where JVM comes up first, we also can launch in-process JNI
>> > just like inter-process mode, correct? (difference is that a 2nd JVM
>> > gets
>> > loaded)
>> >
>> > On Thu, Aug 6, 2015 at 9:51 PM, Shivaram Venkataraman
>> > <sh...@eecs.berkeley.edu> wrote:
>> >>
>> >> The in-process JNI only works out when the R process comes up first
>> >> and we launch a JVM inside it. In many deploy modes like YARN (or
>> >> actually in anything using spark-submit) the JVM comes up first and we
>> >> launch R after that. Using an inter-process solution helps us cover
>> >> both use cases
>> >>
>> >> Thanks
>> >> Shivaram
>> >>
>> >> On Thu, Aug 6, 2015 at 8:33 PM, Renyi Xiong <re...@gmail.com>
>> >> wrote:
>> >> > why SparkR chose to uses inter-process socket solution eventually on
>> >> > driver
>> >> > side instead of in-process JNI showed in one of its doc's below
>> >> > (about
>> >> > page
>> >> > 20)?
>> >> >
>> >> >
>> >> >
>> >> > https://spark-summit.org/wp-content/uploads/2014/07/SparkR-Interactive-R-Programs-at-Scale-Shivaram-Vankataraman-Zongheng-Yang.pdf
>> >> >
>> >> >
>> >
>> >
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: SparkR driver side JNI

Posted by Renyi Xiong <re...@gmail.com>.
forgot to reply all.

I see. but what prevents e.g. R driver getting those command line arguments
from spark-submit and setting them with SparkConf to R diver's
in-process JVM through JNI?

On Thu, Sep 10, 2015 at 9:29 PM, Shivaram Venkataraman <
shivaram@eecs.berkeley.edu> wrote:

> Yeah in addition to the downside of having 2 JVMs the command line
> arguments and SparkConf etc. will be set by spark-submit in the first
> JVM which won't be available in the second JVM.
>
> Shivaram
>
> On Thu, Sep 10, 2015 at 5:18 PM, Renyi Xiong <re...@gmail.com>
> wrote:
> > for 2nd case where JVM comes up first, we also can launch in-process JNI
> > just like inter-process mode, correct? (difference is that a 2nd JVM gets
> > loaded)
> >
> > On Thu, Aug 6, 2015 at 9:51 PM, Shivaram Venkataraman
> > <sh...@eecs.berkeley.edu> wrote:
> >>
> >> The in-process JNI only works out when the R process comes up first
> >> and we launch a JVM inside it. In many deploy modes like YARN (or
> >> actually in anything using spark-submit) the JVM comes up first and we
> >> launch R after that. Using an inter-process solution helps us cover
> >> both use cases
> >>
> >> Thanks
> >> Shivaram
> >>
> >> On Thu, Aug 6, 2015 at 8:33 PM, Renyi Xiong <re...@gmail.com>
> wrote:
> >> > why SparkR chose to uses inter-process socket solution eventually on
> >> > driver
> >> > side instead of in-process JNI showed in one of its doc's below (about
> >> > page
> >> > 20)?
> >> >
> >> >
> >> >
> https://spark-summit.org/wp-content/uploads/2014/07/SparkR-Interactive-R-Programs-at-Scale-Shivaram-Vankataraman-Zongheng-Yang.pdf
> >> >
> >> >
> >
> >
>

Re: SparkR driver side JNI

Posted by Shivaram Venkataraman <sh...@eecs.berkeley.edu>.
The in-process JNI only works out when the R process comes up first
and we launch a JVM inside it. In many deploy modes like YARN (or
actually in anything using spark-submit) the JVM comes up first and we
launch R after that. Using an inter-process solution helps us cover
both use cases

Thanks
Shivaram

On Thu, Aug 6, 2015 at 8:33 PM, Renyi Xiong <re...@gmail.com> wrote:
> why SparkR chose to uses inter-process socket solution eventually on driver
> side instead of in-process JNI showed in one of its doc's below (about page
> 20)?
>
> https://spark-summit.org/wp-content/uploads/2014/07/SparkR-Interactive-R-Programs-at-Scale-Shivaram-Vankataraman-Zongheng-Yang.pdf
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org