You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by tog <gu...@gmail.com> on 2015/07/23 23:41:53 UTC

Interpreter newbie (may be stupid) question

Hi

I am questioning myself regarding the groovy interpreter. As I do progress,
it is becoming quite similar to the spark interpreter except that I am
using the Groovy REPL instead of the Scala REPL

That gives me 2 options:
  - have the Groovy interpreter grouped with the Spark one - but that would
become quite fat and I believe user will choose either scala and/or
groovy/java but most probably not both
  - keep it separated

That brings me to the point of the name of the current spark interpreter,
would it be better if called scala interpreter ? the same could apply to
pyspark versus spark ?

I have the feeling that the language has precedence over spark itself - but
I am not following zeppelin for a long time so please let me know how it
started and please share your view ?

Cheers

-- 
PGP KeyID: 2048R/EA31CFC9  subkeys.pgp.net

Re: Interpreter newbie (may be stupid) question

Posted by moon soo Lee <mo...@apache.org>.
Hi,

You can consider resource sharing among interpreters in the same group.
Spark, Pyspark, SparkSql are in the same group so that they can share the
resources (eg. SparkContext, User data, some logics like getting progress,
canceling the job, etc).

User can still exclude interpreters from interpreter group via
configuration file if he/she is not using it, without separation.

I think currently a single interpreter group represents a single data
processing backend. (like spark, hive, etc). Single data processing backend
might have many different interface, scala, python, sql in spark's case,
 individual Interpreter supports these interfaces.

In this point of view, I think it's totally up to the purpose of groovy
interpreter.
If the purpose is providing groovy interface(api) for Spark, i think being
in the spark interpreter group is more appropriate. If the purpose is
providing groovy language repl, being in a separate group make sense.

Thanks,
moon


On Sun, Jul 26, 2015 at 5:25 AM tog <gu...@gmail.com> wrote:

> Thanks felix for your answer.
> Although both Java & Scala run on the JVM as you rightly say - they are
> quite different beasts.
> I believe that at the end it is a matter of language preference.
>
> Being in the same group, does that mean that they are seen as a same
> interpreter or can they be shipped separately ?
>
> On 25 July 2015 at 20:39, <fe...@hotmail.com> wrote:
>
> > This is an interesting question I don't have answer to unfortunately.
> >
> >
> > Spark (Scala) and PySpark differences are significant enough that I think
> > some might choose one vs another. Spark (Java) or even Spark Goovy vs
> Spark
> > Scala are pretty close (all JVM), so that is more language preferences.
> >
> >
> > Spark and PySpark interpreters are now in the same interpreter group BTW.
> >
> >
> >
> > From: tog
> >
> > Sent: Thursday, July 23, 10:36 PM
> >
> > Subject: Re: Interpreter newbie (may be stupid) question
> >
> > To: dev@zeppelin.incubator.apache.org
> >
> >
> >
> > I agree with that obviously.
> >
> >
> > Does that mean that we could have also python/scala interpreters - not
> >
> > supporting Spark? (Ok I agree that would not make much sense)
> >
> >
> > The point is that Spark is really polyglot - was supporting scala,
> Python,
> >
> > Java and now R. how are we going to reflect that - taking into account
> that
> >
> > from a user perspective what matters is the language + the big
> >
> > data/analytics capabilities of spark.
> >
> >
> > At the end, my interest is through the Groovy language to support the
> Spark
> >
> > Java api in a REPL so should I call that interpreter Groovy, GroovySpark
> or
> >
> > JavaSpark ?
> >
> >
> > Are PySpark and Spark seen as different interpreters on the long term ?
> >
> >
> > On Friday, July 24, 2015, <fe...@hotmail.com> wrote:
> >
> >
> > > Spark is really a superset of Scala
> >
> > >
> >
> > > And likewise
> >
> > >
> >
> > > PySpark is a superset of Python
> >
> > >
> >
> > >
> >
> > > It looks to me like they are exceptions though. Groovy, Hive, Flink,
> Lens
> >
> > > etc are probably fairly different.
> >
> > >
> >
> > >
> >
> > >
> >
> > >
> >
> > >
> >
> > > On Thu, Jul 23, 2015 at 2:42 PM -0700, "tog" <
> guillaume.alleon@gmail.com
> >
> > > <javascript:;>> wrote:
> >
> > > Hi
> >
> > >
> >
> > > I am questioning myself regarding the groovy interpreter. As I do
> > progress,
> >
> > > it is becoming quite similar to the spark interpreter except that I am
> >
> > > using the Groovy REPL instead of the Scala REPL
> >
> > >
> >
> > > That gives me 2 options:
> >
> > >   - have the Groovy interpreter grouped with the Spark one - but that
> > would
> >
> > > become quite fat and I believe user will choose either scala and/or
> >
> > > groovy/java but most probably not both
> >
> > >   - keep it separated
> >
> > >
> >
> > > That brings me to the point of the name of the current spark
> interpreter,
> >
> > > would it be better if called scala interpreter ? the same could apply
> to
> >
> > > pyspark versus spark ?
> >
> > >
> >
> > > I have the feeling that the language has precedence over spark itself -
> > but
> >
> > > I am not following zeppelin for a long time so please let me know how
> it
> >
> > > started and please share your view ?
> >
> > >
> >
> > > Cheers
> >
> > >
> >
> > > --
> >
> > > PGP KeyID: 2048R/EA31CFC9  subkeys.pgp.net
> >
> > >
> >
> >
> >
> > --
> >
> > PGP KeyID: 2048R/EA31CFC9  subkeys.pgp.net
>
>
>
>
> --
> PGP KeyID: 2048R/EA31CFC9  subkeys.pgp.net
>

Re: Interpreter newbie (may be stupid) question

Posted by tog <gu...@gmail.com>.
Thanks felix for your answer.
Although both Java & Scala run on the JVM as you rightly say - they are
quite different beasts.
I believe that at the end it is a matter of language preference.

Being in the same group, does that mean that they are seen as a same
interpreter or can they be shipped separately ?

On 25 July 2015 at 20:39, <fe...@hotmail.com> wrote:

> This is an interesting question I don't have answer to unfortunately.
>
>
> Spark (Scala) and PySpark differences are significant enough that I think
> some might choose one vs another. Spark (Java) or even Spark Goovy vs Spark
> Scala are pretty close (all JVM), so that is more language preferences.
>
>
> Spark and PySpark interpreters are now in the same interpreter group BTW.
>
>
>
> From: tog
>
> Sent: Thursday, July 23, 10:36 PM
>
> Subject: Re: Interpreter newbie (may be stupid) question
>
> To: dev@zeppelin.incubator.apache.org
>
>
>
> I agree with that obviously.
>
>
> Does that mean that we could have also python/scala interpreters - not
>
> supporting Spark? (Ok I agree that would not make much sense)
>
>
> The point is that Spark is really polyglot - was supporting scala, Python,
>
> Java and now R. how are we going to reflect that - taking into account that
>
> from a user perspective what matters is the language + the big
>
> data/analytics capabilities of spark.
>
>
> At the end, my interest is through the Groovy language to support the Spark
>
> Java api in a REPL so should I call that interpreter Groovy, GroovySpark or
>
> JavaSpark ?
>
>
> Are PySpark and Spark seen as different interpreters on the long term ?
>
>
> On Friday, July 24, 2015, <fe...@hotmail.com> wrote:
>
>
> > Spark is really a superset of Scala
>
> >
>
> > And likewise
>
> >
>
> > PySpark is a superset of Python
>
> >
>
> >
>
> > It looks to me like they are exceptions though. Groovy, Hive, Flink, Lens
>
> > etc are probably fairly different.
>
> >
>
> >
>
> >
>
> >
>
> >
>
> > On Thu, Jul 23, 2015 at 2:42 PM -0700, "tog" <guillaume.alleon@gmail.com
>
> > <javascript:;>> wrote:
>
> > Hi
>
> >
>
> > I am questioning myself regarding the groovy interpreter. As I do
> progress,
>
> > it is becoming quite similar to the spark interpreter except that I am
>
> > using the Groovy REPL instead of the Scala REPL
>
> >
>
> > That gives me 2 options:
>
> >   - have the Groovy interpreter grouped with the Spark one - but that
> would
>
> > become quite fat and I believe user will choose either scala and/or
>
> > groovy/java but most probably not both
>
> >   - keep it separated
>
> >
>
> > That brings me to the point of the name of the current spark interpreter,
>
> > would it be better if called scala interpreter ? the same could apply to
>
> > pyspark versus spark ?
>
> >
>
> > I have the feeling that the language has precedence over spark itself -
> but
>
> > I am not following zeppelin for a long time so please let me know how it
>
> > started and please share your view ?
>
> >
>
> > Cheers
>
> >
>
> > --
>
> > PGP KeyID: 2048R/EA31CFC9  subkeys.pgp.net
>
> >
>
>
>
> --
>
> PGP KeyID: 2048R/EA31CFC9  subkeys.pgp.net




-- 
PGP KeyID: 2048R/EA31CFC9  subkeys.pgp.net

Re: Interpreter newbie (may be stupid) question

Posted by fe...@hotmail.com.
This is an interesting question I don't have answer to unfortunately.


Spark (Scala) and PySpark differences are significant enough that I think some might choose one vs another. Spark (Java) or even Spark Goovy vs Spark Scala are pretty close (all JVM), so that is more language preferences.


Spark and PySpark interpreters are now in the same interpreter group BTW.



From: tog

Sent: Thursday, July 23, 10:36 PM

Subject: Re: Interpreter newbie (may be stupid) question

To: dev@zeppelin.incubator.apache.org



I agree with that obviously. 


Does that mean that we could have also python/scala interpreters - not 

supporting Spark? (Ok I agree that would not make much sense) 


The point is that Spark is really polyglot - was supporting scala, Python, 

Java and now R. how are we going to reflect that - taking into account that 

from a user perspective what matters is the language + the big 

data/analytics capabilities of spark. 


At the end, my interest is through the Groovy language to support the Spark 

Java api in a REPL so should I call that interpreter Groovy, GroovySpark or 

JavaSpark ? 


Are PySpark and Spark seen as different interpreters on the long term ? 


On Friday, July 24, 2015, <fe...@hotmail.com> wrote: 


> Spark is really a superset of Scala 

> 

> And likewise 

> 

> PySpark is a superset of Python 

> 

> 

> It looks to me like they are exceptions though. Groovy, Hive, Flink, Lens 

> etc are probably fairly different. 

> 

> 

> 

> 

> 

> On Thu, Jul 23, 2015 at 2:42 PM -0700, "tog" <guillaume.alleon@gmail.com 

> <javascript:;>> wrote: 

> Hi 

> 

> I am questioning myself regarding the groovy interpreter. As I do progress, 

> it is becoming quite similar to the spark interpreter except that I am 

> using the Groovy REPL instead of the Scala REPL 

> 

> That gives me 2 options: 

>   - have the Groovy interpreter grouped with the Spark one - but that would 

> become quite fat and I believe user will choose either scala and/or 

> groovy/java but most probably not both 

>   - keep it separated 

> 

> That brings me to the point of the name of the current spark interpreter, 

> would it be better if called scala interpreter ? the same could apply to 

> pyspark versus spark ? 

> 

> I have the feeling that the language has precedence over spark itself - but 

> I am not following zeppelin for a long time so please let me know how it 

> started and please share your view ? 

> 

> Cheers 

> 

> -- 

> PGP KeyID: 2048R/EA31CFC9  subkeys.pgp.net 

> 



-- 

PGP KeyID: 2048R/EA31CFC9  subkeys.pgp.net

Re: Interpreter newbie (may be stupid) question

Posted by tog <gu...@gmail.com>.
I agree with that obviously.

Does that mean that we could have also python/scala interpreters - not
supporting Spark? (Ok I agree that would not make much sense)

The point is that Spark is really polyglot - was supporting scala, Python,
Java and now R. how are we going to reflect that - taking into account that
from a user perspective what matters is the language + the big
data/analytics capabilities of spark.

At the end, my interest is through the Groovy language to support the Spark
Java api in a REPL so should I call that interpreter Groovy, GroovySpark or
JavaSpark ?

Are PySpark and Spark seen as different interpreters on the long term ?

On Friday, July 24, 2015, <fe...@hotmail.com> wrote:

> Spark is really a superset of Scala
>
> And likewise
>
> PySpark is a superset of Python
>
>
> It looks to me like they are exceptions though. Groovy, Hive, Flink, Lens
> etc are probably fairly different.
>
>
>
>
>
> On Thu, Jul 23, 2015 at 2:42 PM -0700, "tog" <guillaume.alleon@gmail.com
> <javascript:;>> wrote:
> Hi
>
> I am questioning myself regarding the groovy interpreter. As I do progress,
> it is becoming quite similar to the spark interpreter except that I am
> using the Groovy REPL instead of the Scala REPL
>
> That gives me 2 options:
>   - have the Groovy interpreter grouped with the Spark one - but that would
> become quite fat and I believe user will choose either scala and/or
> groovy/java but most probably not both
>   - keep it separated
>
> That brings me to the point of the name of the current spark interpreter,
> would it be better if called scala interpreter ? the same could apply to
> pyspark versus spark ?
>
> I have the feeling that the language has precedence over spark itself - but
> I am not following zeppelin for a long time so please let me know how it
> started and please share your view ?
>
> Cheers
>
> --
> PGP KeyID: 2048R/EA31CFC9  subkeys.pgp.net
>


-- 
PGP KeyID: 2048R/EA31CFC9  subkeys.pgp.net

Re: Interpreter newbie (may be stupid) question

Posted by fe...@hotmail.com.
Spark is really a superset of Scala

And likewise

PySpark is a superset of Python


It looks to me like they are exceptions though. Groovy, Hive, Flink, Lens etc are probably fairly different.





On Thu, Jul 23, 2015 at 2:42 PM -0700, "tog" <gu...@gmail.com> wrote:
Hi

I am questioning myself regarding the groovy interpreter. As I do progress,
it is becoming quite similar to the spark interpreter except that I am
using the Groovy REPL instead of the Scala REPL

That gives me 2 options:
  - have the Groovy interpreter grouped with the Spark one - but that would
become quite fat and I believe user will choose either scala and/or
groovy/java but most probably not both
  - keep it separated

That brings me to the point of the name of the current spark interpreter,
would it be better if called scala interpreter ? the same could apply to
pyspark versus spark ?

I have the feeling that the language has precedence over spark itself - but
I am not following zeppelin for a long time so please let me know how it
started and please share your view ?

Cheers

--
PGP KeyID: 2048R/EA31CFC9  subkeys.pgp.net