You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@zeppelin.apache.org by Corneau Damien <co...@apache.org> on 2015/05/13 10:18:41 UTC

Discussion about Zeppelin Interpreters maintenance

Hi,

I just want to open a discussion about Zeppelin Interpreters.

Currently we are accepting interpreters and merging them to the Zeppelin
Branch when people have some done.

I see more and more issues/post on the mailing list regarding some problems
with interpreters, while it is a big part of Zeppelin, I think it will
become quickly hard to maintain.

It could be a better approach to separate Interpreters from the build
(kinda like plugins/package), that people can include easily depending on
their needs, and let the creator take care of issues related to it.

Any thoughts?

Re: Discussion about Zeppelin Interpreters maintenance

Posted by Paul Curtis <pc...@maprtech.com>.

When I wrote about this in another thread, I was experiencing issues with
the Spark-YARN interpreter expecting different version of some JARs
(specifically netty) than the YARN I wished to integrate. As the
interpreters run in stand alone JVMs, it seemed to make sense to me to have
their classpaths distinct from the Zeppelin application.

While I agree it would be easier for deployment to have a unified
classpath, I don't think it's realistic in the Hadoop universe. There are
too many different components and distributions to tie then to one
classpath. Even between versions of Hadoop, there is only so much that can
be accomplished with profiles in the pom to cover all the permutations.
Even with the profiles provided, it was not clear to me which combination
(spark, yarn, hadoop) would provide a clean running application.

paul

On Tue, May 19, 2015 at 1:41 PM, James Carman <ja...@carmanconsulting.com>
wrote:

> On Mon, May 18, 2015 at 11:46 PM Sharad Agarwal <sh...@apache.org> wrote:
>
> >
> > Interpreters should work more like a plugin. All interpreters compile and
> > runtime dependency should be fully isolated regardless of their maturity
> > level. This would keep the core minimal and writing new interpreter
> easier.
> >
>
> +1, if you're wanting to support true "plugins" and you're not isolating
> their classloaders in some way, then you're asking for trouble.
>

-- 
*Paul Curtis*
Senior Sales Engineer
*O: *+1 203-660-0015
*M:* +1 203-539-9705
<http://mapr.com>

Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Re: Discussion about Zeppelin Interpreters maintenance

Posted by Ted Dunning <te...@gmail.com>.


Sent from my iPhone

> On May 19, 2015, at 10:41, James Carman <ja...@carmanconsulting.com> wrote:
> 
>> On Mon, May 18, 2015 at 11:46 PM Sharad Agarwal <sh...@apache.org> wrote:
>> 
>> 
>> Interpreters should work more like a plugin. All interpreters compile and
>> runtime dependency should be fully isolated regardless of their maturity
>> level. This would keep the core minimal and writing new interpreter easier.
>> 
> 
> +1, if you're wanting to support true "plugins" and you're not isolating
> their classloaders in some way, then you're asking for trouble.

Totally true.  Even running them inside the same jvm is sometimes insufficient isolation if jni is involved.

Re: Discussion about Zeppelin Interpreters maintenance

Posted by James Carman <ja...@carmanconsulting.com>.

On Mon, May 18, 2015 at 11:46 PM Sharad Agarwal <sh...@apache.org> wrote:

>
> Interpreters should work more like a plugin. All interpreters compile and
> runtime dependency should be fully isolated regardless of their maturity
> level. This would keep the core minimal and writing new interpreter easier.
>

+1, if you're wanting to support true "plugins" and you're not isolating
their classloaders in some way, then you're asking for trouble.

Re: Discussion about Zeppelin Interpreters maintenance

Posted by Sharad Agarwal <sh...@apache.org>.

On Mon, May 18, 2015 at 11:46 PM, Ted Dunning <te...@gmail.com> wrote:

>
>
> I think that even very mature components should stay in contrib for the
> sake of keeping down the dependency noise and simplifying builds.
>

 I agree with Ted. Recently we were trying to write Lens interpreter and we
struggled a lot due to enormous dependency conflict issues.

Interpreters should work more like a plugin. All interpreters compile and
runtime dependency should be fully isolated regardless of their maturity
level. This would keep the core minimal and writing new interpreter easier.

Re: Discussion about Zeppelin Interpreters maintenance

Posted by Ted Dunning <te...@gmail.com>.

On Mon, May 18, 2015 at 10:57 AM, Roman Shaposhnik <ro...@shaposhnik.org>
wrote:

>  2. keep promoting stuff out of contrib when it reaches a maturity
>    point.
>
> Combination of 1 & 2 effectively take care of both of your points a and b.
>

I don't think so.

I think that even very mature components should stay in contrib for the
sake of keeping down the dependency noise and simplifying builds.

Re: Discussion about Zeppelin Interpreters maintenance

Posted by Roman Shaposhnik <ro...@shaposhnik.org>.

On Sun, May 17, 2015 at 4:09 PM, Ted Dunning <te...@gmail.com> wrote:
> On Sun, May 17, 2015 at 3:57 PM, Roman Shaposhnik <ro...@shaposhnik.org>
> wrote:
>
>> On Sun, May 17, 2015 at 3:53 PM, Ted Dunning <te...@gmail.com>
>> wrote:
>> > I am hearing feedback that Zeppelin is suffering from dependency creep.
>> > The problem is that all of the dependencies for all of the interpreters
>> get
>> > brought in, no matter which interpreters you actually want.
>> >
>> > Leads to problems.  Having a clean chinese wall between the core and the
>> > optional parts helps this a lot.
>>
>> That's how Hadoop contrib was. Stuff in there was there to be exposed,
>> not to affect the core.
>>
>
>
> Hmm... I am talking about things from the other point of view.
>
> What the user sees is impact on their own lives, not the impact on the
> core.  Having everything in the core makes
>
> a) it harder to write a simple interpreter due to increased jar hell
> induced by too many required dependencies that should be optional
>
> b) it harder to use Z because compilation and installation is much more
> complex

I don't see how we're disagreeing. All I'm saying is:
   1. be inclusive and let folks contribute to contrib (no pub intended)
   with the expectations that stuff in there is as-is and is not hooked
   up to things like assemblies, etc.

   2. keep promoting stuff out of contrib when it reaches a maturity
   point.

Combination of 1 & 2 effectively take care of both of your points a and b.

Thanks,
Roman.

Thanks,
Roman.

Re: Discussion about Zeppelin Interpreters maintenance

Posted by Ted Dunning <te...@gmail.com>.

On Sun, May 17, 2015 at 3:57 PM, Roman Shaposhnik <ro...@shaposhnik.org>
wrote:

> On Sun, May 17, 2015 at 3:53 PM, Ted Dunning <te...@gmail.com>
> wrote:
> > I am hearing feedback that Zeppelin is suffering from dependency creep.
> > The problem is that all of the dependencies for all of the interpreters
> get
> > brought in, no matter which interpreters you actually want.
> >
> > Leads to problems.  Having a clean chinese wall between the core and the
> > optional parts helps this a lot.
>
> That's how Hadoop contrib was. Stuff in there was there to be exposed,
> not to affect the core.
>


Hmm... I am talking about things from the other point of view.

What the user sees is impact on their own lives, not the impact on the
core.  Having everything in the core makes

a) it harder to write a simple interpreter due to increased jar hell
induced by too many required dependencies that should be optional

b) it harder to use Z because compilation and installation is much more
complex

Re: Discussion about Zeppelin Interpreters maintenance

Posted by Roman Shaposhnik <ro...@shaposhnik.org>.

On Sun, May 17, 2015 at 3:53 PM, Ted Dunning <te...@gmail.com> wrote:
> I am hearing feedback that Zeppelin is suffering from dependency creep.
> The problem is that all of the dependencies for all of the interpreters get
> brought in, no matter which interpreters you actually want.
>
> Leads to problems.  Having a clean chinese wall between the core and the
> optional parts helps this a lot.

That's how Hadoop contrib was. Stuff in there was there to be exposed,
not to affect the core.

Thanks,
Roman.

Re: Discussion about Zeppelin Interpreters maintenance

Posted by Ted Dunning <te...@gmail.com>.

I am hearing feedback that Zeppelin is suffering from dependency creep.
The problem is that all of the dependencies for all of the interpreters get
brought in, no matter which interpreters you actually want.

Leads to problems.  Having a clean chinese wall between the core and the
optional parts helps this a lot.



On Sun, May 17, 2015 at 3:33 PM, Roman Shaposhnik <ro...@shaposhnik.org>
wrote:

> On Thu, May 14, 2015 at 8:44 PM, Alexander Bezzubov <bz...@apache.org>
> wrote:
> > Roman, James, Jim, Jongyoul,
> >
> > thank you guys for a feedback.
> >
> > James's point on community growing though the interpreter
> > contributors\maintainers sounds great indeed and difficulties supporting
> > external "plugins" with all the versions could overweight the benefits of
> > such separation. And especially valuable is experience with Cloundstack
> > that Jim brought in.
> >
> > I really like the idea of mixed model like the one that Jim has
> described:
> > more mature interpreters should indeed belong to the root, as it is
> already
> > now, so we just need to find a technical mean of separating early-stages
> > ones (until the community steps up to say they are important\mature
> enough,
> > by maintaining them).
> > AFAIK one can also see the same pattern in Apache Spark project.
> >
> > Any thought on how such "zeppelin-extras" might work?
>
> Every project I've been part of has always ended up having contrib/ folder
> for stuff that had that exact 'incubation -- not for real use' flavor to
> it.
>
> Thanks,
> Roman.
>

Re: Discussion about Zeppelin Interpreters maintenance

Posted by Roman Shaposhnik <ro...@shaposhnik.org>.

On Thu, May 14, 2015 at 8:44 PM, Alexander Bezzubov <bz...@apache.org> wrote:
> Roman, James, Jim, Jongyoul,
>
> thank you guys for a feedback.
>
> James's point on community growing though the interpreter
> contributors\maintainers sounds great indeed and difficulties supporting
> external "plugins" with all the versions could overweight the benefits of
> such separation. And especially valuable is experience with Cloundstack
> that Jim brought in.
>
> I really like the idea of mixed model like the one that Jim has described:
> more mature interpreters should indeed belong to the root, as it is already
> now, so we just need to find a technical mean of separating early-stages
> ones (until the community steps up to say they are important\mature enough,
> by maintaining them).
> AFAIK one can also see the same pattern in Apache Spark project.
>
> Any thought on how such "zeppelin-extras" might work?

Every project I've been part of has always ended up having contrib/ folder
for stuff that had that exact 'incubation -- not for real use' flavor to it.

Thanks,
Roman.

Re: Discussion about Zeppelin Interpreters maintenance

Posted by Alexander Bezzubov <bz...@apache.org>.

Roman, James, Jim, Jongyoul,

thank you guys for a feedback.

James's point on community growing though the interpreter
contributors\maintainers sounds great indeed and difficulties supporting
external "plugins" with all the versions could overweight the benefits of
such separation. And especially valuable is experience with Cloundstack
that Jim brought in.

I really like the idea of mixed model like the one that Jim has described:
more mature interpreters should indeed belong to the root, as it is already
now, so we just need to find a technical mean of separating early-stages
ones (until the community steps up to say they are important\mature enough,
by maintaining them).
AFAIK one can also see the same pattern in Apache Spark project.

Any thought on how such "zeppelin-extras" might work?

On Fri, May 15, 2015 at 1:44 AM, Jim Cooley <ji...@ubixlabs.com> wrote:

> Forgive me for chiming in as I’m new to the group, but we had a similar
> problem with another open source project i worked on: Openstack.  There
> were three key concerns: 1) some of them must be in the project or else it
> is highly likely that they will be inadvertently broken by changes in other
> parts of the code and the onus on maintaining core language should be on
> the person making the change.  a separate project makes this very difficult
> to do/enforce.  2) the set is open ended and some are really very
> early-stage while others are more mature.  pushing the burden to keep
> early-stage projects aligned should be on the early-stage project owners
> and not the core.  3) some interpreters must be part of the core project as
> you need to have at least one, but most likely at least a small group of
> canonical interpreters that define the APIs and provide examples for others
> to follow.
>
> Perhaps something of a mix as we arrived at for that project:
>   + base interpreters and ones that are actively maintained and are
> ‘mature’ could be included in the base project.
>   + have a second project that includes the early-stage interpreters, ones
> that are not mature, or do not have a broad base of people supporting them.
>   + you can of course migrate them from one to the other as they mature,
> as the community steps up to say that they are important (by maintaining
> them), or as they become ‘deprecated’ or cease to be maintained actively.
>
> Just a thought,
>
>
> Jim
>
>
>
> On May 14, 2015, at 9:16 AM, Roman Shaposhnik <ro...@shaposhnik.org>
> wrote:
>
> With my mentor hat on -- huge +1 to what James said.
>
> Thanks,
> Roman.
>
> On Thu, May 14, 2015 at 6:09 AM, James Carman
> <ja...@carmanconsulting.com> wrote:
> > Consider the Apache Camel project.  There are a *ton* of components
> > available within Camel and most everything (with compatible license) is
> > within the Camel project itself, not outside.  Keeping the interpreters
> > outside and independent gets into a nightmare situation when new versions
> > come out, because you don't know which ones work with which versions
> (usually
> > folks end up combining them into some grouping anyway to avoid the
> > fragmentation).  You may want to use two different ones in your notebook,
> > but they support completely different versions of the core.  Also, as I
> > said, it would be much tougher to grow a community around the
> interpreters
> > in isolation.  It would be much better to bring them into the fold and
> grow
> > a large community around all of it together.  Now, that doesn't mean that
> > you couldn't still keep some outside (zeppelin-extras perhaps?).  It also
> > doesn't mean that the core has to follow the same lifecycle as the
> > "canonical" interpreters.
> >
> >
> > On Wed, May 13, 2015 at 9:34 PM Jongyoul Lee <jo...@gmail.com> wrote:
> >
> >> +1 for me. I think we can use a wiki for maintaining third party
> >> interpreters.
> >>
> >> On Thu, May 14, 2015 at 10:23 AM, Alexander Bezzubov <bz...@apache.org>
> >> wrote:
> >>
> >>> Hi,
> >>>
> >>> Damien, thanks for bringing the discussion in, and thank everybody for
> >>> opinions: it is indeed very timely as we see more and more submodules
> >> with
> >>> interpreters implementations and having a consistent strategy on how to
> >>> deal with them is very important now.
> >>>
> >>> What I see right now is:
> >>> - *number of interpreters is potentially unbounded*
> >>>   that means that bringing all of them to the root repo is neither
> >>> feasible nor a maintainable solution: we need then not only to make
> sure
> >>> that patches with such contributions are good but also that ALL
> >> maintainers
> >>> will stick around long enough to support further evolution of API until
> >> at
> >>> least a stable release like 1.0
> >>>  As zeppelin is 0.x and is far from being stable on API level -  this
> >> will
> >>> bring us to the situation when the project does not compile because of
> >> some
> >>> interpret is not update really quick.
> >>>
> >>> - *example of other successful extendable systems*
> >>>   Other systems like Apache Corduva (while being a Phonegap) or
> Homebrew
> >>> in early days all had a repos with extension code but then moved to
> just
> >> a
> >>> 'registry' model, meaning that they just host a registry in 'npm'
> fashion
> >>> and only provide a tools and a workflow to use\contribute those
> >> extensions.
> >>>   I.e Ipython does not host all kernels in main repo rather just list
> >> all
> >>> them in the project wiki
> >>>
> >>> - *importance of keeping Zeppelin codebase small and simple for fast
> >>> iterations*
> >>>  Last but not least, this is kind of implication from the first
> >> statement
> >>> but is very important for longevity of the project. The whole field of
> >>> large scale data analytics systems is very dynamic, and it makes
> perfect
> >>> sense, at least to me to, to keep the focus of the project on
> delivering
> >>> the core value rather then covering all potential applications.
> >>>
> >>> That being said, if this would be the vote I would support a
> >>> "registry-like" model, with an external codebases for interpreters AND
> >> some
> >>> tools on zeppelin side to simplify management like
> install\update\delete
> >>> interpreters i.e from Maven or local\remote filesystem URL.
> >>>
> >>> Those changes would require further discussion in case we have a
> >> consensus
> >>> here.
> >>>
> >>>
> >>> On Wed, May 13, 2015 at 9:13 PM, James Carman <
> >> james@carmanconsulting.com>
> >>> wrote:
> >>>
> >>>> Consider this a huge -1 from me
> >>>> On Wed, May 13, 2015 at 7:39 AM James Carman <
> >> james@carmanconsulting.com
> >>>>
> >>>> wrote:
> >>>>
> >>>>> Why not invite the contributors to be committers?  You don't want to
> >>>> leave
> >>>>> the contributors out in the cold.  Likewise, the interpreters are
> >> less
> >>>>> likely to attract new contributions outside the ASF.  You want to
> >>> build a
> >>>>> community around the code.  Vote them in!
> >>>>>
> >>>>> On Wed, May 13, 2015 at 4:18 AM Corneau Damien <
> >> corneadoug@apache.org>
> >>>>> wrote:
> >>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> I just want to open a discussion about Zeppelin Interpreters.
> >>>>>>
> >>>>>> Currently we are accepting interpreters and merging them to the
> >>> Zeppelin
> >>>>>> Branch when people have some done.
> >>>>>>
> >>>>>> I see more and more issues/post on the mailing list regarding some
> >>>>>> problems
> >>>>>> with interpreters, while it is a big part of Zeppelin, I think it
> >> will
> >>>>>> become quickly hard to maintain.
> >>>>>>
> >>>>>> It could be a better approach to separate Interpreters from the
> >> build
> >>>>>> (kinda like plugins/package), that people can include easily
> >> depending
> >>>> on
> >>>>>> their needs, and let the creator take care of issues related to it.
> >>>>>>
> >>>>>> Any thoughts?
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >>
> >>
> >> --
> >> 이종열, Jongyoul Lee, 李宗烈
> >> http://madeng.net
> >>
>
>

Re: Discussion about Zeppelin Interpreters maintenance

Posted by Jim Cooley <ji...@ubixlabs.com>.

Forgive me for chiming in as I’m new to the group, but we had a similar problem with another open source project i worked on: Openstack.  There were three key concerns: 1) some of them must be in the project or else it is highly likely that they will be inadvertently broken by changes in other parts of the code and the onus on maintaining core language should be on the person making the change.  a separate project makes this very difficult to do/enforce.  2) the set is open ended and some are really very early-stage while others are more mature.  pushing the burden to keep early-stage projects aligned should be on the early-stage project owners and not the core.  3) some interpreters must be part of the core project as you need to have at least one, but most likely at least a small group of canonical interpreters that define the APIs and provide examples for others to follow.

Perhaps something of a mix as we arrived at for that project:
  + base interpreters and ones that are actively maintained and are ‘mature’ could be included in the base project.
  + have a second project that includes the early-stage interpreters, ones that are not mature, or do not have a broad base of people supporting them.
  + you can of course migrate them from one to the other as they mature, as the community steps up to say that they are important (by maintaining them), or as they become ‘deprecated’ or cease to be maintained actively.

Just a thought,


Jim
  


On May 14, 2015, at 9:16 AM, Roman Shaposhnik <ro...@shaposhnik.org> wrote:

With my mentor hat on -- huge +1 to what James said.

Thanks,
Roman.

On Thu, May 14, 2015 at 6:09 AM, James Carman
<ja...@carmanconsulting.com> wrote:
> Consider the Apache Camel project.  There are a *ton* of components
> available within Camel and most everything (with compatible license) is
> within the Camel project itself, not outside.  Keeping the interpreters
> outside and independent gets into a nightmare situation when new versions
> come out, because you don't know which ones work with which versions (usually
> folks end up combining them into some grouping anyway to avoid the
> fragmentation).  You may want to use two different ones in your notebook,
> but they support completely different versions of the core.  Also, as I
> said, it would be much tougher to grow a community around the interpreters
> in isolation.  It would be much better to bring them into the fold and grow
> a large community around all of it together.  Now, that doesn't mean that
> you couldn't still keep some outside (zeppelin-extras perhaps?).  It also
> doesn't mean that the core has to follow the same lifecycle as the
> "canonical" interpreters.
> 
> 
> On Wed, May 13, 2015 at 9:34 PM Jongyoul Lee <jo...@gmail.com> wrote:
> 
>> +1 for me. I think we can use a wiki for maintaining third party
>> interpreters.
>> 
>> On Thu, May 14, 2015 at 10:23 AM, Alexander Bezzubov <bz...@apache.org>
>> wrote:
>> 
>>> Hi,
>>> 
>>> Damien, thanks for bringing the discussion in, and thank everybody for
>>> opinions: it is indeed very timely as we see more and more submodules
>> with
>>> interpreters implementations and having a consistent strategy on how to
>>> deal with them is very important now.
>>> 
>>> What I see right now is:
>>> - *number of interpreters is potentially unbounded*
>>>   that means that bringing all of them to the root repo is neither
>>> feasible nor a maintainable solution: we need then not only to make sure
>>> that patches with such contributions are good but also that ALL
>> maintainers
>>> will stick around long enough to support further evolution of API until
>> at
>>> least a stable release like 1.0
>>>  As zeppelin is 0.x and is far from being stable on API level -  this
>> will
>>> bring us to the situation when the project does not compile because of
>> some
>>> interpret is not update really quick.
>>> 
>>> - *example of other successful extendable systems*
>>>   Other systems like Apache Corduva (while being a Phonegap) or Homebrew
>>> in early days all had a repos with extension code but then moved to just
>> a
>>> 'registry' model, meaning that they just host a registry in 'npm' fashion
>>> and only provide a tools and a workflow to use\contribute those
>> extensions.
>>>   I.e Ipython does not host all kernels in main repo rather just list
>> all
>>> them in the project wiki
>>> 
>>> - *importance of keeping Zeppelin codebase small and simple for fast
>>> iterations*
>>>  Last but not least, this is kind of implication from the first
>> statement
>>> but is very important for longevity of the project. The whole field of
>>> large scale data analytics systems is very dynamic, and it makes perfect
>>> sense, at least to me to, to keep the focus of the project on delivering
>>> the core value rather then covering all potential applications.
>>> 
>>> That being said, if this would be the vote I would support a
>>> "registry-like" model, with an external codebases for interpreters AND
>> some
>>> tools on zeppelin side to simplify management like install\update\delete
>>> interpreters i.e from Maven or local\remote filesystem URL.
>>> 
>>> Those changes would require further discussion in case we have a
>> consensus
>>> here.
>>> 
>>> 
>>> On Wed, May 13, 2015 at 9:13 PM, James Carman <
>> james@carmanconsulting.com>
>>> wrote:
>>> 
>>>> Consider this a huge -1 from me
>>>> On Wed, May 13, 2015 at 7:39 AM James Carman <
>> james@carmanconsulting.com
>>>> 
>>>> wrote:
>>>> 
>>>>> Why not invite the contributors to be committers?  You don't want to
>>>> leave
>>>>> the contributors out in the cold.  Likewise, the interpreters are
>> less
>>>>> likely to attract new contributions outside the ASF.  You want to
>>> build a
>>>>> community around the code.  Vote them in!
>>>>> 
>>>>> On Wed, May 13, 2015 at 4:18 AM Corneau Damien <
>> corneadoug@apache.org>
>>>>> wrote:
>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> I just want to open a discussion about Zeppelin Interpreters.
>>>>>> 
>>>>>> Currently we are accepting interpreters and merging them to the
>>> Zeppelin
>>>>>> Branch when people have some done.
>>>>>> 
>>>>>> I see more and more issues/post on the mailing list regarding some
>>>>>> problems
>>>>>> with interpreters, while it is a big part of Zeppelin, I think it
>> will
>>>>>> become quickly hard to maintain.
>>>>>> 
>>>>>> It could be a better approach to separate Interpreters from the
>> build
>>>>>> (kinda like plugins/package), that people can include easily
>> depending
>>>> on
>>>>>> their needs, and let the creator take care of issues related to it.
>>>>>> 
>>>>>> Any thoughts?
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
>> 
>> 
>> --
>> 이종열, Jongyoul Lee, 李宗烈
>> http://madeng.net
>>

Re: Discussion about Zeppelin Interpreters maintenance

Posted by Roman Shaposhnik <ro...@shaposhnik.org>.

With my mentor hat on -- huge +1 to what James said.

Thanks,
Roman.

On Thu, May 14, 2015 at 6:09 AM, James Carman
<ja...@carmanconsulting.com> wrote:
> Consider the Apache Camel project.  There are a *ton* of components
> available within Camel and most everything (with compatible license) is
> within the Camel project itself, not outside.  Keeping the interpreters
> outside and independent gets into a nightmare situation when new versions
> come out, because you don't know which ones work with which versions (usually
> folks end up combining them into some grouping anyway to avoid the
> fragmentation).  You may want to use two different ones in your notebook,
> but they support completely different versions of the core.  Also, as I
> said, it would be much tougher to grow a community around the interpreters
> in isolation.  It would be much better to bring them into the fold and grow
> a large community around all of it together.  Now, that doesn't mean that
> you couldn't still keep some outside (zeppelin-extras perhaps?).  It also
> doesn't mean that the core has to follow the same lifecycle as the
> "canonical" interpreters.
>
>
> On Wed, May 13, 2015 at 9:34 PM Jongyoul Lee <jo...@gmail.com> wrote:
>
>> +1 for me. I think we can use a wiki for maintaining third party
>> interpreters.
>>
>> On Thu, May 14, 2015 at 10:23 AM, Alexander Bezzubov <bz...@apache.org>
>> wrote:
>>
>> > Hi,
>> >
>> > Damien, thanks for bringing the discussion in, and thank everybody for
>> > opinions: it is indeed very timely as we see more and more submodules
>> with
>> > interpreters implementations and having a consistent strategy on how to
>> > deal with them is very important now.
>> >
>> > What I see right now is:
>> >  - *number of interpreters is potentially unbounded*
>> >    that means that bringing all of them to the root repo is neither
>> > feasible nor a maintainable solution: we need then not only to make sure
>> > that patches with such contributions are good but also that ALL
>> maintainers
>> > will stick around long enough to support further evolution of API until
>> at
>> > least a stable release like 1.0
>> >   As zeppelin is 0.x and is far from being stable on API level -  this
>> will
>> > bring us to the situation when the project does not compile because of
>> some
>> > interpret is not update really quick.
>> >
>> >  - *example of other successful extendable systems*
>> >    Other systems like Apache Corduva (while being a Phonegap) or Homebrew
>> > in early days all had a repos with extension code but then moved to just
>> a
>> > 'registry' model, meaning that they just host a registry in 'npm' fashion
>> > and only provide a tools and a workflow to use\contribute those
>> extensions.
>> >    I.e Ipython does not host all kernels in main repo rather just list
>> all
>> > them in the project wiki
>> >
>> >  - *importance of keeping Zeppelin codebase small and simple for fast
>> > iterations*
>> >   Last but not least, this is kind of implication from the first
>> statement
>> > but is very important for longevity of the project. The whole field of
>> > large scale data analytics systems is very dynamic, and it makes perfect
>> > sense, at least to me to, to keep the focus of the project on delivering
>> > the core value rather then covering all potential applications.
>> >
>> > That being said, if this would be the vote I would support a
>> > "registry-like" model, with an external codebases for interpreters AND
>> some
>> > tools on zeppelin side to simplify management like install\update\delete
>> > interpreters i.e from Maven or local\remote filesystem URL.
>> >
>> > Those changes would require further discussion in case we have a
>> consensus
>> > here.
>> >
>> >
>> > On Wed, May 13, 2015 at 9:13 PM, James Carman <
>> james@carmanconsulting.com>
>> > wrote:
>> >
>> > > Consider this a huge -1 from me
>> > > On Wed, May 13, 2015 at 7:39 AM James Carman <
>> james@carmanconsulting.com
>> > >
>> > > wrote:
>> > >
>> > > > Why not invite the contributors to be committers?  You don't want to
>> > > leave
>> > > > the contributors out in the cold.  Likewise, the interpreters are
>> less
>> > > > likely to attract new contributions outside the ASF.  You want to
>> > build a
>> > > > community around the code.  Vote them in!
>> > > >
>> > > > On Wed, May 13, 2015 at 4:18 AM Corneau Damien <
>> corneadoug@apache.org>
>> > > > wrote:
>> > > >
>> > > >> Hi,
>> > > >>
>> > > >> I just want to open a discussion about Zeppelin Interpreters.
>> > > >>
>> > > >> Currently we are accepting interpreters and merging them to the
>> > Zeppelin
>> > > >> Branch when people have some done.
>> > > >>
>> > > >> I see more and more issues/post on the mailing list regarding some
>> > > >> problems
>> > > >> with interpreters, while it is a big part of Zeppelin, I think it
>> will
>> > > >> become quickly hard to maintain.
>> > > >>
>> > > >> It could be a better approach to separate Interpreters from the
>> build
>> > > >> (kinda like plugins/package), that people can include easily
>> depending
>> > > on
>> > > >> their needs, and let the creator take care of issues related to it.
>> > > >>
>> > > >> Any thoughts?
>> > > >>
>> > > >
>> > >
>> >
>>
>>
>>
>> --
>> 이종열, Jongyoul Lee, 李宗烈
>> http://madeng.net
>>

Re: Discussion about Zeppelin Interpreters maintenance

Posted by James Carman <ja...@carmanconsulting.com>.

Consider the Apache Camel project.  There are a *ton* of components
available within Camel and most everything (with compatible license) is
within the Camel project itself, not outside.  Keeping the interpreters
outside and independent gets into a nightmare situation when new versions
come out, because you don't know which ones work with which versions (usually
folks end up combining them into some grouping anyway to avoid the
fragmentation).  You may want to use two different ones in your notebook,
but they support completely different versions of the core.  Also, as I
said, it would be much tougher to grow a community around the interpreters
in isolation.  It would be much better to bring them into the fold and grow
a large community around all of it together.  Now, that doesn't mean that
you couldn't still keep some outside (zeppelin-extras perhaps?).  It also
doesn't mean that the core has to follow the same lifecycle as the
"canonical" interpreters.


On Wed, May 13, 2015 at 9:34 PM Jongyoul Lee <jo...@gmail.com> wrote:

> +1 for me. I think we can use a wiki for maintaining third party
> interpreters.
>
> On Thu, May 14, 2015 at 10:23 AM, Alexander Bezzubov <bz...@apache.org>
> wrote:
>
> > Hi,
> >
> > Damien, thanks for bringing the discussion in, and thank everybody for
> > opinions: it is indeed very timely as we see more and more submodules
> with
> > interpreters implementations and having a consistent strategy on how to
> > deal with them is very important now.
> >
> > What I see right now is:
> >  - *number of interpreters is potentially unbounded*
> >    that means that bringing all of them to the root repo is neither
> > feasible nor a maintainable solution: we need then not only to make sure
> > that patches with such contributions are good but also that ALL
> maintainers
> > will stick around long enough to support further evolution of API until
> at
> > least a stable release like 1.0
> >   As zeppelin is 0.x and is far from being stable on API level -  this
> will
> > bring us to the situation when the project does not compile because of
> some
> > interpret is not update really quick.
> >
> >  - *example of other successful extendable systems*
> >    Other systems like Apache Corduva (while being a Phonegap) or Homebrew
> > in early days all had a repos with extension code but then moved to just
> a
> > 'registry' model, meaning that they just host a registry in 'npm' fashion
> > and only provide a tools and a workflow to use\contribute those
> extensions.
> >    I.e Ipython does not host all kernels in main repo rather just list
> all
> > them in the project wiki
> >
> >  - *importance of keeping Zeppelin codebase small and simple for fast
> > iterations*
> >   Last but not least, this is kind of implication from the first
> statement
> > but is very important for longevity of the project. The whole field of
> > large scale data analytics systems is very dynamic, and it makes perfect
> > sense, at least to me to, to keep the focus of the project on delivering
> > the core value rather then covering all potential applications.
> >
> > That being said, if this would be the vote I would support a
> > "registry-like" model, with an external codebases for interpreters AND
> some
> > tools on zeppelin side to simplify management like install\update\delete
> > interpreters i.e from Maven or local\remote filesystem URL.
> >
> > Those changes would require further discussion in case we have a
> consensus
> > here.
> >
> >
> > On Wed, May 13, 2015 at 9:13 PM, James Carman <
> james@carmanconsulting.com>
> > wrote:
> >
> > > Consider this a huge -1 from me
> > > On Wed, May 13, 2015 at 7:39 AM James Carman <
> james@carmanconsulting.com
> > >
> > > wrote:
> > >
> > > > Why not invite the contributors to be committers?  You don't want to
> > > leave
> > > > the contributors out in the cold.  Likewise, the interpreters are
> less
> > > > likely to attract new contributions outside the ASF.  You want to
> > build a
> > > > community around the code.  Vote them in!
> > > >
> > > > On Wed, May 13, 2015 at 4:18 AM Corneau Damien <
> corneadoug@apache.org>
> > > > wrote:
> > > >
> > > >> Hi,
> > > >>
> > > >> I just want to open a discussion about Zeppelin Interpreters.
> > > >>
> > > >> Currently we are accepting interpreters and merging them to the
> > Zeppelin
> > > >> Branch when people have some done.
> > > >>
> > > >> I see more and more issues/post on the mailing list regarding some
> > > >> problems
> > > >> with interpreters, while it is a big part of Zeppelin, I think it
> will
> > > >> become quickly hard to maintain.
> > > >>
> > > >> It could be a better approach to separate Interpreters from the
> build
> > > >> (kinda like plugins/package), that people can include easily
> depending
> > > on
> > > >> their needs, and let the creator take care of issues related to it.
> > > >>
> > > >> Any thoughts?
> > > >>
> > > >
> > >
> >
>
>
>
> --
> 이종열, Jongyoul Lee, 李宗烈
> http://madeng.net
>

Re: Discussion about Zeppelin Interpreters maintenance

Posted by Jongyoul Lee <jo...@gmail.com>.

+1 for me. I think we can use a wiki for maintaining third party
interpreters.

On Thu, May 14, 2015 at 10:23 AM, Alexander Bezzubov <bz...@apache.org> wrote:

> Hi,
>
> Damien, thanks for bringing the discussion in, and thank everybody for
> opinions: it is indeed very timely as we see more and more submodules with
> interpreters implementations and having a consistent strategy on how to
> deal with them is very important now.
>
> What I see right now is:
>  - *number of interpreters is potentially unbounded*
>    that means that bringing all of them to the root repo is neither
> feasible nor a maintainable solution: we need then not only to make sure
> that patches with such contributions are good but also that ALL maintainers
> will stick around long enough to support further evolution of API until at
> least a stable release like 1.0
>   As zeppelin is 0.x and is far from being stable on API level -  this will
> bring us to the situation when the project does not compile because of some
> interpret is not update really quick.
>
>  - *example of other successful extendable systems*
>    Other systems like Apache Corduva (while being a Phonegap) or Homebrew
> in early days all had a repos with extension code but then moved to just a
> 'registry' model, meaning that they just host a registry in 'npm' fashion
> and only provide a tools and a workflow to use\contribute those extensions.
>    I.e Ipython does not host all kernels in main repo rather just list all
> them in the project wiki
>
>  - *importance of keeping Zeppelin codebase small and simple for fast
> iterations*
>   Last but not least, this is kind of implication from the first statement
> but is very important for longevity of the project. The whole field of
> large scale data analytics systems is very dynamic, and it makes perfect
> sense, at least to me to, to keep the focus of the project on delivering
> the core value rather then covering all potential applications.
>
> That being said, if this would be the vote I would support a
> "registry-like" model, with an external codebases for interpreters AND some
> tools on zeppelin side to simplify management like install\update\delete
> interpreters i.e from Maven or local\remote filesystem URL.
>
> Those changes would require further discussion in case we have a consensus
> here.
>
>
> On Wed, May 13, 2015 at 9:13 PM, James Carman <ja...@carmanconsulting.com>
> wrote:
>
> > Consider this a huge -1 from me
> > On Wed, May 13, 2015 at 7:39 AM James Carman <james@carmanconsulting.com
> >
> > wrote:
> >
> > > Why not invite the contributors to be committers?  You don't want to
> > leave
> > > the contributors out in the cold.  Likewise, the interpreters are less
> > > likely to attract new contributions outside the ASF.  You want to
> build a
> > > community around the code.  Vote them in!
> > >
> > > On Wed, May 13, 2015 at 4:18 AM Corneau Damien <co...@apache.org>
> > > wrote:
> > >
> > >> Hi,
> > >>
> > >> I just want to open a discussion about Zeppelin Interpreters.
> > >>
> > >> Currently we are accepting interpreters and merging them to the
> Zeppelin
> > >> Branch when people have some done.
> > >>
> > >> I see more and more issues/post on the mailing list regarding some
> > >> problems
> > >> with interpreters, while it is a big part of Zeppelin, I think it will
> > >> become quickly hard to maintain.
> > >>
> > >> It could be a better approach to separate Interpreters from the build
> > >> (kinda like plugins/package), that people can include easily depending
> > on
> > >> their needs, and let the creator take care of issues related to it.
> > >>
> > >> Any thoughts?
> > >>
> > >
> >
>



-- 
이종열, Jongyoul Lee, 李宗烈
http://madeng.net

Re: Discussion about Zeppelin Interpreters maintenance

Posted by Alexander Bezzubov <bz...@apache.org>.

Hi,

Damien, thanks for bringing the discussion in, and thank everybody for
opinions: it is indeed very timely as we see more and more submodules with
interpreters implementations and having a consistent strategy on how to
deal with them is very important now.

What I see right now is:
 - *number of interpreters is potentially unbounded*
   that means that bringing all of them to the root repo is neither
feasible nor a maintainable solution: we need then not only to make sure
that patches with such contributions are good but also that ALL maintainers
will stick around long enough to support further evolution of API until at
least a stable release like 1.0
  As zeppelin is 0.x and is far from being stable on API level -  this will
bring us to the situation when the project does not compile because of some
interpret is not update really quick.

 - *example of other successful extendable systems*
   Other systems like Apache Corduva (while being a Phonegap) or Homebrew
in early days all had a repos with extension code but then moved to just a
'registry' model, meaning that they just host a registry in 'npm' fashion
and only provide a tools and a workflow to use\contribute those extensions.
   I.e Ipython does not host all kernels in main repo rather just list all
them in the project wiki

 - *importance of keeping Zeppelin codebase small and simple for fast
iterations*
  Last but not least, this is kind of implication from the first statement
but is very important for longevity of the project. The whole field of
large scale data analytics systems is very dynamic, and it makes perfect
sense, at least to me to, to keep the focus of the project on delivering
the core value rather then covering all potential applications.

That being said, if this would be the vote I would support a
"registry-like" model, with an external codebases for interpreters AND some
tools on zeppelin side to simplify management like install\update\delete
interpreters i.e from Maven or local\remote filesystem URL.

Those changes would require further discussion in case we have a consensus
here.


On Wed, May 13, 2015 at 9:13 PM, James Carman <ja...@carmanconsulting.com>
wrote:

> Consider this a huge -1 from me
> On Wed, May 13, 2015 at 7:39 AM James Carman <ja...@carmanconsulting.com>
> wrote:
>
> > Why not invite the contributors to be committers?  You don't want to
> leave
> > the contributors out in the cold.  Likewise, the interpreters are less
> > likely to attract new contributions outside the ASF.  You want to build a
> > community around the code.  Vote them in!
> >
> > On Wed, May 13, 2015 at 4:18 AM Corneau Damien <co...@apache.org>
> > wrote:
> >
> >> Hi,
> >>
> >> I just want to open a discussion about Zeppelin Interpreters.
> >>
> >> Currently we are accepting interpreters and merging them to the Zeppelin
> >> Branch when people have some done.
> >>
> >> I see more and more issues/post on the mailing list regarding some
> >> problems
> >> with interpreters, while it is a big part of Zeppelin, I think it will
> >> become quickly hard to maintain.
> >>
> >> It could be a better approach to separate Interpreters from the build
> >> (kinda like plugins/package), that people can include easily depending
> on
> >> their needs, and let the creator take care of issues related to it.
> >>
> >> Any thoughts?
> >>
> >
>

Re: Discussion about Zeppelin Interpreters maintenance

Posted by James Carman <ja...@carmanconsulting.com>.

Consider this a huge -1 from me
On Wed, May 13, 2015 at 7:39 AM James Carman <ja...@carmanconsulting.com>
wrote:

> Why not invite the contributors to be committers?  You don't want to leave
> the contributors out in the cold.  Likewise, the interpreters are less
> likely to attract new contributions outside the ASF.  You want to build a
> community around the code.  Vote them in!
>
> On Wed, May 13, 2015 at 4:18 AM Corneau Damien <co...@apache.org>
> wrote:
>
>> Hi,
>>
>> I just want to open a discussion about Zeppelin Interpreters.
>>
>> Currently we are accepting interpreters and merging them to the Zeppelin
>> Branch when people have some done.
>>
>> I see more and more issues/post on the mailing list regarding some
>> problems
>> with interpreters, while it is a big part of Zeppelin, I think it will
>> become quickly hard to maintain.
>>
>> It could be a better approach to separate Interpreters from the build
>> (kinda like plugins/package), that people can include easily depending on
>> their needs, and let the creator take care of issues related to it.
>>
>> Any thoughts?
>>
>

Re: Discussion about Zeppelin Interpreters maintenance

Posted by James Carman <ja...@carmanconsulting.com>.

Why not invite the contributors to be committers?  You don't want to leave
the contributors out in the cold.  Likewise, the interpreters are less
likely to attract new contributions outside the ASF.  You want to build a
community around the code.  Vote them in!

On Wed, May 13, 2015 at 4:18 AM Corneau Damien <co...@apache.org>
wrote:

> Hi,
>
> I just want to open a discussion about Zeppelin Interpreters.
>
> Currently we are accepting interpreters and merging them to the Zeppelin
> Branch when people have some done.
>
> I see more and more issues/post on the mailing list regarding some problems
> with interpreters, while it is a big part of Zeppelin, I think it will
> become quickly hard to maintain.
>
> It could be a better approach to separate Interpreters from the build
> (kinda like plugins/package), that people can include easily depending on
> their needs, and let the creator take care of issues related to it.
>
> Any thoughts?
>

Re: Discussion about Zeppelin Interpreters maintenance

Posted by moon soo Lee <mo...@apache.org>.

+1 from me.

One small downside is, when there're Interpreter API changes, it'll be much
difficult to make all interpreters update.

On Wed, May 13, 2015 at 5:18 PM Corneau Damien <co...@apache.org>
wrote:

> Hi,
>
> I just want to open a discussion about Zeppelin Interpreters.
>
> Currently we are accepting interpreters and merging them to the Zeppelin
> Branch when people have some done.
>
> I see more and more issues/post on the mailing list regarding some problems
> with interpreters, while it is a big part of Zeppelin, I think it will
> become quickly hard to maintain.
>
> It could be a better approach to separate Interpreters from the build
> (kinda like plugins/package), that people can include easily depending on
> their needs, and let the creator take care of issues related to it.
>
> Any thoughts?
>