You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@zeppelin.apache.org by Mina Lee <mi...@apache.org> on 2017/01/18 03:11:55 UTC

[DISCUSS] Release package size

Hi all,

Zeppelin is about to start 0.7.0 release process, I would like to discuss
about binary package distribution.

Every time we distribute new binary package, size of the
zeppelin-0.x.x-bin-all.tgz package is getting bigger:
   - zeppelin-0.6.0-bin-all.tgz: 506M
   - zeppelin-0.6.1-bin-all.tgz: 517M
   - zeppelin-0.6.2-bin-all.tgz: 547M
   - zeppelin-0.7.0-bin-all.tgz: 720M (Expected)

Mostly it is because the number of interpreters supported by zeppelin keeps
growing,
and there is high chance that we support more interpreters in the near
future.
So instead of asking apache infra team to increase limit,
I would like to suggest to have only zeppelin-0.7.0-bin-netinst.tgz, which
only includes spark interpreter from 0.7.0 release.
One concern is that users need one more step to install the interpreters
they use,
but I believe it can be done easily with single line of command [1].

FYI, attaching the link of similar discussion [2] we had last June in
mailing list.

Regards,
Mina

[1]
http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html#install-specific-interpreters
<http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html>
[2]
https://lists.apache.org/thread.html/4b54c034cf8d691655156e0cb647243180c57a6829d97aa3c085b63c@%3Cusers.zeppelin.apache.org%3E

Re: [DISCUSS] Release package size

Posted by moon soo Lee <mo...@apache.org>.
Hi,

+1 for releasing netinst package only.

Regarding make binary package only some packages, like spark, markdown,
jdbc, we have discussed having minimal package in [1].
And i still think it's very difficult to decide which interpreter need to
be included which is not. For example i prefer to have 'sh' and 'python' be
included too and some people might have other opinions. And it's difficult
to say why some interpreters included but the other interpreters can not be
included in binary release, unless we have some policy that everyone agree.

Regarding 3rd party interpreter,
Nothing stops build interpreter in a separate project. Zeppelin's
interpreter installation script [2] supports 3rd party interpreter and
Zeppelin already capable of loading 3rd party interpreter binary. However,
i haven't seen many people using this feature. I also have some idea how we
can encourage making 3rd party interpreter. Let's open separate thread and
discuss there.

Thanks,
moon

[1]
https://lists.apache.org/thread.html/4b54c034cf8d691655156e0cb647243180c57a6829d97aa3c085b63c@%3Cusers.zeppelin.apache.org%3E
[2]
http://zeppelin.apache.org/docs/0.7.0-SNAPSHOT/manual/interpreterinstallation.html#3rd-party-interpreters


On Tue, Jan 17, 2017 at 8:05 PM Jeff Zhang <zj...@gmail.com> wrote:

>
> Another thing I'd like to talk is that should we move most of interpreters
> out of zeppelin project to somewhere else just like spark do for
> spark-packages, 2 benefits:
>
> 1. Keep the zeppelin project much smaller
> 2. Each interpreter's improvements won't be blocked by the release of
> zeppelin. Interpreters can has its own release cycle as long as
> zeppelin-interpreter doesn't break the compatibility.
>
> If it make sense, I can open another thread to discuss it.
>
>
>
>
> Jun Kim <i2...@gmail.com>于2017年1月18日周三 上午11:55写道:
>
> +1 for Jeff's idea! I also use the three interpreters mainly :)
>
> 2017년 1월 18일 (수) 오후 12:52, Jeff Zhang <zj...@gmail.com>님이 작성:
>
>
> How about also include markdown and jdbc interpreter if this won't cause
> binary distribution much bigger ? I guess spark, markdown, and jdbc
> interpreters are the top 3 interpreters in zeppelin.
>
>
>
> Ahyoung Ryu <ah...@apache.org>于2017年1月18日周三 上午11:33写道:
>
> Thanks Mina always!
> +1 for releasing only netinst package.
>
> On Wed, Jan 18, 2017 at 12:29 PM, Prabhjyot Singh <
> prabhjyotsingh@apache.org> wrote:
>
> +1
>
> I don't think it's a problem now, but if it keeps increasing then in the
> subsequent releases we can ship Zeppelin with few interpreters, and mark
> others as plugins that can be downloaded later with instructions with how
> to configure.
>
> On Jan 18, 2017 8:54 AM, "Jun Kim" <i2...@gmail.com> wrote:
>
> +1
>
> I think it won't be a problem if we notice it clear.
> Maybe we can do that next to the download button here (
> http://zeppelin.apache.org/download.html)
> A message may be "NOTE: only spark interpreter included since 0.7.0. If
> you want other interpreters, please see interpreter installation guide"
>
> 2017년 1월 18일 (수) 오후 12:14, Jeff Zhang <zj...@gmail.com>님이 작성:
>
>
> +1, we should also mention it in release note and in the 0.7 doc
>
>
>
> Mina Lee <mi...@apache.org>于2017年1月18日周三 上午11:12写道:
>
> Hi all,
>
> Zeppelin is about to start 0.7.0 release process, I would like to discuss
> about binary package distribution.
>
> Every time we distribute new binary package, size of the
> zeppelin-0.x.x-bin-all.tgz package is getting bigger:
>    - zeppelin-0.6.0-bin-all.tgz: 506M
>    - zeppelin-0.6.1-bin-all.tgz: 517M
>    - zeppelin-0.6.2-bin-all.tgz: 547M
>    - zeppelin-0.7.0-bin-all.tgz: 720M (Expected)
>
> Mostly it is because the number of interpreters supported by zeppelin
> keeps growing,
> and there is high chance that we support more interpreters in the near
> future.
> So instead of asking apache infra team to increase limit,
> I would like to suggest to have only zeppelin-0.7.0-bin-netinst.tgz, which
> only includes spark interpreter from 0.7.0 release.
> One concern is that users need one more step to install the interpreters
> they use,
> but I believe it can be done easily with single line of command [1].
>
> FYI, attaching the link of similar discussion [2] we had last June in
> mailing list.
>
> Regards,
> Mina
>
> [1]
> http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html#install-specific-interpreters
> <http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html>
> [2]
> https://lists.apache.org/thread.html/4b54c034cf8d691655156e0cb647243180c57a6829d97aa3c085b63c@%3Cusers.zeppelin.apache.org%3E
>
> --
> Taejun Kim
>
> Data Mining Lab.
> School of Electrical and Computer Engineering
> University of Seoul
>
>
> --
> Taejun Kim
>
> Data Mining Lab.
> School of Electrical and Computer Engineering
> University of Seoul
>
>

Re: [DISCUSS] Release package size

Posted by Mina Lee <mi...@apache.org>.
Decision making taking more time than I expected and
I think this shouldn't be blocker for 0.7.0.

We can take more time deciding which interpreters can be included or
excluded.
Until then, I am just going to go with our current one: zeppelin-bin-all,
zeppelin-bin-netinst.

Moon's suggestion looks good too.
Here I summarized interpreter lists that can be included for each option:
 a. Min package includes interpreters, binary size less than 10MB
      > angular, bigquery, hdfs, kylin, livy, md, postgresql, python, sh
 b. Min package includes interpreters 5 or more JIRA issue created per
month.
      > Need to track. This can be overload for release process.
 c. Min package includes/exclude interpreter that community decide via
formal vote.
     > md, jdbc, spark (based on this mailing thread)



On Fri, Jan 20, 2017 at 5:57 PM moon soo Lee <mo...@apache.org> wrote:

> Hi,
>
> I think we need to have some policy to decide which interpreter goes into
> zeppelin-bin-min package. And make applying that policy as a part of
> release process.
> Because i can not see any consistent rule except for "it seems" or "i
> guess". And i have no idea how i can explain if somebody ask 'why python is
> not in min package?' 'why xxx is not in min package?'.
>
> If we really want to min package, we must have a policy that gives
> everyone same expectation which goes to min package and which goes not.
> Once we agree on policy we can make it part of the release process.
>
> So, why don't we try define policy together? Here's some idea i can throw.
>
>  a. Min package includes interpreters, binary size less than 10MB
>  b. Min package includes interpreters 5 or more JIRA issue created per
> month.
>  c. Min package includes/exclude interpreter that community decide via
> formal vote.
>
> "10MB", "5 or more" they are number i just made up. We can change them to
> more reasonable numbers.
> Also a,b,c are possible examples. We can refine them, we can use only one,
> we can use all three, we can add more.
>
> My point is, we need to give everyone the same expectation which goes min
> package, which goes not.
> What do you think?
>
> Thanks,
> moon
>
> On Thu, Jan 19, 2017 at 12:47 AM Mina Lee <mi...@apache.org> wrote:
>
> Thank you for sharing your opinion guys.
>
> I like Eric's approach.
> We are planning to provide official docker managed by community.
> There is ongoing work [1] around it, I can focus on this after 0.7.0
> release.
>
> It seems that majority prefers binary package with top used interpreters
> such as spark, md, jdbc.
> I think we can gradually move to providing only netinst package once
> docker is ready.
> For upcoming 0.7.0 release, I'd like to distribute two binary packages:
>   - zeppelin-bin-min(spark, jdbc, md)
>   - zeppelin-bin-netinst(spark only)
>
> [1] https://github.com/apache/zeppelin/pull/1761
>
> Thanks,
> Mina
>
> On Thu, Jan 19, 2017 at 1:57 AM Jongyoul Lee <jo...@gmail.com> wrote:
>
> I like to deploy netinst only. And it's good idea that Apache Zeppelin
> supports official docker image with all possible interpreters.
>
> On Wed, Jan 18, 2017 at 7:42 PM, Eric Pugh <
> epugh@opensourceconnections.com> wrote:
>
> Can I throw out an alternate approach?   I feel like the key value of the
> “-all” option is to simplify the life of someone who is new to Zeppelin.
>  If you’re a sophisticated Zeppelin user, then picking and choosing
> interpreters is easy, and you you grok why you want to do that….
>
> However, for myself, when I want to demo Zeppelin, I go straight to one of
> the Docker images, specifically
> https://github.com/dylanmei/docker-zeppelin because it bundles in
> everything.
>
> Would providing a similar Docker image on the “Get Zeppelin” page that
> bundles in all the dependencies and interpreters solve the “how do I try
> Zeppelin in 5 minutes” challenge?  The “Get Zeppelin” page is rather
> daunting page!
>
> Eric
>
>
> On Jan 18, 2017, at 12:00 AM, Mohit Jaggi <mo...@gmail.com> wrote:
>
>  Including ALL interpreters is not feasible, not due to download size as
> that is easily increased but because we wouldn't want to couple the release
> cycles as pointed out by Jeff. IMHO a few of the most popular ones should
> be included. Yes it is just one extra step but if a computer can do it why
> make a human suffer? :-)
> Re: spark-packages, Spark does include important and mature functionality
> in its assembly e.g. Csv parser was merged into core spark when it matured.
> I believe Z should do the same.
>
> Sent from my iPhone
>
> On Jan 17, 2017, at 8:05 PM, Jeff Zhang <zj...@gmail.com> wrote:
>
>
> Another thing I'd like to talk is that should we move most of interpreters
> out of zeppelin project to somewhere else just like spark do for
> spark-packages, 2 benefits:
>
> 1. Keep the zeppelin project much smaller
> 2. Each interpreter's improvements won't be blocked by the release of
> zeppelin. Interpreters can has its own release cycle as long as
> zeppelin-interpreter doesn't break the compatibility.
>
> If it make sense, I can open another thread to discuss it.
>
>
>
>
> Jun Kim <i2...@gmail.com>于2017年1月18日周三 上午11:55写道:
>
> +1 for Jeff's idea! I also use the three interpreters mainly :)
>
> 2017년 1월 18일 (수) 오후 12:52, Jeff Zhang <zj...@gmail.com>님이 작성:
>
>
> How about also include markdown and jdbc interpreter if this won't cause
> binary distribution much bigger ? I guess spark, markdown, and jdbc
> interpreters are the top 3 interpreters in zeppelin.
>
>
>
> Ahyoung Ryu <ah...@apache.org>于2017年1月18日周三 上午11:33写道:
>
> Thanks Mina always!
> +1 for releasing only netinst package.
>
> On Wed, Jan 18, 2017 at 12:29 PM, Prabhjyot Singh <
> prabhjyotsingh@apache.org> wrote:
>
> +1
>
> I don't think it's a problem now, but if it keeps increasing then in the
> subsequent releases we can ship Zeppelin with few interpreters, and mark
> others as plugins that can be downloaded later with instructions with how
> to configure.
>
> On Jan 18, 2017 8:54 AM, "Jun Kim" <i2...@gmail.com> wrote:
>
> +1
>
> I think it won't be a problem if we notice it clear.
> Maybe we can do that next to the download button here (
> http://zeppelin.apache.org/download.html)
> A message may be "NOTE: only spark interpreter included since 0.7.0. If
> you want other interpreters, please see interpreter installation guide"
>
> 2017년 1월 18일 (수) 오후 12:14, Jeff Zhang <zj...@gmail.com>님이 작성:
>
>
> +1, we should also mention it in release note and in the 0.7 doc
>
>
>
> Mina Lee <mi...@apache.org>于2017年1月18日周三 上午11:12写道:
>
> Hi all,
>
> Zeppelin is about to start 0.7.0 release process, I would like to discuss
> about binary package distribution.
>
> Every time we distribute new binary package, size of the
> zeppelin-0.x.x-bin-all.tgz package is getting bigger:
>    - zeppelin-0.6.0-bin-all.tgz: 506M
>    - zeppelin-0.6.1-bin-all.tgz: 517M
>    - zeppelin-0.6.2-bin-all.tgz: 547M
>    - zeppelin-0.7.0-bin-all.tgz: 720M (Expected)
>
> Mostly it is because the number of interpreters supported by zeppelin
> keeps growing,
> and there is high chance that we support more interpreters in the near
> future.
> So instead of asking apache infra team to increase limit,
> I would like to suggest to have only zeppelin-0.7.0-bin-netinst.tgz, which
> only includes spark interpreter from 0.7.0 release.
> One concern is that users need one more step to install the interpreters
> they use,
> but I believe it can be done easily with single line of command [1].
>
> FYI, attaching the link of similar discussion [2] we had last June in
> mailing list.
>
> Regards,
> Mina
>
> [1]
> http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html#install-specific-interpreters
> <http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html>
> [2]
> https://lists.apache.org/thread.html/4b54c034cf8d691655156e0cb647243180c57a6829d97aa3c085b63c@%3Cusers.zeppelin.apache.org%3E
>
> --
> Taejun Kim
>
> Data Mining Lab.
> School of Electrical and Computer Engineering
> University of Seoul
>
>
> --
> Taejun Kim
>
> Data Mining Lab.
> School of Electrical and Computer Engineering
> University of Seoul
>
>
>
> _______________________
> *Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC | 434.466.1467
> | http://www.opensourceconnections.com | My Free/Busy
> <http://tinyurl.com/eric-cal>
> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed
> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless
> of whether attachments are marked as such.
>
>
>
>
> --
> 이종열, Jongyoul Lee, 李宗烈
> http://madeng.net
>
>

Re: [DISCUSS] Release package size

Posted by moon soo Lee <mo...@apache.org>.
Hi,

I think we need to have some policy to decide which interpreter goes into
zeppelin-bin-min package. And make applying that policy as a part of
release process.
Because i can not see any consistent rule except for "it seems" or "i
guess". And i have no idea how i can explain if somebody ask 'why python is
not in min package?' 'why xxx is not in min package?'.

If we really want to min package, we must have a policy that gives everyone
same expectation which goes to min package and which goes not. Once we
agree on policy we can make it part of the release process.

So, why don't we try define policy together? Here's some idea i can throw.

 a. Min package includes interpreters, binary size less than 10MB
 b. Min package includes interpreters 5 or more JIRA issue created per
month.
 c. Min package includes/exclude interpreter that community decide via
formal vote.

"10MB", "5 or more" they are number i just made up. We can change them to
more reasonable numbers.
Also a,b,c are possible examples. We can refine them, we can use only one,
we can use all three, we can add more.

My point is, we need to give everyone the same expectation which goes min
package, which goes not.
What do you think?

Thanks,
moon

On Thu, Jan 19, 2017 at 12:47 AM Mina Lee <mi...@apache.org> wrote:

> Thank you for sharing your opinion guys.
>
> I like Eric's approach.
> We are planning to provide official docker managed by community.
> There is ongoing work [1] around it, I can focus on this after 0.7.0
> release.
>
> It seems that majority prefers binary package with top used interpreters
> such as spark, md, jdbc.
> I think we can gradually move to providing only netinst package once
> docker is ready.
> For upcoming 0.7.0 release, I'd like to distribute two binary packages:
>   - zeppelin-bin-min(spark, jdbc, md)
>   - zeppelin-bin-netinst(spark only)
>
> [1] https://github.com/apache/zeppelin/pull/1761
>
> Thanks,
> Mina
>
> On Thu, Jan 19, 2017 at 1:57 AM Jongyoul Lee <jo...@gmail.com> wrote:
>
> I like to deploy netinst only. And it's good idea that Apache Zeppelin
> supports official docker image with all possible interpreters.
>
> On Wed, Jan 18, 2017 at 7:42 PM, Eric Pugh <
> epugh@opensourceconnections.com> wrote:
>
> Can I throw out an alternate approach?   I feel like the key value of the
> “-all” option is to simplify the life of someone who is new to Zeppelin.
>  If you’re a sophisticated Zeppelin user, then picking and choosing
> interpreters is easy, and you you grok why you want to do that….
>
> However, for myself, when I want to demo Zeppelin, I go straight to one of
> the Docker images, specifically
> https://github.com/dylanmei/docker-zeppelin because it bundles in
> everything.
>
> Would providing a similar Docker image on the “Get Zeppelin” page that
> bundles in all the dependencies and interpreters solve the “how do I try
> Zeppelin in 5 minutes” challenge?  The “Get Zeppelin” page is rather
> daunting page!
>
> Eric
>
>
> On Jan 18, 2017, at 12:00 AM, Mohit Jaggi <mo...@gmail.com> wrote:
>
>  Including ALL interpreters is not feasible, not due to download size as
> that is easily increased but because we wouldn't want to couple the release
> cycles as pointed out by Jeff. IMHO a few of the most popular ones should
> be included. Yes it is just one extra step but if a computer can do it why
> make a human suffer? :-)
> Re: spark-packages, Spark does include important and mature functionality
> in its assembly e.g. Csv parser was merged into core spark when it matured.
> I believe Z should do the same.
>
> Sent from my iPhone
>
> On Jan 17, 2017, at 8:05 PM, Jeff Zhang <zj...@gmail.com> wrote:
>
>
> Another thing I'd like to talk is that should we move most of interpreters
> out of zeppelin project to somewhere else just like spark do for
> spark-packages, 2 benefits:
>
> 1. Keep the zeppelin project much smaller
> 2. Each interpreter's improvements won't be blocked by the release of
> zeppelin. Interpreters can has its own release cycle as long as
> zeppelin-interpreter doesn't break the compatibility.
>
> If it make sense, I can open another thread to discuss it.
>
>
>
>
> Jun Kim <i2...@gmail.com>于2017年1月18日周三 上午11:55写道:
>
> +1 for Jeff's idea! I also use the three interpreters mainly :)
>
> 2017년 1월 18일 (수) 오후 12:52, Jeff Zhang <zj...@gmail.com>님이 작성:
>
>
> How about also include markdown and jdbc interpreter if this won't cause
> binary distribution much bigger ? I guess spark, markdown, and jdbc
> interpreters are the top 3 interpreters in zeppelin.
>
>
>
> Ahyoung Ryu <ah...@apache.org>于2017年1月18日周三 上午11:33写道:
>
> Thanks Mina always!
> +1 for releasing only netinst package.
>
> On Wed, Jan 18, 2017 at 12:29 PM, Prabhjyot Singh <
> prabhjyotsingh@apache.org> wrote:
>
> +1
>
> I don't think it's a problem now, but if it keeps increasing then in the
> subsequent releases we can ship Zeppelin with few interpreters, and mark
> others as plugins that can be downloaded later with instructions with how
> to configure.
>
> On Jan 18, 2017 8:54 AM, "Jun Kim" <i2...@gmail.com> wrote:
>
> +1
>
> I think it won't be a problem if we notice it clear.
> Maybe we can do that next to the download button here (
> http://zeppelin.apache.org/download.html)
> A message may be "NOTE: only spark interpreter included since 0.7.0. If
> you want other interpreters, please see interpreter installation guide"
>
> 2017년 1월 18일 (수) 오후 12:14, Jeff Zhang <zj...@gmail.com>님이 작성:
>
>
> +1, we should also mention it in release note and in the 0.7 doc
>
>
>
> Mina Lee <mi...@apache.org>于2017年1月18日周三 上午11:12写道:
>
> Hi all,
>
> Zeppelin is about to start 0.7.0 release process, I would like to discuss
> about binary package distribution.
>
> Every time we distribute new binary package, size of the
> zeppelin-0.x.x-bin-all.tgz package is getting bigger:
>    - zeppelin-0.6.0-bin-all.tgz: 506M
>    - zeppelin-0.6.1-bin-all.tgz: 517M
>    - zeppelin-0.6.2-bin-all.tgz: 547M
>    - zeppelin-0.7.0-bin-all.tgz: 720M (Expected)
>
> Mostly it is because the number of interpreters supported by zeppelin
> keeps growing,
> and there is high chance that we support more interpreters in the near
> future.
> So instead of asking apache infra team to increase limit,
> I would like to suggest to have only zeppelin-0.7.0-bin-netinst.tgz, which
> only includes spark interpreter from 0.7.0 release.
> One concern is that users need one more step to install the interpreters
> they use,
> but I believe it can be done easily with single line of command [1].
>
> FYI, attaching the link of similar discussion [2] we had last June in
> mailing list.
>
> Regards,
> Mina
>
> [1]
> http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html#install-specific-interpreters
> <http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html>
> [2]
> https://lists.apache.org/thread.html/4b54c034cf8d691655156e0cb647243180c57a6829d97aa3c085b63c@%3Cusers.zeppelin.apache.org%3E
>
> --
> Taejun Kim
>
> Data Mining Lab.
> School of Electrical and Computer Engineering
> University of Seoul
>
>
> --
> Taejun Kim
>
> Data Mining Lab.
> School of Electrical and Computer Engineering
> University of Seoul
>
>
>
> _______________________
> *Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC | 434.466.1467
> | http://www.opensourceconnections.com | My Free/Busy
> <http://tinyurl.com/eric-cal>
> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed
> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless
> of whether attachments are marked as such.
>
>
>
>
> --
> 이종열, Jongyoul Lee, 李宗烈
> http://madeng.net
>
>

Re: [DISCUSS] Release package size

Posted by Mina Lee <mi...@apache.org>.
Thank you for sharing your opinion guys.

I like Eric's approach.
We are planning to provide official docker managed by community.
There is ongoing work [1] around it, I can focus on this after 0.7.0
release.

It seems that majority prefers binary package with top used interpreters
such as spark, md, jdbc.
I think we can gradually move to providing only netinst package once docker
is ready.
For upcoming 0.7.0 release, I'd like to distribute two binary packages:
  - zeppelin-bin-min(spark, jdbc, md)
  - zeppelin-bin-netinst(spark only)

[1] https://github.com/apache/zeppelin/pull/1761

Thanks,
Mina

On Thu, Jan 19, 2017 at 1:57 AM Jongyoul Lee <jo...@gmail.com> wrote:

I like to deploy netinst only. And it's good idea that Apache Zeppelin
supports official docker image with all possible interpreters.

On Wed, Jan 18, 2017 at 7:42 PM, Eric Pugh <ep...@opensourceconnections.com>
wrote:

Can I throw out an alternate approach?   I feel like the key value of the
“-all” option is to simplify the life of someone who is new to Zeppelin.
 If you’re a sophisticated Zeppelin user, then picking and choosing
interpreters is easy, and you you grok why you want to do that….

However, for myself, when I want to demo Zeppelin, I go straight to one of
the Docker images, specifically
https://github.com/dylanmei/docker-zeppelin because
it bundles in everything.

Would providing a similar Docker image on the “Get Zeppelin” page that
bundles in all the dependencies and interpreters solve the “how do I try
Zeppelin in 5 minutes” challenge?  The “Get Zeppelin” page is rather
daunting page!

Eric


On Jan 18, 2017, at 12:00 AM, Mohit Jaggi <mo...@gmail.com> wrote:

 Including ALL interpreters is not feasible, not due to download size as
that is easily increased but because we wouldn't want to couple the release
cycles as pointed out by Jeff. IMHO a few of the most popular ones should
be included. Yes it is just one extra step but if a computer can do it why
make a human suffer? :-)
Re: spark-packages, Spark does include important and mature functionality
in its assembly e.g. Csv parser was merged into core spark when it matured.
I believe Z should do the same.

Sent from my iPhone

On Jan 17, 2017, at 8:05 PM, Jeff Zhang <zj...@gmail.com> wrote:


Another thing I'd like to talk is that should we move most of interpreters
out of zeppelin project to somewhere else just like spark do for
spark-packages, 2 benefits:

1. Keep the zeppelin project much smaller
2. Each interpreter's improvements won't be blocked by the release of
zeppelin. Interpreters can has its own release cycle as long as
zeppelin-interpreter doesn't break the compatibility.

If it make sense, I can open another thread to discuss it.




Jun Kim <i2...@gmail.com>于2017年1月18日周三 上午11:55写道:

+1 for Jeff's idea! I also use the three interpreters mainly :)

2017년 1월 18일 (수) 오후 12:52, Jeff Zhang <zj...@gmail.com>님이 작성:


How about also include markdown and jdbc interpreter if this won't cause
binary distribution much bigger ? I guess spark, markdown, and jdbc
interpreters are the top 3 interpreters in zeppelin.



Ahyoung Ryu <ah...@apache.org>于2017年1月18日周三 上午11:33写道:

Thanks Mina always!
+1 for releasing only netinst package.

On Wed, Jan 18, 2017 at 12:29 PM, Prabhjyot Singh <prabhjyotsingh@apache.org
> wrote:

+1

I don't think it's a problem now, but if it keeps increasing then in the
subsequent releases we can ship Zeppelin with few interpreters, and mark
others as plugins that can be downloaded later with instructions with how
to configure.

On Jan 18, 2017 8:54 AM, "Jun Kim" <i2...@gmail.com> wrote:

+1

I think it won't be a problem if we notice it clear.
Maybe we can do that next to the download button here (
http://zeppelin.apache.org/download.html)
A message may be "NOTE: only spark interpreter included since 0.7.0. If you
want other interpreters, please see interpreter installation guide"

2017년 1월 18일 (수) 오후 12:14, Jeff Zhang <zj...@gmail.com>님이 작성:


+1, we should also mention it in release note and in the 0.7 doc



Mina Lee <mi...@apache.org>于2017年1月18日周三 上午11:12写道:

Hi all,

Zeppelin is about to start 0.7.0 release process, I would like to discuss
about binary package distribution.

Every time we distribute new binary package, size of the
zeppelin-0.x.x-bin-all.tgz package is getting bigger:
   - zeppelin-0.6.0-bin-all.tgz: 506M
   - zeppelin-0.6.1-bin-all.tgz: 517M
   - zeppelin-0.6.2-bin-all.tgz: 547M
   - zeppelin-0.7.0-bin-all.tgz: 720M (Expected)

Mostly it is because the number of interpreters supported by zeppelin keeps
growing,
and there is high chance that we support more interpreters in the near
future.
So instead of asking apache infra team to increase limit,
I would like to suggest to have only zeppelin-0.7.0-bin-netinst.tgz, which
only includes spark interpreter from 0.7.0 release.
One concern is that users need one more step to install the interpreters
they use,
but I believe it can be done easily with single line of command [1].

FYI, attaching the link of similar discussion [2] we had last June in
mailing list.

Regards,
Mina

[1]
http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html#install-specific-interpreters
<http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html>
[2]
https://lists.apache.org/thread.html/4b54c034cf8d691655156e0cb647243180c57a6829d97aa3c085b63c@%3Cusers.zeppelin.apache.org%3E

-- 
Taejun Kim

Data Mining Lab.
School of Electrical and Computer Engineering
University of Seoul


-- 
Taejun Kim

Data Mining Lab.
School of Electrical and Computer Engineering
University of Seoul



_______________________
*Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC | 434.466.1467 |
 http://www.opensourceconnections.com | My Free/Busy
<http://tinyurl.com/eric-cal>
Co-Author: Apache Solr Enterprise Search Server, 3rd Ed
<https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
This e-mail and all contents, including attachments, is considered to be
Company Confidential unless explicitly stated otherwise, regardless
of whether attachments are marked as such.




-- 
이종열, Jongyoul Lee, 李宗烈
http://madeng.net

Re: [DISCUSS] Release package size

Posted by Jongyoul Lee <jo...@gmail.com>.
I like to deploy netinst only. And it's good idea that Apache Zeppelin
supports official docker image with all possible interpreters.

On Wed, Jan 18, 2017 at 7:42 PM, Eric Pugh <ep...@opensourceconnections.com>
wrote:

> Can I throw out an alternate approach?   I feel like the key value of the
> “-all” option is to simplify the life of someone who is new to Zeppelin.
>  If you’re a sophisticated Zeppelin user, then picking and choosing
> interpreters is easy, and you you grok why you want to do that….
>
> However, for myself, when I want to demo Zeppelin, I go straight to one of
> the Docker images, specifically https://github.
> com/dylanmei/docker-zeppelin because it bundles in everything.
>
> Would providing a similar Docker image on the “Get Zeppelin” page that
> bundles in all the dependencies and interpreters solve the “how do I try
> Zeppelin in 5 minutes” challenge?  The “Get Zeppelin” page is rather
> daunting page!
>
> Eric
>
>
> On Jan 18, 2017, at 12:00 AM, Mohit Jaggi <mo...@gmail.com> wrote:
>
>  Including ALL interpreters is not feasible, not due to download size as
> that is easily increased but because we wouldn't want to couple the release
> cycles as pointed out by Jeff. IMHO a few of the most popular ones should
> be included. Yes it is just one extra step but if a computer can do it why
> make a human suffer? :-)
> Re: spark-packages, Spark does include important and mature functionality
> in its assembly e.g. Csv parser was merged into core spark when it matured.
> I believe Z should do the same.
>
> Sent from my iPhone
>
> On Jan 17, 2017, at 8:05 PM, Jeff Zhang <zj...@gmail.com> wrote:
>
>
> Another thing I'd like to talk is that should we move most of interpreters
> out of zeppelin project to somewhere else just like spark do for
> spark-packages, 2 benefits:
>
> 1. Keep the zeppelin project much smaller
> 2. Each interpreter's improvements won't be blocked by the release of
> zeppelin. Interpreters can has its own release cycle as long as
> zeppelin-interpreter doesn't break the compatibility.
>
> If it make sense, I can open another thread to discuss it.
>
>
>
>
> Jun Kim <i2...@gmail.com>于2017年1月18日周三 上午11:55写道:
>
>> +1 for Jeff's idea! I also use the three interpreters mainly :)
>>
>> 2017년 1월 18일 (수) 오후 12:52, Jeff Zhang <zj...@gmail.com>님이 작성:
>>
>>
>> How about also include markdown and jdbc interpreter if this won't cause
>> binary distribution much bigger ? I guess spark, markdown, and jdbc
>> interpreters are the top 3 interpreters in zeppelin.
>>
>>
>>
>> Ahyoung Ryu <ah...@apache.org>于2017年1月18日周三 上午11:33写道:
>>
>> Thanks Mina always!
>> +1 for releasing only netinst package.
>>
>> On Wed, Jan 18, 2017 at 12:29 PM, Prabhjyot Singh <
>> prabhjyotsingh@apache.org> wrote:
>>
>> +1
>>
>> I don't think it's a problem now, but if it keeps increasing then in the
>> subsequent releases we can ship Zeppelin with few interpreters, and mark
>> others as plugins that can be downloaded later with instructions with how
>> to configure.
>>
>> On Jan 18, 2017 8:54 AM, "Jun Kim" <i2...@gmail.com> wrote:
>>
>> +1
>>
>> I think it won't be a problem if we notice it clear.
>> Maybe we can do that next to the download button here (
>> http://zeppelin.apache.org/download.html)
>> A message may be "NOTE: only spark interpreter included since 0.7.0. If
>> you want other interpreters, please see interpreter installation guide"
>>
>> 2017년 1월 18일 (수) 오후 12:14, Jeff Zhang <zj...@gmail.com>님이 작성:
>>
>>
>> +1, we should also mention it in release note and in the 0.7 doc
>>
>>
>>
>> Mina Lee <mi...@apache.org>于2017年1月18日周三 上午11:12写道:
>>
>> Hi all,
>>
>> Zeppelin is about to start 0.7.0 release process, I would like to discuss
>> about binary package distribution.
>>
>> Every time we distribute new binary package, size of the
>> zeppelin-0.x.x-bin-all.tgz package is getting bigger:
>>    - zeppelin-0.6.0-bin-all.tgz: 506M
>>    - zeppelin-0.6.1-bin-all.tgz: 517M
>>    - zeppelin-0.6.2-bin-all.tgz: 547M
>>    - zeppelin-0.7.0-bin-all.tgz: 720M (Expected)
>>
>> Mostly it is because the number of interpreters supported by zeppelin
>> keeps growing,
>> and there is high chance that we support more interpreters in the near
>> future.
>> So instead of asking apache infra team to increase limit,
>> I would like to suggest to have only zeppelin-0.7.0-bin-netinst.tgz,
>> which only includes spark interpreter from 0.7.0 release.
>> One concern is that users need one more step to install the interpreters
>> they use,
>> but I believe it can be done easily with single line of command [1].
>>
>> FYI, attaching the link of similar discussion [2] we had last June in
>> mailing list.
>>
>> Regards,
>> Mina
>>
>> [1] http://zeppelin.apache.org/docs/0.6.2/manual/
>> interpreterinstallation.html#install-specific-interpreters
>> <http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html>
>> [2] https://lists.apache.org/thread.html/4b54c034cf8d691655156e0cb64724
>> 3180c57a6829d97aa3c085b63c@%3Cusers.zeppelin.apache.org%3E
>>
>> --
>> Taejun Kim
>>
>> Data Mining Lab.
>> School of Electrical and Computer Engineering
>> University of Seoul
>>
>>
>> --
>> Taejun Kim
>>
>> Data Mining Lab.
>> School of Electrical and Computer Engineering
>> University of Seoul
>>
>
>
> _______________________
> *Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC | 434.466.1467
> | http://www.opensourceconnections.com | My Free/Busy
> <http://tinyurl.com/eric-cal>
> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed
> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless
> of whether attachments are marked as such.
>
>


-- 
이종열, Jongyoul Lee, 李宗烈
http://madeng.net

Re: [DISCUSS] Release package size

Posted by Eric Pugh <ep...@opensourceconnections.com>.
Can I throw out an alternate approach?   I feel like the key value of the “-all” option is to simplify the life of someone who is new to Zeppelin.    If you’re a sophisticated Zeppelin user, then picking and choosing interpreters is easy, and you you grok why you want to do that….

However, for myself, when I want to demo Zeppelin, I go straight to one of the Docker images, specifically https://github.com/dylanmei/docker-zeppelin <https://github.com/dylanmei/docker-zeppelin> because it bundles in everything.

Would providing a similar Docker image on the “Get Zeppelin” page that bundles in all the dependencies and interpreters solve the “how do I try Zeppelin in 5 minutes” challenge?  The “Get Zeppelin” page is rather daunting page!   

Eric


> On Jan 18, 2017, at 12:00 AM, Mohit Jaggi <mo...@gmail.com> wrote:
> 
>  Including ALL interpreters is not feasible, not due to download size as that is easily increased but because we wouldn't want to couple the release cycles as pointed out by Jeff. IMHO a few of the most popular ones should be included. Yes it is just one extra step but if a computer can do it why make a human suffer? :-)
> Re: spark-packages, Spark does include important and mature functionality in its assembly e.g. Csv parser was merged into core spark when it matured. I believe Z should do the same.
> 
> Sent from my iPhone
> 
> On Jan 17, 2017, at 8:05 PM, Jeff Zhang <zjffdu@gmail.com <ma...@gmail.com>> wrote:
> 
>> 
>> Another thing I'd like to talk is that should we move most of interpreters out of zeppelin project to somewhere else just like spark do for spark-packages, 2 benefits:
>> 
>> 1. Keep the zeppelin project much smaller
>> 2. Each interpreter's improvements won't be blocked by the release of zeppelin. Interpreters can has its own release cycle as long as zeppelin-interpreter doesn't break the compatibility. 
>> 
>> If it make sense, I can open another thread to discuss it.
>> 
>> 
>> 
>> 
>> Jun Kim <i2r.jun@gmail.com <ma...@gmail.com>>于2017年1月18日周三 上午11:55写道:
>> +1 for Jeff's idea! I also use the three interpreters mainly :)
>> 
>> 2017년 1월 18일 (수) 오후 12:52, Jeff Zhang <zjffdu@gmail.com <ma...@gmail.com>>님이 작성:
>> 
>> How about also include markdown and jdbc interpreter if this won't cause binary distribution much bigger ? I guess spark, markdown, and jdbc interpreters are the top 3 interpreters in zeppelin.
>> 
>> 
>> 
>> Ahyoung Ryu <ahyoungryu@apache.org <ma...@apache.org>>于2017年1月18日周三 上午11:33写道:
>> Thanks Mina always! 
>> +1 for releasing only netinst package.
>> 
>> On Wed, Jan 18, 2017 at 12:29 PM, Prabhjyot Singh <prabhjyotsingh@apache.org <ma...@apache.org>> wrote:
>> +1
>> 
>> I don't think it's a problem now, but if it keeps increasing then in the subsequent releases we can ship Zeppelin with few interpreters, and mark others as plugins that can be downloaded later with instructions with how to configure.
>> 
>> On Jan 18, 2017 8:54 AM, "Jun Kim" <i2r.jun@gmail.com <ma...@gmail.com>> wrote:
>> +1
>> 
>> I think it won't be a problem if we notice it clear.
>> Maybe we can do that next to the download button here (http://zeppelin.apache.org/download.html <http://zeppelin.apache.org/download.html>)
>> A message may be "NOTE: only spark interpreter included since 0.7.0. If you want other interpreters, please see interpreter installation guide"
>> 
>> 2017년 1월 18일 (수) 오후 12:14, Jeff Zhang <zjffdu@gmail.com <ma...@gmail.com>>님이 작성:
>> 
>> +1, we should also mention it in release note and in the 0.7 doc
>> 
>> 
>> 
>> Mina Lee <minalee@apache.org <ma...@apache.org>>于2017年1月18日周三 上午11:12写道:
>> Hi all,
>> 
>> Zeppelin is about to start 0.7.0 release process, I would like to discuss about binary package distribution.
>> 
>> Every time we distribute new binary package, size of the zeppelin-0.x.x-bin-all.tgz package is getting bigger:
>>    - zeppelin-0.6.0-bin-all.tgz: 506M
>>    - zeppelin-0.6.1-bin-all.tgz: 517M
>>    - zeppelin-0.6.2-bin-all.tgz: 547M
>>    - zeppelin-0.7.0-bin-all.tgz: 720M (Expected)
>> 
>> Mostly it is because the number of interpreters supported by zeppelin keeps growing,
>> and there is high chance that we support more interpreters in the near future.
>> So instead of asking apache infra team to increase limit,
>> I would like to suggest to have only zeppelin-0.7.0-bin-netinst.tgz, which only includes spark interpreter from 0.7.0 release.
>> One concern is that users need one more step to install the interpreters they use,
>> but I believe it can be done easily with single line of command [1].
>> 
>> FYI, attaching the link of similar discussion [2] we had last June in mailing list.
>> 
>> Regards,
>> Mina
>> 
>> [1] http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html#install-specific-interpreters <http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html>
>> [2] https://lists.apache.org/thread.html/4b54c034cf8d691655156e0cb647243180c57a6829d97aa3c085b63c@%3Cusers.zeppelin.apache.org%3E <https://lists.apache.org/thread.html/4b54c034cf8d691655156e0cb647243180c57a6829d97aa3c085b63c@%3Cusers.zeppelin.apache.org%3E>
>> -- 
>> Taejun Kim
>> 
>> Data Mining Lab.
>> School of Electrical and Computer Engineering
>> University of Seoul
>> 
>> -- 
>> Taejun Kim
>> 
>> Data Mining Lab.
>> School of Electrical and Computer Engineering
>> University of Seoul


_______________________
Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com <http://www.opensourceconnections.com/> | My Free/Busy <http://tinyurl.com/eric-cal>  
Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>	
This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.


Re: [DISCUSS] Release package size

Posted by Mohit Jaggi <mo...@gmail.com>.
 Including ALL interpreters is not feasible, not due to download size as that is easily increased but because we wouldn't want to couple the release cycles as pointed out by Jeff. IMHO a few of the most popular ones should be included. Yes it is just one extra step but if a computer can do it why make a human suffer? :-)
Re: spark-packages, Spark does include important and mature functionality in its assembly e.g. Csv parser was merged into core spark when it matured. I believe Z should do the same.

Sent from my iPhone

> On Jan 17, 2017, at 8:05 PM, Jeff Zhang <zj...@gmail.com> wrote:
> 
> 
> Another thing I'd like to talk is that should we move most of interpreters out of zeppelin project to somewhere else just like spark do for spark-packages, 2 benefits:
> 
> 1. Keep the zeppelin project much smaller
> 2. Each interpreter's improvements won't be blocked by the release of zeppelin. Interpreters can has its own release cycle as long as zeppelin-interpreter doesn't break the compatibility. 
> 
> If it make sense, I can open another thread to discuss it.
> 
> 
> 
> 
> Jun Kim <i2...@gmail.com>于2017年1月18日周三 上午11:55写道:
>> +1 for Jeff's idea! I also use the three interpreters mainly :)
>> 
>> 2017년 1월 18일 (수) 오후 12:52, Jeff Zhang <zj...@gmail.com>님이 작성:
>> 
>> How about also include markdown and jdbc interpreter if this won't cause binary distribution much bigger ? I guess spark, markdown, and jdbc interpreters are the top 3 interpreters in zeppelin.
>> 
>> 
>> 
>> Ahyoung Ryu <ah...@apache.org>于2017年1月18日周三 上午11:33写道:
>> Thanks Mina always! 
>> +1 for releasing only netinst package.
>> 
>> On Wed, Jan 18, 2017 at 12:29 PM, Prabhjyot Singh <pr...@apache.org> wrote:
>> +1
>> 
>> I don't think it's a problem now, but if it keeps increasing then in the subsequent releases we can ship Zeppelin with few interpreters, and mark others as plugins that can be downloaded later with instructions with how to configure.
>> 
>> On Jan 18, 2017 8:54 AM, "Jun Kim" <i2...@gmail.com> wrote:
>> +1
>> 
>> I think it won't be a problem if we notice it clear.
>> Maybe we can do that next to the download button here (http://zeppelin.apache.org/download.html)
>> A message may be "NOTE: only spark interpreter included since 0.7.0. If you want other interpreters, please see interpreter installation guide"
>> 
>> 2017년 1월 18일 (수) 오후 12:14, Jeff Zhang <zj...@gmail.com>님이 작성:
>> 
>> +1, we should also mention it in release note and in the 0.7 doc
>> 
>> 
>> 
>> Mina Lee <mi...@apache.org>于2017年1月18日周三 上午11:12写道:
>> Hi all,
>> 
>> Zeppelin is about to start 0.7.0 release process, I would like to discuss about binary package distribution.
>> 
>> Every time we distribute new binary package, size of the zeppelin-0.x.x-bin-all.tgz package is getting bigger:
>>    - zeppelin-0.6.0-bin-all.tgz: 506M
>>    - zeppelin-0.6.1-bin-all.tgz: 517M
>>    - zeppelin-0.6.2-bin-all.tgz: 547M
>>    - zeppelin-0.7.0-bin-all.tgz: 720M (Expected)
>> 
>> Mostly it is because the number of interpreters supported by zeppelin keeps growing,
>> and there is high chance that we support more interpreters in the near future.
>> So instead of asking apache infra team to increase limit,
>> I would like to suggest to have only zeppelin-0.7.0-bin-netinst.tgz, which only includes spark interpreter from 0.7.0 release.
>> One concern is that users need one more step to install the interpreters they use,
>> but I believe it can be done easily with single line of command [1].
>> 
>> FYI, attaching the link of similar discussion [2] we had last June in mailing list.
>> 
>> Regards,
>> Mina
>> 
>> [1] http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html#install-specific-interpreters
>> [2] https://lists.apache.org/thread.html/4b54c034cf8d691655156e0cb647243180c57a6829d97aa3c085b63c@%3Cusers.zeppelin.apache.org%3E
>> -- 
>> Taejun Kim
>> 
>> Data Mining Lab.
>> School of Electrical and Computer Engineering
>> University of Seoul
>> 
>> -- 
>> Taejun Kim
>> 
>> Data Mining Lab.
>> School of Electrical and Computer Engineering
>> University of Seoul

Re: [DISCUSS] Release package size

Posted by Jeff Zhang <zj...@gmail.com>.
Another thing I'd like to talk is that should we move most of interpreters
out of zeppelin project to somewhere else just like spark do for
spark-packages, 2 benefits:

1. Keep the zeppelin project much smaller
2. Each interpreter's improvements won't be blocked by the release of
zeppelin. Interpreters can has its own release cycle as long as
zeppelin-interpreter doesn't break the compatibility.

If it make sense, I can open another thread to discuss it.




Jun Kim <i2...@gmail.com>于2017年1月18日周三 上午11:55写道:

> +1 for Jeff's idea! I also use the three interpreters mainly :)
>
> 2017년 1월 18일 (수) 오후 12:52, Jeff Zhang <zj...@gmail.com>님이 작성:
>
>
> How about also include markdown and jdbc interpreter if this won't cause
> binary distribution much bigger ? I guess spark, markdown, and jdbc
> interpreters are the top 3 interpreters in zeppelin.
>
>
>
> Ahyoung Ryu <ah...@apache.org>于2017年1月18日周三 上午11:33写道:
>
> Thanks Mina always!
> +1 for releasing only netinst package.
>
> On Wed, Jan 18, 2017 at 12:29 PM, Prabhjyot Singh <
> prabhjyotsingh@apache.org> wrote:
>
> +1
>
> I don't think it's a problem now, but if it keeps increasing then in the
> subsequent releases we can ship Zeppelin with few interpreters, and mark
> others as plugins that can be downloaded later with instructions with how
> to configure.
>
> On Jan 18, 2017 8:54 AM, "Jun Kim" <i2...@gmail.com> wrote:
>
> +1
>
> I think it won't be a problem if we notice it clear.
> Maybe we can do that next to the download button here (
> http://zeppelin.apache.org/download.html)
> A message may be "NOTE: only spark interpreter included since 0.7.0. If
> you want other interpreters, please see interpreter installation guide"
>
> 2017년 1월 18일 (수) 오후 12:14, Jeff Zhang <zj...@gmail.com>님이 작성:
>
>
> +1, we should also mention it in release note and in the 0.7 doc
>
>
>
> Mina Lee <mi...@apache.org>于2017年1月18日周三 上午11:12写道:
>
> Hi all,
>
> Zeppelin is about to start 0.7.0 release process, I would like to discuss
> about binary package distribution.
>
> Every time we distribute new binary package, size of the
> zeppelin-0.x.x-bin-all.tgz package is getting bigger:
>    - zeppelin-0.6.0-bin-all.tgz: 506M
>    - zeppelin-0.6.1-bin-all.tgz: 517M
>    - zeppelin-0.6.2-bin-all.tgz: 547M
>    - zeppelin-0.7.0-bin-all.tgz: 720M (Expected)
>
> Mostly it is because the number of interpreters supported by zeppelin
> keeps growing,
> and there is high chance that we support more interpreters in the near
> future.
> So instead of asking apache infra team to increase limit,
> I would like to suggest to have only zeppelin-0.7.0-bin-netinst.tgz, which
> only includes spark interpreter from 0.7.0 release.
> One concern is that users need one more step to install the interpreters
> they use,
> but I believe it can be done easily with single line of command [1].
>
> FYI, attaching the link of similar discussion [2] we had last June in
> mailing list.
>
> Regards,
> Mina
>
> [1]
> http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html#install-specific-interpreters
> <http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html>
> [2]
> https://lists.apache.org/thread.html/4b54c034cf8d691655156e0cb647243180c57a6829d97aa3c085b63c@%3Cusers.zeppelin.apache.org%3E
>
> --
> Taejun Kim
>
> Data Mining Lab.
> School of Electrical and Computer Engineering
> University of Seoul
>
>
> --
> Taejun Kim
>
> Data Mining Lab.
> School of Electrical and Computer Engineering
> University of Seoul
>

Re: [DISCUSS] Release package size

Posted by Jun Kim <i2...@gmail.com>.
+1 for Jeff's idea! I also use the three interpreters mainly :)

2017년 1월 18일 (수) 오후 12:52, Jeff Zhang <zj...@gmail.com>님이 작성:

>
> How about also include markdown and jdbc interpreter if this won't cause
> binary distribution much bigger ? I guess spark, markdown, and jdbc
> interpreters are the top 3 interpreters in zeppelin.
>
>
>
> Ahyoung Ryu <ah...@apache.org>于2017年1月18日周三 上午11:33写道:
>
> Thanks Mina always!
> +1 for releasing only netinst package.
>
> On Wed, Jan 18, 2017 at 12:29 PM, Prabhjyot Singh <
> prabhjyotsingh@apache.org> wrote:
>
> +1
>
> I don't think it's a problem now, but if it keeps increasing then in the
> subsequent releases we can ship Zeppelin with few interpreters, and mark
> others as plugins that can be downloaded later with instructions with how
> to configure.
>
> On Jan 18, 2017 8:54 AM, "Jun Kim" <i2...@gmail.com> wrote:
>
> +1
>
> I think it won't be a problem if we notice it clear.
> Maybe we can do that next to the download button here (
> http://zeppelin.apache.org/download.html)
> A message may be "NOTE: only spark interpreter included since 0.7.0. If
> you want other interpreters, please see interpreter installation guide"
>
> 2017년 1월 18일 (수) 오후 12:14, Jeff Zhang <zj...@gmail.com>님이 작성:
>
>
> +1, we should also mention it in release note and in the 0.7 doc
>
>
>
> Mina Lee <mi...@apache.org>于2017年1月18日周三 上午11:12写道:
>
> Hi all,
>
> Zeppelin is about to start 0.7.0 release process, I would like to discuss
> about binary package distribution.
>
> Every time we distribute new binary package, size of the
> zeppelin-0.x.x-bin-all.tgz package is getting bigger:
>    - zeppelin-0.6.0-bin-all.tgz: 506M
>    - zeppelin-0.6.1-bin-all.tgz: 517M
>    - zeppelin-0.6.2-bin-all.tgz: 547M
>    - zeppelin-0.7.0-bin-all.tgz: 720M (Expected)
>
> Mostly it is because the number of interpreters supported by zeppelin
> keeps growing,
> and there is high chance that we support more interpreters in the near
> future.
> So instead of asking apache infra team to increase limit,
> I would like to suggest to have only zeppelin-0.7.0-bin-netinst.tgz, which
> only includes spark interpreter from 0.7.0 release.
> One concern is that users need one more step to install the interpreters
> they use,
> but I believe it can be done easily with single line of command [1].
>
> FYI, attaching the link of similar discussion [2] we had last June in
> mailing list.
>
> Regards,
> Mina
>
> [1]
> http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html#install-specific-interpreters
> <http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html>
> [2]
> https://lists.apache.org/thread.html/4b54c034cf8d691655156e0cb647243180c57a6829d97aa3c085b63c@%3Cusers.zeppelin.apache.org%3E
>
> --
> Taejun Kim
>
> Data Mining Lab.
> School of Electrical and Computer Engineering
> University of Seoul
>
>
> --
Taejun Kim

Data Mining Lab.
School of Electrical and Computer Engineering
University of Seoul

Re: [DISCUSS] Release package size

Posted by Jeff Zhang <zj...@gmail.com>.
How about also include markdown and jdbc interpreter if this won't cause
binary distribution much bigger ? I guess spark, markdown, and jdbc
interpreters are the top 3 interpreters in zeppelin.



Ahyoung Ryu <ah...@apache.org>于2017年1月18日周三 上午11:33写道:

> Thanks Mina always!
> +1 for releasing only netinst package.
>
> On Wed, Jan 18, 2017 at 12:29 PM, Prabhjyot Singh <
> prabhjyotsingh@apache.org> wrote:
>
> +1
>
> I don't think it's a problem now, but if it keeps increasing then in the
> subsequent releases we can ship Zeppelin with few interpreters, and mark
> others as plugins that can be downloaded later with instructions with how
> to configure.
>
> On Jan 18, 2017 8:54 AM, "Jun Kim" <i2...@gmail.com> wrote:
>
> +1
>
> I think it won't be a problem if we notice it clear.
> Maybe we can do that next to the download button here (
> http://zeppelin.apache.org/download.html)
> A message may be "NOTE: only spark interpreter included since 0.7.0. If
> you want other interpreters, please see interpreter installation guide"
>
> 2017년 1월 18일 (수) 오후 12:14, Jeff Zhang <zj...@gmail.com>님이 작성:
>
>
> +1, we should also mention it in release note and in the 0.7 doc
>
>
>
> Mina Lee <mi...@apache.org>于2017年1月18日周三 上午11:12写道:
>
> Hi all,
>
> Zeppelin is about to start 0.7.0 release process, I would like to discuss
> about binary package distribution.
>
> Every time we distribute new binary package, size of the
> zeppelin-0.x.x-bin-all.tgz package is getting bigger:
>    - zeppelin-0.6.0-bin-all.tgz: 506M
>    - zeppelin-0.6.1-bin-all.tgz: 517M
>    - zeppelin-0.6.2-bin-all.tgz: 547M
>    - zeppelin-0.7.0-bin-all.tgz: 720M (Expected)
>
> Mostly it is because the number of interpreters supported by zeppelin
> keeps growing,
> and there is high chance that we support more interpreters in the near
> future.
> So instead of asking apache infra team to increase limit,
> I would like to suggest to have only zeppelin-0.7.0-bin-netinst.tgz, which
> only includes spark interpreter from 0.7.0 release.
> One concern is that users need one more step to install the interpreters
> they use,
> but I believe it can be done easily with single line of command [1].
>
> FYI, attaching the link of similar discussion [2] we had last June in
> mailing list.
>
> Regards,
> Mina
>
> [1]
> http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html#install-specific-interpreters
> <http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html>
> [2]
> https://lists.apache.org/thread.html/4b54c034cf8d691655156e0cb647243180c57a6829d97aa3c085b63c@%3Cusers.zeppelin.apache.org%3E
>
> --
> Taejun Kim
>
> Data Mining Lab.
> School of Electrical and Computer Engineering
> University of Seoul
>
>
>

Re: [DISCUSS] Release package size

Posted by Ahyoung Ryu <ah...@apache.org>.
Thanks Mina always!
+1 for releasing only netinst package.

On Wed, Jan 18, 2017 at 12:29 PM, Prabhjyot Singh <prabhjyotsingh@apache.org
> wrote:

> +1
>
> I don't think it's a problem now, but if it keeps increasing then in the
> subsequent releases we can ship Zeppelin with few interpreters, and mark
> others as plugins that can be downloaded later with instructions with how
> to configure.
>
> On Jan 18, 2017 8:54 AM, "Jun Kim" <i2...@gmail.com> wrote:
>
>> +1
>>
>> I think it won't be a problem if we notice it clear.
>> Maybe we can do that next to the download button here (
>> http://zeppelin.apache.org/download.html)
>> A message may be "NOTE: only spark interpreter included since 0.7.0. If
>> you want other interpreters, please see interpreter installation guide"
>>
>> 2017년 1월 18일 (수) 오후 12:14, Jeff Zhang <zj...@gmail.com>님이 작성:
>>
>>>
>>> +1, we should also mention it in release note and in the 0.7 doc
>>>
>>>
>>>
>>> Mina Lee <mi...@apache.org>于2017年1月18日周三 上午11:12写道:
>>>
>>> Hi all,
>>>
>>> Zeppelin is about to start 0.7.0 release process, I would like to
>>> discuss about binary package distribution.
>>>
>>> Every time we distribute new binary package, size of the
>>> zeppelin-0.x.x-bin-all.tgz package is getting bigger:
>>>    - zeppelin-0.6.0-bin-all.tgz: 506M
>>>    - zeppelin-0.6.1-bin-all.tgz: 517M
>>>    - zeppelin-0.6.2-bin-all.tgz: 547M
>>>    - zeppelin-0.7.0-bin-all.tgz: 720M (Expected)
>>>
>>> Mostly it is because the number of interpreters supported by zeppelin
>>> keeps growing,
>>> and there is high chance that we support more interpreters in the near
>>> future.
>>> So instead of asking apache infra team to increase limit,
>>> I would like to suggest to have only zeppelin-0.7.0-bin-netinst.tgz,
>>> which only includes spark interpreter from 0.7.0 release.
>>> One concern is that users need one more step to install the interpreters
>>> they use,
>>> but I believe it can be done easily with single line of command [1].
>>>
>>> FYI, attaching the link of similar discussion [2] we had last June in
>>> mailing list.
>>>
>>> Regards,
>>> Mina
>>>
>>> [1] http://zeppelin.apache.org/docs/0.6.2/manual/interpreter
>>> installation.html#install-specific-interpreters
>>> <http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html>
>>> [2] https://lists.apache.org/thread.html/4b54c034cf8d691655156e0
>>> cb647243180c57a6829d97aa3c085b63c@%3Cusers.zeppelin.apache.org%3E
>>>
>>> --
>> Taejun Kim
>>
>> Data Mining Lab.
>> School of Electrical and Computer Engineering
>> University of Seoul
>>
>

Re: [DISCUSS] Release package size

Posted by Prabhjyot Singh <pr...@apache.org>.
+1

I don't think it's a problem now, but if it keeps increasing then in the
subsequent releases we can ship Zeppelin with few interpreters, and mark
others as plugins that can be downloaded later with instructions with how
to configure.

On Jan 18, 2017 8:54 AM, "Jun Kim" <i2...@gmail.com> wrote:

> +1
>
> I think it won't be a problem if we notice it clear.
> Maybe we can do that next to the download button here (
> http://zeppelin.apache.org/download.html)
> A message may be "NOTE: only spark interpreter included since 0.7.0. If
> you want other interpreters, please see interpreter installation guide"
>
> 2017년 1월 18일 (수) 오후 12:14, Jeff Zhang <zj...@gmail.com>님이 작성:
>
>>
>> +1, we should also mention it in release note and in the 0.7 doc
>>
>>
>>
>> Mina Lee <mi...@apache.org>于2017年1月18日周三 上午11:12写道:
>>
>> Hi all,
>>
>> Zeppelin is about to start 0.7.0 release process, I would like to discuss
>> about binary package distribution.
>>
>> Every time we distribute new binary package, size of the
>> zeppelin-0.x.x-bin-all.tgz package is getting bigger:
>>    - zeppelin-0.6.0-bin-all.tgz: 506M
>>    - zeppelin-0.6.1-bin-all.tgz: 517M
>>    - zeppelin-0.6.2-bin-all.tgz: 547M
>>    - zeppelin-0.7.0-bin-all.tgz: 720M (Expected)
>>
>> Mostly it is because the number of interpreters supported by zeppelin
>> keeps growing,
>> and there is high chance that we support more interpreters in the near
>> future.
>> So instead of asking apache infra team to increase limit,
>> I would like to suggest to have only zeppelin-0.7.0-bin-netinst.tgz,
>> which only includes spark interpreter from 0.7.0 release.
>> One concern is that users need one more step to install the interpreters
>> they use,
>> but I believe it can be done easily with single line of command [1].
>>
>> FYI, attaching the link of similar discussion [2] we had last June in
>> mailing list.
>>
>> Regards,
>> Mina
>>
>> [1] http://zeppelin.apache.org/docs/0.6.2/manual/
>> interpreterinstallation.html#install-specific-interpreters
>> <http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html>
>> [2] https://lists.apache.org/thread.html/4b54c034cf8d691655156e0cb64724
>> 3180c57a6829d97aa3c085b63c@%3Cusers.zeppelin.apache.org%3E
>>
>> --
> Taejun Kim
>
> Data Mining Lab.
> School of Electrical and Computer Engineering
> University of Seoul
>

Re: [DISCUSS] Release package size

Posted by Jun Kim <i2...@gmail.com>.
+1

I think it won't be a problem if we notice it clear.
Maybe we can do that next to the download button here (
http://zeppelin.apache.org/download.html)
A message may be "NOTE: only spark interpreter included since 0.7.0. If you
want other interpreters, please see interpreter installation guide"

2017년 1월 18일 (수) 오후 12:14, Jeff Zhang <zj...@gmail.com>님이 작성:

>
> +1, we should also mention it in release note and in the 0.7 doc
>
>
>
> Mina Lee <mi...@apache.org>于2017年1月18日周三 上午11:12写道:
>
> Hi all,
>
> Zeppelin is about to start 0.7.0 release process, I would like to discuss
> about binary package distribution.
>
> Every time we distribute new binary package, size of the
> zeppelin-0.x.x-bin-all.tgz package is getting bigger:
>    - zeppelin-0.6.0-bin-all.tgz: 506M
>    - zeppelin-0.6.1-bin-all.tgz: 517M
>    - zeppelin-0.6.2-bin-all.tgz: 547M
>    - zeppelin-0.7.0-bin-all.tgz: 720M (Expected)
>
> Mostly it is because the number of interpreters supported by zeppelin
> keeps growing,
> and there is high chance that we support more interpreters in the near
> future.
> So instead of asking apache infra team to increase limit,
> I would like to suggest to have only zeppelin-0.7.0-bin-netinst.tgz, which
> only includes spark interpreter from 0.7.0 release.
> One concern is that users need one more step to install the interpreters
> they use,
> but I believe it can be done easily with single line of command [1].
>
> FYI, attaching the link of similar discussion [2] we had last June in
> mailing list.
>
> Regards,
> Mina
>
> [1]
> http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html#install-specific-interpreters
> <http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html>
> [2]
> https://lists.apache.org/thread.html/4b54c034cf8d691655156e0cb647243180c57a6829d97aa3c085b63c@%3Cusers.zeppelin.apache.org%3E
>
> --
Taejun Kim

Data Mining Lab.
School of Electrical and Computer Engineering
University of Seoul

Re: [DISCUSS] Release package size

Posted by Jeff Zhang <zj...@gmail.com>.
+1, we should also mention it in release note and in the 0.7 doc



Mina Lee <mi...@apache.org>于2017年1月18日周三 上午11:12写道:

> Hi all,
>
> Zeppelin is about to start 0.7.0 release process, I would like to discuss
> about binary package distribution.
>
> Every time we distribute new binary package, size of the
> zeppelin-0.x.x-bin-all.tgz package is getting bigger:
>    - zeppelin-0.6.0-bin-all.tgz: 506M
>    - zeppelin-0.6.1-bin-all.tgz: 517M
>    - zeppelin-0.6.2-bin-all.tgz: 547M
>    - zeppelin-0.7.0-bin-all.tgz: 720M (Expected)
>
> Mostly it is because the number of interpreters supported by zeppelin
> keeps growing,
> and there is high chance that we support more interpreters in the near
> future.
> So instead of asking apache infra team to increase limit,
> I would like to suggest to have only zeppelin-0.7.0-bin-netinst.tgz, which
> only includes spark interpreter from 0.7.0 release.
> One concern is that users need one more step to install the interpreters
> they use,
> but I believe it can be done easily with single line of command [1].
>
> FYI, attaching the link of similar discussion [2] we had last June in
> mailing list.
>
> Regards,
> Mina
>
> [1]
> http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html#install-specific-interpreters
> <http://zeppelin.apache.org/docs/0.6.2/manual/interpreterinstallation.html>
> [2]
> https://lists.apache.org/thread.html/4b54c034cf8d691655156e0cb647243180c57a6829d97aa3c085b63c@%3Cusers.zeppelin.apache.org%3E
>