You are viewing a plain text version of this content. The canonical link for it is here.

Posted to general@incubator.apache.org by Sean Busbey <bu...@apache.org> on 2017/05/31 13:03:40 UTC

[VOTE] Livy to enter Apache Incubator

Hi folks!

I'm calling a vote to accept "Livy" into the Apache Incubator.

The full proposal is available below, and is also available in the wiki:

https://wiki.apache.org/incubator/LivyProposal

For additional context, please see the discussion thread:

https://s.apache.org/incubator-livy-proposal-thread

Please cast your vote:

[ ] +1, bring Livy into Incubator
[ ] -1, do not bring Livy into Incubator, because...

The vote will open at least for 72 hours and only votes from the Incubator
PMC are binding.

I start with my vote:
+1

----

= Abstract =

Livy is web service that exposes a REST interface for managing long running
Apache Spark contexts in your cluster. With Livy, new applications can be
built on top of Apache Spark that require fine grained interaction with many
Spark contexts.  

= Proposal =

Livy is an open-source REST service for Apache Spark. Livy enables
applications to submit Spark applications and retrieve results without a
co-location requirement on the Spark cluster. 

We propose to contribute the Livy codebase and associated artifacts (e.g.
documentation, web-site context etc) to the Apache Software Foundation.

= Background =

Apache Spark is a fast and general purpose distributed compute engine, with
a versatile API. It enables processing of large quantities of static data
distributed over a cluster of machines, as well as processing of continuous
streams of data. It is the preferred distributed data processing engine for
data engineering, stream processing and data science workloads. Each Spark
application uses a construct called the SparkContext, which is the
applicationâs connection or entry point to the Spark engine. Each Spark
application will have its own SparkContext.

Livy enables clients to interact with one or more Spark sessions through the
Livy Server, which acts as a proxy layer. Livy Clients have fine grained
control over the lifecycle of the Spark sessions, as well as the ability to
submit jobs and retrieve results, all over HTTP. Clients have two modes of
interaction: RPC Client API, available in Java and Python, which allows
results to be retrieved as Java or Python objects. The serialization and
deserialization of the results is handled by the Livy framework. HTTP based
API that allows submission of code snippets, and retrieval of the results in
different formats.

Multi-tenant resource allocation and security: Livy enables multiple
independent Spark sessions to be managed simultaneously. Multiple clients
can also interact simultaneously with the same Spark session and share the
resources of that Spark session. Livy can also enforce secure, authenticated
communication between the clients and their respective Spark sessions.

More information on Livy can be found at the existing open source website:
http://livy.io/

= Rationale =

Users want to use Sparkâs powerful processing engine and API as the data
processing backend for interactive applications. However, the job submission
and application interaction mechanisms built into Apache Spark are
insufficient and cumbersome for multi-user interactive applications.

The primary mechanism for applications to submit Spark jobs is via
spark-submit
(http://spark.apache.org/docs/latest/submitting-applications.html), which is
available as a command line tool as well as a programmatic API. However,
spark-submit has the following limitations that make it difficult to build
interactive applications: It is slow: each invocation of spark-submit
involves a setup phase where cluster resources are acquired, new processes
are forked, etc. This setup phase runs for many seconds, or even minutes,
and hence is too slow for interactive applications. It is cumbersome and
lacks flexibility: application code and dependencies have to be pre-compiled
and submitted as jars, and can not be submitted interactively.

Apache Spark comes with an ODBC/JDBC server, which can be used to submit SQL
queries to Spark. However, this solution is limited to SQL and does not
allow the client to leverage the rest of the Spark API, such as RDDs, MLlib
and Streaming.

A third way of using Spark is via its command-line shell, which allows the
interactive submission of snippets of Spark code. However, the shell entails
running Spark code on the client machine and hence is not a viable mechanism
for remote clients to submit Spark jobs.

Livy solves the limitations of the above three mechanisms, and provides the
full Spark API as a multi-tenant service to remote clients. 

Since the open source release of Livy in late 2015, we have seen tremendous
interest among a diverse set of application developers and ISVs that want to
build applications with Apache Spark. To make Livy a robust and flexible
solution that will enable a broad and growing set of applications, it is
important to grow a large and varied community of contributors.

= Initial Goals =

  * Move existing codebase, website, documentation and mailing lists to
    Apache-hosted infrastructure
  * Work with the infrastructure team to implement and approve our code
    review, build, and testing workflows in the context of the ASF
  * Incremental development and releases per Apache guidelines

= Current Status =

The Livy project began at Cloudera, as a part of the Hue project. Cloudera
soon realized the broad applicability of Livy, and separated it out into an
independent project in Nov 2015.

== Releases ==

Livy has undergone two public releases, tagged here: 

 * https://github.com/cloudera/livy/releases/tag/v0.2.0
 * https://github.com/cloudera/livy/releases/tag/v0.3.0

Tarballs and zip files were created for each release and hosted on github.
Upon joining the incubator, we will adopt a more typical ASF release
process.

== Source ==

Livyâs source is currently hosted on Github at:
https://github.com/cloudera/livy

This repository will be transitioned to Apacheâs git hosting during
incubation.

== Code review ==

Livyâs code reviews are currently public and hosted on github as pull
request reviews at: https://github.com/cloudera/livy/pulls
The Livy developer community so far is happy with github pull request
reviews and hopes to continue this after being admitted to the ASF.

== Issue Tracking ==

Livyâs bug and feature tracking is hosted on JIRA at:
https://issues.cloudera.org/projects/LIVY/summary
This JIRA instance contains bugs and development discussion dating back 1
year and will provide an initial seed for the ASF JIRA

== Community Discussion ==

Livy has several public discussion forums:

 * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-dev
 * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-user

== Development Practices ==

The Livy project follows a review before commit philosophy. Every commit
automatically runs through the unit tests and generates coverage reports
presented as a pull request comment. Our experience with this process leads
us to believe that it helps ease new contributors into the project. They get
feedback quickly on common mistakes, lowering the burden on reviewers. Those
same reviewers get to lead by example, showing the new contributors that we
value feedback within our community even when changes are done by more
experienced folks.

== Meritocracy ==

We believe strongly in meritocracy when electing committers and PMC members.
In the past few months, the project has added two new committers from two
different organisations, in recognition of their significant contributions
to the project. We will encourage contributions and participation of all
types, and ensure that contributors are appropriately recognized.

== Community ==

Though Livy is relatively new as a standalone open source project, it has
already seen promising growth in its community across several organizations:
Cloudera is the original development sponsor for Livy
Microsoft pushed the development of the interpreter fixing high availability
issues and adding additional features. 
Hortonworks has contributed the security features to Livy allowing kerberos
and impersonation to work with Spark
IBM is starting to make contributions to the Livy project
A number of other patches contributed by community members

Livy currently relies on Google Groups for mailing lists. These lists have
been active since the end of 2015/start of 2016. Currently, Livyâs user
mailing list has 173 subscribers and has hosted a total of 227 topic
threads. Livyâs developer list has 49 subscribers and has hosted 79 topic
threads.

== Core Developers ==

The early contributions to Livy were made by Cloudera engineers. In 2016,
engineers from Microsoft and Hortonworks joined the core developer
community. 

== Alignment ==

Livy is built upon Apache Spark, and other Apache projects like Apache
Hadoop YARN. Itâs used as a building block by Apache Zeppelin. These
community connections combined with our focus on development practices that
emphasize community engagement with a path to meritocratic recognition
naturally align us with the ASF.

= Known Risks =

== Orphaned Products ==

The risk of Livy being abandoned is low because it is supported by three
major big-data software vendors. Moreover, Livy is already used to power
multiple releases of services and products used in production.

== Inexperience with Open Source ==

Several of the initial committers are experienced open source developers,
several being committers and/or PMC members on other ASF projects (Spark,
YARN).

== Homogenous Developers ==

The project already has a diverse developer base. It has contributions from
3 major organisations (Cloudera, Microsoft and Hortonworks), and is used in
diverse applications, in diverse settings (On-Prem and Cloud).

== Reliance on salaried Developers ==

The contributions to the Livy project to date have been made by salaried
engineers from Cloudera, Microsoft and Hortonworks. One of the individuals
on the initial committer list has since left Microsoft and is currently
unaffiliated. The remaining contributors are from Cloudera and Hortonworks.
Since there are at least two major organizations involved, the risk of
reliance on a single group of salaried developers is mitigated. The Livy
user base is diverse, with users from across the globe, including users from
academic settings. We aim to further diversify the Livy user and contributor
base.

== Relationships with other Apache projects ==

Livy is closely tied to the Apache Spark project and currently addresses the
scenarios for a REST based batch and interactive gateway for Spark jobs on
YARN. Given the growing number of integrations with Livy, keeping it outside
of Apache Spark aligns with the desire of the Apache Spark community to
reduce the number of external dependencies in the Spark project.
Specifically, the Apache Spark community has previously expressed a desire
to keep job servers independent from the project.<<FootNote(See, for
example, discussion of the Ooyala Spark Job Server in SPARK-818)>>
Furthermore, while Livy common usage is closely tied to Spark deployments
right now, its core building blocks can be reused elsewhere.  Livyâs Remote
REPL could be used as a library for interactive scenarios in non-Spark
projects. In the future, integrations with cluster managers like Apache
Mesos and others could also be added.

The features provided by Livy have already been integrated with existing
projects like Jupyter and Apache Zeppelin for their interactive Spark use
cases. This validates the need for a project like Livy and provides an
active downstream user base that the Livy community can interact with to
seed future interest in the project.

Livy serves a similar purpose to Apache Toree (incubating) but differs in
making session management, security and impersonation a focal design point.

== An Excessive Fascination with the Apache Brand ==

The primary motivation for submitting Livy to the ASF is to grow a diverse
and strong community. We wish to encourage diverse organisations, including
ISVs, to adopt Livy and contribute to Livy without any concerns about
ownership or licensing.

= Documentation =

Documentation can be found on the Livy website http://livy.io/

The Livy web site is version controlled on the âgh-pagesâ branch of the
above repository.
Additional documentation is provided on the github wiki:
https://github.com/cloudera/livy/wiki
APis are documented within the source code as JavaDoc style documentation
comments. 

= Initial Source =

The initial source code for Livy is hosted at
https://github.com/cloudera/livy 

= Source and Intellectual Property submission plan =

The Livy codebase and web site is currently hosted on GitHub and will be
transitioned to the ASF repositories during incubation. Livy is already
licensed under the Apache 2.0 license. Cloudera has collected ICLAs and
CCLAs from all committers. There are, however, some contributions recently
from authors that have not signed the CCLA and ICLA. If necessary for a
successful SGA, weâll seek the necessary documentation or replace the
contributions.

The âLivyâ name is not a registered trademark. We will need to do a
trademark search and make sure it is available for the Apache Foundation
prior to graduation.

Cloudera currently owns the domain name: http://livy.io/. Once all the
documentation has moved over to ASF infrastructure, the main landing page
will become livy.incubator.apache.org and the old domain will just act as a
redirect.

= External Dependencies =

The list below covers the non-Apache dependencies of the project and their
licenses.

 * Jetty: Apache 2.0
 * Dropwizard Metrics: Apache 2.0
 * FasterXML Jackson: Apache 2.0
 * Netty: Apache 2.0
 * Scala: BSD
 * Py4J: BSD
 * Scalatra: BSD

Build/test-only dependencies:

 * Mockito: MIT
 * JUnit: Eclipse

= Required Resources =

== Mailing Lists ==

 * private@livy.incubator.apache.org (PPMC)
 * dev@livy.incubator.apache.org (dev mailing list)
 * user@livy.incubator.apache.org (User questions)
 * commits@livy.incubator.apache.org (subscribers shouldnât be able to post)
 * issues@livy.incubator.apache.org (subscribers shouldnât be able to post)

== Git Repository ==

git://git.apache.org/incubator-livy

== Issue Tracking ==

We would like to import our current JIRA project into the ASF JIRA, such
that our historical commit message and code comments continue to reference
the appropriate bug numbers.

= Initial Committers =

 * Marcelo Vanzin (vanzin@cloudera.com)
 * Alex Man (alex@alexman.space)
 * Jeff Zhang (zjffdu@gmail.com)
 * Saisai Shao (sshao@hortonworks.com)
 * Kostas Sakellis (kostas@cloudera.com)

= Affiliations =

The initial set of committers includes people employed by Cloudera and
Hortonworks as well as one currently independent contributor.

= Additional Interested Contributors =

Those interested in getting involved with the project as we enter incubation
are encouraged to list themselves here.

  * IsmaÃ«l MejÃa (iemejia@apache.org)

= Sponsors =

== Champion ==

Sean Busbey (busbey@apache.org)

== Nominated Mentors ==

 * Bikas Saha (bikas@apache.org)
 * Brock Noland (brock@phdata.io)
 * Luciano Resende (lresende@apache.org)

== Sponsoring Entity ==

We ask that the Incubator PMC sponsor this proposal.


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org

Re: [VOTE] Livy to enter Apache Incubator

Posted by Pierre Smits <pi...@gmail.com>.

+1 (from the cheap seats).

Best regards,

Pierre Smits

ORRTIZ.COM <http://www.orrtiz.com>
OFBiz based solutions & services

OFBiz Extensions Marketplace
http://oem.ofbizci.net/oci-2/

On Wed, May 31, 2017 at 10:18 PM, tim shea <ti...@oracle.com> wrote:

> +1 (non-binding)
>
> Great project (and I've used it).
>
>
> On 5/31/17 11:59 AM, Kostas Sakellis wrote:
>
>> +1 (non-binding)
>>
>> On Wed, May 31, 2017 at 11:46 AM, Andrew Purtell <
>> andrew.purtell@gmail.com>
>> wrote:
>>
>> +1 (binding)
>>>
>>> On May 31, 2017, at 6:03 AM, Sean Busbey <bu...@apache.org> wrote:
>>>>
>>>> Hi folks!
>>>>
>>>> I'm calling a vote to accept "Livy" into the Apache Incubator.
>>>>
>>>> The full proposal is available below, and is also available in the wiki:
>>>>
>>>> https://wiki.apache.org/incubator/LivyProposal
>>>>
>>>> For additional context, please see the discussion thread:
>>>>
>>>> https://s.apache.org/incubator-livy-proposal-thread
>>>>
>>>> Please cast your vote:
>>>>
>>>> [ ] +1, bring Livy into Incubator
>>>> [ ] -1, do not bring Livy into Incubator, because...
>>>>
>>>> The vote will open at least for 72 hours and only votes from the
>>>>
>>> Incubator
>>>
>>>> PMC are binding.
>>>>
>>>> I start with my vote:
>>>> +1
>>>>
>>>> ----
>>>>
>>>> = Abstract =
>>>>
>>>> Livy is web service that exposes a REST interface for managing long
>>>>
>>> running
>>>
>>>> Apache Spark contexts in your cluster. With Livy, new applications can
>>>> be
>>>> built on top of Apache Spark that require fine grained interaction with
>>>>
>>> many
>>>
>>>> Spark contexts.
>>>>
>>>> = Proposal =
>>>>
>>>> Livy is an open-source REST service for Apache Spark. Livy enables
>>>> applications to submit Spark applications and retrieve results without a
>>>> co-location requirement on the Spark cluster.
>>>>
>>>> We propose to contribute the Livy codebase and associated artifacts
>>>> (e.g.
>>>> documentation, web-site context etc) to the Apache Software Foundation.
>>>>
>>>> = Background =
>>>>
>>>> Apache Spark is a fast and general purpose distributed compute engine,
>>>>
>>> with
>>>
>>>> a versatile API. It enables processing of large quantities of static
>>>> data
>>>> distributed over a cluster of machines, as well as processing of
>>>>
>>> continuous
>>>
>>>> streams of data. It is the preferred distributed data processing engine
>>>>
>>> for
>>>
>>>> data engineering, stream processing and data science workloads. Each
>>>>
>>> Spark
>>>
>>>> application uses a construct called the SparkContext, which is the
>>>> application’s connection or entry point to the Spark engine. Each Spark
>>>> application will have its own SparkContext.
>>>>
>>>> Livy enables clients to interact with one or more Spark sessions through
>>>>
>>> the
>>>
>>>> Livy Server, which acts as a proxy layer. Livy Clients have fine grained
>>>> control over the lifecycle of the Spark sessions, as well as the ability
>>>>
>>> to
>>>
>>>> submit jobs and retrieve results, all over HTTP. Clients have two modes
>>>>
>>> of
>>>
>>>> interaction: RPC Client API, available in Java and Python, which allows
>>>> results to be retrieved as Java or Python objects. The serialization and
>>>> deserialization of the results is handled by the Livy framework. HTTP
>>>>
>>> based
>>>
>>>> API that allows submission of code snippets, and retrieval of the
>>>>
>>> results in
>>>
>>>> different formats.
>>>>
>>>> Multi-tenant resource allocation and security: Livy enables multiple
>>>> independent Spark sessions to be managed simultaneously. Multiple
>>>> clients
>>>> can also interact simultaneously with the same Spark session and share
>>>>
>>> the
>>>
>>>> resources of that Spark session. Livy can also enforce secure,
>>>>
>>> authenticated
>>>
>>>> communication between the clients and their respective Spark sessions.
>>>>
>>>> More information on Livy can be found at the existing open source
>>>>
>>> website:
>>>
>>>> http://livy.io/
>>>>
>>>> = Rationale =
>>>>
>>>> Users want to use Spark’s powerful processing engine and API as the data
>>>> processing backend for interactive applications. However, the job
>>>>
>>> submission
>>>
>>>> and application interaction mechanisms built into Apache Spark are
>>>> insufficient and cumbersome for multi-user interactive applications.
>>>>
>>>> The primary mechanism for applications to submit Spark jobs is via
>>>> spark-submit
>>>> (http://spark.apache.org/docs/latest/submitting-applications.html),
>>>>
>>> which is
>>>
>>>> available as a command line tool as well as a programmatic API. However,
>>>> spark-submit has the following limitations that make it difficult to
>>>>
>>> build
>>>
>>>> interactive applications: It is slow: each invocation of spark-submit
>>>> involves a setup phase where cluster resources are acquired, new
>>>>
>>> processes
>>>
>>>> are forked, etc. This setup phase runs for many seconds, or even
>>>> minutes,
>>>> and hence is too slow for interactive applications. It is cumbersome and
>>>> lacks flexibility: application code and dependencies have to be
>>>>
>>> pre-compiled
>>>
>>>> and submitted as jars, and can not be submitted interactively.
>>>>
>>>> Apache Spark comes with an ODBC/JDBC server, which can be used to submit
>>>>
>>> SQL
>>>
>>>> queries to Spark. However, this solution is limited to SQL and does not
>>>> allow the client to leverage the rest of the Spark API, such as RDDs,
>>>>
>>> MLlib
>>>
>>>> and Streaming.
>>>>
>>>> A third way of using Spark is via its command-line shell, which allows
>>>>
>>> the
>>>
>>>> interactive submission of snippets of Spark code. However, the shell
>>>>
>>> entails
>>>
>>>> running Spark code on the client machine and hence is not a viable
>>>>
>>> mechanism
>>>
>>>> for remote clients to submit Spark jobs.
>>>>
>>>> Livy solves the limitations of the above three mechanisms, and provides
>>>>
>>> the
>>>
>>>> full Spark API as a multi-tenant service to remote clients.
>>>>
>>>> Since the open source release of Livy in late 2015, we have seen
>>>>
>>> tremendous
>>>
>>>> interest among a diverse set of application developers and ISVs that
>>>>
>>> want to
>>>
>>>> build applications with Apache Spark. To make Livy a robust and flexible
>>>> solution that will enable a broad and growing set of applications, it is
>>>> important to grow a large and varied community of contributors.
>>>>
>>>> = Initial Goals =
>>>>
>>>>   * Move existing codebase, website, documentation and mailing lists to
>>>>     Apache-hosted infrastructure
>>>>   * Work with the infrastructure team to implement and approve our code
>>>>     review, build, and testing workflows in the context of the ASF
>>>>   * Incremental development and releases per Apache guidelines
>>>>
>>>> = Current Status =
>>>>
>>>> The Livy project began at Cloudera, as a part of the Hue project.
>>>>
>>> Cloudera
>>>
>>>> soon realized the broad applicability of Livy, and separated it out into
>>>>
>>> an
>>>
>>>> independent project in Nov 2015.
>>>>
>>>> == Releases ==
>>>>
>>>> Livy has undergone two public releases, tagged here:
>>>>
>>>> * https://github.com/cloudera/livy/releases/tag/v0.2.0
>>>> * https://github.com/cloudera/livy/releases/tag/v0.3.0
>>>>
>>>> Tarballs and zip files were created for each release and hosted on
>>>>
>>> github.
>>>
>>>> Upon joining the incubator, we will adopt a more typical ASF release
>>>> process.
>>>>
>>>> == Source ==
>>>>
>>>> Livy’s source is currently hosted on Github at:
>>>> https://github.com/cloudera/livy
>>>>
>>>> This repository will be transitioned to Apache’s git hosting during
>>>> incubation.
>>>>
>>>> == Code review ==
>>>>
>>>> Livy’s code reviews are currently public and hosted on github as pull
>>>> request reviews at: https://github.com/cloudera/livy/pulls
>>>> The Livy developer community so far is happy with github pull request
>>>> reviews and hopes to continue this after being admitted to the ASF.
>>>>
>>>> == Issue Tracking ==
>>>>
>>>> Livy’s bug and feature tracking is hosted on JIRA at:
>>>> https://issues.cloudera.org/projects/LIVY/summary
>>>> This JIRA instance contains bugs and development discussion dating back
>>>> 1
>>>> year and will provide an initial seed for the ASF JIRA
>>>>
>>>> == Community Discussion ==
>>>>
>>>> Livy has several public discussion forums:
>>>>
>>>> * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-dev
>>>> * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-user
>>>>
>>>> == Development Practices ==
>>>>
>>>> The Livy project follows a review before commit philosophy. Every commit
>>>> automatically runs through the unit tests and generates coverage reports
>>>> presented as a pull request comment. Our experience with this process
>>>>
>>> leads
>>>
>>>> us to believe that it helps ease new contributors into the project. They
>>>>
>>> get
>>>
>>>> feedback quickly on common mistakes, lowering the burden on reviewers.
>>>>
>>> Those
>>>
>>>> same reviewers get to lead by example, showing the new contributors that
>>>>
>>> we
>>>
>>>> value feedback within our community even when changes are done by more
>>>> experienced folks.
>>>>
>>>> == Meritocracy ==
>>>>
>>>> We believe strongly in meritocracy when electing committers and PMC
>>>>
>>> members.
>>>
>>>> In the past few months, the project has added two new committers from
>>>> two
>>>> different organisations, in recognition of their significant
>>>>
>>> contributions
>>>
>>>> to the project. We will encourage contributions and participation of all
>>>> types, and ensure that contributors are appropriately recognized.
>>>>
>>>> == Community ==
>>>>
>>>> Though Livy is relatively new as a standalone open source project, it
>>>> has
>>>> already seen promising growth in its community across several
>>>>
>>> organizations:
>>>
>>>> Cloudera is the original development sponsor for Livy
>>>> Microsoft pushed the development of the interpreter fixing high
>>>>
>>> availability
>>>
>>>> issues and adding additional features.
>>>> Hortonworks has contributed the security features to Livy allowing
>>>>
>>> kerberos
>>>
>>>> and impersonation to work with Spark
>>>> IBM is starting to make contributions to the Livy project
>>>> A number of other patches contributed by community members
>>>>
>>>> Livy currently relies on Google Groups for mailing lists. These lists
>>>>
>>> have
>>>
>>>> been active since the end of 2015/start of 2016. Currently, Livy’s user
>>>> mailing list has 173 subscribers and has hosted a total of 227 topic
>>>> threads. Livy’s developer list has 49 subscribers and has hosted 79
>>>> topic
>>>> threads.
>>>>
>>>> == Core Developers ==
>>>>
>>>> The early contributions to Livy were made by Cloudera engineers. In
>>>> 2016,
>>>> engineers from Microsoft and Hortonworks joined the core developer
>>>> community.
>>>>
>>>> == Alignment ==
>>>>
>>>> Livy is built upon Apache Spark, and other Apache projects like Apache
>>>> Hadoop YARN. It’s used as a building block by Apache Zeppelin. These
>>>> community connections combined with our focus on development practices
>>>>
>>> that
>>>
>>>> emphasize community engagement with a path to meritocratic recognition
>>>> naturally align us with the ASF.
>>>>
>>>> = Known Risks =
>>>>
>>>> == Orphaned Products ==
>>>>
>>>> The risk of Livy being abandoned is low because it is supported by three
>>>> major big-data software vendors. Moreover, Livy is already used to power
>>>> multiple releases of services and products used in production.
>>>>
>>>> == Inexperience with Open Source ==
>>>>
>>>> Several of the initial committers are experienced open source
>>>> developers,
>>>> several being committers and/or PMC members on other ASF projects
>>>> (Spark,
>>>> YARN).
>>>>
>>>> == Homogenous Developers ==
>>>>
>>>> The project already has a diverse developer base. It has contributions
>>>>
>>> from
>>>
>>>> 3 major organisations (Cloudera, Microsoft and Hortonworks), and is used
>>>>
>>> in
>>>
>>>> diverse applications, in diverse settings (On-Prem and Cloud).
>>>>
>>>> == Reliance on salaried Developers ==
>>>>
>>>> The contributions to the Livy project to date have been made by salaried
>>>> engineers from Cloudera, Microsoft and Hortonworks. One of the
>>>>
>>> individuals
>>>
>>>> on the initial committer list has since left Microsoft and is currently
>>>> unaffiliated. The remaining contributors are from Cloudera and
>>>>
>>> Hortonworks.
>>>
>>>> Since there are at least two major organizations involved, the risk of
>>>> reliance on a single group of salaried developers is mitigated. The Livy
>>>> user base is diverse, with users from across the globe, including users
>>>>
>>> from
>>>
>>>> academic settings. We aim to further diversify the Livy user and
>>>>
>>> contributor
>>>
>>>> base.
>>>>
>>>> == Relationships with other Apache projects ==
>>>>
>>>> Livy is closely tied to the Apache Spark project and currently addresses
>>>>
>>> the
>>>
>>>> scenarios for a REST based batch and interactive gateway for Spark jobs
>>>>
>>> on
>>>
>>>> YARN. Given the growing number of integrations with Livy, keeping it
>>>>
>>> outside
>>>
>>>> of Apache Spark aligns with the desire of the Apache Spark community to
>>>> reduce the number of external dependencies in the Spark project.
>>>> Specifically, the Apache Spark community has previously expressed a
>>>>
>>> desire
>>>
>>>> to keep job servers independent from the project.<<FootNote(See, for
>>>> example, discussion of the Ooyala Spark Job Server in SPARK-818)>>
>>>> Furthermore, while Livy common usage is closely tied to Spark
>>>> deployments
>>>> right now, its core building blocks can be reused elsewhere.  Livy’s
>>>>
>>> Remote
>>>
>>>> REPL could be used as a library for interactive scenarios in non-Spark
>>>> projects. In the future, integrations with cluster managers like Apache
>>>> Mesos and others could also be added.
>>>>
>>>> The features provided by Livy have already been integrated with existing
>>>> projects like Jupyter and Apache Zeppelin for their interactive Spark
>>>> use
>>>> cases. This validates the need for a project like Livy and provides an
>>>> active downstream user base that the Livy community can interact with to
>>>> seed future interest in the project.
>>>>
>>>> Livy serves a similar purpose to Apache Toree (incubating) but differs
>>>> in
>>>> making session management, security and impersonation a focal design
>>>>
>>> point.
>>>
>>>> == An Excessive Fascination with the Apache Brand ==
>>>>
>>>> The primary motivation for submitting Livy to the ASF is to grow a
>>>>
>>> diverse
>>>
>>>> and strong community. We wish to encourage diverse organisations,
>>>>
>>> including
>>>
>>>> ISVs, to adopt Livy and contribute to Livy without any concerns about
>>>> ownership or licensing.
>>>>
>>>> = Documentation =
>>>>
>>>> Documentation can be found on the Livy website http://livy.io/
>>>>
>>>> The Livy web site is version controlled on the ‘gh-pages’ branch of the
>>>> above repository.
>>>> Additional documentation is provided on the github wiki:
>>>> https://github.com/cloudera/livy/wiki
>>>> APis are documented within the source code as JavaDoc style
>>>> documentation
>>>> comments.
>>>>
>>>> = Initial Source =
>>>>
>>>> The initial source code for Livy is hosted at
>>>> https://github.com/cloudera/livy
>>>>
>>>> = Source and Intellectual Property submission plan =
>>>>
>>>> The Livy codebase and web site is currently hosted on GitHub and will be
>>>> transitioned to the ASF repositories during incubation. Livy is already
>>>> licensed under the Apache 2.0 license. Cloudera has collected ICLAs and
>>>> CCLAs from all committers. There are, however, some contributions
>>>>
>>> recently
>>>
>>>> from authors that have not signed the CCLA and ICLA. If necessary for a
>>>> successful SGA, we’ll seek the necessary documentation or replace the
>>>> contributions.
>>>>
>>>> The “Livy” name is not a registered trademark. We will need to do a
>>>> trademark search and make sure it is available for the Apache Foundation
>>>> prior to graduation.
>>>>
>>>> Cloudera currently owns the domain name: http://livy.io/. Once all the
>>>> documentation has moved over to ASF infrastructure, the main landing
>>>> page
>>>> will become livy.incubator.apache.org and the old domain will just act
>>>>
>>> as a
>>>
>>>> redirect.
>>>>
>>>> = External Dependencies =
>>>>
>>>> The list below covers the non-Apache dependencies of the project and
>>>>
>>> their
>>>
>>>> licenses.
>>>>
>>>> * Jetty: Apache 2.0
>>>> * Dropwizard Metrics: Apache 2.0
>>>> * FasterXML Jackson: Apache 2.0
>>>> * Netty: Apache 2.0
>>>> * Scala: BSD
>>>> * Py4J: BSD
>>>> * Scalatra: BSD
>>>>
>>>> Build/test-only dependencies:
>>>>
>>>> * Mockito: MIT
>>>> * JUnit: Eclipse
>>>>
>>>> = Required Resources =
>>>>
>>>> == Mailing Lists ==
>>>>
>>>> * private@livy.incubator.apache.org (PPMC)
>>>> * dev@livy.incubator.apache.org (dev mailing list)
>>>> * user@livy.incubator.apache.org (User questions)
>>>> * commits@livy.incubator.apache.org (subscribers shouldn’t be able to
>>>>
>>> post)
>>>
>>>> * issues@livy.incubator.apache.org (subscribers shouldn’t be able to
>>>>
>>> post)
>>>
>>>> == Git Repository ==
>>>>
>>>> git://git.apache.org/incubator-livy
>>>>
>>>> == Issue Tracking ==
>>>>
>>>> We would like to import our current JIRA project into the ASF JIRA, such
>>>> that our historical commit message and code comments continue to
>>>>
>>> reference
>>>
>>>> the appropriate bug numbers.
>>>>
>>>> = Initial Committers =
>>>>
>>>> * Marcelo Vanzin (vanzin@cloudera.com)
>>>> * Alex Man (alex@alexman.space)
>>>> * Jeff Zhang (zjffdu@gmail.com)
>>>> * Saisai Shao (sshao@hortonworks.com)
>>>> * Kostas Sakellis (kostas@cloudera.com)
>>>>
>>>> = Affiliations =
>>>>
>>>> The initial set of committers includes people employed by Cloudera and
>>>> Hortonworks as well as one currently independent contributor.
>>>>
>>>> = Additional Interested Contributors =
>>>>
>>>> Those interested in getting involved with the project as we enter
>>>>
>>> incubation
>>>
>>>> are encouraged to list themselves here.
>>>>
>>>>   * Ismaël Mejía (iemejia@apache.org)
>>>>
>>>> = Sponsors =
>>>>
>>>> == Champion ==
>>>>
>>>> Sean Busbey (busbey@apache.org)
>>>>
>>>> == Nominated Mentors ==
>>>>
>>>> * Bikas Saha (bikas@apache.org)
>>>> * Brock Noland (brock@phdata.io)
>>>> * Luciano Resende (lresende@apache.org)
>>>>
>>>> == Sponsoring Entity ==
>>>>
>>>> We ask that the Incubator PMC sponsor this proposal.
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>>>> For additional commands, e-mail: general-help@incubator.apache.org
>>>>
>>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>>> For additional commands, e-mail: general-help@incubator.apache.org
>>>
>>>
>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>

Re: [VOTE] Livy to enter Apache Incubator

Posted by tim shea <ti...@oracle.com>.

+1 (non-binding)

Great project (and I've used it).

On 5/31/17 11:59 AM, Kostas Sakellis wrote:
> +1 (non-binding)
>
> On Wed, May 31, 2017 at 11:46 AM, Andrew Purtell <an...@gmail.com>
> wrote:
>
>> +1 (binding)
>>
>>> On May 31, 2017, at 6:03 AM, Sean Busbey <bu...@apache.org> wrote:
>>>
>>> Hi folks!
>>>
>>> I'm calling a vote to accept "Livy" into the Apache Incubator.
>>>
>>> The full proposal is available below, and is also available in the wiki:
>>>
>>> https://wiki.apache.org/incubator/LivyProposal
>>>
>>> For additional context, please see the discussion thread:
>>>
>>> https://s.apache.org/incubator-livy-proposal-thread
>>>
>>> Please cast your vote:
>>>
>>> [ ] +1, bring Livy into Incubator
>>> [ ] -1, do not bring Livy into Incubator, because...
>>>
>>> The vote will open at least for 72 hours and only votes from the
>> Incubator
>>> PMC are binding.
>>>
>>> I start with my vote:
>>> +1
>>>
>>> ----
>>>
>>> = Abstract =
>>>
>>> Livy is web service that exposes a REST interface for managing long
>> running
>>> Apache Spark contexts in your cluster. With Livy, new applications can be
>>> built on top of Apache Spark that require fine grained interaction with
>> many
>>> Spark contexts.
>>>
>>> = Proposal =
>>>
>>> Livy is an open-source REST service for Apache Spark. Livy enables
>>> applications to submit Spark applications and retrieve results without a
>>> co-location requirement on the Spark cluster.
>>>
>>> We propose to contribute the Livy codebase and associated artifacts (e.g.
>>> documentation, web-site context etc) to the Apache Software Foundation.
>>>
>>> = Background =
>>>
>>> Apache Spark is a fast and general purpose distributed compute engine,
>> with
>>> a versatile API. It enables processing of large quantities of static data
>>> distributed over a cluster of machines, as well as processing of
>> continuous
>>> streams of data. It is the preferred distributed data processing engine
>> for
>>> data engineering, stream processing and data science workloads. Each
>> Spark
>>> application uses a construct called the SparkContext, which is the
>>> application’s connection or entry point to the Spark engine. Each Spark
>>> application will have its own SparkContext.
>>>
>>> Livy enables clients to interact with one or more Spark sessions through
>> the
>>> Livy Server, which acts as a proxy layer. Livy Clients have fine grained
>>> control over the lifecycle of the Spark sessions, as well as the ability
>> to
>>> submit jobs and retrieve results, all over HTTP. Clients have two modes
>> of
>>> interaction: RPC Client API, available in Java and Python, which allows
>>> results to be retrieved as Java or Python objects. The serialization and
>>> deserialization of the results is handled by the Livy framework. HTTP
>> based
>>> API that allows submission of code snippets, and retrieval of the
>> results in
>>> different formats.
>>>
>>> Multi-tenant resource allocation and security: Livy enables multiple
>>> independent Spark sessions to be managed simultaneously. Multiple clients
>>> can also interact simultaneously with the same Spark session and share
>> the
>>> resources of that Spark session. Livy can also enforce secure,
>> authenticated
>>> communication between the clients and their respective Spark sessions.
>>>
>>> More information on Livy can be found at the existing open source
>> website:
>>> http://livy.io/
>>>
>>> = Rationale =
>>>
>>> Users want to use Spark’s powerful processing engine and API as the data
>>> processing backend for interactive applications. However, the job
>> submission
>>> and application interaction mechanisms built into Apache Spark are
>>> insufficient and cumbersome for multi-user interactive applications.
>>>
>>> The primary mechanism for applications to submit Spark jobs is via
>>> spark-submit
>>> (http://spark.apache.org/docs/latest/submitting-applications.html),
>> which is
>>> available as a command line tool as well as a programmatic API. However,
>>> spark-submit has the following limitations that make it difficult to
>> build
>>> interactive applications: It is slow: each invocation of spark-submit
>>> involves a setup phase where cluster resources are acquired, new
>> processes
>>> are forked, etc. This setup phase runs for many seconds, or even minutes,
>>> and hence is too slow for interactive applications. It is cumbersome and
>>> lacks flexibility: application code and dependencies have to be
>> pre-compiled
>>> and submitted as jars, and can not be submitted interactively.
>>>
>>> Apache Spark comes with an ODBC/JDBC server, which can be used to submit
>> SQL
>>> queries to Spark. However, this solution is limited to SQL and does not
>>> allow the client to leverage the rest of the Spark API, such as RDDs,
>> MLlib
>>> and Streaming.
>>>
>>> A third way of using Spark is via its command-line shell, which allows
>> the
>>> interactive submission of snippets of Spark code. However, the shell
>> entails
>>> running Spark code on the client machine and hence is not a viable
>> mechanism
>>> for remote clients to submit Spark jobs.
>>>
>>> Livy solves the limitations of the above three mechanisms, and provides
>> the
>>> full Spark API as a multi-tenant service to remote clients.
>>>
>>> Since the open source release of Livy in late 2015, we have seen
>> tremendous
>>> interest among a diverse set of application developers and ISVs that
>> want to
>>> build applications with Apache Spark. To make Livy a robust and flexible
>>> solution that will enable a broad and growing set of applications, it is
>>> important to grow a large and varied community of contributors.
>>>
>>> = Initial Goals =
>>>
>>>   * Move existing codebase, website, documentation and mailing lists to
>>>     Apache-hosted infrastructure
>>>   * Work with the infrastructure team to implement and approve our code
>>>     review, build, and testing workflows in the context of the ASF
>>>   * Incremental development and releases per Apache guidelines
>>>
>>> = Current Status =
>>>
>>> The Livy project began at Cloudera, as a part of the Hue project.
>> Cloudera
>>> soon realized the broad applicability of Livy, and separated it out into
>> an
>>> independent project in Nov 2015.
>>>
>>> == Releases ==
>>>
>>> Livy has undergone two public releases, tagged here:
>>>
>>> * https://github.com/cloudera/livy/releases/tag/v0.2.0
>>> * https://github.com/cloudera/livy/releases/tag/v0.3.0
>>>
>>> Tarballs and zip files were created for each release and hosted on
>> github.
>>> Upon joining the incubator, we will adopt a more typical ASF release
>>> process.
>>>
>>> == Source ==
>>>
>>> Livy’s source is currently hosted on Github at:
>>> https://github.com/cloudera/livy
>>>
>>> This repository will be transitioned to Apache’s git hosting during
>>> incubation.
>>>
>>> == Code review ==
>>>
>>> Livy’s code reviews are currently public and hosted on github as pull
>>> request reviews at: https://github.com/cloudera/livy/pulls
>>> The Livy developer community so far is happy with github pull request
>>> reviews and hopes to continue this after being admitted to the ASF.
>>>
>>> == Issue Tracking ==
>>>
>>> Livy’s bug and feature tracking is hosted on JIRA at:
>>> https://issues.cloudera.org/projects/LIVY/summary
>>> This JIRA instance contains bugs and development discussion dating back 1
>>> year and will provide an initial seed for the ASF JIRA
>>>
>>> == Community Discussion ==
>>>
>>> Livy has several public discussion forums:
>>>
>>> * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-dev
>>> * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-user
>>>
>>> == Development Practices ==
>>>
>>> The Livy project follows a review before commit philosophy. Every commit
>>> automatically runs through the unit tests and generates coverage reports
>>> presented as a pull request comment. Our experience with this process
>> leads
>>> us to believe that it helps ease new contributors into the project. They
>> get
>>> feedback quickly on common mistakes, lowering the burden on reviewers.
>> Those
>>> same reviewers get to lead by example, showing the new contributors that
>> we
>>> value feedback within our community even when changes are done by more
>>> experienced folks.
>>>
>>> == Meritocracy ==
>>>
>>> We believe strongly in meritocracy when electing committers and PMC
>> members.
>>> In the past few months, the project has added two new committers from two
>>> different organisations, in recognition of their significant
>> contributions
>>> to the project. We will encourage contributions and participation of all
>>> types, and ensure that contributors are appropriately recognized.
>>>
>>> == Community ==
>>>
>>> Though Livy is relatively new as a standalone open source project, it has
>>> already seen promising growth in its community across several
>> organizations:
>>> Cloudera is the original development sponsor for Livy
>>> Microsoft pushed the development of the interpreter fixing high
>> availability
>>> issues and adding additional features.
>>> Hortonworks has contributed the security features to Livy allowing
>> kerberos
>>> and impersonation to work with Spark
>>> IBM is starting to make contributions to the Livy project
>>> A number of other patches contributed by community members
>>>
>>> Livy currently relies on Google Groups for mailing lists. These lists
>> have
>>> been active since the end of 2015/start of 2016. Currently, Livy’s user
>>> mailing list has 173 subscribers and has hosted a total of 227 topic
>>> threads. Livy’s developer list has 49 subscribers and has hosted 79 topic
>>> threads.
>>>
>>> == Core Developers ==
>>>
>>> The early contributions to Livy were made by Cloudera engineers. In 2016,
>>> engineers from Microsoft and Hortonworks joined the core developer
>>> community.
>>>
>>> == Alignment ==
>>>
>>> Livy is built upon Apache Spark, and other Apache projects like Apache
>>> Hadoop YARN. It’s used as a building block by Apache Zeppelin. These
>>> community connections combined with our focus on development practices
>> that
>>> emphasize community engagement with a path to meritocratic recognition
>>> naturally align us with the ASF.
>>>
>>> = Known Risks =
>>>
>>> == Orphaned Products ==
>>>
>>> The risk of Livy being abandoned is low because it is supported by three
>>> major big-data software vendors. Moreover, Livy is already used to power
>>> multiple releases of services and products used in production.
>>>
>>> == Inexperience with Open Source ==
>>>
>>> Several of the initial committers are experienced open source developers,
>>> several being committers and/or PMC members on other ASF projects (Spark,
>>> YARN).
>>>
>>> == Homogenous Developers ==
>>>
>>> The project already has a diverse developer base. It has contributions
>> from
>>> 3 major organisations (Cloudera, Microsoft and Hortonworks), and is used
>> in
>>> diverse applications, in diverse settings (On-Prem and Cloud).
>>>
>>> == Reliance on salaried Developers ==
>>>
>>> The contributions to the Livy project to date have been made by salaried
>>> engineers from Cloudera, Microsoft and Hortonworks. One of the
>> individuals
>>> on the initial committer list has since left Microsoft and is currently
>>> unaffiliated. The remaining contributors are from Cloudera and
>> Hortonworks.
>>> Since there are at least two major organizations involved, the risk of
>>> reliance on a single group of salaried developers is mitigated. The Livy
>>> user base is diverse, with users from across the globe, including users
>> from
>>> academic settings. We aim to further diversify the Livy user and
>> contributor
>>> base.
>>>
>>> == Relationships with other Apache projects ==
>>>
>>> Livy is closely tied to the Apache Spark project and currently addresses
>> the
>>> scenarios for a REST based batch and interactive gateway for Spark jobs
>> on
>>> YARN. Given the growing number of integrations with Livy, keeping it
>> outside
>>> of Apache Spark aligns with the desire of the Apache Spark community to
>>> reduce the number of external dependencies in the Spark project.
>>> Specifically, the Apache Spark community has previously expressed a
>> desire
>>> to keep job servers independent from the project.<<FootNote(See, for
>>> example, discussion of the Ooyala Spark Job Server in SPARK-818)>>
>>> Furthermore, while Livy common usage is closely tied to Spark deployments
>>> right now, its core building blocks can be reused elsewhere.  Livy’s
>> Remote
>>> REPL could be used as a library for interactive scenarios in non-Spark
>>> projects. In the future, integrations with cluster managers like Apache
>>> Mesos and others could also be added.
>>>
>>> The features provided by Livy have already been integrated with existing
>>> projects like Jupyter and Apache Zeppelin for their interactive Spark use
>>> cases. This validates the need for a project like Livy and provides an
>>> active downstream user base that the Livy community can interact with to
>>> seed future interest in the project.
>>>
>>> Livy serves a similar purpose to Apache Toree (incubating) but differs in
>>> making session management, security and impersonation a focal design
>> point.
>>> == An Excessive Fascination with the Apache Brand ==
>>>
>>> The primary motivation for submitting Livy to the ASF is to grow a
>> diverse
>>> and strong community. We wish to encourage diverse organisations,
>> including
>>> ISVs, to adopt Livy and contribute to Livy without any concerns about
>>> ownership or licensing.
>>>
>>> = Documentation =
>>>
>>> Documentation can be found on the Livy website http://livy.io/
>>>
>>> The Livy web site is version controlled on the ‘gh-pages’ branch of the
>>> above repository.
>>> Additional documentation is provided on the github wiki:
>>> https://github.com/cloudera/livy/wiki
>>> APis are documented within the source code as JavaDoc style documentation
>>> comments.
>>>
>>> = Initial Source =
>>>
>>> The initial source code for Livy is hosted at
>>> https://github.com/cloudera/livy
>>>
>>> = Source and Intellectual Property submission plan =
>>>
>>> The Livy codebase and web site is currently hosted on GitHub and will be
>>> transitioned to the ASF repositories during incubation. Livy is already
>>> licensed under the Apache 2.0 license. Cloudera has collected ICLAs and
>>> CCLAs from all committers. There are, however, some contributions
>> recently
>>> from authors that have not signed the CCLA and ICLA. If necessary for a
>>> successful SGA, we’ll seek the necessary documentation or replace the
>>> contributions.
>>>
>>> The “Livy” name is not a registered trademark. We will need to do a
>>> trademark search and make sure it is available for the Apache Foundation
>>> prior to graduation.
>>>
>>> Cloudera currently owns the domain name: http://livy.io/. Once all the
>>> documentation has moved over to ASF infrastructure, the main landing page
>>> will become livy.incubator.apache.org and the old domain will just act
>> as a
>>> redirect.
>>>
>>> = External Dependencies =
>>>
>>> The list below covers the non-Apache dependencies of the project and
>> their
>>> licenses.
>>>
>>> * Jetty: Apache 2.0
>>> * Dropwizard Metrics: Apache 2.0
>>> * FasterXML Jackson: Apache 2.0
>>> * Netty: Apache 2.0
>>> * Scala: BSD
>>> * Py4J: BSD
>>> * Scalatra: BSD
>>>
>>> Build/test-only dependencies:
>>>
>>> * Mockito: MIT
>>> * JUnit: Eclipse
>>>
>>> = Required Resources =
>>>
>>> == Mailing Lists ==
>>>
>>> * private@livy.incubator.apache.org (PPMC)
>>> * dev@livy.incubator.apache.org (dev mailing list)
>>> * user@livy.incubator.apache.org (User questions)
>>> * commits@livy.incubator.apache.org (subscribers shouldn’t be able to
>> post)
>>> * issues@livy.incubator.apache.org (subscribers shouldn’t be able to
>> post)
>>> == Git Repository ==
>>>
>>> git://git.apache.org/incubator-livy
>>>
>>> == Issue Tracking ==
>>>
>>> We would like to import our current JIRA project into the ASF JIRA, such
>>> that our historical commit message and code comments continue to
>> reference
>>> the appropriate bug numbers.
>>>
>>> = Initial Committers =
>>>
>>> * Marcelo Vanzin (vanzin@cloudera.com)
>>> * Alex Man (alex@alexman.space)
>>> * Jeff Zhang (zjffdu@gmail.com)
>>> * Saisai Shao (sshao@hortonworks.com)
>>> * Kostas Sakellis (kostas@cloudera.com)
>>>
>>> = Affiliations =
>>>
>>> The initial set of committers includes people employed by Cloudera and
>>> Hortonworks as well as one currently independent contributor.
>>>
>>> = Additional Interested Contributors =
>>>
>>> Those interested in getting involved with the project as we enter
>> incubation
>>> are encouraged to list themselves here.
>>>
>>>   * Ismaël Mejía (iemejia@apache.org)
>>>
>>> = Sponsors =
>>>
>>> == Champion ==
>>>
>>> Sean Busbey (busbey@apache.org)
>>>
>>> == Nominated Mentors ==
>>>
>>> * Bikas Saha (bikas@apache.org)
>>> * Brock Noland (brock@phdata.io)
>>> * Luciano Resende (lresende@apache.org)
>>>
>>> == Sponsoring Entity ==
>>>
>>> We ask that the Incubator PMC sponsor this proposal.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>>> For additional commands, e-mail: general-help@incubator.apache.org
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>> For additional commands, e-mail: general-help@incubator.apache.org
>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org

Re: [VOTE] Livy to enter Apache Incubator

Posted by Saisai Shao <sa...@gmail.com>.

+1 (non-binding)

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org

Re: [VOTE] Livy to enter Apache Incubator

Posted by Jeff Zhang <zj...@gmail.com>.

+1 (non-binding)


Brock Noland <br...@apache.org>于2017年6月1日周四 上午3:13写道：

> +1 (binding)
>
> On Wed, May 31, 2017 at 1:59 PM, Kostas Sakellis <ko...@cloudera.com>
> wrote:
>
> > +1 (non-binding)
> >
> > On Wed, May 31, 2017 at 11:46 AM, Andrew Purtell <
> andrew.purtell@gmail.com
> > >
> > wrote:
> >
> > > +1 (binding)
> > >
> > > > On May 31, 2017, at 6:03 AM, Sean Busbey <bu...@apache.org> wrote:
> > > >
> > > > Hi folks!
> > > >
> > > > I'm calling a vote to accept "Livy" into the Apache Incubator.
> > > >
> > > > The full proposal is available below, and is also available in the
> > wiki:
> > > >
> > > > https://wiki.apache.org/incubator/LivyProposal
> > > >
> > > > For additional context, please see the discussion thread:
> > > >
> > > > https://s.apache.org/incubator-livy-proposal-thread
> > > >
> > > > Please cast your vote:
> > > >
> > > > [ ] +1, bring Livy into Incubator
> > > > [ ] -1, do not bring Livy into Incubator, because...
> > > >
> > > > The vote will open at least for 72 hours and only votes from the
> > > Incubator
> > > > PMC are binding.
> > > >
> > > > I start with my vote:
> > > > +1
> > > >
> > > > ----
> > > >
> > > > = Abstract =
> > > >
> > > > Livy is web service that exposes a REST interface for managing long
> > > running
> > > > Apache Spark contexts in your cluster. With Livy, new applications
> can
> > be
> > > > built on top of Apache Spark that require fine grained interaction
> with
> > > many
> > > > Spark contexts.
> > > >
> > > > = Proposal =
> > > >
> > > > Livy is an open-source REST service for Apache Spark. Livy enables
> > > > applications to submit Spark applications and retrieve results
> without
> > a
> > > > co-location requirement on the Spark cluster.
> > > >
> > > > We propose to contribute the Livy codebase and associated artifacts
> > (e.g.
> > > > documentation, web-site context etc) to the Apache Software
> Foundation.
> > > >
> > > > = Background =
> > > >
> > > > Apache Spark is a fast and general purpose distributed compute
> engine,
> > > with
> > > > a versatile API. It enables processing of large quantities of static
> > data
> > > > distributed over a cluster of machines, as well as processing of
> > > continuous
> > > > streams of data. It is the preferred distributed data processing
> engine
> > > for
> > > > data engineering, stream processing and data science workloads. Each
> > > Spark
> > > > application uses a construct called the SparkContext, which is the
> > > > application’s connection or entry point to the Spark engine. Each
> Spark
> > > > application will have its own SparkContext.
> > > >
> > > > Livy enables clients to interact with one or more Spark sessions
> > through
> > > the
> > > > Livy Server, which acts as a proxy layer. Livy Clients have fine
> > grained
> > > > control over the lifecycle of the Spark sessions, as well as the
> > ability
> > > to
> > > > submit jobs and retrieve results, all over HTTP. Clients have two
> modes
> > > of
> > > > interaction: RPC Client API, available in Java and Python, which
> allows
> > > > results to be retrieved as Java or Python objects. The serialization
> > and
> > > > deserialization of the results is handled by the Livy framework. HTTP
> > > based
> > > > API that allows submission of code snippets, and retrieval of the
> > > results in
> > > > different formats.
> > > >
> > > > Multi-tenant resource allocation and security: Livy enables multiple
> > > > independent Spark sessions to be managed simultaneously. Multiple
> > clients
> > > > can also interact simultaneously with the same Spark session and
> share
> > > the
> > > > resources of that Spark session. Livy can also enforce secure,
> > > authenticated
> > > > communication between the clients and their respective Spark
> sessions.
> > > >
> > > > More information on Livy can be found at the existing open source
> > > website:
> > > > http://livy.io/
> > > >
> > > > = Rationale =
> > > >
> > > > Users want to use Spark’s powerful processing engine and API as the
> > data
> > > > processing backend for interactive applications. However, the job
> > > submission
> > > > and application interaction mechanisms built into Apache Spark are
> > > > insufficient and cumbersome for multi-user interactive applications.
> > > >
> > > > The primary mechanism for applications to submit Spark jobs is via
> > > > spark-submit
> > > > (http://spark.apache.org/docs/latest/submitting-applications.html),
> > > which is
> > > > available as a command line tool as well as a programmatic API.
> > However,
> > > > spark-submit has the following limitations that make it difficult to
> > > build
> > > > interactive applications: It is slow: each invocation of spark-submit
> > > > involves a setup phase where cluster resources are acquired, new
> > > processes
> > > > are forked, etc. This setup phase runs for many seconds, or even
> > minutes,
> > > > and hence is too slow for interactive applications. It is cumbersome
> > and
> > > > lacks flexibility: application code and dependencies have to be
> > > pre-compiled
> > > > and submitted as jars, and can not be submitted interactively.
> > > >
> > > > Apache Spark comes with an ODBC/JDBC server, which can be used to
> > submit
> > > SQL
> > > > queries to Spark. However, this solution is limited to SQL and does
> not
> > > > allow the client to leverage the rest of the Spark API, such as RDDs,
> > > MLlib
> > > > and Streaming.
> > > >
> > > > A third way of using Spark is via its command-line shell, which
> allows
> > > the
> > > > interactive submission of snippets of Spark code. However, the shell
> > > entails
> > > > running Spark code on the client machine and hence is not a viable
> > > mechanism
> > > > for remote clients to submit Spark jobs.
> > > >
> > > > Livy solves the limitations of the above three mechanisms, and
> provides
> > > the
> > > > full Spark API as a multi-tenant service to remote clients.
> > > >
> > > > Since the open source release of Livy in late 2015, we have seen
> > > tremendous
> > > > interest among a diverse set of application developers and ISVs that
> > > want to
> > > > build applications with Apache Spark. To make Livy a robust and
> > flexible
> > > > solution that will enable a broad and growing set of applications, it
> > is
> > > > important to grow a large and varied community of contributors.
> > > >
> > > > = Initial Goals =
> > > >
> > > >  * Move existing codebase, website, documentation and mailing lists
> to
> > > >    Apache-hosted infrastructure
> > > >  * Work with the infrastructure team to implement and approve our
> code
> > > >    review, build, and testing workflows in the context of the ASF
> > > >  * Incremental development and releases per Apache guidelines
> > > >
> > > > = Current Status =
> > > >
> > > > The Livy project began at Cloudera, as a part of the Hue project.
> > > Cloudera
> > > > soon realized the broad applicability of Livy, and separated it out
> > into
> > > an
> > > > independent project in Nov 2015.
> > > >
> > > > == Releases ==
> > > >
> > > > Livy has undergone two public releases, tagged here:
> > > >
> > > > * https://github.com/cloudera/livy/releases/tag/v0.2.0
> > > > * https://github.com/cloudera/livy/releases/tag/v0.3.0
> > > >
> > > > Tarballs and zip files were created for each release and hosted on
> > > github.
> > > > Upon joining the incubator, we will adopt a more typical ASF release
> > > > process.
> > > >
> > > > == Source ==
> > > >
> > > > Livy’s source is currently hosted on Github at:
> > > > https://github.com/cloudera/livy
> > > >
> > > > This repository will be transitioned to Apache’s git hosting during
> > > > incubation.
> > > >
> > > > == Code review ==
> > > >
> > > > Livy’s code reviews are currently public and hosted on github as pull
> > > > request reviews at: https://github.com/cloudera/livy/pulls
> > > > The Livy developer community so far is happy with github pull request
> > > > reviews and hopes to continue this after being admitted to the ASF.
> > > >
> > > > == Issue Tracking ==
> > > >
> > > > Livy’s bug and feature tracking is hosted on JIRA at:
> > > > https://issues.cloudera.org/projects/LIVY/summary
> > > > This JIRA instance contains bugs and development discussion dating
> > back 1
> > > > year and will provide an initial seed for the ASF JIRA
> > > >
> > > > == Community Discussion ==
> > > >
> > > > Livy has several public discussion forums:
> > > >
> > > > * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-dev
> > > > * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-user
> > > >
> > > > == Development Practices ==
> > > >
> > > > The Livy project follows a review before commit philosophy. Every
> > commit
> > > > automatically runs through the unit tests and generates coverage
> > reports
> > > > presented as a pull request comment. Our experience with this process
> > > leads
> > > > us to believe that it helps ease new contributors into the project.
> > They
> > > get
> > > > feedback quickly on common mistakes, lowering the burden on
> reviewers.
> > > Those
> > > > same reviewers get to lead by example, showing the new contributors
> > that
> > > we
> > > > value feedback within our community even when changes are done by
> more
> > > > experienced folks.
> > > >
> > > > == Meritocracy ==
> > > >
> > > > We believe strongly in meritocracy when electing committers and PMC
> > > members.
> > > > In the past few months, the project has added two new committers from
> > two
> > > > different organisations, in recognition of their significant
> > > contributions
> > > > to the project. We will encourage contributions and participation of
> > all
> > > > types, and ensure that contributors are appropriately recognized.
> > > >
> > > > == Community ==
> > > >
> > > > Though Livy is relatively new as a standalone open source project, it
> > has
> > > > already seen promising growth in its community across several
> > > organizations:
> > > > Cloudera is the original development sponsor for Livy
> > > > Microsoft pushed the development of the interpreter fixing high
> > > availability
> > > > issues and adding additional features.
> > > > Hortonworks has contributed the security features to Livy allowing
> > > kerberos
> > > > and impersonation to work with Spark
> > > > IBM is starting to make contributions to the Livy project
> > > > A number of other patches contributed by community members
> > > >
> > > > Livy currently relies on Google Groups for mailing lists. These lists
> > > have
> > > > been active since the end of 2015/start of 2016. Currently, Livy’s
> user
> > > > mailing list has 173 subscribers and has hosted a total of 227 topic
> > > > threads. Livy’s developer list has 49 subscribers and has hosted 79
> > topic
> > > > threads.
> > > >
> > > > == Core Developers ==
> > > >
> > > > The early contributions to Livy were made by Cloudera engineers. In
> > 2016,
> > > > engineers from Microsoft and Hortonworks joined the core developer
> > > > community.
> > > >
> > > > == Alignment ==
> > > >
> > > > Livy is built upon Apache Spark, and other Apache projects like
> Apache
> > > > Hadoop YARN. It’s used as a building block by Apache Zeppelin. These
> > > > community connections combined with our focus on development
> practices
> > > that
> > > > emphasize community engagement with a path to meritocratic
> recognition
> > > > naturally align us with the ASF.
> > > >
> > > > = Known Risks =
> > > >
> > > > == Orphaned Products ==
> > > >
> > > > The risk of Livy being abandoned is low because it is supported by
> > three
> > > > major big-data software vendors. Moreover, Livy is already used to
> > power
> > > > multiple releases of services and products used in production.
> > > >
> > > > == Inexperience with Open Source ==
> > > >
> > > > Several of the initial committers are experienced open source
> > developers,
> > > > several being committers and/or PMC members on other ASF projects
> > (Spark,
> > > > YARN).
> > > >
> > > > == Homogenous Developers ==
> > > >
> > > > The project already has a diverse developer base. It has
> contributions
> > > from
> > > > 3 major organisations (Cloudera, Microsoft and Hortonworks), and is
> > used
> > > in
> > > > diverse applications, in diverse settings (On-Prem and Cloud).
> > > >
> > > > == Reliance on salaried Developers ==
> > > >
> > > > The contributions to the Livy project to date have been made by
> > salaried
> > > > engineers from Cloudera, Microsoft and Hortonworks. One of the
> > > individuals
> > > > on the initial committer list has since left Microsoft and is
> currently
> > > > unaffiliated. The remaining contributors are from Cloudera and
> > > Hortonworks.
> > > > Since there are at least two major organizations involved, the risk
> of
> > > > reliance on a single group of salaried developers is mitigated. The
> > Livy
> > > > user base is diverse, with users from across the globe, including
> users
> > > from
> > > > academic settings. We aim to further diversify the Livy user and
> > > contributor
> > > > base.
> > > >
> > > > == Relationships with other Apache projects ==
> > > >
> > > > Livy is closely tied to the Apache Spark project and currently
> > addresses
> > > the
> > > > scenarios for a REST based batch and interactive gateway for Spark
> jobs
> > > on
> > > > YARN. Given the growing number of integrations with Livy, keeping it
> > > outside
> > > > of Apache Spark aligns with the desire of the Apache Spark community
> to
> > > > reduce the number of external dependencies in the Spark project.
> > > > Specifically, the Apache Spark community has previously expressed a
> > > desire
> > > > to keep job servers independent from the project.<<FootNote(See, for
> > > > example, discussion of the Ooyala Spark Job Server in SPARK-818)>>
> > > > Furthermore, while Livy common usage is closely tied to Spark
> > deployments
> > > > right now, its core building blocks can be reused elsewhere.  Livy’s
> > > Remote
> > > > REPL could be used as a library for interactive scenarios in
> non-Spark
> > > > projects. In the future, integrations with cluster managers like
> Apache
> > > > Mesos and others could also be added.
> > > >
> > > > The features provided by Livy have already been integrated with
> > existing
> > > > projects like Jupyter and Apache Zeppelin for their interactive Spark
> > use
> > > > cases. This validates the need for a project like Livy and provides
> an
> > > > active downstream user base that the Livy community can interact with
> > to
> > > > seed future interest in the project.
> > > >
> > > > Livy serves a similar purpose to Apache Toree (incubating) but
> differs
> > in
> > > > making session management, security and impersonation a focal design
> > > point.
> > > >
> > > > == An Excessive Fascination with the Apache Brand ==
> > > >
> > > > The primary motivation for submitting Livy to the ASF is to grow a
> > > diverse
> > > > and strong community. We wish to encourage diverse organisations,
> > > including
> > > > ISVs, to adopt Livy and contribute to Livy without any concerns about
> > > > ownership or licensing.
> > > >
> > > > = Documentation =
> > > >
> > > > Documentation can be found on the Livy website http://livy.io/
> > > >
> > > > The Livy web site is version controlled on the ‘gh-pages’ branch of
> the
> > > > above repository.
> > > > Additional documentation is provided on the github wiki:
> > > > https://github.com/cloudera/livy/wiki
> > > > APis are documented within the source code as JavaDoc style
> > documentation
> > > > comments.
> > > >
> > > > = Initial Source =
> > > >
> > > > The initial source code for Livy is hosted at
> > > > https://github.com/cloudera/livy
> > > >
> > > > = Source and Intellectual Property submission plan =
> > > >
> > > > The Livy codebase and web site is currently hosted on GitHub and will
> > be
> > > > transitioned to the ASF repositories during incubation. Livy is
> already
> > > > licensed under the Apache 2.0 license. Cloudera has collected ICLAs
> and
> > > > CCLAs from all committers. There are, however, some contributions
> > > recently
> > > > from authors that have not signed the CCLA and ICLA. If necessary
> for a
> > > > successful SGA, we’ll seek the necessary documentation or replace the
> > > > contributions.
> > > >
> > > > The “Livy” name is not a registered trademark. We will need to do a
> > > > trademark search and make sure it is available for the Apache
> > Foundation
> > > > prior to graduation.
> > > >
> > > > Cloudera currently owns the domain name: http://livy.io/. Once all
> the
> > > > documentation has moved over to ASF infrastructure, the main landing
> > page
> > > > will become livy.incubator.apache.org and the old domain will just
> act
> > > as a
> > > > redirect.
> > > >
> > > > = External Dependencies =
> > > >
> > > > The list below covers the non-Apache dependencies of the project and
> > > their
> > > > licenses.
> > > >
> > > > * Jetty: Apache 2.0
> > > > * Dropwizard Metrics: Apache 2.0
> > > > * FasterXML Jackson: Apache 2.0
> > > > * Netty: Apache 2.0
> > > > * Scala: BSD
> > > > * Py4J: BSD
> > > > * Scalatra: BSD
> > > >
> > > > Build/test-only dependencies:
> > > >
> > > > * Mockito: MIT
> > > > * JUnit: Eclipse
> > > >
> > > > = Required Resources =
> > > >
> > > > == Mailing Lists ==
> > > >
> > > > * private@livy.incubator.apache.org (PPMC)
> > > > * dev@livy.incubator.apache.org (dev mailing list)
> > > > * user@livy.incubator.apache.org (User questions)
> > > > * commits@livy.incubator.apache.org (subscribers shouldn’t be able
> to
> > > post)
> > > > * issues@livy.incubator.apache.org (subscribers shouldn’t be able to
> > > post)
> > > >
> > > > == Git Repository ==
> > > >
> > > > git://git.apache.org/incubator-livy
> > > >
> > > > == Issue Tracking ==
> > > >
> > > > We would like to import our current JIRA project into the ASF JIRA,
> > such
> > > > that our historical commit message and code comments continue to
> > > reference
> > > > the appropriate bug numbers.
> > > >
> > > > = Initial Committers =
> > > >
> > > > * Marcelo Vanzin (vanzin@cloudera.com)
> > > > * Alex Man (alex@alexman.space)
> > > > * Jeff Zhang (zjffdu@gmail.com)
> > > > * Saisai Shao (sshao@hortonworks.com)
> > > > * Kostas Sakellis (kostas@cloudera.com)
> > > >
> > > > = Affiliations =
> > > >
> > > > The initial set of committers includes people employed by Cloudera
> and
> > > > Hortonworks as well as one currently independent contributor.
> > > >
> > > > = Additional Interested Contributors =
> > > >
> > > > Those interested in getting involved with the project as we enter
> > > incubation
> > > > are encouraged to list themselves here.
> > > >
> > > >  * Ismaël Mejía (iemejia@apache.org)
> > > >
> > > > = Sponsors =
> > > >
> > > > == Champion ==
> > > >
> > > > Sean Busbey (busbey@apache.org)
> > > >
> > > > == Nominated Mentors ==
> > > >
> > > > * Bikas Saha (bikas@apache.org)
> > > > * Brock Noland (brock@phdata.io)
> > > > * Luciano Resende (lresende@apache.org)
> > > >
> > > > == Sponsoring Entity ==
> > > >
> > > > We ask that the Incubator PMC sponsor this proposal.
> > > >
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > > > For additional commands, e-mail: general-help@incubator.apache.org
> > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > > For additional commands, e-mail: general-help@incubator.apache.org
> > >
> > >
> >
>

Re: [VOTE] Livy to enter Apache Incubator

Posted by Brock Noland <br...@apache.org>.

+1 (binding)

On Wed, May 31, 2017 at 1:59 PM, Kostas Sakellis <ko...@cloudera.com>
wrote:

> +1 (non-binding)
>
> On Wed, May 31, 2017 at 11:46 AM, Andrew Purtell <andrew.purtell@gmail.com
> >
> wrote:
>
> > +1 (binding)
> >
> > > On May 31, 2017, at 6:03 AM, Sean Busbey <bu...@apache.org> wrote:
> > >
> > > Hi folks!
> > >
> > > I'm calling a vote to accept "Livy" into the Apache Incubator.
> > >
> > > The full proposal is available below, and is also available in the
> wiki:
> > >
> > > https://wiki.apache.org/incubator/LivyProposal
> > >
> > > For additional context, please see the discussion thread:
> > >
> > > https://s.apache.org/incubator-livy-proposal-thread
> > >
> > > Please cast your vote:
> > >
> > > [ ] +1, bring Livy into Incubator
> > > [ ] -1, do not bring Livy into Incubator, because...
> > >
> > > The vote will open at least for 72 hours and only votes from the
> > Incubator
> > > PMC are binding.
> > >
> > > I start with my vote:
> > > +1
> > >
> > > ----
> > >
> > > = Abstract =
> > >
> > > Livy is web service that exposes a REST interface for managing long
> > running
> > > Apache Spark contexts in your cluster. With Livy, new applications can
> be
> > > built on top of Apache Spark that require fine grained interaction with
> > many
> > > Spark contexts.
> > >
> > > = Proposal =
> > >
> > > Livy is an open-source REST service for Apache Spark. Livy enables
> > > applications to submit Spark applications and retrieve results without
> a
> > > co-location requirement on the Spark cluster.
> > >
> > > We propose to contribute the Livy codebase and associated artifacts
> (e.g.
> > > documentation, web-site context etc) to the Apache Software Foundation.
> > >
> > > = Background =
> > >
> > > Apache Spark is a fast and general purpose distributed compute engine,
> > with
> > > a versatile API. It enables processing of large quantities of static
> data
> > > distributed over a cluster of machines, as well as processing of
> > continuous
> > > streams of data. It is the preferred distributed data processing engine
> > for
> > > data engineering, stream processing and data science workloads. Each
> > Spark
> > > application uses a construct called the SparkContext, which is the
> > > application’s connection or entry point to the Spark engine. Each Spark
> > > application will have its own SparkContext.
> > >
> > > Livy enables clients to interact with one or more Spark sessions
> through
> > the
> > > Livy Server, which acts as a proxy layer. Livy Clients have fine
> grained
> > > control over the lifecycle of the Spark sessions, as well as the
> ability
> > to
> > > submit jobs and retrieve results, all over HTTP. Clients have two modes
> > of
> > > interaction: RPC Client API, available in Java and Python, which allows
> > > results to be retrieved as Java or Python objects. The serialization
> and
> > > deserialization of the results is handled by the Livy framework. HTTP
> > based
> > > API that allows submission of code snippets, and retrieval of the
> > results in
> > > different formats.
> > >
> > > Multi-tenant resource allocation and security: Livy enables multiple
> > > independent Spark sessions to be managed simultaneously. Multiple
> clients
> > > can also interact simultaneously with the same Spark session and share
> > the
> > > resources of that Spark session. Livy can also enforce secure,
> > authenticated
> > > communication between the clients and their respective Spark sessions.
> > >
> > > More information on Livy can be found at the existing open source
> > website:
> > > http://livy.io/
> > >
> > > = Rationale =
> > >
> > > Users want to use Spark’s powerful processing engine and API as the
> data
> > > processing backend for interactive applications. However, the job
> > submission
> > > and application interaction mechanisms built into Apache Spark are
> > > insufficient and cumbersome for multi-user interactive applications.
> > >
> > > The primary mechanism for applications to submit Spark jobs is via
> > > spark-submit
> > > (http://spark.apache.org/docs/latest/submitting-applications.html),
> > which is
> > > available as a command line tool as well as a programmatic API.
> However,
> > > spark-submit has the following limitations that make it difficult to
> > build
> > > interactive applications: It is slow: each invocation of spark-submit
> > > involves a setup phase where cluster resources are acquired, new
> > processes
> > > are forked, etc. This setup phase runs for many seconds, or even
> minutes,
> > > and hence is too slow for interactive applications. It is cumbersome
> and
> > > lacks flexibility: application code and dependencies have to be
> > pre-compiled
> > > and submitted as jars, and can not be submitted interactively.
> > >
> > > Apache Spark comes with an ODBC/JDBC server, which can be used to
> submit
> > SQL
> > > queries to Spark. However, this solution is limited to SQL and does not
> > > allow the client to leverage the rest of the Spark API, such as RDDs,
> > MLlib
> > > and Streaming.
> > >
> > > A third way of using Spark is via its command-line shell, which allows
> > the
> > > interactive submission of snippets of Spark code. However, the shell
> > entails
> > > running Spark code on the client machine and hence is not a viable
> > mechanism
> > > for remote clients to submit Spark jobs.
> > >
> > > Livy solves the limitations of the above three mechanisms, and provides
> > the
> > > full Spark API as a multi-tenant service to remote clients.
> > >
> > > Since the open source release of Livy in late 2015, we have seen
> > tremendous
> > > interest among a diverse set of application developers and ISVs that
> > want to
> > > build applications with Apache Spark. To make Livy a robust and
> flexible
> > > solution that will enable a broad and growing set of applications, it
> is
> > > important to grow a large and varied community of contributors.
> > >
> > > = Initial Goals =
> > >
> > >  * Move existing codebase, website, documentation and mailing lists to
> > >    Apache-hosted infrastructure
> > >  * Work with the infrastructure team to implement and approve our code
> > >    review, build, and testing workflows in the context of the ASF
> > >  * Incremental development and releases per Apache guidelines
> > >
> > > = Current Status =
> > >
> > > The Livy project began at Cloudera, as a part of the Hue project.
> > Cloudera
> > > soon realized the broad applicability of Livy, and separated it out
> into
> > an
> > > independent project in Nov 2015.
> > >
> > > == Releases ==
> > >
> > > Livy has undergone two public releases, tagged here:
> > >
> > > * https://github.com/cloudera/livy/releases/tag/v0.2.0
> > > * https://github.com/cloudera/livy/releases/tag/v0.3.0
> > >
> > > Tarballs and zip files were created for each release and hosted on
> > github.
> > > Upon joining the incubator, we will adopt a more typical ASF release
> > > process.
> > >
> > > == Source ==
> > >
> > > Livy’s source is currently hosted on Github at:
> > > https://github.com/cloudera/livy
> > >
> > > This repository will be transitioned to Apache’s git hosting during
> > > incubation.
> > >
> > > == Code review ==
> > >
> > > Livy’s code reviews are currently public and hosted on github as pull
> > > request reviews at: https://github.com/cloudera/livy/pulls
> > > The Livy developer community so far is happy with github pull request
> > > reviews and hopes to continue this after being admitted to the ASF.
> > >
> > > == Issue Tracking ==
> > >
> > > Livy’s bug and feature tracking is hosted on JIRA at:
> > > https://issues.cloudera.org/projects/LIVY/summary
> > > This JIRA instance contains bugs and development discussion dating
> back 1
> > > year and will provide an initial seed for the ASF JIRA
> > >
> > > == Community Discussion ==
> > >
> > > Livy has several public discussion forums:
> > >
> > > * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-dev
> > > * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-user
> > >
> > > == Development Practices ==
> > >
> > > The Livy project follows a review before commit philosophy. Every
> commit
> > > automatically runs through the unit tests and generates coverage
> reports
> > > presented as a pull request comment. Our experience with this process
> > leads
> > > us to believe that it helps ease new contributors into the project.
> They
> > get
> > > feedback quickly on common mistakes, lowering the burden on reviewers.
> > Those
> > > same reviewers get to lead by example, showing the new contributors
> that
> > we
> > > value feedback within our community even when changes are done by more
> > > experienced folks.
> > >
> > > == Meritocracy ==
> > >
> > > We believe strongly in meritocracy when electing committers and PMC
> > members.
> > > In the past few months, the project has added two new committers from
> two
> > > different organisations, in recognition of their significant
> > contributions
> > > to the project. We will encourage contributions and participation of
> all
> > > types, and ensure that contributors are appropriately recognized.
> > >
> > > == Community ==
> > >
> > > Though Livy is relatively new as a standalone open source project, it
> has
> > > already seen promising growth in its community across several
> > organizations:
> > > Cloudera is the original development sponsor for Livy
> > > Microsoft pushed the development of the interpreter fixing high
> > availability
> > > issues and adding additional features.
> > > Hortonworks has contributed the security features to Livy allowing
> > kerberos
> > > and impersonation to work with Spark
> > > IBM is starting to make contributions to the Livy project
> > > A number of other patches contributed by community members
> > >
> > > Livy currently relies on Google Groups for mailing lists. These lists
> > have
> > > been active since the end of 2015/start of 2016. Currently, Livy’s user
> > > mailing list has 173 subscribers and has hosted a total of 227 topic
> > > threads. Livy’s developer list has 49 subscribers and has hosted 79
> topic
> > > threads.
> > >
> > > == Core Developers ==
> > >
> > > The early contributions to Livy were made by Cloudera engineers. In
> 2016,
> > > engineers from Microsoft and Hortonworks joined the core developer
> > > community.
> > >
> > > == Alignment ==
> > >
> > > Livy is built upon Apache Spark, and other Apache projects like Apache
> > > Hadoop YARN. It’s used as a building block by Apache Zeppelin. These
> > > community connections combined with our focus on development practices
> > that
> > > emphasize community engagement with a path to meritocratic recognition
> > > naturally align us with the ASF.
> > >
> > > = Known Risks =
> > >
> > > == Orphaned Products ==
> > >
> > > The risk of Livy being abandoned is low because it is supported by
> three
> > > major big-data software vendors. Moreover, Livy is already used to
> power
> > > multiple releases of services and products used in production.
> > >
> > > == Inexperience with Open Source ==
> > >
> > > Several of the initial committers are experienced open source
> developers,
> > > several being committers and/or PMC members on other ASF projects
> (Spark,
> > > YARN).
> > >
> > > == Homogenous Developers ==
> > >
> > > The project already has a diverse developer base. It has contributions
> > from
> > > 3 major organisations (Cloudera, Microsoft and Hortonworks), and is
> used
> > in
> > > diverse applications, in diverse settings (On-Prem and Cloud).
> > >
> > > == Reliance on salaried Developers ==
> > >
> > > The contributions to the Livy project to date have been made by
> salaried
> > > engineers from Cloudera, Microsoft and Hortonworks. One of the
> > individuals
> > > on the initial committer list has since left Microsoft and is currently
> > > unaffiliated. The remaining contributors are from Cloudera and
> > Hortonworks.
> > > Since there are at least two major organizations involved, the risk of
> > > reliance on a single group of salaried developers is mitigated. The
> Livy
> > > user base is diverse, with users from across the globe, including users
> > from
> > > academic settings. We aim to further diversify the Livy user and
> > contributor
> > > base.
> > >
> > > == Relationships with other Apache projects ==
> > >
> > > Livy is closely tied to the Apache Spark project and currently
> addresses
> > the
> > > scenarios for a REST based batch and interactive gateway for Spark jobs
> > on
> > > YARN. Given the growing number of integrations with Livy, keeping it
> > outside
> > > of Apache Spark aligns with the desire of the Apache Spark community to
> > > reduce the number of external dependencies in the Spark project.
> > > Specifically, the Apache Spark community has previously expressed a
> > desire
> > > to keep job servers independent from the project.<<FootNote(See, for
> > > example, discussion of the Ooyala Spark Job Server in SPARK-818)>>
> > > Furthermore, while Livy common usage is closely tied to Spark
> deployments
> > > right now, its core building blocks can be reused elsewhere.  Livy’s
> > Remote
> > > REPL could be used as a library for interactive scenarios in non-Spark
> > > projects. In the future, integrations with cluster managers like Apache
> > > Mesos and others could also be added.
> > >
> > > The features provided by Livy have already been integrated with
> existing
> > > projects like Jupyter and Apache Zeppelin for their interactive Spark
> use
> > > cases. This validates the need for a project like Livy and provides an
> > > active downstream user base that the Livy community can interact with
> to
> > > seed future interest in the project.
> > >
> > > Livy serves a similar purpose to Apache Toree (incubating) but differs
> in
> > > making session management, security and impersonation a focal design
> > point.
> > >
> > > == An Excessive Fascination with the Apache Brand ==
> > >
> > > The primary motivation for submitting Livy to the ASF is to grow a
> > diverse
> > > and strong community. We wish to encourage diverse organisations,
> > including
> > > ISVs, to adopt Livy and contribute to Livy without any concerns about
> > > ownership or licensing.
> > >
> > > = Documentation =
> > >
> > > Documentation can be found on the Livy website http://livy.io/
> > >
> > > The Livy web site is version controlled on the ‘gh-pages’ branch of the
> > > above repository.
> > > Additional documentation is provided on the github wiki:
> > > https://github.com/cloudera/livy/wiki
> > > APis are documented within the source code as JavaDoc style
> documentation
> > > comments.
> > >
> > > = Initial Source =
> > >
> > > The initial source code for Livy is hosted at
> > > https://github.com/cloudera/livy
> > >
> > > = Source and Intellectual Property submission plan =
> > >
> > > The Livy codebase and web site is currently hosted on GitHub and will
> be
> > > transitioned to the ASF repositories during incubation. Livy is already
> > > licensed under the Apache 2.0 license. Cloudera has collected ICLAs and
> > > CCLAs from all committers. There are, however, some contributions
> > recently
> > > from authors that have not signed the CCLA and ICLA. If necessary for a
> > > successful SGA, we’ll seek the necessary documentation or replace the
> > > contributions.
> > >
> > > The “Livy” name is not a registered trademark. We will need to do a
> > > trademark search and make sure it is available for the Apache
> Foundation
> > > prior to graduation.
> > >
> > > Cloudera currently owns the domain name: http://livy.io/. Once all the
> > > documentation has moved over to ASF infrastructure, the main landing
> page
> > > will become livy.incubator.apache.org and the old domain will just act
> > as a
> > > redirect.
> > >
> > > = External Dependencies =
> > >
> > > The list below covers the non-Apache dependencies of the project and
> > their
> > > licenses.
> > >
> > > * Jetty: Apache 2.0
> > > * Dropwizard Metrics: Apache 2.0
> > > * FasterXML Jackson: Apache 2.0
> > > * Netty: Apache 2.0
> > > * Scala: BSD
> > > * Py4J: BSD
> > > * Scalatra: BSD
> > >
> > > Build/test-only dependencies:
> > >
> > > * Mockito: MIT
> > > * JUnit: Eclipse
> > >
> > > = Required Resources =
> > >
> > > == Mailing Lists ==
> > >
> > > * private@livy.incubator.apache.org (PPMC)
> > > * dev@livy.incubator.apache.org (dev mailing list)
> > > * user@livy.incubator.apache.org (User questions)
> > > * commits@livy.incubator.apache.org (subscribers shouldn’t be able to
> > post)
> > > * issues@livy.incubator.apache.org (subscribers shouldn’t be able to
> > post)
> > >
> > > == Git Repository ==
> > >
> > > git://git.apache.org/incubator-livy
> > >
> > > == Issue Tracking ==
> > >
> > > We would like to import our current JIRA project into the ASF JIRA,
> such
> > > that our historical commit message and code comments continue to
> > reference
> > > the appropriate bug numbers.
> > >
> > > = Initial Committers =
> > >
> > > * Marcelo Vanzin (vanzin@cloudera.com)
> > > * Alex Man (alex@alexman.space)
> > > * Jeff Zhang (zjffdu@gmail.com)
> > > * Saisai Shao (sshao@hortonworks.com)
> > > * Kostas Sakellis (kostas@cloudera.com)
> > >
> > > = Affiliations =
> > >
> > > The initial set of committers includes people employed by Cloudera and
> > > Hortonworks as well as one currently independent contributor.
> > >
> > > = Additional Interested Contributors =
> > >
> > > Those interested in getting involved with the project as we enter
> > incubation
> > > are encouraged to list themselves here.
> > >
> > >  * Ismaël Mejía (iemejia@apache.org)
> > >
> > > = Sponsors =
> > >
> > > == Champion ==
> > >
> > > Sean Busbey (busbey@apache.org)
> > >
> > > == Nominated Mentors ==
> > >
> > > * Bikas Saha (bikas@apache.org)
> > > * Brock Noland (brock@phdata.io)
> > > * Luciano Resende (lresende@apache.org)
> > >
> > > == Sponsoring Entity ==
> > >
> > > We ask that the Incubator PMC sponsor this proposal.
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > > For additional commands, e-mail: general-help@incubator.apache.org
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > For additional commands, e-mail: general-help@incubator.apache.org
> >
> >
>

Re: [VOTE] Livy to enter Apache Incubator

Posted by Kostas Sakellis <ko...@cloudera.com>.

+1 (non-binding)

On Wed, May 31, 2017 at 11:46 AM, Andrew Purtell <an...@gmail.com>
wrote:

> +1 (binding)
>
> > On May 31, 2017, at 6:03 AM, Sean Busbey <bu...@apache.org> wrote:
> >
> > Hi folks!
> >
> > I'm calling a vote to accept "Livy" into the Apache Incubator.
> >
> > The full proposal is available below, and is also available in the wiki:
> >
> > https://wiki.apache.org/incubator/LivyProposal
> >
> > For additional context, please see the discussion thread:
> >
> > https://s.apache.org/incubator-livy-proposal-thread
> >
> > Please cast your vote:
> >
> > [ ] +1, bring Livy into Incubator
> > [ ] -1, do not bring Livy into Incubator, because...
> >
> > The vote will open at least for 72 hours and only votes from the
> Incubator
> > PMC are binding.
> >
> > I start with my vote:
> > +1
> >
> > ----
> >
> > = Abstract =
> >
> > Livy is web service that exposes a REST interface for managing long
> running
> > Apache Spark contexts in your cluster. With Livy, new applications can be
> > built on top of Apache Spark that require fine grained interaction with
> many
> > Spark contexts.
> >
> > = Proposal =
> >
> > Livy is an open-source REST service for Apache Spark. Livy enables
> > applications to submit Spark applications and retrieve results without a
> > co-location requirement on the Spark cluster.
> >
> > We propose to contribute the Livy codebase and associated artifacts (e.g.
> > documentation, web-site context etc) to the Apache Software Foundation.
> >
> > = Background =
> >
> > Apache Spark is a fast and general purpose distributed compute engine,
> with
> > a versatile API. It enables processing of large quantities of static data
> > distributed over a cluster of machines, as well as processing of
> continuous
> > streams of data. It is the preferred distributed data processing engine
> for
> > data engineering, stream processing and data science workloads. Each
> Spark
> > application uses a construct called the SparkContext, which is the
> > application’s connection or entry point to the Spark engine. Each Spark
> > application will have its own SparkContext.
> >
> > Livy enables clients to interact with one or more Spark sessions through
> the
> > Livy Server, which acts as a proxy layer. Livy Clients have fine grained
> > control over the lifecycle of the Spark sessions, as well as the ability
> to
> > submit jobs and retrieve results, all over HTTP. Clients have two modes
> of
> > interaction: RPC Client API, available in Java and Python, which allows
> > results to be retrieved as Java or Python objects. The serialization and
> > deserialization of the results is handled by the Livy framework. HTTP
> based
> > API that allows submission of code snippets, and retrieval of the
> results in
> > different formats.
> >
> > Multi-tenant resource allocation and security: Livy enables multiple
> > independent Spark sessions to be managed simultaneously. Multiple clients
> > can also interact simultaneously with the same Spark session and share
> the
> > resources of that Spark session. Livy can also enforce secure,
> authenticated
> > communication between the clients and their respective Spark sessions.
> >
> > More information on Livy can be found at the existing open source
> website:
> > http://livy.io/
> >
> > = Rationale =
> >
> > Users want to use Spark’s powerful processing engine and API as the data
> > processing backend for interactive applications. However, the job
> submission
> > and application interaction mechanisms built into Apache Spark are
> > insufficient and cumbersome for multi-user interactive applications.
> >
> > The primary mechanism for applications to submit Spark jobs is via
> > spark-submit
> > (http://spark.apache.org/docs/latest/submitting-applications.html),
> which is
> > available as a command line tool as well as a programmatic API. However,
> > spark-submit has the following limitations that make it difficult to
> build
> > interactive applications: It is slow: each invocation of spark-submit
> > involves a setup phase where cluster resources are acquired, new
> processes
> > are forked, etc. This setup phase runs for many seconds, or even minutes,
> > and hence is too slow for interactive applications. It is cumbersome and
> > lacks flexibility: application code and dependencies have to be
> pre-compiled
> > and submitted as jars, and can not be submitted interactively.
> >
> > Apache Spark comes with an ODBC/JDBC server, which can be used to submit
> SQL
> > queries to Spark. However, this solution is limited to SQL and does not
> > allow the client to leverage the rest of the Spark API, such as RDDs,
> MLlib
> > and Streaming.
> >
> > A third way of using Spark is via its command-line shell, which allows
> the
> > interactive submission of snippets of Spark code. However, the shell
> entails
> > running Spark code on the client machine and hence is not a viable
> mechanism
> > for remote clients to submit Spark jobs.
> >
> > Livy solves the limitations of the above three mechanisms, and provides
> the
> > full Spark API as a multi-tenant service to remote clients.
> >
> > Since the open source release of Livy in late 2015, we have seen
> tremendous
> > interest among a diverse set of application developers and ISVs that
> want to
> > build applications with Apache Spark. To make Livy a robust and flexible
> > solution that will enable a broad and growing set of applications, it is
> > important to grow a large and varied community of contributors.
> >
> > = Initial Goals =
> >
> >  * Move existing codebase, website, documentation and mailing lists to
> >    Apache-hosted infrastructure
> >  * Work with the infrastructure team to implement and approve our code
> >    review, build, and testing workflows in the context of the ASF
> >  * Incremental development and releases per Apache guidelines
> >
> > = Current Status =
> >
> > The Livy project began at Cloudera, as a part of the Hue project.
> Cloudera
> > soon realized the broad applicability of Livy, and separated it out into
> an
> > independent project in Nov 2015.
> >
> > == Releases ==
> >
> > Livy has undergone two public releases, tagged here:
> >
> > * https://github.com/cloudera/livy/releases/tag/v0.2.0
> > * https://github.com/cloudera/livy/releases/tag/v0.3.0
> >
> > Tarballs and zip files were created for each release and hosted on
> github.
> > Upon joining the incubator, we will adopt a more typical ASF release
> > process.
> >
> > == Source ==
> >
> > Livy’s source is currently hosted on Github at:
> > https://github.com/cloudera/livy
> >
> > This repository will be transitioned to Apache’s git hosting during
> > incubation.
> >
> > == Code review ==
> >
> > Livy’s code reviews are currently public and hosted on github as pull
> > request reviews at: https://github.com/cloudera/livy/pulls
> > The Livy developer community so far is happy with github pull request
> > reviews and hopes to continue this after being admitted to the ASF.
> >
> > == Issue Tracking ==
> >
> > Livy’s bug and feature tracking is hosted on JIRA at:
> > https://issues.cloudera.org/projects/LIVY/summary
> > This JIRA instance contains bugs and development discussion dating back 1
> > year and will provide an initial seed for the ASF JIRA
> >
> > == Community Discussion ==
> >
> > Livy has several public discussion forums:
> >
> > * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-dev
> > * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-user
> >
> > == Development Practices ==
> >
> > The Livy project follows a review before commit philosophy. Every commit
> > automatically runs through the unit tests and generates coverage reports
> > presented as a pull request comment. Our experience with this process
> leads
> > us to believe that it helps ease new contributors into the project. They
> get
> > feedback quickly on common mistakes, lowering the burden on reviewers.
> Those
> > same reviewers get to lead by example, showing the new contributors that
> we
> > value feedback within our community even when changes are done by more
> > experienced folks.
> >
> > == Meritocracy ==
> >
> > We believe strongly in meritocracy when electing committers and PMC
> members.
> > In the past few months, the project has added two new committers from two
> > different organisations, in recognition of their significant
> contributions
> > to the project. We will encourage contributions and participation of all
> > types, and ensure that contributors are appropriately recognized.
> >
> > == Community ==
> >
> > Though Livy is relatively new as a standalone open source project, it has
> > already seen promising growth in its community across several
> organizations:
> > Cloudera is the original development sponsor for Livy
> > Microsoft pushed the development of the interpreter fixing high
> availability
> > issues and adding additional features.
> > Hortonworks has contributed the security features to Livy allowing
> kerberos
> > and impersonation to work with Spark
> > IBM is starting to make contributions to the Livy project
> > A number of other patches contributed by community members
> >
> > Livy currently relies on Google Groups for mailing lists. These lists
> have
> > been active since the end of 2015/start of 2016. Currently, Livy’s user
> > mailing list has 173 subscribers and has hosted a total of 227 topic
> > threads. Livy’s developer list has 49 subscribers and has hosted 79 topic
> > threads.
> >
> > == Core Developers ==
> >
> > The early contributions to Livy were made by Cloudera engineers. In 2016,
> > engineers from Microsoft and Hortonworks joined the core developer
> > community.
> >
> > == Alignment ==
> >
> > Livy is built upon Apache Spark, and other Apache projects like Apache
> > Hadoop YARN. It’s used as a building block by Apache Zeppelin. These
> > community connections combined with our focus on development practices
> that
> > emphasize community engagement with a path to meritocratic recognition
> > naturally align us with the ASF.
> >
> > = Known Risks =
> >
> > == Orphaned Products ==
> >
> > The risk of Livy being abandoned is low because it is supported by three
> > major big-data software vendors. Moreover, Livy is already used to power
> > multiple releases of services and products used in production.
> >
> > == Inexperience with Open Source ==
> >
> > Several of the initial committers are experienced open source developers,
> > several being committers and/or PMC members on other ASF projects (Spark,
> > YARN).
> >
> > == Homogenous Developers ==
> >
> > The project already has a diverse developer base. It has contributions
> from
> > 3 major organisations (Cloudera, Microsoft and Hortonworks), and is used
> in
> > diverse applications, in diverse settings (On-Prem and Cloud).
> >
> > == Reliance on salaried Developers ==
> >
> > The contributions to the Livy project to date have been made by salaried
> > engineers from Cloudera, Microsoft and Hortonworks. One of the
> individuals
> > on the initial committer list has since left Microsoft and is currently
> > unaffiliated. The remaining contributors are from Cloudera and
> Hortonworks.
> > Since there are at least two major organizations involved, the risk of
> > reliance on a single group of salaried developers is mitigated. The Livy
> > user base is diverse, with users from across the globe, including users
> from
> > academic settings. We aim to further diversify the Livy user and
> contributor
> > base.
> >
> > == Relationships with other Apache projects ==
> >
> > Livy is closely tied to the Apache Spark project and currently addresses
> the
> > scenarios for a REST based batch and interactive gateway for Spark jobs
> on
> > YARN. Given the growing number of integrations with Livy, keeping it
> outside
> > of Apache Spark aligns with the desire of the Apache Spark community to
> > reduce the number of external dependencies in the Spark project.
> > Specifically, the Apache Spark community has previously expressed a
> desire
> > to keep job servers independent from the project.<<FootNote(See, for
> > example, discussion of the Ooyala Spark Job Server in SPARK-818)>>
> > Furthermore, while Livy common usage is closely tied to Spark deployments
> > right now, its core building blocks can be reused elsewhere.  Livy’s
> Remote
> > REPL could be used as a library for interactive scenarios in non-Spark
> > projects. In the future, integrations with cluster managers like Apache
> > Mesos and others could also be added.
> >
> > The features provided by Livy have already been integrated with existing
> > projects like Jupyter and Apache Zeppelin for their interactive Spark use
> > cases. This validates the need for a project like Livy and provides an
> > active downstream user base that the Livy community can interact with to
> > seed future interest in the project.
> >
> > Livy serves a similar purpose to Apache Toree (incubating) but differs in
> > making session management, security and impersonation a focal design
> point.
> >
> > == An Excessive Fascination with the Apache Brand ==
> >
> > The primary motivation for submitting Livy to the ASF is to grow a
> diverse
> > and strong community. We wish to encourage diverse organisations,
> including
> > ISVs, to adopt Livy and contribute to Livy without any concerns about
> > ownership or licensing.
> >
> > = Documentation =
> >
> > Documentation can be found on the Livy website http://livy.io/
> >
> > The Livy web site is version controlled on the ‘gh-pages’ branch of the
> > above repository.
> > Additional documentation is provided on the github wiki:
> > https://github.com/cloudera/livy/wiki
> > APis are documented within the source code as JavaDoc style documentation
> > comments.
> >
> > = Initial Source =
> >
> > The initial source code for Livy is hosted at
> > https://github.com/cloudera/livy
> >
> > = Source and Intellectual Property submission plan =
> >
> > The Livy codebase and web site is currently hosted on GitHub and will be
> > transitioned to the ASF repositories during incubation. Livy is already
> > licensed under the Apache 2.0 license. Cloudera has collected ICLAs and
> > CCLAs from all committers. There are, however, some contributions
> recently
> > from authors that have not signed the CCLA and ICLA. If necessary for a
> > successful SGA, we’ll seek the necessary documentation or replace the
> > contributions.
> >
> > The “Livy” name is not a registered trademark. We will need to do a
> > trademark search and make sure it is available for the Apache Foundation
> > prior to graduation.
> >
> > Cloudera currently owns the domain name: http://livy.io/. Once all the
> > documentation has moved over to ASF infrastructure, the main landing page
> > will become livy.incubator.apache.org and the old domain will just act
> as a
> > redirect.
> >
> > = External Dependencies =
> >
> > The list below covers the non-Apache dependencies of the project and
> their
> > licenses.
> >
> > * Jetty: Apache 2.0
> > * Dropwizard Metrics: Apache 2.0
> > * FasterXML Jackson: Apache 2.0
> > * Netty: Apache 2.0
> > * Scala: BSD
> > * Py4J: BSD
> > * Scalatra: BSD
> >
> > Build/test-only dependencies:
> >
> > * Mockito: MIT
> > * JUnit: Eclipse
> >
> > = Required Resources =
> >
> > == Mailing Lists ==
> >
> > * private@livy.incubator.apache.org (PPMC)
> > * dev@livy.incubator.apache.org (dev mailing list)
> > * user@livy.incubator.apache.org (User questions)
> > * commits@livy.incubator.apache.org (subscribers shouldn’t be able to
> post)
> > * issues@livy.incubator.apache.org (subscribers shouldn’t be able to
> post)
> >
> > == Git Repository ==
> >
> > git://git.apache.org/incubator-livy
> >
> > == Issue Tracking ==
> >
> > We would like to import our current JIRA project into the ASF JIRA, such
> > that our historical commit message and code comments continue to
> reference
> > the appropriate bug numbers.
> >
> > = Initial Committers =
> >
> > * Marcelo Vanzin (vanzin@cloudera.com)
> > * Alex Man (alex@alexman.space)
> > * Jeff Zhang (zjffdu@gmail.com)
> > * Saisai Shao (sshao@hortonworks.com)
> > * Kostas Sakellis (kostas@cloudera.com)
> >
> > = Affiliations =
> >
> > The initial set of committers includes people employed by Cloudera and
> > Hortonworks as well as one currently independent contributor.
> >
> > = Additional Interested Contributors =
> >
> > Those interested in getting involved with the project as we enter
> incubation
> > are encouraged to list themselves here.
> >
> >  * Ismaël Mejía (iemejia@apache.org)
> >
> > = Sponsors =
> >
> > == Champion ==
> >
> > Sean Busbey (busbey@apache.org)
> >
> > == Nominated Mentors ==
> >
> > * Bikas Saha (bikas@apache.org)
> > * Brock Noland (brock@phdata.io)
> > * Luciano Resende (lresende@apache.org)
> >
> > == Sponsoring Entity ==
> >
> > We ask that the Incubator PMC sponsor this proposal.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > For additional commands, e-mail: general-help@incubator.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>

Re: [VOTE] Livy to enter Apache Incubator

Posted by Andrew Purtell <an...@gmail.com>.

+1 (binding)

> On May 31, 2017, at 6:03 AM, Sean Busbey <bu...@apache.org> wrote:
> 
> Hi folks!
> 
> I'm calling a vote to accept "Livy" into the Apache Incubator.
> 
> The full proposal is available below, and is also available in the wiki:
> 
> https://wiki.apache.org/incubator/LivyProposal
> 
> For additional context, please see the discussion thread:
> 
> https://s.apache.org/incubator-livy-proposal-thread
> 
> Please cast your vote:
> 
> [ ] +1, bring Livy into Incubator
> [ ] -1, do not bring Livy into Incubator, because...
> 
> The vote will open at least for 72 hours and only votes from the Incubator
> PMC are binding.
> 
> I start with my vote:
> +1
> 
> ----
> 
> = Abstract =
> 
> Livy is web service that exposes a REST interface for managing long running
> Apache Spark contexts in your cluster. With Livy, new applications can be
> built on top of Apache Spark that require fine grained interaction with many
> Spark contexts.  
> 
> = Proposal =
> 
> Livy is an open-source REST service for Apache Spark. Livy enables
> applications to submit Spark applications and retrieve results without a
> co-location requirement on the Spark cluster. 
> 
> We propose to contribute the Livy codebase and associated artifacts (e.g.
> documentation, web-site context etc) to the Apache Software Foundation.
> 
> = Background =
> 
> Apache Spark is a fast and general purpose distributed compute engine, with
> a versatile API. It enables processing of large quantities of static data
> distributed over a cluster of machines, as well as processing of continuous
> streams of data. It is the preferred distributed data processing engine for
> data engineering, stream processing and data science workloads. Each Spark
> application uses a construct called the SparkContext, which is the
> application’s connection or entry point to the Spark engine. Each Spark
> application will have its own SparkContext.
> 
> Livy enables clients to interact with one or more Spark sessions through the
> Livy Server, which acts as a proxy layer. Livy Clients have fine grained
> control over the lifecycle of the Spark sessions, as well as the ability to
> submit jobs and retrieve results, all over HTTP. Clients have two modes of
> interaction: RPC Client API, available in Java and Python, which allows
> results to be retrieved as Java or Python objects. The serialization and
> deserialization of the results is handled by the Livy framework. HTTP based
> API that allows submission of code snippets, and retrieval of the results in
> different formats.
> 
> Multi-tenant resource allocation and security: Livy enables multiple
> independent Spark sessions to be managed simultaneously. Multiple clients
> can also interact simultaneously with the same Spark session and share the
> resources of that Spark session. Livy can also enforce secure, authenticated
> communication between the clients and their respective Spark sessions.
> 
> More information on Livy can be found at the existing open source website:
> http://livy.io/
> 
> = Rationale =
> 
> Users want to use Spark’s powerful processing engine and API as the data
> processing backend for interactive applications. However, the job submission
> and application interaction mechanisms built into Apache Spark are
> insufficient and cumbersome for multi-user interactive applications.
> 
> The primary mechanism for applications to submit Spark jobs is via
> spark-submit
> (http://spark.apache.org/docs/latest/submitting-applications.html), which is
> available as a command line tool as well as a programmatic API. However,
> spark-submit has the following limitations that make it difficult to build
> interactive applications: It is slow: each invocation of spark-submit
> involves a setup phase where cluster resources are acquired, new processes
> are forked, etc. This setup phase runs for many seconds, or even minutes,
> and hence is too slow for interactive applications. It is cumbersome and
> lacks flexibility: application code and dependencies have to be pre-compiled
> and submitted as jars, and can not be submitted interactively.
> 
> Apache Spark comes with an ODBC/JDBC server, which can be used to submit SQL
> queries to Spark. However, this solution is limited to SQL and does not
> allow the client to leverage the rest of the Spark API, such as RDDs, MLlib
> and Streaming.
> 
> A third way of using Spark is via its command-line shell, which allows the
> interactive submission of snippets of Spark code. However, the shell entails
> running Spark code on the client machine and hence is not a viable mechanism
> for remote clients to submit Spark jobs.
> 
> Livy solves the limitations of the above three mechanisms, and provides the
> full Spark API as a multi-tenant service to remote clients. 
> 
> Since the open source release of Livy in late 2015, we have seen tremendous
> interest among a diverse set of application developers and ISVs that want to
> build applications with Apache Spark. To make Livy a robust and flexible
> solution that will enable a broad and growing set of applications, it is
> important to grow a large and varied community of contributors.
> 
> = Initial Goals =
> 
>  * Move existing codebase, website, documentation and mailing lists to
>    Apache-hosted infrastructure
>  * Work with the infrastructure team to implement and approve our code
>    review, build, and testing workflows in the context of the ASF
>  * Incremental development and releases per Apache guidelines
> 
> = Current Status =
> 
> The Livy project began at Cloudera, as a part of the Hue project. Cloudera
> soon realized the broad applicability of Livy, and separated it out into an
> independent project in Nov 2015.
> 
> == Releases ==
> 
> Livy has undergone two public releases, tagged here: 
> 
> * https://github.com/cloudera/livy/releases/tag/v0.2.0
> * https://github.com/cloudera/livy/releases/tag/v0.3.0
> 
> Tarballs and zip files were created for each release and hosted on github.
> Upon joining the incubator, we will adopt a more typical ASF release
> process.
> 
> == Source ==
> 
> Livy’s source is currently hosted on Github at:
> https://github.com/cloudera/livy
> 
> This repository will be transitioned to Apache’s git hosting during
> incubation.
> 
> == Code review ==
> 
> Livy’s code reviews are currently public and hosted on github as pull
> request reviews at: https://github.com/cloudera/livy/pulls
> The Livy developer community so far is happy with github pull request
> reviews and hopes to continue this after being admitted to the ASF.
> 
> == Issue Tracking ==
> 
> Livy’s bug and feature tracking is hosted on JIRA at:
> https://issues.cloudera.org/projects/LIVY/summary
> This JIRA instance contains bugs and development discussion dating back 1
> year and will provide an initial seed for the ASF JIRA
> 
> == Community Discussion ==
> 
> Livy has several public discussion forums:
> 
> * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-dev
> * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-user
> 
> == Development Practices ==
> 
> The Livy project follows a review before commit philosophy. Every commit
> automatically runs through the unit tests and generates coverage reports
> presented as a pull request comment. Our experience with this process leads
> us to believe that it helps ease new contributors into the project. They get
> feedback quickly on common mistakes, lowering the burden on reviewers. Those
> same reviewers get to lead by example, showing the new contributors that we
> value feedback within our community even when changes are done by more
> experienced folks.
> 
> == Meritocracy ==
> 
> We believe strongly in meritocracy when electing committers and PMC members.
> In the past few months, the project has added two new committers from two
> different organisations, in recognition of their significant contributions
> to the project. We will encourage contributions and participation of all
> types, and ensure that contributors are appropriately recognized.
> 
> == Community ==
> 
> Though Livy is relatively new as a standalone open source project, it has
> already seen promising growth in its community across several organizations:
> Cloudera is the original development sponsor for Livy
> Microsoft pushed the development of the interpreter fixing high availability
> issues and adding additional features. 
> Hortonworks has contributed the security features to Livy allowing kerberos
> and impersonation to work with Spark
> IBM is starting to make contributions to the Livy project
> A number of other patches contributed by community members
> 
> Livy currently relies on Google Groups for mailing lists. These lists have
> been active since the end of 2015/start of 2016. Currently, Livy’s user
> mailing list has 173 subscribers and has hosted a total of 227 topic
> threads. Livy’s developer list has 49 subscribers and has hosted 79 topic
> threads.
> 
> == Core Developers ==
> 
> The early contributions to Livy were made by Cloudera engineers. In 2016,
> engineers from Microsoft and Hortonworks joined the core developer
> community. 
> 
> == Alignment ==
> 
> Livy is built upon Apache Spark, and other Apache projects like Apache
> Hadoop YARN. It’s used as a building block by Apache Zeppelin. These
> community connections combined with our focus on development practices that
> emphasize community engagement with a path to meritocratic recognition
> naturally align us with the ASF.
> 
> = Known Risks =
> 
> == Orphaned Products ==
> 
> The risk of Livy being abandoned is low because it is supported by three
> major big-data software vendors. Moreover, Livy is already used to power
> multiple releases of services and products used in production.
> 
> == Inexperience with Open Source ==
> 
> Several of the initial committers are experienced open source developers,
> several being committers and/or PMC members on other ASF projects (Spark,
> YARN).
> 
> == Homogenous Developers ==
> 
> The project already has a diverse developer base. It has contributions from
> 3 major organisations (Cloudera, Microsoft and Hortonworks), and is used in
> diverse applications, in diverse settings (On-Prem and Cloud).
> 
> == Reliance on salaried Developers ==
> 
> The contributions to the Livy project to date have been made by salaried
> engineers from Cloudera, Microsoft and Hortonworks. One of the individuals
> on the initial committer list has since left Microsoft and is currently
> unaffiliated. The remaining contributors are from Cloudera and Hortonworks.
> Since there are at least two major organizations involved, the risk of
> reliance on a single group of salaried developers is mitigated. The Livy
> user base is diverse, with users from across the globe, including users from
> academic settings. We aim to further diversify the Livy user and contributor
> base.
> 
> == Relationships with other Apache projects ==
> 
> Livy is closely tied to the Apache Spark project and currently addresses the
> scenarios for a REST based batch and interactive gateway for Spark jobs on
> YARN. Given the growing number of integrations with Livy, keeping it outside
> of Apache Spark aligns with the desire of the Apache Spark community to
> reduce the number of external dependencies in the Spark project.
> Specifically, the Apache Spark community has previously expressed a desire
> to keep job servers independent from the project.<<FootNote(See, for
> example, discussion of the Ooyala Spark Job Server in SPARK-818)>>
> Furthermore, while Livy common usage is closely tied to Spark deployments
> right now, its core building blocks can be reused elsewhere.  Livy’s Remote
> REPL could be used as a library for interactive scenarios in non-Spark
> projects. In the future, integrations with cluster managers like Apache
> Mesos and others could also be added.
> 
> The features provided by Livy have already been integrated with existing
> projects like Jupyter and Apache Zeppelin for their interactive Spark use
> cases. This validates the need for a project like Livy and provides an
> active downstream user base that the Livy community can interact with to
> seed future interest in the project.
> 
> Livy serves a similar purpose to Apache Toree (incubating) but differs in
> making session management, security and impersonation a focal design point.
> 
> == An Excessive Fascination with the Apache Brand ==
> 
> The primary motivation for submitting Livy to the ASF is to grow a diverse
> and strong community. We wish to encourage diverse organisations, including
> ISVs, to adopt Livy and contribute to Livy without any concerns about
> ownership or licensing.
> 
> = Documentation =
> 
> Documentation can be found on the Livy website http://livy.io/
> 
> The Livy web site is version controlled on the ‘gh-pages’ branch of the
> above repository.
> Additional documentation is provided on the github wiki:
> https://github.com/cloudera/livy/wiki
> APis are documented within the source code as JavaDoc style documentation
> comments. 
> 
> = Initial Source =
> 
> The initial source code for Livy is hosted at
> https://github.com/cloudera/livy 
> 
> = Source and Intellectual Property submission plan =
> 
> The Livy codebase and web site is currently hosted on GitHub and will be
> transitioned to the ASF repositories during incubation. Livy is already
> licensed under the Apache 2.0 license. Cloudera has collected ICLAs and
> CCLAs from all committers. There are, however, some contributions recently
> from authors that have not signed the CCLA and ICLA. If necessary for a
> successful SGA, we’ll seek the necessary documentation or replace the
> contributions.
> 
> The “Livy” name is not a registered trademark. We will need to do a
> trademark search and make sure it is available for the Apache Foundation
> prior to graduation.
> 
> Cloudera currently owns the domain name: http://livy.io/. Once all the
> documentation has moved over to ASF infrastructure, the main landing page
> will become livy.incubator.apache.org and the old domain will just act as a
> redirect.
> 
> = External Dependencies =
> 
> The list below covers the non-Apache dependencies of the project and their
> licenses.
> 
> * Jetty: Apache 2.0
> * Dropwizard Metrics: Apache 2.0
> * FasterXML Jackson: Apache 2.0
> * Netty: Apache 2.0
> * Scala: BSD
> * Py4J: BSD
> * Scalatra: BSD
> 
> Build/test-only dependencies:
> 
> * Mockito: MIT
> * JUnit: Eclipse
> 
> = Required Resources =
> 
> == Mailing Lists ==
> 
> * private@livy.incubator.apache.org (PPMC)
> * dev@livy.incubator.apache.org (dev mailing list)
> * user@livy.incubator.apache.org (User questions)
> * commits@livy.incubator.apache.org (subscribers shouldn’t be able to post)
> * issues@livy.incubator.apache.org (subscribers shouldn’t be able to post)
> 
> == Git Repository ==
> 
> git://git.apache.org/incubator-livy
> 
> == Issue Tracking ==
> 
> We would like to import our current JIRA project into the ASF JIRA, such
> that our historical commit message and code comments continue to reference
> the appropriate bug numbers.
> 
> = Initial Committers =
> 
> * Marcelo Vanzin (vanzin@cloudera.com)
> * Alex Man (alex@alexman.space)
> * Jeff Zhang (zjffdu@gmail.com)
> * Saisai Shao (sshao@hortonworks.com)
> * Kostas Sakellis (kostas@cloudera.com)
> 
> = Affiliations =
> 
> The initial set of committers includes people employed by Cloudera and
> Hortonworks as well as one currently independent contributor.
> 
> = Additional Interested Contributors =
> 
> Those interested in getting involved with the project as we enter incubation
> are encouraged to list themselves here.
> 
>  * Ismaël Mejía (iemejia@apache.org)
> 
> = Sponsors =
> 
> == Champion ==
> 
> Sean Busbey (busbey@apache.org)
> 
> == Nominated Mentors ==
> 
> * Bikas Saha (bikas@apache.org)
> * Brock Noland (brock@phdata.io)
> * Luciano Resende (lresende@apache.org)
> 
> == Sponsoring Entity ==
> 
> We ask that the Incubator PMC sponsor this proposal.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org

Re: [VOTE] Livy to enter Apache Incubator

Posted by Felix Cheung <fe...@apache.org>.

+1

On Thu, Jun 1, 2017 at 11:14 AM Hitesh Shah <hi...@apache.org> wrote:

> +1
>
> -- Hitesh
>
> On Wed, May 31, 2017 at 6:03 AM, Sean Busbey <bu...@apache.org> wrote:
>
> > Hi folks!
> >
> > I'm calling a vote to accept "Livy" into the Apache Incubator.
> >
> > The full proposal is available below, and is also available in the wiki:
> >
> > https://wiki.apache.org/incubator/LivyProposal
> >
> > For additional context, please see the discussion thread:
> >
> > https://s.apache.org/incubator-livy-proposal-thread
> >
> > Please cast your vote:
> >
> > [ ] +1, bring Livy into Incubator
> > [ ] -1, do not bring Livy into Incubator, because...
> >
> > The vote will open at least for 72 hours and only votes from the
> Incubator
> > PMC are binding.
> >
> > I start with my vote:
> > +1
> >
> > ----
> >
> > = Abstract =
> >
> > Livy is web service that exposes a REST interface for managing long
> running
> > Apache Spark contexts in your cluster. With Livy, new applications can be
> > built on top of Apache Spark that require fine grained interaction with
> > many
> > Spark contexts.
> >
> > = Proposal =
> >
> > Livy is an open-source REST service for Apache Spark. Livy enables
> > applications to submit Spark applications and retrieve results without a
> > co-location requirement on the Spark cluster.
> >
> > We propose to contribute the Livy codebase and associated artifacts (e.g.
> > documentation, web-site context etc) to the Apache Software Foundation.
> >
> > = Background =
> >
> > Apache Spark is a fast and general purpose distributed compute engine,
> with
> > a versatile API. It enables processing of large quantities of static data
> > distributed over a cluster of machines, as well as processing of
> continuous
> > streams of data. It is the preferred distributed data processing engine
> for
> > data engineering, stream processing and data science workloads. Each
> Spark
> > application uses a construct called the SparkContext, which is the
> > application’s connection or entry point to the Spark engine. Each Spark
> > application will have its own SparkContext.
> >
> > Livy enables clients to interact with one or more Spark sessions through
> > the
> > Livy Server, which acts as a proxy layer. Livy Clients have fine grained
> > control over the lifecycle of the Spark sessions, as well as the ability
> to
> > submit jobs and retrieve results, all over HTTP. Clients have two modes
> of
> > interaction: RPC Client API, available in Java and Python, which allows
> > results to be retrieved as Java or Python objects. The serialization and
> > deserialization of the results is handled by the Livy framework. HTTP
> based
> > API that allows submission of code snippets, and retrieval of the results
> > in
> > different formats.
> >
> > Multi-tenant resource allocation and security: Livy enables multiple
> > independent Spark sessions to be managed simultaneously. Multiple clients
> > can also interact simultaneously with the same Spark session and share
> the
> > resources of that Spark session. Livy can also enforce secure,
> > authenticated
> > communication between the clients and their respective Spark sessions.
> >
> > More information on Livy can be found at the existing open source
> website:
> > http://livy.io/
> >
> > = Rationale =
> >
> > Users want to use Spark’s powerful processing engine and API as the data
> > processing backend for interactive applications. However, the job
> > submission
> > and application interaction mechanisms built into Apache Spark are
> > insufficient and cumbersome for multi-user interactive applications.
> >
> > The primary mechanism for applications to submit Spark jobs is via
> > spark-submit
> > (http://spark.apache.org/docs/latest/submitting-applications.html),
> which
> > is
> > available as a command line tool as well as a programmatic API. However,
> > spark-submit has the following limitations that make it difficult to
> build
> > interactive applications: It is slow: each invocation of spark-submit
> > involves a setup phase where cluster resources are acquired, new
> processes
> > are forked, etc. This setup phase runs for many seconds, or even minutes,
> > and hence is too slow for interactive applications. It is cumbersome and
> > lacks flexibility: application code and dependencies have to be
> > pre-compiled
> > and submitted as jars, and can not be submitted interactively.
> >
> > Apache Spark comes with an ODBC/JDBC server, which can be used to submit
> > SQL
> > queries to Spark. However, this solution is limited to SQL and does not
> > allow the client to leverage the rest of the Spark API, such as RDDs,
> MLlib
> > and Streaming.
> >
> > A third way of using Spark is via its command-line shell, which allows
> the
> > interactive submission of snippets of Spark code. However, the shell
> > entails
> > running Spark code on the client machine and hence is not a viable
> > mechanism
> > for remote clients to submit Spark jobs.
> >
> > Livy solves the limitations of the above three mechanisms, and provides
> the
> > full Spark API as a multi-tenant service to remote clients.
> >
> > Since the open source release of Livy in late 2015, we have seen
> tremendous
> > interest among a diverse set of application developers and ISVs that want
> > to
> > build applications with Apache Spark. To make Livy a robust and flexible
> > solution that will enable a broad and growing set of applications, it is
> > important to grow a large and varied community of contributors.
> >
> > = Initial Goals =
> >
> >   * Move existing codebase, website, documentation and mailing lists to
> >     Apache-hosted infrastructure
> >   * Work with the infrastructure team to implement and approve our code
> >     review, build, and testing workflows in the context of the ASF
> >   * Incremental development and releases per Apache guidelines
> >
> > = Current Status =
> >
> > The Livy project began at Cloudera, as a part of the Hue project.
> Cloudera
> > soon realized the broad applicability of Livy, and separated it out into
> an
> > independent project in Nov 2015.
> >
> > == Releases ==
> >
> > Livy has undergone two public releases, tagged here:
> >
> >  * https://github.com/cloudera/livy/releases/tag/v0.2.0
> >  * https://github.com/cloudera/livy/releases/tag/v0.3.0
> >
> > Tarballs and zip files were created for each release and hosted on
> github.
> > Upon joining the incubator, we will adopt a more typical ASF release
> > process.
> >
> > == Source ==
> >
> > Livy’s source is currently hosted on Github at:
> > https://github.com/cloudera/livy
> >
> > This repository will be transitioned to Apache’s git hosting during
> > incubation.
> >
> > == Code review ==
> >
> > Livy’s code reviews are currently public and hosted on github as pull
> > request reviews at: https://github.com/cloudera/livy/pulls
> > The Livy developer community so far is happy with github pull request
> > reviews and hopes to continue this after being admitted to the ASF.
> >
> > == Issue Tracking ==
> >
> > Livy’s bug and feature tracking is hosted on JIRA at:
> > https://issues.cloudera.org/projects/LIVY/summary
> > This JIRA instance contains bugs and development discussion dating back 1
> > year and will provide an initial seed for the ASF JIRA
> >
> > == Community Discussion ==
> >
> > Livy has several public discussion forums:
> >
> >  * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-dev
> >  * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-user
> >
> > == Development Practices ==
> >
> > The Livy project follows a review before commit philosophy. Every commit
> > automatically runs through the unit tests and generates coverage reports
> > presented as a pull request comment. Our experience with this process
> leads
> > us to believe that it helps ease new contributors into the project. They
> > get
> > feedback quickly on common mistakes, lowering the burden on reviewers.
> > Those
> > same reviewers get to lead by example, showing the new contributors that
> we
> > value feedback within our community even when changes are done by more
> > experienced folks.
> >
> > == Meritocracy ==
> >
> > We believe strongly in meritocracy when electing committers and PMC
> > members.
> > In the past few months, the project has added two new committers from two
> > different organisations, in recognition of their significant
> contributions
> > to the project. We will encourage contributions and participation of all
> > types, and ensure that contributors are appropriately recognized.
> >
> > == Community ==
> >
> > Though Livy is relatively new as a standalone open source project, it has
> > already seen promising growth in its community across several
> > organizations:
> > Cloudera is the original development sponsor for Livy
> > Microsoft pushed the development of the interpreter fixing high
> > availability
> > issues and adding additional features.
> > Hortonworks has contributed the security features to Livy allowing
> kerberos
> > and impersonation to work with Spark
> > IBM is starting to make contributions to the Livy project
> > A number of other patches contributed by community members
> >
> > Livy currently relies on Google Groups for mailing lists. These lists
> have
> > been active since the end of 2015/start of 2016. Currently, Livy’s user
> > mailing list has 173 subscribers and has hosted a total of 227 topic
> > threads. Livy’s developer list has 49 subscribers and has hosted 79 topic
> > threads.
> >
> > == Core Developers ==
> >
> > The early contributions to Livy were made by Cloudera engineers. In 2016,
> > engineers from Microsoft and Hortonworks joined the core developer
> > community.
> >
> > == Alignment ==
> >
> > Livy is built upon Apache Spark, and other Apache projects like Apache
> > Hadoop YARN. It’s used as a building block by Apache Zeppelin. These
> > community connections combined with our focus on development practices
> that
> > emphasize community engagement with a path to meritocratic recognition
> > naturally align us with the ASF.
> >
> > = Known Risks =
> >
> > == Orphaned Products ==
> >
> > The risk of Livy being abandoned is low because it is supported by three
> > major big-data software vendors. Moreover, Livy is already used to power
> > multiple releases of services and products used in production.
> >
> > == Inexperience with Open Source ==
> >
> > Several of the initial committers are experienced open source developers,
> > several being committers and/or PMC members on other ASF projects (Spark,
> > YARN).
> >
> > == Homogenous Developers ==
> >
> > The project already has a diverse developer base. It has contributions
> from
> > 3 major organisations (Cloudera, Microsoft and Hortonworks), and is used
> in
> > diverse applications, in diverse settings (On-Prem and Cloud).
> >
> > == Reliance on salaried Developers ==
> >
> > The contributions to the Livy project to date have been made by salaried
> > engineers from Cloudera, Microsoft and Hortonworks. One of the
> individuals
> > on the initial committer list has since left Microsoft and is currently
> > unaffiliated. The remaining contributors are from Cloudera and
> Hortonworks.
> > Since there are at least two major organizations involved, the risk of
> > reliance on a single group of salaried developers is mitigated. The Livy
> > user base is diverse, with users from across the globe, including users
> > from
> > academic settings. We aim to further diversify the Livy user and
> > contributor
> > base.
> >
> > == Relationships with other Apache projects ==
> >
> > Livy is closely tied to the Apache Spark project and currently addresses
> > the
> > scenarios for a REST based batch and interactive gateway for Spark jobs
> on
> > YARN. Given the growing number of integrations with Livy, keeping it
> > outside
> > of Apache Spark aligns with the desire of the Apache Spark community to
> > reduce the number of external dependencies in the Spark project.
> > Specifically, the Apache Spark community has previously expressed a
> desire
> > to keep job servers independent from the project.<<FootNote(See, for
> > example, discussion of the Ooyala Spark Job Server in SPARK-818)>>
> > Furthermore, while Livy common usage is closely tied to Spark deployments
> > right now, its core building blocks can be reused elsewhere.  Livy’s
> Remote
> > REPL could be used as a library for interactive scenarios in non-Spark
> > projects. In the future, integrations with cluster managers like Apache
> > Mesos and others could also be added.
> >
> > The features provided by Livy have already been integrated with existing
> > projects like Jupyter and Apache Zeppelin for their interactive Spark use
> > cases. This validates the need for a project like Livy and provides an
> > active downstream user base that the Livy community can interact with to
> > seed future interest in the project.
> >
> > Livy serves a similar purpose to Apache Toree (incubating) but differs in
> > making session management, security and impersonation a focal design
> point.
> >
> > == An Excessive Fascination with the Apache Brand ==
> >
> > The primary motivation for submitting Livy to the ASF is to grow a
> diverse
> > and strong community. We wish to encourage diverse organisations,
> including
> > ISVs, to adopt Livy and contribute to Livy without any concerns about
> > ownership or licensing.
> >
> > = Documentation =
> >
> > Documentation can be found on the Livy website http://livy.io/
> >
> > The Livy web site is version controlled on the ‘gh-pages’ branch of the
> > above repository.
> > Additional documentation is provided on the github wiki:
> > https://github.com/cloudera/livy/wiki
> > APis are documented within the source code as JavaDoc style documentation
> > comments.
> >
> > = Initial Source =
> >
> > The initial source code for Livy is hosted at
> > https://github.com/cloudera/livy
> >
> > = Source and Intellectual Property submission plan =
> >
> > The Livy codebase and web site is currently hosted on GitHub and will be
> > transitioned to the ASF repositories during incubation. Livy is already
> > licensed under the Apache 2.0 license. Cloudera has collected ICLAs and
> > CCLAs from all committers. There are, however, some contributions
> recently
> > from authors that have not signed the CCLA and ICLA. If necessary for a
> > successful SGA, we’ll seek the necessary documentation or replace the
> > contributions.
> >
> > The “Livy” name is not a registered trademark. We will need to do a
> > trademark search and make sure it is available for the Apache Foundation
> > prior to graduation.
> >
> > Cloudera currently owns the domain name: http://livy.io/. Once all the
> > documentation has moved over to ASF infrastructure, the main landing page
> > will become livy.incubator.apache.org and the old domain will just act
> as
> > a
> > redirect.
> >
> > = External Dependencies =
> >
> > The list below covers the non-Apache dependencies of the project and
> their
> > licenses.
> >
> >  * Jetty: Apache 2.0
> >  * Dropwizard Metrics: Apache 2.0
> >  * FasterXML Jackson: Apache 2.0
> >  * Netty: Apache 2.0
> >  * Scala: BSD
> >  * Py4J: BSD
> >  * Scalatra: BSD
> >
> > Build/test-only dependencies:
> >
> >  * Mockito: MIT
> >  * JUnit: Eclipse
> >
> > = Required Resources =
> >
> > == Mailing Lists ==
> >
> >  * private@livy.incubator.apache.org (PPMC)
> >  * dev@livy.incubator.apache.org (dev mailing list)
> >  * user@livy.incubator.apache.org (User questions)
> >  * commits@livy.incubator.apache.org (subscribers shouldn’t be able to
> > post)
> >  * issues@livy.incubator.apache.org (subscribers shouldn’t be able to
> > post)
> >
> > == Git Repository ==
> >
> > git://git.apache.org/incubator-livy
> >
> > == Issue Tracking ==
> >
> > We would like to import our current JIRA project into the ASF JIRA, such
> > that our historical commit message and code comments continue to
> reference
> > the appropriate bug numbers.
> >
> > = Initial Committers =
> >
> >  * Marcelo Vanzin (vanzin@cloudera.com)
> >  * Alex Man (alex@alexman.space)
> >  * Jeff Zhang (zjffdu@gmail.com)
> >  * Saisai Shao (sshao@hortonworks.com)
> >  * Kostas Sakellis (kostas@cloudera.com)
> >
> > = Affiliations =
> >
> > The initial set of committers includes people employed by Cloudera and
> > Hortonworks as well as one currently independent contributor.
> >
> > = Additional Interested Contributors =
> >
> > Those interested in getting involved with the project as we enter
> > incubation
> > are encouraged to list themselves here.
> >
> >   * Ismaël Mejía (iemejia@apache.org)
> >
> > = Sponsors =
> >
> > == Champion ==
> >
> > Sean Busbey (busbey@apache.org)
> >
> > == Nominated Mentors ==
> >
> >  * Bikas Saha (bikas@apache.org)
> >  * Brock Noland (brock@phdata.io)
> >  * Luciano Resende (lresende@apache.org)
> >
> > == Sponsoring Entity ==
> >
> > We ask that the Incubator PMC sponsor this proposal.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > For additional commands, e-mail: general-help@incubator.apache.org
> >
> >
>

Re: [VOTE] Livy to enter Apache Incubator

Posted by Hitesh Shah <hi...@apache.org>.

+1

-- Hitesh

On Wed, May 31, 2017 at 6:03 AM, Sean Busbey <bu...@apache.org> wrote:

> Hi folks!
>
> I'm calling a vote to accept "Livy" into the Apache Incubator.
>
> The full proposal is available below, and is also available in the wiki:
>
> https://wiki.apache.org/incubator/LivyProposal
>
> For additional context, please see the discussion thread:
>
> https://s.apache.org/incubator-livy-proposal-thread
>
> Please cast your vote:
>
> [ ] +1, bring Livy into Incubator
> [ ] -1, do not bring Livy into Incubator, because...
>
> The vote will open at least for 72 hours and only votes from the Incubator
> PMC are binding.
>
> I start with my vote:
> +1
>
> ----
>
> = Abstract =
>
> Livy is web service that exposes a REST interface for managing long running
> Apache Spark contexts in your cluster. With Livy, new applications can be
> built on top of Apache Spark that require fine grained interaction with
> many
> Spark contexts.
>
> = Proposal =
>
> Livy is an open-source REST service for Apache Spark. Livy enables
> applications to submit Spark applications and retrieve results without a
> co-location requirement on the Spark cluster.
>
> We propose to contribute the Livy codebase and associated artifacts (e.g.
> documentation, web-site context etc) to the Apache Software Foundation.
>
> = Background =
>
> Apache Spark is a fast and general purpose distributed compute engine, with
> a versatile API. It enables processing of large quantities of static data
> distributed over a cluster of machines, as well as processing of continuous
> streams of data. It is the preferred distributed data processing engine for
> data engineering, stream processing and data science workloads. Each Spark
> application uses a construct called the SparkContext, which is the
> application’s connection or entry point to the Spark engine. Each Spark
> application will have its own SparkContext.
>
> Livy enables clients to interact with one or more Spark sessions through
> the
> Livy Server, which acts as a proxy layer. Livy Clients have fine grained
> control over the lifecycle of the Spark sessions, as well as the ability to
> submit jobs and retrieve results, all over HTTP. Clients have two modes of
> interaction: RPC Client API, available in Java and Python, which allows
> results to be retrieved as Java or Python objects. The serialization and
> deserialization of the results is handled by the Livy framework. HTTP based
> API that allows submission of code snippets, and retrieval of the results
> in
> different formats.
>
> Multi-tenant resource allocation and security: Livy enables multiple
> independent Spark sessions to be managed simultaneously. Multiple clients
> can also interact simultaneously with the same Spark session and share the
> resources of that Spark session. Livy can also enforce secure,
> authenticated
> communication between the clients and their respective Spark sessions.
>
> More information on Livy can be found at the existing open source website:
> http://livy.io/
>
> = Rationale =
>
> Users want to use Spark’s powerful processing engine and API as the data
> processing backend for interactive applications. However, the job
> submission
> and application interaction mechanisms built into Apache Spark are
> insufficient and cumbersome for multi-user interactive applications.
>
> The primary mechanism for applications to submit Spark jobs is via
> spark-submit
> (http://spark.apache.org/docs/latest/submitting-applications.html), which
> is
> available as a command line tool as well as a programmatic API. However,
> spark-submit has the following limitations that make it difficult to build
> interactive applications: It is slow: each invocation of spark-submit
> involves a setup phase where cluster resources are acquired, new processes
> are forked, etc. This setup phase runs for many seconds, or even minutes,
> and hence is too slow for interactive applications. It is cumbersome and
> lacks flexibility: application code and dependencies have to be
> pre-compiled
> and submitted as jars, and can not be submitted interactively.
>
> Apache Spark comes with an ODBC/JDBC server, which can be used to submit
> SQL
> queries to Spark. However, this solution is limited to SQL and does not
> allow the client to leverage the rest of the Spark API, such as RDDs, MLlib
> and Streaming.
>
> A third way of using Spark is via its command-line shell, which allows the
> interactive submission of snippets of Spark code. However, the shell
> entails
> running Spark code on the client machine and hence is not a viable
> mechanism
> for remote clients to submit Spark jobs.
>
> Livy solves the limitations of the above three mechanisms, and provides the
> full Spark API as a multi-tenant service to remote clients.
>
> Since the open source release of Livy in late 2015, we have seen tremendous
> interest among a diverse set of application developers and ISVs that want
> to
> build applications with Apache Spark. To make Livy a robust and flexible
> solution that will enable a broad and growing set of applications, it is
> important to grow a large and varied community of contributors.
>
> = Initial Goals =
>
>   * Move existing codebase, website, documentation and mailing lists to
>     Apache-hosted infrastructure
>   * Work with the infrastructure team to implement and approve our code
>     review, build, and testing workflows in the context of the ASF
>   * Incremental development and releases per Apache guidelines
>
> = Current Status =
>
> The Livy project began at Cloudera, as a part of the Hue project. Cloudera
> soon realized the broad applicability of Livy, and separated it out into an
> independent project in Nov 2015.
>
> == Releases ==
>
> Livy has undergone two public releases, tagged here:
>
>  * https://github.com/cloudera/livy/releases/tag/v0.2.0
>  * https://github.com/cloudera/livy/releases/tag/v0.3.0
>
> Tarballs and zip files were created for each release and hosted on github.
> Upon joining the incubator, we will adopt a more typical ASF release
> process.
>
> == Source ==
>
> Livy’s source is currently hosted on Github at:
> https://github.com/cloudera/livy
>
> This repository will be transitioned to Apache’s git hosting during
> incubation.
>
> == Code review ==
>
> Livy’s code reviews are currently public and hosted on github as pull
> request reviews at: https://github.com/cloudera/livy/pulls
> The Livy developer community so far is happy with github pull request
> reviews and hopes to continue this after being admitted to the ASF.
>
> == Issue Tracking ==
>
> Livy’s bug and feature tracking is hosted on JIRA at:
> https://issues.cloudera.org/projects/LIVY/summary
> This JIRA instance contains bugs and development discussion dating back 1
> year and will provide an initial seed for the ASF JIRA
>
> == Community Discussion ==
>
> Livy has several public discussion forums:
>
>  * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-dev
>  * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-user
>
> == Development Practices ==
>
> The Livy project follows a review before commit philosophy. Every commit
> automatically runs through the unit tests and generates coverage reports
> presented as a pull request comment. Our experience with this process leads
> us to believe that it helps ease new contributors into the project. They
> get
> feedback quickly on common mistakes, lowering the burden on reviewers.
> Those
> same reviewers get to lead by example, showing the new contributors that we
> value feedback within our community even when changes are done by more
> experienced folks.
>
> == Meritocracy ==
>
> We believe strongly in meritocracy when electing committers and PMC
> members.
> In the past few months, the project has added two new committers from two
> different organisations, in recognition of their significant contributions
> to the project. We will encourage contributions and participation of all
> types, and ensure that contributors are appropriately recognized.
>
> == Community ==
>
> Though Livy is relatively new as a standalone open source project, it has
> already seen promising growth in its community across several
> organizations:
> Cloudera is the original development sponsor for Livy
> Microsoft pushed the development of the interpreter fixing high
> availability
> issues and adding additional features.
> Hortonworks has contributed the security features to Livy allowing kerberos
> and impersonation to work with Spark
> IBM is starting to make contributions to the Livy project
> A number of other patches contributed by community members
>
> Livy currently relies on Google Groups for mailing lists. These lists have
> been active since the end of 2015/start of 2016. Currently, Livy’s user
> mailing list has 173 subscribers and has hosted a total of 227 topic
> threads. Livy’s developer list has 49 subscribers and has hosted 79 topic
> threads.
>
> == Core Developers ==
>
> The early contributions to Livy were made by Cloudera engineers. In 2016,
> engineers from Microsoft and Hortonworks joined the core developer
> community.
>
> == Alignment ==
>
> Livy is built upon Apache Spark, and other Apache projects like Apache
> Hadoop YARN. It’s used as a building block by Apache Zeppelin. These
> community connections combined with our focus on development practices that
> emphasize community engagement with a path to meritocratic recognition
> naturally align us with the ASF.
>
> = Known Risks =
>
> == Orphaned Products ==
>
> The risk of Livy being abandoned is low because it is supported by three
> major big-data software vendors. Moreover, Livy is already used to power
> multiple releases of services and products used in production.
>
> == Inexperience with Open Source ==
>
> Several of the initial committers are experienced open source developers,
> several being committers and/or PMC members on other ASF projects (Spark,
> YARN).
>
> == Homogenous Developers ==
>
> The project already has a diverse developer base. It has contributions from
> 3 major organisations (Cloudera, Microsoft and Hortonworks), and is used in
> diverse applications, in diverse settings (On-Prem and Cloud).
>
> == Reliance on salaried Developers ==
>
> The contributions to the Livy project to date have been made by salaried
> engineers from Cloudera, Microsoft and Hortonworks. One of the individuals
> on the initial committer list has since left Microsoft and is currently
> unaffiliated. The remaining contributors are from Cloudera and Hortonworks.
> Since there are at least two major organizations involved, the risk of
> reliance on a single group of salaried developers is mitigated. The Livy
> user base is diverse, with users from across the globe, including users
> from
> academic settings. We aim to further diversify the Livy user and
> contributor
> base.
>
> == Relationships with other Apache projects ==
>
> Livy is closely tied to the Apache Spark project and currently addresses
> the
> scenarios for a REST based batch and interactive gateway for Spark jobs on
> YARN. Given the growing number of integrations with Livy, keeping it
> outside
> of Apache Spark aligns with the desire of the Apache Spark community to
> reduce the number of external dependencies in the Spark project.
> Specifically, the Apache Spark community has previously expressed a desire
> to keep job servers independent from the project.<<FootNote(See, for
> example, discussion of the Ooyala Spark Job Server in SPARK-818)>>
> Furthermore, while Livy common usage is closely tied to Spark deployments
> right now, its core building blocks can be reused elsewhere.  Livy’s Remote
> REPL could be used as a library for interactive scenarios in non-Spark
> projects. In the future, integrations with cluster managers like Apache
> Mesos and others could also be added.
>
> The features provided by Livy have already been integrated with existing
> projects like Jupyter and Apache Zeppelin for their interactive Spark use
> cases. This validates the need for a project like Livy and provides an
> active downstream user base that the Livy community can interact with to
> seed future interest in the project.
>
> Livy serves a similar purpose to Apache Toree (incubating) but differs in
> making session management, security and impersonation a focal design point.
>
> == An Excessive Fascination with the Apache Brand ==
>
> The primary motivation for submitting Livy to the ASF is to grow a diverse
> and strong community. We wish to encourage diverse organisations, including
> ISVs, to adopt Livy and contribute to Livy without any concerns about
> ownership or licensing.
>
> = Documentation =
>
> Documentation can be found on the Livy website http://livy.io/
>
> The Livy web site is version controlled on the ‘gh-pages’ branch of the
> above repository.
> Additional documentation is provided on the github wiki:
> https://github.com/cloudera/livy/wiki
> APis are documented within the source code as JavaDoc style documentation
> comments.
>
> = Initial Source =
>
> The initial source code for Livy is hosted at
> https://github.com/cloudera/livy
>
> = Source and Intellectual Property submission plan =
>
> The Livy codebase and web site is currently hosted on GitHub and will be
> transitioned to the ASF repositories during incubation. Livy is already
> licensed under the Apache 2.0 license. Cloudera has collected ICLAs and
> CCLAs from all committers. There are, however, some contributions recently
> from authors that have not signed the CCLA and ICLA. If necessary for a
> successful SGA, we’ll seek the necessary documentation or replace the
> contributions.
>
> The “Livy” name is not a registered trademark. We will need to do a
> trademark search and make sure it is available for the Apache Foundation
> prior to graduation.
>
> Cloudera currently owns the domain name: http://livy.io/. Once all the
> documentation has moved over to ASF infrastructure, the main landing page
> will become livy.incubator.apache.org and the old domain will just act as
> a
> redirect.
>
> = External Dependencies =
>
> The list below covers the non-Apache dependencies of the project and their
> licenses.
>
>  * Jetty: Apache 2.0
>  * Dropwizard Metrics: Apache 2.0
>  * FasterXML Jackson: Apache 2.0
>  * Netty: Apache 2.0
>  * Scala: BSD
>  * Py4J: BSD
>  * Scalatra: BSD
>
> Build/test-only dependencies:
>
>  * Mockito: MIT
>  * JUnit: Eclipse
>
> = Required Resources =
>
> == Mailing Lists ==
>
>  * private@livy.incubator.apache.org (PPMC)
>  * dev@livy.incubator.apache.org (dev mailing list)
>  * user@livy.incubator.apache.org (User questions)
>  * commits@livy.incubator.apache.org (subscribers shouldn’t be able to
> post)
>  * issues@livy.incubator.apache.org (subscribers shouldn’t be able to
> post)
>
> == Git Repository ==
>
> git://git.apache.org/incubator-livy
>
> == Issue Tracking ==
>
> We would like to import our current JIRA project into the ASF JIRA, such
> that our historical commit message and code comments continue to reference
> the appropriate bug numbers.
>
> = Initial Committers =
>
>  * Marcelo Vanzin (vanzin@cloudera.com)
>  * Alex Man (alex@alexman.space)
>  * Jeff Zhang (zjffdu@gmail.com)
>  * Saisai Shao (sshao@hortonworks.com)
>  * Kostas Sakellis (kostas@cloudera.com)
>
> = Affiliations =
>
> The initial set of committers includes people employed by Cloudera and
> Hortonworks as well as one currently independent contributor.
>
> = Additional Interested Contributors =
>
> Those interested in getting involved with the project as we enter
> incubation
> are encouraged to list themselves here.
>
>   * Ismaël Mejía (iemejia@apache.org)
>
> = Sponsors =
>
> == Champion ==
>
> Sean Busbey (busbey@apache.org)
>
> == Nominated Mentors ==
>
>  * Bikas Saha (bikas@apache.org)
>  * Brock Noland (brock@phdata.io)
>  * Luciano Resende (lresende@apache.org)
>
> == Sponsoring Entity ==
>
> We ask that the Incubator PMC sponsor this proposal.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>

Re: [VOTE] Livy to enter Apache Incubator

Posted by Ismaël Mejía <ie...@gmail.com>.

A missing backend element with a community around it, definitely a
great project to have at Apache.

+1 (non-binding)

Ismaël


On Wed, May 31, 2017 at 3:29 PM, larry mccay <la...@gmail.com> wrote:
> This will be a great addition.
>
> +1
>
> On Wed, May 31, 2017 at 9:03 AM, Sean Busbey <bu...@apache.org> wrote:
>
>> Hi folks!
>>
>> I'm calling a vote to accept "Livy" into the Apache Incubator.
>>
>> The full proposal is available below, and is also available in the wiki:
>>
>> https://wiki.apache.org/incubator/LivyProposal
>>
>> For additional context, please see the discussion thread:
>>
>> https://s.apache.org/incubator-livy-proposal-thread
>>
>> Please cast your vote:
>>
>> [ ] +1, bring Livy into Incubator
>> [ ] -1, do not bring Livy into Incubator, because...
>>
>> The vote will open at least for 72 hours and only votes from the Incubator
>> PMC are binding.
>>
>> I start with my vote:
>> +1
>>
>> ----
>>
>> = Abstract =
>>
>> Livy is web service that exposes a REST interface for managing long running
>> Apache Spark contexts in your cluster. With Livy, new applications can be
>> built on top of Apache Spark that require fine grained interaction with
>> many
>> Spark contexts.
>>
>> = Proposal =
>>
>> Livy is an open-source REST service for Apache Spark. Livy enables
>> applications to submit Spark applications and retrieve results without a
>> co-location requirement on the Spark cluster.
>>
>> We propose to contribute the Livy codebase and associated artifacts (e.g.
>> documentation, web-site context etc) to the Apache Software Foundation.
>>
>> = Background =
>>
>> Apache Spark is a fast and general purpose distributed compute engine, with
>> a versatile API. It enables processing of large quantities of static data
>> distributed over a cluster of machines, as well as processing of continuous
>> streams of data. It is the preferred distributed data processing engine for
>> data engineering, stream processing and data science workloads. Each Spark
>> application uses a construct called the SparkContext, which is the
>> application’s connection or entry point to the Spark engine. Each Spark
>> application will have its own SparkContext.
>>
>> Livy enables clients to interact with one or more Spark sessions through
>> the
>> Livy Server, which acts as a proxy layer. Livy Clients have fine grained
>> control over the lifecycle of the Spark sessions, as well as the ability to
>> submit jobs and retrieve results, all over HTTP. Clients have two modes of
>> interaction: RPC Client API, available in Java and Python, which allows
>> results to be retrieved as Java or Python objects. The serialization and
>> deserialization of the results is handled by the Livy framework. HTTP based
>> API that allows submission of code snippets, and retrieval of the results
>> in
>> different formats.
>>
>> Multi-tenant resource allocation and security: Livy enables multiple
>> independent Spark sessions to be managed simultaneously. Multiple clients
>> can also interact simultaneously with the same Spark session and share the
>> resources of that Spark session. Livy can also enforce secure,
>> authenticated
>> communication between the clients and their respective Spark sessions.
>>
>> More information on Livy can be found at the existing open source website:
>> http://livy.io/
>>
>> = Rationale =
>>
>> Users want to use Spark’s powerful processing engine and API as the data
>> processing backend for interactive applications. However, the job
>> submission
>> and application interaction mechanisms built into Apache Spark are
>> insufficient and cumbersome for multi-user interactive applications.
>>
>> The primary mechanism for applications to submit Spark jobs is via
>> spark-submit
>> (http://spark.apache.org/docs/latest/submitting-applications.html), which
>> is
>> available as a command line tool as well as a programmatic API. However,
>> spark-submit has the following limitations that make it difficult to build
>> interactive applications: It is slow: each invocation of spark-submit
>> involves a setup phase where cluster resources are acquired, new processes
>> are forked, etc. This setup phase runs for many seconds, or even minutes,
>> and hence is too slow for interactive applications. It is cumbersome and
>> lacks flexibility: application code and dependencies have to be
>> pre-compiled
>> and submitted as jars, and can not be submitted interactively.
>>
>> Apache Spark comes with an ODBC/JDBC server, which can be used to submit
>> SQL
>> queries to Spark. However, this solution is limited to SQL and does not
>> allow the client to leverage the rest of the Spark API, such as RDDs, MLlib
>> and Streaming.
>>
>> A third way of using Spark is via its command-line shell, which allows the
>> interactive submission of snippets of Spark code. However, the shell
>> entails
>> running Spark code on the client machine and hence is not a viable
>> mechanism
>> for remote clients to submit Spark jobs.
>>
>> Livy solves the limitations of the above three mechanisms, and provides the
>> full Spark API as a multi-tenant service to remote clients.
>>
>> Since the open source release of Livy in late 2015, we have seen tremendous
>> interest among a diverse set of application developers and ISVs that want
>> to
>> build applications with Apache Spark. To make Livy a robust and flexible
>> solution that will enable a broad and growing set of applications, it is
>> important to grow a large and varied community of contributors.
>>
>> = Initial Goals =
>>
>>   * Move existing codebase, website, documentation and mailing lists to
>>     Apache-hosted infrastructure
>>   * Work with the infrastructure team to implement and approve our code
>>     review, build, and testing workflows in the context of the ASF
>>   * Incremental development and releases per Apache guidelines
>>
>> = Current Status =
>>
>> The Livy project began at Cloudera, as a part of the Hue project. Cloudera
>> soon realized the broad applicability of Livy, and separated it out into an
>> independent project in Nov 2015.
>>
>> == Releases ==
>>
>> Livy has undergone two public releases, tagged here:
>>
>>  * https://github.com/cloudera/livy/releases/tag/v0.2.0
>>  * https://github.com/cloudera/livy/releases/tag/v0.3.0
>>
>> Tarballs and zip files were created for each release and hosted on github.
>> Upon joining the incubator, we will adopt a more typical ASF release
>> process.
>>
>> == Source ==
>>
>> Livy’s source is currently hosted on Github at:
>> https://github.com/cloudera/livy
>>
>> This repository will be transitioned to Apache’s git hosting during
>> incubation.
>>
>> == Code review ==
>>
>> Livy’s code reviews are currently public and hosted on github as pull
>> request reviews at: https://github.com/cloudera/livy/pulls
>> The Livy developer community so far is happy with github pull request
>> reviews and hopes to continue this after being admitted to the ASF.
>>
>> == Issue Tracking ==
>>
>> Livy’s bug and feature tracking is hosted on JIRA at:
>> https://issues.cloudera.org/projects/LIVY/summary
>> This JIRA instance contains bugs and development discussion dating back 1
>> year and will provide an initial seed for the ASF JIRA
>>
>> == Community Discussion ==
>>
>> Livy has several public discussion forums:
>>
>>  * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-dev
>>  * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-user
>>
>> == Development Practices ==
>>
>> The Livy project follows a review before commit philosophy. Every commit
>> automatically runs through the unit tests and generates coverage reports
>> presented as a pull request comment. Our experience with this process leads
>> us to believe that it helps ease new contributors into the project. They
>> get
>> feedback quickly on common mistakes, lowering the burden on reviewers.
>> Those
>> same reviewers get to lead by example, showing the new contributors that we
>> value feedback within our community even when changes are done by more
>> experienced folks.
>>
>> == Meritocracy ==
>>
>> We believe strongly in meritocracy when electing committers and PMC
>> members.
>> In the past few months, the project has added two new committers from two
>> different organisations, in recognition of their significant contributions
>> to the project. We will encourage contributions and participation of all
>> types, and ensure that contributors are appropriately recognized.
>>
>> == Community ==
>>
>> Though Livy is relatively new as a standalone open source project, it has
>> already seen promising growth in its community across several
>> organizations:
>> Cloudera is the original development sponsor for Livy
>> Microsoft pushed the development of the interpreter fixing high
>> availability
>> issues and adding additional features.
>> Hortonworks has contributed the security features to Livy allowing kerberos
>> and impersonation to work with Spark
>> IBM is starting to make contributions to the Livy project
>> A number of other patches contributed by community members
>>
>> Livy currently relies on Google Groups for mailing lists. These lists have
>> been active since the end of 2015/start of 2016. Currently, Livy’s user
>> mailing list has 173 subscribers and has hosted a total of 227 topic
>> threads. Livy’s developer list has 49 subscribers and has hosted 79 topic
>> threads.
>>
>> == Core Developers ==
>>
>> The early contributions to Livy were made by Cloudera engineers. In 2016,
>> engineers from Microsoft and Hortonworks joined the core developer
>> community.
>>
>> == Alignment ==
>>
>> Livy is built upon Apache Spark, and other Apache projects like Apache
>> Hadoop YARN. It’s used as a building block by Apache Zeppelin. These
>> community connections combined with our focus on development practices that
>> emphasize community engagement with a path to meritocratic recognition
>> naturally align us with the ASF.
>>
>> = Known Risks =
>>
>> == Orphaned Products ==
>>
>> The risk of Livy being abandoned is low because it is supported by three
>> major big-data software vendors. Moreover, Livy is already used to power
>> multiple releases of services and products used in production.
>>
>> == Inexperience with Open Source ==
>>
>> Several of the initial committers are experienced open source developers,
>> several being committers and/or PMC members on other ASF projects (Spark,
>> YARN).
>>
>> == Homogenous Developers ==
>>
>> The project already has a diverse developer base. It has contributions from
>> 3 major organisations (Cloudera, Microsoft and Hortonworks), and is used in
>> diverse applications, in diverse settings (On-Prem and Cloud).
>>
>> == Reliance on salaried Developers ==
>>
>> The contributions to the Livy project to date have been made by salaried
>> engineers from Cloudera, Microsoft and Hortonworks. One of the individuals
>> on the initial committer list has since left Microsoft and is currently
>> unaffiliated. The remaining contributors are from Cloudera and Hortonworks.
>> Since there are at least two major organizations involved, the risk of
>> reliance on a single group of salaried developers is mitigated. The Livy
>> user base is diverse, with users from across the globe, including users
>> from
>> academic settings. We aim to further diversify the Livy user and
>> contributor
>> base.
>>
>> == Relationships with other Apache projects ==
>>
>> Livy is closely tied to the Apache Spark project and currently addresses
>> the
>> scenarios for a REST based batch and interactive gateway for Spark jobs on
>> YARN. Given the growing number of integrations with Livy, keeping it
>> outside
>> of Apache Spark aligns with the desire of the Apache Spark community to
>> reduce the number of external dependencies in the Spark project.
>> Specifically, the Apache Spark community has previously expressed a desire
>> to keep job servers independent from the project.<<FootNote(See, for
>> example, discussion of the Ooyala Spark Job Server in SPARK-818)>>
>> Furthermore, while Livy common usage is closely tied to Spark deployments
>> right now, its core building blocks can be reused elsewhere.  Livy’s Remote
>> REPL could be used as a library for interactive scenarios in non-Spark
>> projects. In the future, integrations with cluster managers like Apache
>> Mesos and others could also be added.
>>
>> The features provided by Livy have already been integrated with existing
>> projects like Jupyter and Apache Zeppelin for their interactive Spark use
>> cases. This validates the need for a project like Livy and provides an
>> active downstream user base that the Livy community can interact with to
>> seed future interest in the project.
>>
>> Livy serves a similar purpose to Apache Toree (incubating) but differs in
>> making session management, security and impersonation a focal design point.
>>
>> == An Excessive Fascination with the Apache Brand ==
>>
>> The primary motivation for submitting Livy to the ASF is to grow a diverse
>> and strong community. We wish to encourage diverse organisations, including
>> ISVs, to adopt Livy and contribute to Livy without any concerns about
>> ownership or licensing.
>>
>> = Documentation =
>>
>> Documentation can be found on the Livy website http://livy.io/
>>
>> The Livy web site is version controlled on the ‘gh-pages’ branch of the
>> above repository.
>> Additional documentation is provided on the github wiki:
>> https://github.com/cloudera/livy/wiki
>> APis are documented within the source code as JavaDoc style documentation
>> comments.
>>
>> = Initial Source =
>>
>> The initial source code for Livy is hosted at
>> https://github.com/cloudera/livy
>>
>> = Source and Intellectual Property submission plan =
>>
>> The Livy codebase and web site is currently hosted on GitHub and will be
>> transitioned to the ASF repositories during incubation. Livy is already
>> licensed under the Apache 2.0 license. Cloudera has collected ICLAs and
>> CCLAs from all committers. There are, however, some contributions recently
>> from authors that have not signed the CCLA and ICLA. If necessary for a
>> successful SGA, we’ll seek the necessary documentation or replace the
>> contributions.
>>
>> The “Livy” name is not a registered trademark. We will need to do a
>> trademark search and make sure it is available for the Apache Foundation
>> prior to graduation.
>>
>> Cloudera currently owns the domain name: http://livy.io/. Once all the
>> documentation has moved over to ASF infrastructure, the main landing page
>> will become livy.incubator.apache.org and the old domain will just act as
>> a
>> redirect.
>>
>> = External Dependencies =
>>
>> The list below covers the non-Apache dependencies of the project and their
>> licenses.
>>
>>  * Jetty: Apache 2.0
>>  * Dropwizard Metrics: Apache 2.0
>>  * FasterXML Jackson: Apache 2.0
>>  * Netty: Apache 2.0
>>  * Scala: BSD
>>  * Py4J: BSD
>>  * Scalatra: BSD
>>
>> Build/test-only dependencies:
>>
>>  * Mockito: MIT
>>  * JUnit: Eclipse
>>
>> = Required Resources =
>>
>> == Mailing Lists ==
>>
>>  * private@livy.incubator.apache.org (PPMC)
>>  * dev@livy.incubator.apache.org (dev mailing list)
>>  * user@livy.incubator.apache.org (User questions)
>>  * commits@livy.incubator.apache.org (subscribers shouldn’t be able to
>> post)
>>  * issues@livy.incubator.apache.org (subscribers shouldn’t be able to
>> post)
>>
>> == Git Repository ==
>>
>> git://git.apache.org/incubator-livy
>>
>> == Issue Tracking ==
>>
>> We would like to import our current JIRA project into the ASF JIRA, such
>> that our historical commit message and code comments continue to reference
>> the appropriate bug numbers.
>>
>> = Initial Committers =
>>
>>  * Marcelo Vanzin (vanzin@cloudera.com)
>>  * Alex Man (alex@alexman.space)
>>  * Jeff Zhang (zjffdu@gmail.com)
>>  * Saisai Shao (sshao@hortonworks.com)
>>  * Kostas Sakellis (kostas@cloudera.com)
>>
>> = Affiliations =
>>
>> The initial set of committers includes people employed by Cloudera and
>> Hortonworks as well as one currently independent contributor.
>>
>> = Additional Interested Contributors =
>>
>> Those interested in getting involved with the project as we enter
>> incubation
>> are encouraged to list themselves here.
>>
>>   * Ismaël Mejía (iemejia@apache.org)
>>
>> = Sponsors =
>>
>> == Champion ==
>>
>> Sean Busbey (busbey@apache.org)
>>
>> == Nominated Mentors ==
>>
>>  * Bikas Saha (bikas@apache.org)
>>  * Brock Noland (brock@phdata.io)
>>  * Luciano Resende (lresende@apache.org)
>>
>> == Sponsoring Entity ==
>>
>> We ask that the Incubator PMC sponsor this proposal.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>> For additional commands, e-mail: general-help@incubator.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org

Re: [VOTE] Livy to enter Apache Incubator

Posted by larry mccay <la...@gmail.com>.

This will be a great addition.

+1

On Wed, May 31, 2017 at 9:03 AM, Sean Busbey <bu...@apache.org> wrote:

> Hi folks!
>
> I'm calling a vote to accept "Livy" into the Apache Incubator.
>
> The full proposal is available below, and is also available in the wiki:
>
> https://wiki.apache.org/incubator/LivyProposal
>
> For additional context, please see the discussion thread:
>
> https://s.apache.org/incubator-livy-proposal-thread
>
> Please cast your vote:
>
> [ ] +1, bring Livy into Incubator
> [ ] -1, do not bring Livy into Incubator, because...
>
> The vote will open at least for 72 hours and only votes from the Incubator
> PMC are binding.
>
> I start with my vote:
> +1
>
> ----
>
> = Abstract =
>
> Livy is web service that exposes a REST interface for managing long running
> Apache Spark contexts in your cluster. With Livy, new applications can be
> built on top of Apache Spark that require fine grained interaction with
> many
> Spark contexts.
>
> = Proposal =
>
> Livy is an open-source REST service for Apache Spark. Livy enables
> applications to submit Spark applications and retrieve results without a
> co-location requirement on the Spark cluster.
>
> We propose to contribute the Livy codebase and associated artifacts (e.g.
> documentation, web-site context etc) to the Apache Software Foundation.
>
> = Background =
>
> Apache Spark is a fast and general purpose distributed compute engine, with
> a versatile API. It enables processing of large quantities of static data
> distributed over a cluster of machines, as well as processing of continuous
> streams of data. It is the preferred distributed data processing engine for
> data engineering, stream processing and data science workloads. Each Spark
> application uses a construct called the SparkContext, which is the
> application’s connection or entry point to the Spark engine. Each Spark
> application will have its own SparkContext.
>
> Livy enables clients to interact with one or more Spark sessions through
> the
> Livy Server, which acts as a proxy layer. Livy Clients have fine grained
> control over the lifecycle of the Spark sessions, as well as the ability to
> submit jobs and retrieve results, all over HTTP. Clients have two modes of
> interaction: RPC Client API, available in Java and Python, which allows
> results to be retrieved as Java or Python objects. The serialization and
> deserialization of the results is handled by the Livy framework. HTTP based
> API that allows submission of code snippets, and retrieval of the results
> in
> different formats.
>
> Multi-tenant resource allocation and security: Livy enables multiple
> independent Spark sessions to be managed simultaneously. Multiple clients
> can also interact simultaneously with the same Spark session and share the
> resources of that Spark session. Livy can also enforce secure,
> authenticated
> communication between the clients and their respective Spark sessions.
>
> More information on Livy can be found at the existing open source website:
> http://livy.io/
>
> = Rationale =
>
> Users want to use Spark’s powerful processing engine and API as the data
> processing backend for interactive applications. However, the job
> submission
> and application interaction mechanisms built into Apache Spark are
> insufficient and cumbersome for multi-user interactive applications.
>
> The primary mechanism for applications to submit Spark jobs is via
> spark-submit
> (http://spark.apache.org/docs/latest/submitting-applications.html), which
> is
> available as a command line tool as well as a programmatic API. However,
> spark-submit has the following limitations that make it difficult to build
> interactive applications: It is slow: each invocation of spark-submit
> involves a setup phase where cluster resources are acquired, new processes
> are forked, etc. This setup phase runs for many seconds, or even minutes,
> and hence is too slow for interactive applications. It is cumbersome and
> lacks flexibility: application code and dependencies have to be
> pre-compiled
> and submitted as jars, and can not be submitted interactively.
>
> Apache Spark comes with an ODBC/JDBC server, which can be used to submit
> SQL
> queries to Spark. However, this solution is limited to SQL and does not
> allow the client to leverage the rest of the Spark API, such as RDDs, MLlib
> and Streaming.
>
> A third way of using Spark is via its command-line shell, which allows the
> interactive submission of snippets of Spark code. However, the shell
> entails
> running Spark code on the client machine and hence is not a viable
> mechanism
> for remote clients to submit Spark jobs.
>
> Livy solves the limitations of the above three mechanisms, and provides the
> full Spark API as a multi-tenant service to remote clients.
>
> Since the open source release of Livy in late 2015, we have seen tremendous
> interest among a diverse set of application developers and ISVs that want
> to
> build applications with Apache Spark. To make Livy a robust and flexible
> solution that will enable a broad and growing set of applications, it is
> important to grow a large and varied community of contributors.
>
> = Initial Goals =
>
>   * Move existing codebase, website, documentation and mailing lists to
>     Apache-hosted infrastructure
>   * Work with the infrastructure team to implement and approve our code
>     review, build, and testing workflows in the context of the ASF
>   * Incremental development and releases per Apache guidelines
>
> = Current Status =
>
> The Livy project began at Cloudera, as a part of the Hue project. Cloudera
> soon realized the broad applicability of Livy, and separated it out into an
> independent project in Nov 2015.
>
> == Releases ==
>
> Livy has undergone two public releases, tagged here:
>
>  * https://github.com/cloudera/livy/releases/tag/v0.2.0
>  * https://github.com/cloudera/livy/releases/tag/v0.3.0
>
> Tarballs and zip files were created for each release and hosted on github.
> Upon joining the incubator, we will adopt a more typical ASF release
> process.
>
> == Source ==
>
> Livy’s source is currently hosted on Github at:
> https://github.com/cloudera/livy
>
> This repository will be transitioned to Apache’s git hosting during
> incubation.
>
> == Code review ==
>
> Livy’s code reviews are currently public and hosted on github as pull
> request reviews at: https://github.com/cloudera/livy/pulls
> The Livy developer community so far is happy with github pull request
> reviews and hopes to continue this after being admitted to the ASF.
>
> == Issue Tracking ==
>
> Livy’s bug and feature tracking is hosted on JIRA at:
> https://issues.cloudera.org/projects/LIVY/summary
> This JIRA instance contains bugs and development discussion dating back 1
> year and will provide an initial seed for the ASF JIRA
>
> == Community Discussion ==
>
> Livy has several public discussion forums:
>
>  * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-dev
>  * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-user
>
> == Development Practices ==
>
> The Livy project follows a review before commit philosophy. Every commit
> automatically runs through the unit tests and generates coverage reports
> presented as a pull request comment. Our experience with this process leads
> us to believe that it helps ease new contributors into the project. They
> get
> feedback quickly on common mistakes, lowering the burden on reviewers.
> Those
> same reviewers get to lead by example, showing the new contributors that we
> value feedback within our community even when changes are done by more
> experienced folks.
>
> == Meritocracy ==
>
> We believe strongly in meritocracy when electing committers and PMC
> members.
> In the past few months, the project has added two new committers from two
> different organisations, in recognition of their significant contributions
> to the project. We will encourage contributions and participation of all
> types, and ensure that contributors are appropriately recognized.
>
> == Community ==
>
> Though Livy is relatively new as a standalone open source project, it has
> already seen promising growth in its community across several
> organizations:
> Cloudera is the original development sponsor for Livy
> Microsoft pushed the development of the interpreter fixing high
> availability
> issues and adding additional features.
> Hortonworks has contributed the security features to Livy allowing kerberos
> and impersonation to work with Spark
> IBM is starting to make contributions to the Livy project
> A number of other patches contributed by community members
>
> Livy currently relies on Google Groups for mailing lists. These lists have
> been active since the end of 2015/start of 2016. Currently, Livy’s user
> mailing list has 173 subscribers and has hosted a total of 227 topic
> threads. Livy’s developer list has 49 subscribers and has hosted 79 topic
> threads.
>
> == Core Developers ==
>
> The early contributions to Livy were made by Cloudera engineers. In 2016,
> engineers from Microsoft and Hortonworks joined the core developer
> community.
>
> == Alignment ==
>
> Livy is built upon Apache Spark, and other Apache projects like Apache
> Hadoop YARN. It’s used as a building block by Apache Zeppelin. These
> community connections combined with our focus on development practices that
> emphasize community engagement with a path to meritocratic recognition
> naturally align us with the ASF.
>
> = Known Risks =
>
> == Orphaned Products ==
>
> The risk of Livy being abandoned is low because it is supported by three
> major big-data software vendors. Moreover, Livy is already used to power
> multiple releases of services and products used in production.
>
> == Inexperience with Open Source ==
>
> Several of the initial committers are experienced open source developers,
> several being committers and/or PMC members on other ASF projects (Spark,
> YARN).
>
> == Homogenous Developers ==
>
> The project already has a diverse developer base. It has contributions from
> 3 major organisations (Cloudera, Microsoft and Hortonworks), and is used in
> diverse applications, in diverse settings (On-Prem and Cloud).
>
> == Reliance on salaried Developers ==
>
> The contributions to the Livy project to date have been made by salaried
> engineers from Cloudera, Microsoft and Hortonworks. One of the individuals
> on the initial committer list has since left Microsoft and is currently
> unaffiliated. The remaining contributors are from Cloudera and Hortonworks.
> Since there are at least two major organizations involved, the risk of
> reliance on a single group of salaried developers is mitigated. The Livy
> user base is diverse, with users from across the globe, including users
> from
> academic settings. We aim to further diversify the Livy user and
> contributor
> base.
>
> == Relationships with other Apache projects ==
>
> Livy is closely tied to the Apache Spark project and currently addresses
> the
> scenarios for a REST based batch and interactive gateway for Spark jobs on
> YARN. Given the growing number of integrations with Livy, keeping it
> outside
> of Apache Spark aligns with the desire of the Apache Spark community to
> reduce the number of external dependencies in the Spark project.
> Specifically, the Apache Spark community has previously expressed a desire
> to keep job servers independent from the project.<<FootNote(See, for
> example, discussion of the Ooyala Spark Job Server in SPARK-818)>>
> Furthermore, while Livy common usage is closely tied to Spark deployments
> right now, its core building blocks can be reused elsewhere.  Livy’s Remote
> REPL could be used as a library for interactive scenarios in non-Spark
> projects. In the future, integrations with cluster managers like Apache
> Mesos and others could also be added.
>
> The features provided by Livy have already been integrated with existing
> projects like Jupyter and Apache Zeppelin for their interactive Spark use
> cases. This validates the need for a project like Livy and provides an
> active downstream user base that the Livy community can interact with to
> seed future interest in the project.
>
> Livy serves a similar purpose to Apache Toree (incubating) but differs in
> making session management, security and impersonation a focal design point.
>
> == An Excessive Fascination with the Apache Brand ==
>
> The primary motivation for submitting Livy to the ASF is to grow a diverse
> and strong community. We wish to encourage diverse organisations, including
> ISVs, to adopt Livy and contribute to Livy without any concerns about
> ownership or licensing.
>
> = Documentation =
>
> Documentation can be found on the Livy website http://livy.io/
>
> The Livy web site is version controlled on the ‘gh-pages’ branch of the
> above repository.
> Additional documentation is provided on the github wiki:
> https://github.com/cloudera/livy/wiki
> APis are documented within the source code as JavaDoc style documentation
> comments.
>
> = Initial Source =
>
> The initial source code for Livy is hosted at
> https://github.com/cloudera/livy
>
> = Source and Intellectual Property submission plan =
>
> The Livy codebase and web site is currently hosted on GitHub and will be
> transitioned to the ASF repositories during incubation. Livy is already
> licensed under the Apache 2.0 license. Cloudera has collected ICLAs and
> CCLAs from all committers. There are, however, some contributions recently
> from authors that have not signed the CCLA and ICLA. If necessary for a
> successful SGA, we’ll seek the necessary documentation or replace the
> contributions.
>
> The “Livy” name is not a registered trademark. We will need to do a
> trademark search and make sure it is available for the Apache Foundation
> prior to graduation.
>
> Cloudera currently owns the domain name: http://livy.io/. Once all the
> documentation has moved over to ASF infrastructure, the main landing page
> will become livy.incubator.apache.org and the old domain will just act as
> a
> redirect.
>
> = External Dependencies =
>
> The list below covers the non-Apache dependencies of the project and their
> licenses.
>
>  * Jetty: Apache 2.0
>  * Dropwizard Metrics: Apache 2.0
>  * FasterXML Jackson: Apache 2.0
>  * Netty: Apache 2.0
>  * Scala: BSD
>  * Py4J: BSD
>  * Scalatra: BSD
>
> Build/test-only dependencies:
>
>  * Mockito: MIT
>  * JUnit: Eclipse
>
> = Required Resources =
>
> == Mailing Lists ==
>
>  * private@livy.incubator.apache.org (PPMC)
>  * dev@livy.incubator.apache.org (dev mailing list)
>  * user@livy.incubator.apache.org (User questions)
>  * commits@livy.incubator.apache.org (subscribers shouldn’t be able to
> post)
>  * issues@livy.incubator.apache.org (subscribers shouldn’t be able to
> post)
>
> == Git Repository ==
>
> git://git.apache.org/incubator-livy
>
> == Issue Tracking ==
>
> We would like to import our current JIRA project into the ASF JIRA, such
> that our historical commit message and code comments continue to reference
> the appropriate bug numbers.
>
> = Initial Committers =
>
>  * Marcelo Vanzin (vanzin@cloudera.com)
>  * Alex Man (alex@alexman.space)
>  * Jeff Zhang (zjffdu@gmail.com)
>  * Saisai Shao (sshao@hortonworks.com)
>  * Kostas Sakellis (kostas@cloudera.com)
>
> = Affiliations =
>
> The initial set of committers includes people employed by Cloudera and
> Hortonworks as well as one currently independent contributor.
>
> = Additional Interested Contributors =
>
> Those interested in getting involved with the project as we enter
> incubation
> are encouraged to list themselves here.
>
>   * Ismaël Mejía (iemejia@apache.org)
>
> = Sponsors =
>
> == Champion ==
>
> Sean Busbey (busbey@apache.org)
>
> == Nominated Mentors ==
>
>  * Bikas Saha (bikas@apache.org)
>  * Brock Noland (brock@phdata.io)
>  * Luciano Resende (lresende@apache.org)
>
> == Sponsoring Entity ==
>
> We ask that the Incubator PMC sponsor this proposal.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>

Re: [VOTE] Livy to enter Apache Incubator

Posted by Madhawa Kasun Gunasekara <ma...@gmail.com>.

+1 (non binding)

Madhawa

On Thu, Jun 1, 2017 at 6:23 AM, Raphael Bircher <rb...@gmail.com>
wrote:

> +1 (binding)
>
>
> Am .05.2017, 15:03 Uhr, schrieb Sean Busbey <bu...@apache.org>:
>
> Hi folks!
>>
>> I'm calling a vote to accept "Livy" into the Apache Incubator.
>>
>> The full proposal is available below, and is also available in the wiki:
>>
>> https://wiki.apache.org/incubator/LivyProposal
>>
>> For additional context, please see the discussion thread:
>>
>> https://s.apache.org/incubator-livy-proposal-thread
>>
>> Please cast your vote:
>>
>> [ ] +1, bring Livy into Incubator
>> [ ] -1, do not bring Livy into Incubator, because...
>>
>> The vote will open at least for 72 hours and only votes from the Incubator
>> PMC are binding.
>>
>> I start with my vote:
>> +1
>>
>> ----
>>
>> = Abstract =
>>
>> Livy is web service that exposes a REST interface for managing long
>> running
>> Apache Spark contexts in your cluster. With Livy, new applications can be
>> built on top of Apache Spark that require fine grained interaction with
>> many
>> Spark contexts.
>>
>> = Proposal =
>>
>> Livy is an open-source REST service for Apache Spark. Livy enables
>> applications to submit Spark applications and retrieve results without a
>> co-location requirement on the Spark cluster.
>>
>> We propose to contribute the Livy codebase and associated artifacts (e.g.
>> documentation, web-site context etc) to the Apache Software Foundation.
>>
>> = Background =
>>
>> Apache Spark is a fast and general purpose distributed compute engine,
>> with
>> a versatile API. It enables processing of large quantities of static data
>> distributed over a cluster of machines, as well as processing of
>> continuous
>> streams of data. It is the preferred distributed data processing engine
>> for
>> data engineering, stream processing and data science workloads. Each Spark
>> application uses a construct called the SparkContext, which is the
>> applicationâ€™s connection or entry point to the Spark engine. Each Spark
>> application will have its own SparkContext.
>>
>> Livy enables clients to interact with one or more Spark sessions through
>> the
>> Livy Server, which acts as a proxy layer. Livy Clients have fine grained
>> control over the lifecycle of the Spark sessions, as well as the ability
>> to
>> submit jobs and retrieve results, all over HTTP. Clients have two modes of
>> interaction: RPC Client API, available in Java and Python, which allows
>> results to be retrieved as Java or Python objects. The serialization and
>> deserialization of the results is handled by the Livy framework. HTTP
>> based
>> API that allows submission of code snippets, and retrieval of the results
>> in
>> different formats.
>>
>> Multi-tenant resource allocation and security: Livy enables multiple
>> independent Spark sessions to be managed simultaneously. Multiple clients
>> can also interact simultaneously with the same Spark session and share the
>> resources of that Spark session. Livy can also enforce secure,
>> authenticated
>> communication between the clients and their respective Spark sessions.
>>
>> More information on Livy can be found at the existing open source website:
>> http://livy.io/
>>
>> = Rationale =
>>
>> Users want to use Sparkâ€™s powerful processing engine and API as the data
>>
>> processing backend for interactive applications. However, the job
>> submission
>> and application interaction mechanisms built into Apache Spark are
>> insufficient and cumbersome for multi-user interactive applications.
>>
>> The primary mechanism for applications to submit Spark jobs is via
>> spark-submit
>> (http://spark.apache.org/docs/latest/submitting-applications.html),
>> which is
>> available as a command line tool as well as a programmatic API. However,
>> spark-submit has the following limitations that make it difficult to build
>> interactive applications: It is slow: each invocation of spark-submit
>> involves a setup phase where cluster resources are acquired, new processes
>> are forked, etc. This setup phase runs for many seconds, or even minutes,
>> and hence is too slow for interactive applications. It is cumbersome and
>> lacks flexibility: application code and dependencies have to be
>> pre-compiled
>> and submitted as jars, and can not be submitted interactively.
>>
>> Apache Spark comes with an ODBC/JDBC server, which can be used to submit
>> SQL
>> queries to Spark. However, this solution is limited to SQL and does not
>> allow the client to leverage the rest of the Spark API, such as RDDs,
>> MLlib
>> and Streaming.
>>
>> A third way of using Spark is via its command-line shell, which allows the
>> interactive submission of snippets of Spark code. However, the shell
>> entails
>> running Spark code on the client machine and hence is not a viable
>> mechanism
>> for remote clients to submit Spark jobs.
>>
>> Livy solves the limitations of the above three mechanisms, and provides
>> the
>> full Spark API as a multi-tenant service to remote clients.
>>
>> Since the open source release of Livy in late 2015, we have seen
>> tremendous
>> interest among a diverse set of application developers and ISVs that want
>> to
>> build applications with Apache Spark. To make Livy a robust and flexible
>> solution that will enable a broad and growing set of applications, it is
>> important to grow a large and varied community of contributors.
>>
>> = Initial Goals =
>>
>>   * Move existing codebase, website, documentation and mailing lists to
>>     Apache-hosted infrastructure
>>   * Work with the infrastructure team to implement and approve our code
>>     review, build, and testing workflows in the context of the ASF
>>   * Incremental development and releases per Apache guidelines
>>
>> = Current Status =
>>
>> The Livy project began at Cloudera, as a part of the Hue project. Cloudera
>> soon realized the broad applicability of Livy, and separated it out into
>> an
>> independent project in Nov 2015.
>>
>> == Releases ==
>>
>> Livy has undergone two public releases, tagged here:
>>
>>  * https://github.com/cloudera/livy/releases/tag/v0.2.0
>>  * https://github.com/cloudera/livy/releases/tag/v0.3.0
>>
>> Tarballs and zip files were created for each release and hosted on github.
>> Upon joining the incubator, we will adopt a more typical ASF release
>> process.
>>
>> == Source ==
>>
>> Livyâ€™s source is currently hosted on Github at:
>> https://github.com/cloudera/livy
>>
>> This repository will be transitioned to Apacheâ€™s git hosting during
>> incubation.
>>
>> == Code review ==
>>
>> Livyâ€™s code reviews are currently public and hosted on github as pull
>> request reviews at: https://github.com/cloudera/livy/pulls
>> The Livy developer community so far is happy with github pull request
>> reviews and hopes to continue this after being admitted to the ASF.
>>
>> == Issue Tracking ==
>>
>> Livyâ€™s bug and feature tracking is hosted on JIRA at:
>>
>> https://issues.cloudera.org/projects/LIVY/summary
>> This JIRA instance contains bugs and development discussion dating back 1
>> year and will provide an initial seed for the ASF JIRA
>>
>> == Community Discussion ==
>>
>> Livy has several public discussion forums:
>>
>>  * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-dev
>>  * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-user
>>
>> == Development Practices ==
>>
>> The Livy project follows a review before commit philosophy. Every commit
>> automatically runs through the unit tests and generates coverage reports
>> presented as a pull request comment. Our experience with this process
>> leads
>> us to believe that it helps ease new contributors into the project. They
>> get
>> feedback quickly on common mistakes, lowering the burden on reviewers.
>> Those
>> same reviewers get to lead by example, showing the new contributors that
>> we
>> value feedback within our community even when changes are done by more
>> experienced folks.
>>
>> == Meritocracy ==
>>
>> We believe strongly in meritocracy when electing committers and PMC
>> members.
>> In the past few months, the project has added two new committers from two
>> different organisations, in recognition of their significant contributions
>> to the project. We will encourage contributions and participation of all
>> types, and ensure that contributors are appropriately recognized.
>>
>> == Community ==
>>
>> Though Livy is relatively new as a standalone open source project, it has
>> already seen promising growth in its community across several
>> organizations:
>> Cloudera is the original development sponsor for Livy
>> Microsoft pushed the development of the interpreter fixing high
>> availability
>> issues and adding additional features.
>> Hortonworks has contributed the security features to Livy allowing
>> kerberos
>> and impersonation to work with Spark
>> IBM is starting to make contributions to the Livy project
>> A number of other patches contributed by community members
>>
>> Livy currently relies on Google Groups for mailing lists. These lists have
>> been active since the end of 2015/start of 2016. Currently, Livyâ€™s user
>> mailing list has 173 subscribers and has hosted a total of 227 topic
>> threads. Livyâ€™s developer list has 49 subscribers and has hosted 79
>> topic
>> threads.
>>
>> == Core Developers ==
>>
>> The early contributions to Livy were made by Cloudera engineers. In 2016,
>> engineers from Microsoft and Hortonworks joined the core developer
>> community.
>>
>> == Alignment ==
>>
>> Livy is built upon Apache Spark, and other Apache projects like Apache
>> Hadoop YARN. Itâ€™s used as a building block by Apache Zeppelin. These
>>
>> community connections combined with our focus on development practices
>> that
>> emphasize community engagement with a path to meritocratic recognition
>> naturally align us with the ASF.
>>
>> = Known Risks =
>>
>> == Orphaned Products ==
>>
>> The risk of Livy being abandoned is low because it is supported by three
>> major big-data software vendors. Moreover, Livy is already used to power
>> multiple releases of services and products used in production.
>>
>> == Inexperience with Open Source ==
>>
>> Several of the initial committers are experienced open source developers,
>> several being committers and/or PMC members on other ASF projects (Spark,
>> YARN).
>>
>> == Homogenous Developers ==
>>
>> The project already has a diverse developer base. It has contributions
>> from
>> 3 major organisations (Cloudera, Microsoft and Hortonworks), and is used
>> in
>> diverse applications, in diverse settings (On-Prem and Cloud).
>>
>> == Reliance on salaried Developers ==
>>
>> The contributions to the Livy project to date have been made by salaried
>> engineers from Cloudera, Microsoft and Hortonworks. One of the individuals
>> on the initial committer list has since left Microsoft and is currently
>> unaffiliated. The remaining contributors are from Cloudera and
>> Hortonworks.
>> Since there are at least two major organizations involved, the risk of
>> reliance on a single group of salaried developers is mitigated. The Livy
>> user base is diverse, with users from across the globe, including users
>> from
>> academic settings. We aim to further diversify the Livy user and
>> contributor
>> base.
>>
>> == Relationships with other Apache projects ==
>>
>> Livy is closely tied to the Apache Spark project and currently addresses
>> the
>> scenarios for a REST based batch and interactive gateway for Spark jobs on
>> YARN. Given the growing number of integrations with Livy, keeping it
>> outside
>> of Apache Spark aligns with the desire of the Apache Spark community to
>> reduce the number of external dependencies in the Spark project.
>> Specifically, the Apache Spark community has previously expressed a desire
>> to keep job servers independent from the project.<<FootNote(See, for
>> example, discussion of the Ooyala Spark Job Server in SPARK-818)>>
>> Furthermore, while Livy common usage is closely tied to Spark deployments
>> right now, its core building blocks can be reused elsewhere.  Livyâ€™s
>> Remote
>> REPL could be used as a library for interactive scenarios in non-Spark
>> projects. In the future, integrations with cluster managers like Apache
>> Mesos and others could also be added.
>>
>> The features provided by Livy have already been integrated with existing
>> projects like Jupyter and Apache Zeppelin for their interactive Spark use
>> cases. This validates the need for a project like Livy and provides an
>> active downstream user base that the Livy community can interact with to
>> seed future interest in the project.
>>
>> Livy serves a similar purpose to Apache Toree (incubating) but differs in
>> making session management, security and impersonation a focal design
>> point.
>>
>> == An Excessive Fascination with the Apache Brand ==
>>
>> The primary motivation for submitting Livy to the ASF is to grow a diverse
>> and strong community. We wish to encourage diverse organisations,
>> including
>> ISVs, to adopt Livy and contribute to Livy without any concerns about
>> ownership or licensing.
>>
>> = Documentation =
>>
>> Documentation can be found on the Livy website http://livy.io/
>>
>> The Livy web site is version controlled on the â€˜gh-pagesâ€™ branch of
>> the
>> above repository.
>> Additional documentation is provided on the github wiki:
>> https://github.com/cloudera/livy/wiki
>> APis are documented within the source code as JavaDoc style documentation
>> comments.
>>
>> = Initial Source =
>>
>> The initial source code for Livy is hosted at
>> https://github.com/cloudera/livy
>>
>> = Source and Intellectual Property submission plan =
>>
>> The Livy codebase and web site is currently hosted on GitHub and will be
>> transitioned to the ASF repositories during incubation. Livy is already
>> licensed under the Apache 2.0 license. Cloudera has collected ICLAs and
>> CCLAs from all committers. There are, however, some contributions recently
>> from authors that have not signed the CCLA and ICLA. If necessary for a
>> successful SGA, weâ€™ll seek the necessary documentation or replace the
>> contributions.
>>
>> The â€œLivyâ€  name is not a registered trademark. We will need to do a
>> trademark search and make sure it is available for the Apache Foundation
>> prior to graduation.
>>
>> Cloudera currently owns the domain name: http://livy.io/. Once all the
>> documentation has moved over to ASF infrastructure, the main landing page
>> will become livy.incubator.apache.org and the old domain will just act
>> as a
>> redirect.
>>
>> = External Dependencies =
>>
>> The list below covers the non-Apache dependencies of the project and their
>> licenses.
>>
>>  * Jetty: Apache 2.0
>>  * Dropwizard Metrics: Apache 2.0
>>  * FasterXML Jackson: Apache 2.0
>>  * Netty: Apache 2.0
>>  * Scala: BSD
>>  * Py4J: BSD
>>  * Scalatra: BSD
>>
>> Build/test-only dependencies:
>>
>>  * Mockito: MIT
>>  * JUnit: Eclipse
>>
>> = Required Resources =
>>
>> == Mailing Lists ==
>>
>>  * private@livy.incubator.apache.org (PPMC)
>>  * dev@livy.incubator.apache.org (dev mailing list)
>>  * user@livy.incubator.apache.org (User questions)
>>  * commits@livy.incubator.apache.org (subscribers shouldnâ€™t be able to
>> post)
>>  * issues@livy.incubator.apache.org (subscribers shouldnâ€™t be able to
>> post)
>>
>> == Git Repository ==
>>
>> git://git.apache.org/incubator-livy
>>
>> == Issue Tracking ==
>>
>> We would like to import our current JIRA project into the ASF JIRA, such
>> that our historical commit message and code comments continue to reference
>> the appropriate bug numbers.
>>
>> = Initial Committers =
>>
>>  * Marcelo Vanzin (vanzin@cloudera.com)
>>  * Alex Man (alex@alexman.space)
>>  * Jeff Zhang (zjffdu@gmail.com)
>>  * Saisai Shao (sshao@hortonworks.com)
>>  * Kostas Sakellis (kostas@cloudera.com)
>>
>> = Affiliations =
>>
>> The initial set of committers includes people employed by Cloudera and
>> Hortonworks as well as one currently independent contributor.
>>
>> = Additional Interested Contributors =
>>
>> Those interested in getting involved with the project as we enter
>> incubation
>> are encouraged to list themselves here.
>>
>>   * IsmaÃ«l MejÃa (iemejia@apache.org)
>>
>> = Sponsors =
>>
>> == Champion ==
>>
>> Sean Busbey (busbey@apache.org)
>>
>> == Nominated Mentors ==
>>
>>  * Bikas Saha (bikas@apache.org)
>>  * Brock Noland (brock@phdata.io)
>>  * Luciano Resende (lresende@apache.org)
>>
>> == Sponsoring Entity ==
>>
>> We ask that the Incubator PMC sponsor this proposal.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>> For additional commands, e-mail: general-help@incubator.apache.org
>>
>>
>
> --
> My introduction https://youtu.be/Ln4vly5sxYU
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>

Re: [VOTE] Livy to enter Apache Incubator

Posted by Raphael Bircher <rb...@gmail.com>.

+1 (binding)

Am .05.2017, 15:03 Uhr, schrieb Sean Busbey <bu...@apache.org>:

> Hi folks!
>
> I'm calling a vote to accept "Livy" into the Apache Incubator.
>
> The full proposal is available below, and is also available in the wiki:
>
> https://wiki.apache.org/incubator/LivyProposal
>
> For additional context, please see the discussion thread:
>
> https://s.apache.org/incubator-livy-proposal-thread
>
> Please cast your vote:
>
> [ ] +1, bring Livy into Incubator
> [ ] -1, do not bring Livy into Incubator, because...
>
> The vote will open at least for 72 hours and only votes from the  
> Incubator
> PMC are binding.
>
> I start with my vote:
> +1
>
> ----
>
> = Abstract =
>
> Livy is web service that exposes a REST interface for managing long  
> running
> Apache Spark contexts in your cluster. With Livy, new applications can be
> built on top of Apache Spark that require fine grained interaction with  
> many
> Spark contexts.
>
> = Proposal =
>
> Livy is an open-source REST service for Apache Spark. Livy enables
> applications to submit Spark applications and retrieve results without a
> co-location requirement on the Spark cluster.
>
> We propose to contribute the Livy codebase and associated artifacts (e.g.
> documentation, web-site context etc) to the Apache Software Foundation.
>
> = Background =
>
> Apache Spark is a fast and general purpose distributed compute engine,  
> with
> a versatile API. It enables processing of large quantities of static data
> distributed over a cluster of machines, as well as processing of  
> continuous
> streams of data. It is the preferred distributed data processing engine  
> for
> data engineering, stream processing and data science workloads. Each  
> Spark
> application uses a construct called the SparkContext, which is the
> applicationâ€™s connection or entry point to the Spark engine. Each Spark
> application will have its own SparkContext.
>
> Livy enables clients to interact with one or more Spark sessions through  
> the
> Livy Server, which acts as a proxy layer. Livy Clients have fine grained
> control over the lifecycle of the Spark sessions, as well as the ability  
> to
> submit jobs and retrieve results, all over HTTP. Clients have two modes  
> of
> interaction: RPC Client API, available in Java and Python, which allows
> results to be retrieved as Java or Python objects. The serialization and
> deserialization of the results is handled by the Livy framework. HTTP  
> based
> API that allows submission of code snippets, and retrieval of the  
> results in
> different formats.
>
> Multi-tenant resource allocation and security: Livy enables multiple
> independent Spark sessions to be managed simultaneously. Multiple clients
> can also interact simultaneously with the same Spark session and share  
> the
> resources of that Spark session. Livy can also enforce secure,  
> authenticated
> communication between the clients and their respective Spark sessions.
>
> More information on Livy can be found at the existing open source  
> website:
> http://livy.io/
>
> = Rationale =
>
> Users want to use Sparkâ€™s powerful processing engine and API as the  
> data
> processing backend for interactive applications. However, the job  
> submission
> and application interaction mechanisms built into Apache Spark are
> insufficient and cumbersome for multi-user interactive applications.
>
> The primary mechanism for applications to submit Spark jobs is via
> spark-submit
> (http://spark.apache.org/docs/latest/submitting-applications.html),  
> which is
> available as a command line tool as well as a programmatic API. However,
> spark-submit has the following limitations that make it difficult to  
> build
> interactive applications: It is slow: each invocation of spark-submit
> involves a setup phase where cluster resources are acquired, new  
> processes
> are forked, etc. This setup phase runs for many seconds, or even minutes,
> and hence is too slow for interactive applications. It is cumbersome and
> lacks flexibility: application code and dependencies have to be  
> pre-compiled
> and submitted as jars, and can not be submitted interactively.
>
> Apache Spark comes with an ODBC/JDBC server, which can be used to submit  
> SQL
> queries to Spark. However, this solution is limited to SQL and does not
> allow the client to leverage the rest of the Spark API, such as RDDs,  
> MLlib
> and Streaming.
>
> A third way of using Spark is via its command-line shell, which allows  
> the
> interactive submission of snippets of Spark code. However, the shell  
> entails
> running Spark code on the client machine and hence is not a viable  
> mechanism
> for remote clients to submit Spark jobs.
>
> Livy solves the limitations of the above three mechanisms, and provides  
> the
> full Spark API as a multi-tenant service to remote clients.
>
> Since the open source release of Livy in late 2015, we have seen  
> tremendous
> interest among a diverse set of application developers and ISVs that  
> want to
> build applications with Apache Spark. To make Livy a robust and flexible
> solution that will enable a broad and growing set of applications, it is
> important to grow a large and varied community of contributors.
>
> = Initial Goals =
>
>   * Move existing codebase, website, documentation and mailing lists to
>     Apache-hosted infrastructure
>   * Work with the infrastructure team to implement and approve our code
>     review, build, and testing workflows in the context of the ASF
>   * Incremental development and releases per Apache guidelines
>
> = Current Status =
>
> The Livy project began at Cloudera, as a part of the Hue project.  
> Cloudera
> soon realized the broad applicability of Livy, and separated it out into  
> an
> independent project in Nov 2015.
>
> == Releases ==
>
> Livy has undergone two public releases, tagged here:
>
>  * https://github.com/cloudera/livy/releases/tag/v0.2.0
>  * https://github.com/cloudera/livy/releases/tag/v0.3.0
>
> Tarballs and zip files were created for each release and hosted on  
> github.
> Upon joining the incubator, we will adopt a more typical ASF release
> process.
>
> == Source ==
>
> Livyâ€™s source is currently hosted on Github at:
> https://github.com/cloudera/livy
>
> This repository will be transitioned to Apacheâ€™s git hosting during
> incubation.
>
> == Code review ==
>
> Livyâ€™s code reviews are currently public and hosted on github as pull
> request reviews at: https://github.com/cloudera/livy/pulls
> The Livy developer community so far is happy with github pull request
> reviews and hopes to continue this after being admitted to the ASF.
>
> == Issue Tracking ==
>
> Livyâ€™s bug and feature tracking is hosted on JIRA at:
> https://issues.cloudera.org/projects/LIVY/summary
> This JIRA instance contains bugs and development discussion dating back 1
> year and will provide an initial seed for the ASF JIRA
>
> == Community Discussion ==
>
> Livy has several public discussion forums:
>
>  * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-dev
>  * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-user
>
> == Development Practices ==
>
> The Livy project follows a review before commit philosophy. Every commit
> automatically runs through the unit tests and generates coverage reports
> presented as a pull request comment. Our experience with this process  
> leads
> us to believe that it helps ease new contributors into the project. They  
> get
> feedback quickly on common mistakes, lowering the burden on reviewers.  
> Those
> same reviewers get to lead by example, showing the new contributors that  
> we
> value feedback within our community even when changes are done by more
> experienced folks.
>
> == Meritocracy ==
>
> We believe strongly in meritocracy when electing committers and PMC  
> members.
> In the past few months, the project has added two new committers from two
> different organisations, in recognition of their significant  
> contributions
> to the project. We will encourage contributions and participation of all
> types, and ensure that contributors are appropriately recognized.
>
> == Community ==
>
> Though Livy is relatively new as a standalone open source project, it has
> already seen promising growth in its community across several  
> organizations:
> Cloudera is the original development sponsor for Livy
> Microsoft pushed the development of the interpreter fixing high  
> availability
> issues and adding additional features.
> Hortonworks has contributed the security features to Livy allowing  
> kerberos
> and impersonation to work with Spark
> IBM is starting to make contributions to the Livy project
> A number of other patches contributed by community members
>
> Livy currently relies on Google Groups for mailing lists. These lists  
> have
> been active since the end of 2015/start of 2016. Currently, Livyâ€™s user
> mailing list has 173 subscribers and has hosted a total of 227 topic
> threads. Livyâ€™s developer list has 49 subscribers and has hosted 79  
> topic
> threads.
>
> == Core Developers ==
>
> The early contributions to Livy were made by Cloudera engineers. In 2016,
> engineers from Microsoft and Hortonworks joined the core developer
> community.
>
> == Alignment ==
>
> Livy is built upon Apache Spark, and other Apache projects like Apache
> Hadoop YARN. Itâ€™s used as a building block by Apache Zeppelin. These
> community connections combined with our focus on development practices  
> that
> emphasize community engagement with a path to meritocratic recognition
> naturally align us with the ASF.
>
> = Known Risks =
>
> == Orphaned Products ==
>
> The risk of Livy being abandoned is low because it is supported by three
> major big-data software vendors. Moreover, Livy is already used to power
> multiple releases of services and products used in production.
>
> == Inexperience with Open Source ==
>
> Several of the initial committers are experienced open source developers,
> several being committers and/or PMC members on other ASF projects (Spark,
> YARN).
>
> == Homogenous Developers ==
>
> The project already has a diverse developer base. It has contributions  
> from
> 3 major organisations (Cloudera, Microsoft and Hortonworks), and is used  
> in
> diverse applications, in diverse settings (On-Prem and Cloud).
>
> == Reliance on salaried Developers ==
>
> The contributions to the Livy project to date have been made by salaried
> engineers from Cloudera, Microsoft and Hortonworks. One of the  
> individuals
> on the initial committer list has since left Microsoft and is currently
> unaffiliated. The remaining contributors are from Cloudera and  
> Hortonworks.
> Since there are at least two major organizations involved, the risk of
> reliance on a single group of salaried developers is mitigated. The Livy
> user base is diverse, with users from across the globe, including users  
> from
> academic settings. We aim to further diversify the Livy user and  
> contributor
> base.
>
> == Relationships with other Apache projects ==
>
> Livy is closely tied to the Apache Spark project and currently addresses  
> the
> scenarios for a REST based batch and interactive gateway for Spark jobs  
> on
> YARN. Given the growing number of integrations with Livy, keeping it  
> outside
> of Apache Spark aligns with the desire of the Apache Spark community to
> reduce the number of external dependencies in the Spark project.
> Specifically, the Apache Spark community has previously expressed a  
> desire
> to keep job servers independent from the project.<<FootNote(See, for
> example, discussion of the Ooyala Spark Job Server in SPARK-818)>>
> Furthermore, while Livy common usage is closely tied to Spark deployments
> right now, its core building blocks can be reused elsewhere.  Livyâ€™s  
> Remote
> REPL could be used as a library for interactive scenarios in non-Spark
> projects. In the future, integrations with cluster managers like Apache
> Mesos and others could also be added.
>
> The features provided by Livy have already been integrated with existing
> projects like Jupyter and Apache Zeppelin for their interactive Spark use
> cases. This validates the need for a project like Livy and provides an
> active downstream user base that the Livy community can interact with to
> seed future interest in the project.
>
> Livy serves a similar purpose to Apache Toree (incubating) but differs in
> making session management, security and impersonation a focal design  
> point.
>
> == An Excessive Fascination with the Apache Brand ==
>
> The primary motivation for submitting Livy to the ASF is to grow a  
> diverse
> and strong community. We wish to encourage diverse organisations,  
> including
> ISVs, to adopt Livy and contribute to Livy without any concerns about
> ownership or licensing.
>
> = Documentation =
>
> Documentation can be found on the Livy website http://livy.io/
>
> The Livy web site is version controlled on the â€˜gh-pagesâ€™ branch of  
> the
> above repository.
> Additional documentation is provided on the github wiki:
> https://github.com/cloudera/livy/wiki
> APis are documented within the source code as JavaDoc style documentation
> comments.
>
> = Initial Source =
>
> The initial source code for Livy is hosted at
> https://github.com/cloudera/livy
>
> = Source and Intellectual Property submission plan =
>
> The Livy codebase and web site is currently hosted on GitHub and will be
> transitioned to the ASF repositories during incubation. Livy is already
> licensed under the Apache 2.0 license. Cloudera has collected ICLAs and
> CCLAs from all committers. There are, however, some contributions  
> recently
> from authors that have not signed the CCLA and ICLA. If necessary for a
> successful SGA, weâ€™ll seek the necessary documentation or replace the
> contributions.
>
> The â€œLivyâ€ name is not a registered trademark. We will need to do a
> trademark search and make sure it is available for the Apache Foundation
> prior to graduation.
>
> Cloudera currently owns the domain name: http://livy.io/. Once all the
> documentation has moved over to ASF infrastructure, the main landing page
> will become livy.incubator.apache.org and the old domain will just act  
> as a
> redirect.
>
> = External Dependencies =
>
> The list below covers the non-Apache dependencies of the project and  
> their
> licenses.
>
>  * Jetty: Apache 2.0
>  * Dropwizard Metrics: Apache 2.0
>  * FasterXML Jackson: Apache 2.0
>  * Netty: Apache 2.0
>  * Scala: BSD
>  * Py4J: BSD
>  * Scalatra: BSD
>
> Build/test-only dependencies:
>
>  * Mockito: MIT
>  * JUnit: Eclipse
>
> = Required Resources =
>
> == Mailing Lists ==
>
>  * private@livy.incubator.apache.org (PPMC)
>  * dev@livy.incubator.apache.org (dev mailing list)
>  * user@livy.incubator.apache.org (User questions)
>  * commits@livy.incubator.apache.org (subscribers shouldnâ€™t be able to  
> post)
>  * issues@livy.incubator.apache.org (subscribers shouldnâ€™t be able to  
> post)
>
> == Git Repository ==
>
> git://git.apache.org/incubator-livy
>
> == Issue Tracking ==
>
> We would like to import our current JIRA project into the ASF JIRA, such
> that our historical commit message and code comments continue to  
> reference
> the appropriate bug numbers.
>
> = Initial Committers =
>
>  * Marcelo Vanzin (vanzin@cloudera.com)
>  * Alex Man (alex@alexman.space)
>  * Jeff Zhang (zjffdu@gmail.com)
>  * Saisai Shao (sshao@hortonworks.com)
>  * Kostas Sakellis (kostas@cloudera.com)
>
> = Affiliations =
>
> The initial set of committers includes people employed by Cloudera and
> Hortonworks as well as one currently independent contributor.
>
> = Additional Interested Contributors =
>
> Those interested in getting involved with the project as we enter  
> incubation
> are encouraged to list themselves here.
>
>   * IsmaÃ«l MejÃa (iemejia@apache.org)
>
> = Sponsors =
>
> == Champion ==
>
> Sean Busbey (busbey@apache.org)
>
> == Nominated Mentors ==
>
>  * Bikas Saha (bikas@apache.org)
>  * Brock Noland (brock@phdata.io)
>  * Luciano Resende (lresende@apache.org)
>
> == Sponsoring Entity ==
>
> We ask that the Incubator PMC sponsor this proposal.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>


-- 
My introduction https://youtu.be/Ln4vly5sxYU

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org

Re: [VOTE] Livy to enter Apache Incubator

Posted by Luciano Resende <lu...@gmail.com>.

+1 (binding)

On Wed, May 31, 2017 at 6:03 AM, Sean Busbey <bu...@apache.org> wrote:

> Hi folks!
>
> I'm calling a vote to accept "Livy" into the Apache Incubator.
>
> The full proposal is available below, and is also available in the wiki:
>
> https://wiki.apache.org/incubator/LivyProposal
>
> For additional context, please see the discussion thread:
>
> https://s.apache.org/incubator-livy-proposal-thread
>
> Please cast your vote:
>
> [ ] +1, bring Livy into Incubator
> [ ] -1, do not bring Livy into Incubator, because...
>
> The vote will open at least for 72 hours and only votes from the Incubator
> PMC are binding.
>
> I start with my vote:
> +1
>
> ----
>
> = Abstract =
>
> Livy is web service that exposes a REST interface for managing long running
> Apache Spark contexts in your cluster. With Livy, new applications can be
> built on top of Apache Spark that require fine grained interaction with
> many
> Spark contexts.
>
> = Proposal =
>
> Livy is an open-source REST service for Apache Spark. Livy enables
> applications to submit Spark applications and retrieve results without a
> co-location requirement on the Spark cluster.
>
> We propose to contribute the Livy codebase and associated artifacts (e.g.
> documentation, web-site context etc) to the Apache Software Foundation.
>
> = Background =
>
> Apache Spark is a fast and general purpose distributed compute engine, with
> a versatile API. It enables processing of large quantities of static data
> distributed over a cluster of machines, as well as processing of continuous
> streams of data. It is the preferred distributed data processing engine for
> data engineering, stream processing and data science workloads. Each Spark
> application uses a construct called the SparkContext, which is the
> application’s connection or entry point to the Spark engine. Each Spark
> application will have its own SparkContext.
>
> Livy enables clients to interact with one or more Spark sessions through
> the
> Livy Server, which acts as a proxy layer. Livy Clients have fine grained
> control over the lifecycle of the Spark sessions, as well as the ability to
> submit jobs and retrieve results, all over HTTP. Clients have two modes of
> interaction: RPC Client API, available in Java and Python, which allows
> results to be retrieved as Java or Python objects. The serialization and
> deserialization of the results is handled by the Livy framework. HTTP based
> API that allows submission of code snippets, and retrieval of the results
> in
> different formats.
>
> Multi-tenant resource allocation and security: Livy enables multiple
> independent Spark sessions to be managed simultaneously. Multiple clients
> can also interact simultaneously with the same Spark session and share the
> resources of that Spark session. Livy can also enforce secure,
> authenticated
> communication between the clients and their respective Spark sessions.
>
> More information on Livy can be found at the existing open source website:
> http://livy.io/
>
> = Rationale =
>
> Users want to use Spark’s powerful processing engine and API as the data
> processing backend for interactive applications. However, the job
> submission
> and application interaction mechanisms built into Apache Spark are
> insufficient and cumbersome for multi-user interactive applications.
>
> The primary mechanism for applications to submit Spark jobs is via
> spark-submit
> (http://spark.apache.org/docs/latest/submitting-applications.html), which
> is
> available as a command line tool as well as a programmatic API. However,
> spark-submit has the following limitations that make it difficult to build
> interactive applications: It is slow: each invocation of spark-submit
> involves a setup phase where cluster resources are acquired, new processes
> are forked, etc. This setup phase runs for many seconds, or even minutes,
> and hence is too slow for interactive applications. It is cumbersome and
> lacks flexibility: application code and dependencies have to be
> pre-compiled
> and submitted as jars, and can not be submitted interactively.
>
> Apache Spark comes with an ODBC/JDBC server, which can be used to submit
> SQL
> queries to Spark. However, this solution is limited to SQL and does not
> allow the client to leverage the rest of the Spark API, such as RDDs, MLlib
> and Streaming.
>
> A third way of using Spark is via its command-line shell, which allows the
> interactive submission of snippets of Spark code. However, the shell
> entails
> running Spark code on the client machine and hence is not a viable
> mechanism
> for remote clients to submit Spark jobs.
>
> Livy solves the limitations of the above three mechanisms, and provides the
> full Spark API as a multi-tenant service to remote clients.
>
> Since the open source release of Livy in late 2015, we have seen tremendous
> interest among a diverse set of application developers and ISVs that want
> to
> build applications with Apache Spark. To make Livy a robust and flexible
> solution that will enable a broad and growing set of applications, it is
> important to grow a large and varied community of contributors.
>
> = Initial Goals =
>
>   * Move existing codebase, website, documentation and mailing lists to
>     Apache-hosted infrastructure
>   * Work with the infrastructure team to implement and approve our code
>     review, build, and testing workflows in the context of the ASF
>   * Incremental development and releases per Apache guidelines
>
> = Current Status =
>
> The Livy project began at Cloudera, as a part of the Hue project. Cloudera
> soon realized the broad applicability of Livy, and separated it out into an
> independent project in Nov 2015.
>
> == Releases ==
>
> Livy has undergone two public releases, tagged here:
>
>  * https://github.com/cloudera/livy/releases/tag/v0.2.0
>  * https://github.com/cloudera/livy/releases/tag/v0.3.0
>
> Tarballs and zip files were created for each release and hosted on github.
> Upon joining the incubator, we will adopt a more typical ASF release
> process.
>
> == Source ==
>
> Livy’s source is currently hosted on Github at:
> https://github.com/cloudera/livy
>
> This repository will be transitioned to Apache’s git hosting during
> incubation.
>
> == Code review ==
>
> Livy’s code reviews are currently public and hosted on github as pull
> request reviews at: https://github.com/cloudera/livy/pulls
> The Livy developer community so far is happy with github pull request
> reviews and hopes to continue this after being admitted to the ASF.
>
> == Issue Tracking ==
>
> Livy’s bug and feature tracking is hosted on JIRA at:
> https://issues.cloudera.org/projects/LIVY/summary
> This JIRA instance contains bugs and development discussion dating back 1
> year and will provide an initial seed for the ASF JIRA
>
> == Community Discussion ==
>
> Livy has several public discussion forums:
>
>  * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-dev
>  * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-user
>
> == Development Practices ==
>
> The Livy project follows a review before commit philosophy. Every commit
> automatically runs through the unit tests and generates coverage reports
> presented as a pull request comment. Our experience with this process leads
> us to believe that it helps ease new contributors into the project. They
> get
> feedback quickly on common mistakes, lowering the burden on reviewers.
> Those
> same reviewers get to lead by example, showing the new contributors that we
> value feedback within our community even when changes are done by more
> experienced folks.
>
> == Meritocracy ==
>
> We believe strongly in meritocracy when electing committers and PMC
> members.
> In the past few months, the project has added two new committers from two
> different organisations, in recognition of their significant contributions
> to the project. We will encourage contributions and participation of all
> types, and ensure that contributors are appropriately recognized.
>
> == Community ==
>
> Though Livy is relatively new as a standalone open source project, it has
> already seen promising growth in its community across several
> organizations:
> Cloudera is the original development sponsor for Livy
> Microsoft pushed the development of the interpreter fixing high
> availability
> issues and adding additional features.
> Hortonworks has contributed the security features to Livy allowing kerberos
> and impersonation to work with Spark
> IBM is starting to make contributions to the Livy project
> A number of other patches contributed by community members
>
> Livy currently relies on Google Groups for mailing lists. These lists have
> been active since the end of 2015/start of 2016. Currently, Livy’s user
> mailing list has 173 subscribers and has hosted a total of 227 topic
> threads. Livy’s developer list has 49 subscribers and has hosted 79 topic
> threads.
>
> == Core Developers ==
>
> The early contributions to Livy were made by Cloudera engineers. In 2016,
> engineers from Microsoft and Hortonworks joined the core developer
> community.
>
> == Alignment ==
>
> Livy is built upon Apache Spark, and other Apache projects like Apache
> Hadoop YARN. It’s used as a building block by Apache Zeppelin. These
> community connections combined with our focus on development practices that
> emphasize community engagement with a path to meritocratic recognition
> naturally align us with the ASF.
>
> = Known Risks =
>
> == Orphaned Products ==
>
> The risk of Livy being abandoned is low because it is supported by three
> major big-data software vendors. Moreover, Livy is already used to power
> multiple releases of services and products used in production.
>
> == Inexperience with Open Source ==
>
> Several of the initial committers are experienced open source developers,
> several being committers and/or PMC members on other ASF projects (Spark,
> YARN).
>
> == Homogenous Developers ==
>
> The project already has a diverse developer base. It has contributions from
> 3 major organisations (Cloudera, Microsoft and Hortonworks), and is used in
> diverse applications, in diverse settings (On-Prem and Cloud).
>
> == Reliance on salaried Developers ==
>
> The contributions to the Livy project to date have been made by salaried
> engineers from Cloudera, Microsoft and Hortonworks. One of the individuals
> on the initial committer list has since left Microsoft and is currently
> unaffiliated. The remaining contributors are from Cloudera and Hortonworks.
> Since there are at least two major organizations involved, the risk of
> reliance on a single group of salaried developers is mitigated. The Livy
> user base is diverse, with users from across the globe, including users
> from
> academic settings. We aim to further diversify the Livy user and
> contributor
> base.
>
> == Relationships with other Apache projects ==
>
> Livy is closely tied to the Apache Spark project and currently addresses
> the
> scenarios for a REST based batch and interactive gateway for Spark jobs on
> YARN. Given the growing number of integrations with Livy, keeping it
> outside
> of Apache Spark aligns with the desire of the Apache Spark community to
> reduce the number of external dependencies in the Spark project.
> Specifically, the Apache Spark community has previously expressed a desire
> to keep job servers independent from the project.<<FootNote(See, for
> example, discussion of the Ooyala Spark Job Server in SPARK-818)>>
> Furthermore, while Livy common usage is closely tied to Spark deployments
> right now, its core building blocks can be reused elsewhere.  Livy’s Remote
> REPL could be used as a library for interactive scenarios in non-Spark
> projects. In the future, integrations with cluster managers like Apache
> Mesos and others could also be added.
>
> The features provided by Livy have already been integrated with existing
> projects like Jupyter and Apache Zeppelin for their interactive Spark use
> cases. This validates the need for a project like Livy and provides an
> active downstream user base that the Livy community can interact with to
> seed future interest in the project.
>
> Livy serves a similar purpose to Apache Toree (incubating) but differs in
> making session management, security and impersonation a focal design point.
>
> == An Excessive Fascination with the Apache Brand ==
>
> The primary motivation for submitting Livy to the ASF is to grow a diverse
> and strong community. We wish to encourage diverse organisations, including
> ISVs, to adopt Livy and contribute to Livy without any concerns about
> ownership or licensing.
>
> = Documentation =
>
> Documentation can be found on the Livy website http://livy.io/
>
> The Livy web site is version controlled on the ‘gh-pages’ branch of the
> above repository.
> Additional documentation is provided on the github wiki:
> https://github.com/cloudera/livy/wiki
> APis are documented within the source code as JavaDoc style documentation
> comments.
>
> = Initial Source =
>
> The initial source code for Livy is hosted at
> https://github.com/cloudera/livy
>
> = Source and Intellectual Property submission plan =
>
> The Livy codebase and web site is currently hosted on GitHub and will be
> transitioned to the ASF repositories during incubation. Livy is already
> licensed under the Apache 2.0 license. Cloudera has collected ICLAs and
> CCLAs from all committers. There are, however, some contributions recently
> from authors that have not signed the CCLA and ICLA. If necessary for a
> successful SGA, we’ll seek the necessary documentation or replace the
> contributions.
>
> The “Livy” name is not a registered trademark. We will need to do a
> trademark search and make sure it is available for the Apache Foundation
> prior to graduation.
>
> Cloudera currently owns the domain name: http://livy.io/. Once all the
> documentation has moved over to ASF infrastructure, the main landing page
> will become livy.incubator.apache.org and the old domain will just act as
> a
> redirect.
>
> = External Dependencies =
>
> The list below covers the non-Apache dependencies of the project and their
> licenses.
>
>  * Jetty: Apache 2.0
>  * Dropwizard Metrics: Apache 2.0
>  * FasterXML Jackson: Apache 2.0
>  * Netty: Apache 2.0
>  * Scala: BSD
>  * Py4J: BSD
>  * Scalatra: BSD
>
> Build/test-only dependencies:
>
>  * Mockito: MIT
>  * JUnit: Eclipse
>
> = Required Resources =
>
> == Mailing Lists ==
>
>  * private@livy.incubator.apache.org (PPMC)
>  * dev@livy.incubator.apache.org (dev mailing list)
>  * user@livy.incubator.apache.org (User questions)
>  * commits@livy.incubator.apache.org (subscribers shouldn’t be able to
> post)
>  * issues@livy.incubator.apache.org (subscribers shouldn’t be able to
> post)
>
> == Git Repository ==
>
> git://git.apache.org/incubator-livy
>
> == Issue Tracking ==
>
> We would like to import our current JIRA project into the ASF JIRA, such
> that our historical commit message and code comments continue to reference
> the appropriate bug numbers.
>
> = Initial Committers =
>
>  * Marcelo Vanzin (vanzin@cloudera.com)
>  * Alex Man (alex@alexman.space)
>  * Jeff Zhang (zjffdu@gmail.com)
>  * Saisai Shao (sshao@hortonworks.com)
>  * Kostas Sakellis (kostas@cloudera.com)
>
> = Affiliations =
>
> The initial set of committers includes people employed by Cloudera and
> Hortonworks as well as one currently independent contributor.
>
> = Additional Interested Contributors =
>
> Those interested in getting involved with the project as we enter
> incubation
> are encouraged to list themselves here.
>
>   * Ismaël Mejía (iemejia@apache.org)
>
> = Sponsors =
>
> == Champion ==
>
> Sean Busbey (busbey@apache.org)
>
> == Nominated Mentors ==
>
>  * Bikas Saha (bikas@apache.org)
>  * Brock Noland (brock@phdata.io)
>  * Luciano Resende (lresende@apache.org)
>
> == Sponsoring Entity ==
>
> We ask that the Incubator PMC sponsor this proposal.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>


-- 
Luciano Resende
http://twitter.com/lresende1975
http://lresende.blogspot.com/

Re: [VOTE] Livy to enter Apache Incubator

Posted by Bruno Mahé <bm...@apache.org>.

+1 (non binding)


Thanks,
Bruno

On 05/31/2017 06:03 AM, Sean Busbey wrote:
> Hi folks!
>
> I'm calling a vote to accept "Livy" into the Apache Incubator.
>
> The full proposal is available below, and is also available in the wiki:
>
> https://wiki.apache.org/incubator/LivyProposal
>
> For additional context, please see the discussion thread:
>
> https://s.apache.org/incubator-livy-proposal-thread
>
> Please cast your vote:
>
> [ ] +1, bring Livy into Incubator
> [ ] -1, do not bring Livy into Incubator, because...
>
> The vote will open at least for 72 hours and only votes from the Incubator
> PMC are binding.
>
> I start with my vote:
> +1
>
> ----
>
> = Abstract =
>
> Livy is web service that exposes a REST interface for managing long running
> Apache Spark contexts in your cluster. With Livy, new applications can be
> built on top of Apache Spark that require fine grained interaction with many
> Spark contexts.
>
> = Proposal =
>
> Livy is an open-source REST service for Apache Spark. Livy enables
> applications to submit Spark applications and retrieve results without a
> co-location requirement on the Spark cluster.
>
> We propose to contribute the Livy codebase and associated artifacts (e.g.
> documentation, web-site context etc) to the Apache Software Foundation.
>
> = Background =
>
> Apache Spark is a fast and general purpose distributed compute engine, with
> a versatile API. It enables processing of large quantities of static data
> distributed over a cluster of machines, as well as processing of continuous
> streams of data. It is the preferred distributed data processing engine for
> data engineering, stream processing and data science workloads. Each Spark
> application uses a construct called the SparkContext, which is the
> applicationâ€™s connection or entry point to the Spark engine. Each Spark
> application will have its own SparkContext.
>
> Livy enables clients to interact with one or more Spark sessions through the
> Livy Server, which acts as a proxy layer. Livy Clients have fine grained
> control over the lifecycle of the Spark sessions, as well as the ability to
> submit jobs and retrieve results, all over HTTP. Clients have two modes of
> interaction: RPC Client API, available in Java and Python, which allows
> results to be retrieved as Java or Python objects. The serialization and
> deserialization of the results is handled by the Livy framework. HTTP based
> API that allows submission of code snippets, and retrieval of the results in
> different formats.
>
> Multi-tenant resource allocation and security: Livy enables multiple
> independent Spark sessions to be managed simultaneously. Multiple clients
> can also interact simultaneously with the same Spark session and share the
> resources of that Spark session. Livy can also enforce secure, authenticated
> communication between the clients and their respective Spark sessions.
>
> More information on Livy can be found at the existing open source website:
> http://livy.io/
>
> = Rationale =
>
> Users want to use Sparkâ€™s powerful processing engine and API as the data
> processing backend for interactive applications. However, the job submission
> and application interaction mechanisms built into Apache Spark are
> insufficient and cumbersome for multi-user interactive applications.
>
> The primary mechanism for applications to submit Spark jobs is via
> spark-submit
> (http://spark.apache.org/docs/latest/submitting-applications.html), which is
> available as a command line tool as well as a programmatic API. However,
> spark-submit has the following limitations that make it difficult to build
> interactive applications: It is slow: each invocation of spark-submit
> involves a setup phase where cluster resources are acquired, new processes
> are forked, etc. This setup phase runs for many seconds, or even minutes,
> and hence is too slow for interactive applications. It is cumbersome and
> lacks flexibility: application code and dependencies have to be pre-compiled
> and submitted as jars, and can not be submitted interactively.
>
> Apache Spark comes with an ODBC/JDBC server, which can be used to submit SQL
> queries to Spark. However, this solution is limited to SQL and does not
> allow the client to leverage the rest of the Spark API, such as RDDs, MLlib
> and Streaming.
>
> A third way of using Spark is via its command-line shell, which allows the
> interactive submission of snippets of Spark code. However, the shell entails
> running Spark code on the client machine and hence is not a viable mechanism
> for remote clients to submit Spark jobs.
>
> Livy solves the limitations of the above three mechanisms, and provides the
> full Spark API as a multi-tenant service to remote clients.
>
> Since the open source release of Livy in late 2015, we have seen tremendous
> interest among a diverse set of application developers and ISVs that want to
> build applications with Apache Spark. To make Livy a robust and flexible
> solution that will enable a broad and growing set of applications, it is
> important to grow a large and varied community of contributors.
>
> = Initial Goals =
>
>    * Move existing codebase, website, documentation and mailing lists to
>      Apache-hosted infrastructure
>    * Work with the infrastructure team to implement and approve our code
>      review, build, and testing workflows in the context of the ASF
>    * Incremental development and releases per Apache guidelines
>
> = Current Status =
>
> The Livy project began at Cloudera, as a part of the Hue project. Cloudera
> soon realized the broad applicability of Livy, and separated it out into an
> independent project in Nov 2015.
>
> == Releases ==
>
> Livy has undergone two public releases, tagged here:
>
>   * https://github.com/cloudera/livy/releases/tag/v0.2.0
>   * https://github.com/cloudera/livy/releases/tag/v0.3.0
>
> Tarballs and zip files were created for each release and hosted on github.
> Upon joining the incubator, we will adopt a more typical ASF release
> process.
>
> == Source ==
>
> Livyâ€™s source is currently hosted on Github at:
> https://github.com/cloudera/livy
>
> This repository will be transitioned to Apacheâ€™s git hosting during
> incubation.
>
> == Code review ==
>
> Livyâ€™s code reviews are currently public and hosted on github as pull
> request reviews at: https://github.com/cloudera/livy/pulls
> The Livy developer community so far is happy with github pull request
> reviews and hopes to continue this after being admitted to the ASF.
>
> == Issue Tracking ==
>
> Livyâ€™s bug and feature tracking is hosted on JIRA at:
> https://issues.cloudera.org/projects/LIVY/summary
> This JIRA instance contains bugs and development discussion dating back 1
> year and will provide an initial seed for the ASF JIRA
>
> == Community Discussion ==
>
> Livy has several public discussion forums:
>
>   * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-dev
>   * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-user
>
> == Development Practices ==
>
> The Livy project follows a review before commit philosophy. Every commit
> automatically runs through the unit tests and generates coverage reports
> presented as a pull request comment. Our experience with this process leads
> us to believe that it helps ease new contributors into the project. They get
> feedback quickly on common mistakes, lowering the burden on reviewers. Those
> same reviewers get to lead by example, showing the new contributors that we
> value feedback within our community even when changes are done by more
> experienced folks.
>
> == Meritocracy ==
>
> We believe strongly in meritocracy when electing committers and PMC members.
> In the past few months, the project has added two new committers from two
> different organisations, in recognition of their significant contributions
> to the project. We will encourage contributions and participation of all
> types, and ensure that contributors are appropriately recognized.
>
> == Community ==
>
> Though Livy is relatively new as a standalone open source project, it has
> already seen promising growth in its community across several organizations:
> Cloudera is the original development sponsor for Livy
> Microsoft pushed the development of the interpreter fixing high availability
> issues and adding additional features.
> Hortonworks has contributed the security features to Livy allowing kerberos
> and impersonation to work with Spark
> IBM is starting to make contributions to the Livy project
> A number of other patches contributed by community members
>
> Livy currently relies on Google Groups for mailing lists. These lists have
> been active since the end of 2015/start of 2016. Currently, Livyâ€™s user
> mailing list has 173 subscribers and has hosted a total of 227 topic
> threads. Livyâ€™s developer list has 49 subscribers and has hosted 79 topic
> threads.
>
> == Core Developers ==
>
> The early contributions to Livy were made by Cloudera engineers. In 2016,
> engineers from Microsoft and Hortonworks joined the core developer
> community.
>
> == Alignment ==
>
> Livy is built upon Apache Spark, and other Apache projects like Apache
> Hadoop YARN. Itâ€™s used as a building block by Apache Zeppelin. These
> community connections combined with our focus on development practices that
> emphasize community engagement with a path to meritocratic recognition
> naturally align us with the ASF.
>
> = Known Risks =
>
> == Orphaned Products ==
>
> The risk of Livy being abandoned is low because it is supported by three
> major big-data software vendors. Moreover, Livy is already used to power
> multiple releases of services and products used in production.
>
> == Inexperience with Open Source ==
>
> Several of the initial committers are experienced open source developers,
> several being committers and/or PMC members on other ASF projects (Spark,
> YARN).
>
> == Homogenous Developers ==
>
> The project already has a diverse developer base. It has contributions from
> 3 major organisations (Cloudera, Microsoft and Hortonworks), and is used in
> diverse applications, in diverse settings (On-Prem and Cloud).
>
> == Reliance on salaried Developers ==
>
> The contributions to the Livy project to date have been made by salaried
> engineers from Cloudera, Microsoft and Hortonworks. One of the individuals
> on the initial committer list has since left Microsoft and is currently
> unaffiliated. The remaining contributors are from Cloudera and Hortonworks.
> Since there are at least two major organizations involved, the risk of
> reliance on a single group of salaried developers is mitigated. The Livy
> user base is diverse, with users from across the globe, including users from
> academic settings. We aim to further diversify the Livy user and contributor
> base.
>
> == Relationships with other Apache projects ==
>
> Livy is closely tied to the Apache Spark project and currently addresses the
> scenarios for a REST based batch and interactive gateway for Spark jobs on
> YARN. Given the growing number of integrations with Livy, keeping it outside
> of Apache Spark aligns with the desire of the Apache Spark community to
> reduce the number of external dependencies in the Spark project.
> Specifically, the Apache Spark community has previously expressed a desire
> to keep job servers independent from the project.<<FootNote(See, for
> example, discussion of the Ooyala Spark Job Server in SPARK-818)>>
> Furthermore, while Livy common usage is closely tied to Spark deployments
> right now, its core building blocks can be reused elsewhere.  Livyâ€™s Remote
> REPL could be used as a library for interactive scenarios in non-Spark
> projects. In the future, integrations with cluster managers like Apache
> Mesos and others could also be added.
>
> The features provided by Livy have already been integrated with existing
> projects like Jupyter and Apache Zeppelin for their interactive Spark use
> cases. This validates the need for a project like Livy and provides an
> active downstream user base that the Livy community can interact with to
> seed future interest in the project.
>
> Livy serves a similar purpose to Apache Toree (incubating) but differs in
> making session management, security and impersonation a focal design point.
>
> == An Excessive Fascination with the Apache Brand ==
>
> The primary motivation for submitting Livy to the ASF is to grow a diverse
> and strong community. We wish to encourage diverse organisations, including
> ISVs, to adopt Livy and contribute to Livy without any concerns about
> ownership or licensing.
>
> = Documentation =
>
> Documentation can be found on the Livy website http://livy.io/
>
> The Livy web site is version controlled on the â€˜gh-pagesâ€™ branch of the
> above repository.
> Additional documentation is provided on the github wiki:
> https://github.com/cloudera/livy/wiki
> APis are documented within the source code as JavaDoc style documentation
> comments.
>
> = Initial Source =
>
> The initial source code for Livy is hosted at
> https://github.com/cloudera/livy
>
> = Source and Intellectual Property submission plan =
>
> The Livy codebase and web site is currently hosted on GitHub and will be
> transitioned to the ASF repositories during incubation. Livy is already
> licensed under the Apache 2.0 license. Cloudera has collected ICLAs and
> CCLAs from all committers. There are, however, some contributions recently
> from authors that have not signed the CCLA and ICLA. If necessary for a
> successful SGA, weâ€™ll seek the necessary documentation or replace the
> contributions.
>
> The â€œLivyâ€ name is not a registered trademark. We will need to do a
> trademark search and make sure it is available for the Apache Foundation
> prior to graduation.
>
> Cloudera currently owns the domain name: http://livy.io/. Once all the
> documentation has moved over to ASF infrastructure, the main landing page
> will become livy.incubator.apache.org and the old domain will just act as a
> redirect.
>
> = External Dependencies =
>
> The list below covers the non-Apache dependencies of the project and their
> licenses.
>
>   * Jetty: Apache 2.0
>   * Dropwizard Metrics: Apache 2.0
>   * FasterXML Jackson: Apache 2.0
>   * Netty: Apache 2.0
>   * Scala: BSD
>   * Py4J: BSD
>   * Scalatra: BSD
>
> Build/test-only dependencies:
>
>   * Mockito: MIT
>   * JUnit: Eclipse
>
> = Required Resources =
>
> == Mailing Lists ==
>
>   * private@livy.incubator.apache.org (PPMC)
>   * dev@livy.incubator.apache.org (dev mailing list)
>   * user@livy.incubator.apache.org (User questions)
>   * commits@livy.incubator.apache.org (subscribers shouldnâ€™t be able to post)
>   * issues@livy.incubator.apache.org (subscribers shouldnâ€™t be able to post)
>
> == Git Repository ==
>
> git://git.apache.org/incubator-livy
>
> == Issue Tracking ==
>
> We would like to import our current JIRA project into the ASF JIRA, such
> that our historical commit message and code comments continue to reference
> the appropriate bug numbers.
>
> = Initial Committers =
>
>   * Marcelo Vanzin (vanzin@cloudera.com)
>   * Alex Man (alex@alexman.space)
>   * Jeff Zhang (zjffdu@gmail.com)
>   * Saisai Shao (sshao@hortonworks.com)
>   * Kostas Sakellis (kostas@cloudera.com)
>
> = Affiliations =
>
> The initial set of committers includes people employed by Cloudera and
> Hortonworks as well as one currently independent contributor.
>
> = Additional Interested Contributors =
>
> Those interested in getting involved with the project as we enter incubation
> are encouraged to list themselves here.
>
>    * IsmaÃ«l MejÃa (iemejia@apache.org)
>
> = Sponsors =
>
> == Champion ==
>
> Sean Busbey (busbey@apache.org)
>
> == Nominated Mentors ==
>
>   * Bikas Saha (bikas@apache.org)
>   * Brock Noland (brock@phdata.io)
>   * Luciano Resende (lresende@apache.org)
>
> == Sponsoring Entity ==
>
> We ask that the Incubator PMC sponsor this proposal.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org

Re: [VOTE] Livy to enter Apache Incubator

Posted by Arpit Agarwal <aa...@hortonworks.com>.

+1 (non-binding)


On 5/31/17, 6:03 AM, "Sean Busbey" <bu...@apache.org> wrote:

    Hi folks!
    
    I'm calling a vote to accept "Livy" into the Apache Incubator.
    
    The full proposal is available below, and is also available in the wiki:
    
    https://wiki.apache.org/incubator/LivyProposal
    
    For additional context, please see the discussion thread:
    
    https://s.apache.org/incubator-livy-proposal-thread
    
    Please cast your vote:
    
    [ ] +1, bring Livy into Incubator
    [ ] -1, do not bring Livy into Incubator, because...
    
    The vote will open at least for 72 hours and only votes from the Incubator
    PMC are binding.
    
    I start with my vote:
    +1
    
    ----
    
    = Abstract =
    
    Livy is web service that exposes a REST interface for managing long running
    Apache Spark contexts in your cluster. With Livy, new applications can be
    built on top of Apache Spark that require fine grained interaction with many
    Spark contexts.  
    
    = Proposal =
    
    Livy is an open-source REST service for Apache Spark. Livy enables
    applications to submit Spark applications and retrieve results without a
    co-location requirement on the Spark cluster. 
    
    We propose to contribute the Livy codebase and associated artifacts (e.g.
    documentation, web-site context etc) to the Apache Software Foundation.
    
    = Background =
    
    Apache Spark is a fast and general purpose distributed compute engine, with
    a versatile API. It enables processing of large quantities of static data
    distributed over a cluster of machines, as well as processing of continuous
    streams of data. It is the preferred distributed data processing engine for
    data engineering, stream processing and data science workloads. Each Spark
    application uses a construct called the SparkContext, which is the
    applicationâ€™s connection or entry point to the Spark engine. Each Spark
    application will have its own SparkContext.
    
    Livy enables clients to interact with one or more Spark sessions through the
    Livy Server, which acts as a proxy layer. Livy Clients have fine grained
    control over the lifecycle of the Spark sessions, as well as the ability to
    submit jobs and retrieve results, all over HTTP. Clients have two modes of
    interaction: RPC Client API, available in Java and Python, which allows
    results to be retrieved as Java or Python objects. The serialization and
    deserialization of the results is handled by the Livy framework. HTTP based
    API that allows submission of code snippets, and retrieval of the results in
    different formats.
    
    Multi-tenant resource allocation and security: Livy enables multiple
    independent Spark sessions to be managed simultaneously. Multiple clients
    can also interact simultaneously with the same Spark session and share the
    resources of that Spark session. Livy can also enforce secure, authenticated
    communication between the clients and their respective Spark sessions.
    
    More information on Livy can be found at the existing open source website:
    http://livy.io/
    
    = Rationale =
    
    Users want to use Sparkâ€™s powerful processing engine and API as the data
    processing backend for interactive applications. However, the job submission
    and application interaction mechanisms built into Apache Spark are
    insufficient and cumbersome for multi-user interactive applications.
    
    The primary mechanism for applications to submit Spark jobs is via
    spark-submit
    (http://spark.apache.org/docs/latest/submitting-applications.html), which is
    available as a command line tool as well as a programmatic API. However,
    spark-submit has the following limitations that make it difficult to build
    interactive applications: It is slow: each invocation of spark-submit
    involves a setup phase where cluster resources are acquired, new processes
    are forked, etc. This setup phase runs for many seconds, or even minutes,
    and hence is too slow for interactive applications. It is cumbersome and
    lacks flexibility: application code and dependencies have to be pre-compiled
    and submitted as jars, and can not be submitted interactively.
    
    Apache Spark comes with an ODBC/JDBC server, which can be used to submit SQL
    queries to Spark. However, this solution is limited to SQL and does not
    allow the client to leverage the rest of the Spark API, such as RDDs, MLlib
    and Streaming.
    
    A third way of using Spark is via its command-line shell, which allows the
    interactive submission of snippets of Spark code. However, the shell entails
    running Spark code on the client machine and hence is not a viable mechanism
    for remote clients to submit Spark jobs.
    
    Livy solves the limitations of the above three mechanisms, and provides the
    full Spark API as a multi-tenant service to remote clients. 
    
    Since the open source release of Livy in late 2015, we have seen tremendous
    interest among a diverse set of application developers and ISVs that want to
    build applications with Apache Spark. To make Livy a robust and flexible
    solution that will enable a broad and growing set of applications, it is
    important to grow a large and varied community of contributors.
    
    = Initial Goals =
    
      * Move existing codebase, website, documentation and mailing lists to
        Apache-hosted infrastructure
      * Work with the infrastructure team to implement and approve our code
        review, build, and testing workflows in the context of the ASF
      * Incremental development and releases per Apache guidelines
    
    = Current Status =
    
    The Livy project began at Cloudera, as a part of the Hue project. Cloudera
    soon realized the broad applicability of Livy, and separated it out into an
    independent project in Nov 2015.
    
    == Releases ==
    
    Livy has undergone two public releases, tagged here: 
    
     * https://github.com/cloudera/livy/releases/tag/v0.2.0
     * https://github.com/cloudera/livy/releases/tag/v0.3.0
    
    Tarballs and zip files were created for each release and hosted on github.
    Upon joining the incubator, we will adopt a more typical ASF release
    process.
    
    == Source ==
    
    Livyâ€™s source is currently hosted on Github at:
    https://github.com/cloudera/livy
    
    This repository will be transitioned to Apacheâ€™s git hosting during
    incubation.
    
    == Code review ==
    
    Livyâ€™s code reviews are currently public and hosted on github as pull
    request reviews at: https://github.com/cloudera/livy/pulls
    The Livy developer community so far is happy with github pull request
    reviews and hopes to continue this after being admitted to the ASF.
    
    == Issue Tracking ==
    
    Livyâ€™s bug and feature tracking is hosted on JIRA at:
    https://issues.cloudera.org/projects/LIVY/summary
    This JIRA instance contains bugs and development discussion dating back 1
    year and will provide an initial seed for the ASF JIRA
    
    == Community Discussion ==
    
    Livy has several public discussion forums:
    
     * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-dev
     * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-user
    
    == Development Practices ==
    
    The Livy project follows a review before commit philosophy. Every commit
    automatically runs through the unit tests and generates coverage reports
    presented as a pull request comment. Our experience with this process leads
    us to believe that it helps ease new contributors into the project. They get
    feedback quickly on common mistakes, lowering the burden on reviewers. Those
    same reviewers get to lead by example, showing the new contributors that we
    value feedback within our community even when changes are done by more
    experienced folks.
    
    == Meritocracy ==
    
    We believe strongly in meritocracy when electing committers and PMC members.
    In the past few months, the project has added two new committers from two
    different organisations, in recognition of their significant contributions
    to the project. We will encourage contributions and participation of all
    types, and ensure that contributors are appropriately recognized.
    
    == Community ==
    
    Though Livy is relatively new as a standalone open source project, it has
    already seen promising growth in its community across several organizations:
    Cloudera is the original development sponsor for Livy
    Microsoft pushed the development of the interpreter fixing high availability
    issues and adding additional features. 
    Hortonworks has contributed the security features to Livy allowing kerberos
    and impersonation to work with Spark
    IBM is starting to make contributions to the Livy project
    A number of other patches contributed by community members
    
    Livy currently relies on Google Groups for mailing lists. These lists have
    been active since the end of 2015/start of 2016. Currently, Livyâ€™s user
    mailing list has 173 subscribers and has hosted a total of 227 topic
    threads. Livyâ€™s developer list has 49 subscribers and has hosted 79 topic
    threads.
    
    == Core Developers ==
    
    The early contributions to Livy were made by Cloudera engineers. In 2016,
    engineers from Microsoft and Hortonworks joined the core developer
    community. 
    
    == Alignment ==
    
    Livy is built upon Apache Spark, and other Apache projects like Apache
    Hadoop YARN. Itâ€™s used as a building block by Apache Zeppelin. These
    community connections combined with our focus on development practices that
    emphasize community engagement with a path to meritocratic recognition
    naturally align us with the ASF.
    
    = Known Risks =
    
    == Orphaned Products ==
    
    The risk of Livy being abandoned is low because it is supported by three
    major big-data software vendors. Moreover, Livy is already used to power
    multiple releases of services and products used in production.
    
    == Inexperience with Open Source ==
    
    Several of the initial committers are experienced open source developers,
    several being committers and/or PMC members on other ASF projects (Spark,
    YARN).
    
    == Homogenous Developers ==
    
    The project already has a diverse developer base. It has contributions from
    3 major organisations (Cloudera, Microsoft and Hortonworks), and is used in
    diverse applications, in diverse settings (On-Prem and Cloud).
    
    == Reliance on salaried Developers ==
    
    The contributions to the Livy project to date have been made by salaried
    engineers from Cloudera, Microsoft and Hortonworks. One of the individuals
    on the initial committer list has since left Microsoft and is currently
    unaffiliated. The remaining contributors are from Cloudera and Hortonworks.
    Since there are at least two major organizations involved, the risk of
    reliance on a single group of salaried developers is mitigated. The Livy
    user base is diverse, with users from across the globe, including users from
    academic settings. We aim to further diversify the Livy user and contributor
    base.
    
    == Relationships with other Apache projects ==
    
    Livy is closely tied to the Apache Spark project and currently addresses the
    scenarios for a REST based batch and interactive gateway for Spark jobs on
    YARN. Given the growing number of integrations with Livy, keeping it outside
    of Apache Spark aligns with the desire of the Apache Spark community to
    reduce the number of external dependencies in the Spark project.
    Specifically, the Apache Spark community has previously expressed a desire
    to keep job servers independent from the project.<<FootNote(See, for
    example, discussion of the Ooyala Spark Job Server in SPARK-818)>>
    Furthermore, while Livy common usage is closely tied to Spark deployments
    right now, its core building blocks can be reused elsewhere.  Livyâ€™s Remote
    REPL could be used as a library for interactive scenarios in non-Spark
    projects. In the future, integrations with cluster managers like Apache
    Mesos and others could also be added.
    
    The features provided by Livy have already been integrated with existing
    projects like Jupyter and Apache Zeppelin for their interactive Spark use
    cases. This validates the need for a project like Livy and provides an
    active downstream user base that the Livy community can interact with to
    seed future interest in the project.
    
    Livy serves a similar purpose to Apache Toree (incubating) but differs in
    making session management, security and impersonation a focal design point.
    
    == An Excessive Fascination with the Apache Brand ==
    
    The primary motivation for submitting Livy to the ASF is to grow a diverse
    and strong community. We wish to encourage diverse organisations, including
    ISVs, to adopt Livy and contribute to Livy without any concerns about
    ownership or licensing.
    
    = Documentation =
    
    Documentation can be found on the Livy website http://livy.io/
    
    The Livy web site is version controlled on the â€˜gh-pagesâ€™ branch of the
    above repository.
    Additional documentation is provided on the github wiki:
    https://github.com/cloudera/livy/wiki
    APis are documented within the source code as JavaDoc style documentation
    comments. 
    
    = Initial Source =
    
    The initial source code for Livy is hosted at
    https://github.com/cloudera/livy 
    
    = Source and Intellectual Property submission plan =
    
    The Livy codebase and web site is currently hosted on GitHub and will be
    transitioned to the ASF repositories during incubation. Livy is already
    licensed under the Apache 2.0 license. Cloudera has collected ICLAs and
    CCLAs from all committers. There are, however, some contributions recently
    from authors that have not signed the CCLA and ICLA. If necessary for a
    successful SGA, weâ€™ll seek the necessary documentation or replace the
    contributions.
    
    The â€œLivyâ€ name is not a registered trademark. We will need to do a
    trademark search and make sure it is available for the Apache Foundation
    prior to graduation.
    
    Cloudera currently owns the domain name: http://livy.io/. Once all the
    documentation has moved over to ASF infrastructure, the main landing page
    will become livy.incubator.apache.org and the old domain will just act as a
    redirect.
    
    = External Dependencies =
    
    The list below covers the non-Apache dependencies of the project and their
    licenses.
    
     * Jetty: Apache 2.0
     * Dropwizard Metrics: Apache 2.0
     * FasterXML Jackson: Apache 2.0
     * Netty: Apache 2.0
     * Scala: BSD
     * Py4J: BSD
     * Scalatra: BSD
    
    Build/test-only dependencies:
    
     * Mockito: MIT
     * JUnit: Eclipse
    
    = Required Resources =
    
    == Mailing Lists ==
    
     * private@livy.incubator.apache.org (PPMC)
     * dev@livy.incubator.apache.org (dev mailing list)
     * user@livy.incubator.apache.org (User questions)
     * commits@livy.incubator.apache.org (subscribers shouldnâ€™t be able to post)
     * issues@livy.incubator.apache.org (subscribers shouldnâ€™t be able to post)
    
    == Git Repository ==
    
    git://git.apache.org/incubator-livy
    
    == Issue Tracking ==
    
    We would like to import our current JIRA project into the ASF JIRA, such
    that our historical commit message and code comments continue to reference
    the appropriate bug numbers.
    
    = Initial Committers =
    
     * Marcelo Vanzin (vanzin@cloudera.com)
     * Alex Man (alex@alexman.space)
     * Jeff Zhang (zjffdu@gmail.com)
     * Saisai Shao (sshao@hortonworks.com)
     * Kostas Sakellis (kostas@cloudera.com)
    
    = Affiliations =
    
    The initial set of committers includes people employed by Cloudera and
    Hortonworks as well as one currently independent contributor.
    
    = Additional Interested Contributors =
    
    Those interested in getting involved with the project as we enter incubation
    are encouraged to list themselves here.
    
      * IsmaÃ«l MejÃa (iemejia@apache.org)
    
    = Sponsors =
    
    == Champion ==
    
    Sean Busbey (busbey@apache.org)
    
    == Nominated Mentors ==
    
     * Bikas Saha (bikas@apache.org)
     * Brock Noland (brock@phdata.io)
     * Luciano Resende (lresende@apache.org)
    
    == Sponsoring Entity ==
    
    We ask that the Incubator PMC sponsor this proposal.
    
    
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
    For additional commands, e-mail: general-help@incubator.apache.org
    
    
    


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org

Re: [VOTE] Livy to enter Apache Incubator

Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.

+1 (binding)

If you need an additional mentor, please let me know, I'm interested by the 
project !

Regards
JB

On 05/31/2017 03:03 PM, Sean Busbey wrote:
> Hi folks!
> 
> I'm calling a vote to accept "Livy" into the Apache Incubator.
> 
> The full proposal is available below, and is also available in the wiki:
> 
> https://wiki.apache.org/incubator/LivyProposal
> 
> For additional context, please see the discussion thread:
> 
> https://s.apache.org/incubator-livy-proposal-thread
> 
> Please cast your vote:
> 
> [ ] +1, bring Livy into Incubator
> [ ] -1, do not bring Livy into Incubator, because...
> 
> The vote will open at least for 72 hours and only votes from the Incubator
> PMC are binding.
> 
> I start with my vote:
> +1
> 
> ----
> 
> = Abstract =
> 
> Livy is web service that exposes a REST interface for managing long running
> Apache Spark contexts in your cluster. With Livy, new applications can be
> built on top of Apache Spark that require fine grained interaction with many
> Spark contexts.
> 
> = Proposal =
> 
> Livy is an open-source REST service for Apache Spark. Livy enables
> applications to submit Spark applications and retrieve results without a
> co-location requirement on the Spark cluster.
> 
> We propose to contribute the Livy codebase and associated artifacts (e.g.
> documentation, web-site context etc) to the Apache Software Foundation.
> 
> = Background =
> 
> Apache Spark is a fast and general purpose distributed compute engine, with
> a versatile API. It enables processing of large quantities of static data
> distributed over a cluster of machines, as well as processing of continuous
> streams of data. It is the preferred distributed data processing engine for
> data engineering, stream processing and data science workloads. Each Spark
> application uses a construct called the SparkContext, which is the
> applicationâ€™s connection or entry point to the Spark engine. Each Spark
> application will have its own SparkContext.
> 
> Livy enables clients to interact with one or more Spark sessions through the
> Livy Server, which acts as a proxy layer. Livy Clients have fine grained
> control over the lifecycle of the Spark sessions, as well as the ability to
> submit jobs and retrieve results, all over HTTP. Clients have two modes of
> interaction: RPC Client API, available in Java and Python, which allows
> results to be retrieved as Java or Python objects. The serialization and
> deserialization of the results is handled by the Livy framework. HTTP based
> API that allows submission of code snippets, and retrieval of the results in
> different formats.
> 
> Multi-tenant resource allocation and security: Livy enables multiple
> independent Spark sessions to be managed simultaneously. Multiple clients
> can also interact simultaneously with the same Spark session and share the
> resources of that Spark session. Livy can also enforce secure, authenticated
> communication between the clients and their respective Spark sessions.
> 
> More information on Livy can be found at the existing open source website:
> http://livy.io/
> 
> = Rationale =
> 
> Users want to use Sparkâ€™s powerful processing engine and API as the data
> processing backend for interactive applications. However, the job submission
> and application interaction mechanisms built into Apache Spark are
> insufficient and cumbersome for multi-user interactive applications.
> 
> The primary mechanism for applications to submit Spark jobs is via
> spark-submit
> (http://spark.apache.org/docs/latest/submitting-applications.html), which is
> available as a command line tool as well as a programmatic API. However,
> spark-submit has the following limitations that make it difficult to build
> interactive applications: It is slow: each invocation of spark-submit
> involves a setup phase where cluster resources are acquired, new processes
> are forked, etc. This setup phase runs for many seconds, or even minutes,
> and hence is too slow for interactive applications. It is cumbersome and
> lacks flexibility: application code and dependencies have to be pre-compiled
> and submitted as jars, and can not be submitted interactively.
> 
> Apache Spark comes with an ODBC/JDBC server, which can be used to submit SQL
> queries to Spark. However, this solution is limited to SQL and does not
> allow the client to leverage the rest of the Spark API, such as RDDs, MLlib
> and Streaming.
> 
> A third way of using Spark is via its command-line shell, which allows the
> interactive submission of snippets of Spark code. However, the shell entails
> running Spark code on the client machine and hence is not a viable mechanism
> for remote clients to submit Spark jobs.
> 
> Livy solves the limitations of the above three mechanisms, and provides the
> full Spark API as a multi-tenant service to remote clients.
> 
> Since the open source release of Livy in late 2015, we have seen tremendous
> interest among a diverse set of application developers and ISVs that want to
> build applications with Apache Spark. To make Livy a robust and flexible
> solution that will enable a broad and growing set of applications, it is
> important to grow a large and varied community of contributors.
> 
> = Initial Goals =
> 
>    * Move existing codebase, website, documentation and mailing lists to
>      Apache-hosted infrastructure
>    * Work with the infrastructure team to implement and approve our code
>      review, build, and testing workflows in the context of the ASF
>    * Incremental development and releases per Apache guidelines
> 
> = Current Status =
> 
> The Livy project began at Cloudera, as a part of the Hue project. Cloudera
> soon realized the broad applicability of Livy, and separated it out into an
> independent project in Nov 2015.
> 
> == Releases ==
> 
> Livy has undergone two public releases, tagged here:
> 
>   * https://github.com/cloudera/livy/releases/tag/v0.2.0
>   * https://github.com/cloudera/livy/releases/tag/v0.3.0
> 
> Tarballs and zip files were created for each release and hosted on github.
> Upon joining the incubator, we will adopt a more typical ASF release
> process.
> 
> == Source ==
> 
> Livyâ€™s source is currently hosted on Github at:
> https://github.com/cloudera/livy
> 
> This repository will be transitioned to Apacheâ€™s git hosting during
> incubation.
> 
> == Code review ==
> 
> Livyâ€™s code reviews are currently public and hosted on github as pull
> request reviews at: https://github.com/cloudera/livy/pulls
> The Livy developer community so far is happy with github pull request
> reviews and hopes to continue this after being admitted to the ASF.
> 
> == Issue Tracking ==
> 
> Livyâ€™s bug and feature tracking is hosted on JIRA at:
> https://issues.cloudera.org/projects/LIVY/summary
> This JIRA instance contains bugs and development discussion dating back 1
> year and will provide an initial seed for the ASF JIRA
> 
> == Community Discussion ==
> 
> Livy has several public discussion forums:
> 
>   * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-dev
>   * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-user
> 
> == Development Practices ==
> 
> The Livy project follows a review before commit philosophy. Every commit
> automatically runs through the unit tests and generates coverage reports
> presented as a pull request comment. Our experience with this process leads
> us to believe that it helps ease new contributors into the project. They get
> feedback quickly on common mistakes, lowering the burden on reviewers. Those
> same reviewers get to lead by example, showing the new contributors that we
> value feedback within our community even when changes are done by more
> experienced folks.
> 
> == Meritocracy ==
> 
> We believe strongly in meritocracy when electing committers and PMC members.
> In the past few months, the project has added two new committers from two
> different organisations, in recognition of their significant contributions
> to the project. We will encourage contributions and participation of all
> types, and ensure that contributors are appropriately recognized.
> 
> == Community ==
> 
> Though Livy is relatively new as a standalone open source project, it has
> already seen promising growth in its community across several organizations:
> Cloudera is the original development sponsor for Livy
> Microsoft pushed the development of the interpreter fixing high availability
> issues and adding additional features.
> Hortonworks has contributed the security features to Livy allowing kerberos
> and impersonation to work with Spark
> IBM is starting to make contributions to the Livy project
> A number of other patches contributed by community members
> 
> Livy currently relies on Google Groups for mailing lists. These lists have
> been active since the end of 2015/start of 2016. Currently, Livyâ€™s user
> mailing list has 173 subscribers and has hosted a total of 227 topic
> threads. Livyâ€™s developer list has 49 subscribers and has hosted 79 topic
> threads.
> 
> == Core Developers ==
> 
> The early contributions to Livy were made by Cloudera engineers. In 2016,
> engineers from Microsoft and Hortonworks joined the core developer
> community.
> 
> == Alignment ==
> 
> Livy is built upon Apache Spark, and other Apache projects like Apache
> Hadoop YARN. Itâ€™s used as a building block by Apache Zeppelin. These
> community connections combined with our focus on development practices that
> emphasize community engagement with a path to meritocratic recognition
> naturally align us with the ASF.
> 
> = Known Risks =
> 
> == Orphaned Products ==
> 
> The risk of Livy being abandoned is low because it is supported by three
> major big-data software vendors. Moreover, Livy is already used to power
> multiple releases of services and products used in production.
> 
> == Inexperience with Open Source ==
> 
> Several of the initial committers are experienced open source developers,
> several being committers and/or PMC members on other ASF projects (Spark,
> YARN).
> 
> == Homogenous Developers ==
> 
> The project already has a diverse developer base. It has contributions from
> 3 major organisations (Cloudera, Microsoft and Hortonworks), and is used in
> diverse applications, in diverse settings (On-Prem and Cloud).
> 
> == Reliance on salaried Developers ==
> 
> The contributions to the Livy project to date have been made by salaried
> engineers from Cloudera, Microsoft and Hortonworks. One of the individuals
> on the initial committer list has since left Microsoft and is currently
> unaffiliated. The remaining contributors are from Cloudera and Hortonworks.
> Since there are at least two major organizations involved, the risk of
> reliance on a single group of salaried developers is mitigated. The Livy
> user base is diverse, with users from across the globe, including users from
> academic settings. We aim to further diversify the Livy user and contributor
> base.
> 
> == Relationships with other Apache projects ==
> 
> Livy is closely tied to the Apache Spark project and currently addresses the
> scenarios for a REST based batch and interactive gateway for Spark jobs on
> YARN. Given the growing number of integrations with Livy, keeping it outside
> of Apache Spark aligns with the desire of the Apache Spark community to
> reduce the number of external dependencies in the Spark project.
> Specifically, the Apache Spark community has previously expressed a desire
> to keep job servers independent from the project.<<FootNote(See, for
> example, discussion of the Ooyala Spark Job Server in SPARK-818)>>
> Furthermore, while Livy common usage is closely tied to Spark deployments
> right now, its core building blocks can be reused elsewhere.  Livyâ€™s Remote
> REPL could be used as a library for interactive scenarios in non-Spark
> projects. In the future, integrations with cluster managers like Apache
> Mesos and others could also be added.
> 
> The features provided by Livy have already been integrated with existing
> projects like Jupyter and Apache Zeppelin for their interactive Spark use
> cases. This validates the need for a project like Livy and provides an
> active downstream user base that the Livy community can interact with to
> seed future interest in the project.
> 
> Livy serves a similar purpose to Apache Toree (incubating) but differs in
> making session management, security and impersonation a focal design point.
> 
> == An Excessive Fascination with the Apache Brand ==
> 
> The primary motivation for submitting Livy to the ASF is to grow a diverse
> and strong community. We wish to encourage diverse organisations, including
> ISVs, to adopt Livy and contribute to Livy without any concerns about
> ownership or licensing.
> 
> = Documentation =
> 
> Documentation can be found on the Livy website http://livy.io/
> 
> The Livy web site is version controlled on the â€˜gh-pagesâ€™ branch of the
> above repository.
> Additional documentation is provided on the github wiki:
> https://github.com/cloudera/livy/wiki
> APis are documented within the source code as JavaDoc style documentation
> comments.
> 
> = Initial Source =
> 
> The initial source code for Livy is hosted at
> https://github.com/cloudera/livy
> 
> = Source and Intellectual Property submission plan =
> 
> The Livy codebase and web site is currently hosted on GitHub and will be
> transitioned to the ASF repositories during incubation. Livy is already
> licensed under the Apache 2.0 license. Cloudera has collected ICLAs and
> CCLAs from all committers. There are, however, some contributions recently
> from authors that have not signed the CCLA and ICLA. If necessary for a
> successful SGA, weâ€™ll seek the necessary documentation or replace the
> contributions.
> 
> The â€œLivyâ€ name is not a registered trademark. We will need to do a
> trademark search and make sure it is available for the Apache Foundation
> prior to graduation.
> 
> Cloudera currently owns the domain name: http://livy.io/. Once all the
> documentation has moved over to ASF infrastructure, the main landing page
> will become livy.incubator.apache.org and the old domain will just act as a
> redirect.
> 
> = External Dependencies =
> 
> The list below covers the non-Apache dependencies of the project and their
> licenses.
> 
>   * Jetty: Apache 2.0
>   * Dropwizard Metrics: Apache 2.0
>   * FasterXML Jackson: Apache 2.0
>   * Netty: Apache 2.0
>   * Scala: BSD
>   * Py4J: BSD
>   * Scalatra: BSD
> 
> Build/test-only dependencies:
> 
>   * Mockito: MIT
>   * JUnit: Eclipse
> 
> = Required Resources =
> 
> == Mailing Lists ==
> 
>   * private@livy.incubator.apache.org (PPMC)
>   * dev@livy.incubator.apache.org (dev mailing list)
>   * user@livy.incubator.apache.org (User questions)
>   * commits@livy.incubator.apache.org (subscribers shouldnâ€™t be able to post)
>   * issues@livy.incubator.apache.org (subscribers shouldnâ€™t be able to post)
> 
> == Git Repository ==
> 
> git://git.apache.org/incubator-livy
> 
> == Issue Tracking ==
> 
> We would like to import our current JIRA project into the ASF JIRA, such
> that our historical commit message and code comments continue to reference
> the appropriate bug numbers.
> 
> = Initial Committers =
> 
>   * Marcelo Vanzin (vanzin@cloudera.com)
>   * Alex Man (alex@alexman.space)
>   * Jeff Zhang (zjffdu@gmail.com)
>   * Saisai Shao (sshao@hortonworks.com)
>   * Kostas Sakellis (kostas@cloudera.com)
> 
> = Affiliations =
> 
> The initial set of committers includes people employed by Cloudera and
> Hortonworks as well as one currently independent contributor.
> 
> = Additional Interested Contributors =
> 
> Those interested in getting involved with the project as we enter incubation
> are encouraged to list themselves here.
> 
>    * IsmaÃ«l MejÃa (iemejia@apache.org)
> 
> = Sponsors =
> 
> == Champion ==
> 
> Sean Busbey (busbey@apache.org)
> 
> == Nominated Mentors ==
> 
>   * Bikas Saha (bikas@apache.org)
>   * Brock Noland (brock@phdata.io)
>   * Luciano Resende (lresende@apache.org)
> 
> == Sponsoring Entity ==
> 
> We ask that the Incubator PMC sponsor this proposal.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
> 

-- 
Jean-Baptiste Onofré
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org

Re: [VOTE] Livy to enter Apache Incubator

Posted by Phillip Rhodes <mo...@gmail.com>.

+1

Looking forward to this...


Phil

This message optimized for indexing by NSA PRISM


On Wed, May 31, 2017 at 12:54 PM, Neelesh Salian
<ne...@gmail.com> wrote:
> +1 (non-binding)
> Thanks for putting this together.
>
> On May 31, 2017 9:46 AM, "Marcelo Vanzin" <va...@cloudera.com> wrote:
>
>> +1 (non-binding)
>>
>> On Wed, May 31, 2017 at 6:03 AM, Sean Busbey <bu...@apache.org> wrote:
>> > Hi folks!
>> >
>> > I'm calling a vote to accept "Livy" into the Apache Incubator.
>> >
>> > The full proposal is available below, and is also available in the wiki:
>> >
>> > https://wiki.apache.org/incubator/LivyProposal
>> >
>> > For additional context, please see the discussion thread:
>> >
>> > https://s.apache.org/incubator-livy-proposal-thread
>> >
>> > Please cast your vote:
>> >
>> > [ ] +1, bring Livy into Incubator
>> > [ ] -1, do not bring Livy into Incubator, because...
>> >
>> > The vote will open at least for 72 hours and only votes from the
>> Incubator
>> > PMC are binding.
>> >
>> > I start with my vote:
>> > +1
>> >
>> > ----
>> >
>> > = Abstract =
>> >
>> > Livy is web service that exposes a REST interface for managing long
>> running
>> > Apache Spark contexts in your cluster. With Livy, new applications can be
>> > built on top of Apache Spark that require fine grained interaction with
>> many
>> > Spark contexts.
>> >
>> > = Proposal =
>> >
>> > Livy is an open-source REST service for Apache Spark. Livy enables
>> > applications to submit Spark applications and retrieve results without a
>> > co-location requirement on the Spark cluster.
>> >
>> > We propose to contribute the Livy codebase and associated artifacts (e.g.
>> > documentation, web-site context etc) to the Apache Software Foundation.
>> >
>> > = Background =
>> >
>> > Apache Spark is a fast and general purpose distributed compute engine,
>> with
>> > a versatile API. It enables processing of large quantities of static data
>> > distributed over a cluster of machines, as well as processing of
>> continuous
>> > streams of data. It is the preferred distributed data processing engine
>> for
>> > data engineering, stream processing and data science workloads. Each
>> Spark
>> > application uses a construct called the SparkContext, which is the
>> > application’s connection or entry point to the Spark engine. Each Spark
>> > application will have its own SparkContext.
>> >
>> > Livy enables clients to interact with one or more Spark sessions through
>> the
>> > Livy Server, which acts as a proxy layer. Livy Clients have fine grained
>> > control over the lifecycle of the Spark sessions, as well as the ability
>> to
>> > submit jobs and retrieve results, all over HTTP. Clients have two modes
>> of
>> > interaction: RPC Client API, available in Java and Python, which allows
>> > results to be retrieved as Java or Python objects. The serialization and
>> > deserialization of the results is handled by the Livy framework. HTTP
>> based
>> > API that allows submission of code snippets, and retrieval of the
>> results in
>> > different formats.
>> >
>> > Multi-tenant resource allocation and security: Livy enables multiple
>> > independent Spark sessions to be managed simultaneously. Multiple clients
>> > can also interact simultaneously with the same Spark session and share
>> the
>> > resources of that Spark session. Livy can also enforce secure,
>> authenticated
>> > communication between the clients and their respective Spark sessions.
>> >
>> > More information on Livy can be found at the existing open source
>> website:
>> > http://livy.io/
>> >
>> > = Rationale =
>> >
>> > Users want to use Spark’s powerful processing engine and API as the data
>> > processing backend for interactive applications. However, the job
>> submission
>> > and application interaction mechanisms built into Apache Spark are
>> > insufficient and cumbersome for multi-user interactive applications.
>> >
>> > The primary mechanism for applications to submit Spark jobs is via
>> > spark-submit
>> > (http://spark.apache.org/docs/latest/submitting-applications.html),
>> which is
>> > available as a command line tool as well as a programmatic API. However,
>> > spark-submit has the following limitations that make it difficult to
>> build
>> > interactive applications: It is slow: each invocation of spark-submit
>> > involves a setup phase where cluster resources are acquired, new
>> processes
>> > are forked, etc. This setup phase runs for many seconds, or even minutes,
>> > and hence is too slow for interactive applications. It is cumbersome and
>> > lacks flexibility: application code and dependencies have to be
>> pre-compiled
>> > and submitted as jars, and can not be submitted interactively.
>> >
>> > Apache Spark comes with an ODBC/JDBC server, which can be used to submit
>> SQL
>> > queries to Spark. However, this solution is limited to SQL and does not
>> > allow the client to leverage the rest of the Spark API, such as RDDs,
>> MLlib
>> > and Streaming.
>> >
>> > A third way of using Spark is via its command-line shell, which allows
>> the
>> > interactive submission of snippets of Spark code. However, the shell
>> entails
>> > running Spark code on the client machine and hence is not a viable
>> mechanism
>> > for remote clients to submit Spark jobs.
>> >
>> > Livy solves the limitations of the above three mechanisms, and provides
>> the
>> > full Spark API as a multi-tenant service to remote clients.
>> >
>> > Since the open source release of Livy in late 2015, we have seen
>> tremendous
>> > interest among a diverse set of application developers and ISVs that
>> want to
>> > build applications with Apache Spark. To make Livy a robust and flexible
>> > solution that will enable a broad and growing set of applications, it is
>> > important to grow a large and varied community of contributors.
>> >
>> > = Initial Goals =
>> >
>> >   * Move existing codebase, website, documentation and mailing lists to
>> >     Apache-hosted infrastructure
>> >   * Work with the infrastructure team to implement and approve our code
>> >     review, build, and testing workflows in the context of the ASF
>> >   * Incremental development and releases per Apache guidelines
>> >
>> > = Current Status =
>> >
>> > The Livy project began at Cloudera, as a part of the Hue project.
>> Cloudera
>> > soon realized the broad applicability of Livy, and separated it out into
>> an
>> > independent project in Nov 2015.
>> >
>> > == Releases ==
>> >
>> > Livy has undergone two public releases, tagged here:
>> >
>> >  * https://github.com/cloudera/livy/releases/tag/v0.2.0
>> >  * https://github.com/cloudera/livy/releases/tag/v0.3.0
>> >
>> > Tarballs and zip files were created for each release and hosted on
>> github.
>> > Upon joining the incubator, we will adopt a more typical ASF release
>> > process.
>> >
>> > == Source ==
>> >
>> > Livy’s source is currently hosted on Github at:
>> > https://github.com/cloudera/livy
>> >
>> > This repository will be transitioned to Apache’s git hosting during
>> > incubation.
>> >
>> > == Code review ==
>> >
>> > Livy’s code reviews are currently public and hosted on github as pull
>> > request reviews at: https://github.com/cloudera/livy/pulls
>> > The Livy developer community so far is happy with github pull request
>> > reviews and hopes to continue this after being admitted to the ASF.
>> >
>> > == Issue Tracking ==
>> >
>> > Livy’s bug and feature tracking is hosted on JIRA at:
>> > https://issues.cloudera.org/projects/LIVY/summary
>> > This JIRA instance contains bugs and development discussion dating back 1
>> > year and will provide an initial seed for the ASF JIRA
>> >
>> > == Community Discussion ==
>> >
>> > Livy has several public discussion forums:
>> >
>> >  * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-dev
>> >  * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-user
>> >
>> > == Development Practices ==
>> >
>> > The Livy project follows a review before commit philosophy. Every commit
>> > automatically runs through the unit tests and generates coverage reports
>> > presented as a pull request comment. Our experience with this process
>> leads
>> > us to believe that it helps ease new contributors into the project. They
>> get
>> > feedback quickly on common mistakes, lowering the burden on reviewers.
>> Those
>> > same reviewers get to lead by example, showing the new contributors that
>> we
>> > value feedback within our community even when changes are done by more
>> > experienced folks.
>> >
>> > == Meritocracy ==
>> >
>> > We believe strongly in meritocracy when electing committers and PMC
>> members.
>> > In the past few months, the project has added two new committers from two
>> > different organisations, in recognition of their significant
>> contributions
>> > to the project. We will encourage contributions and participation of all
>> > types, and ensure that contributors are appropriately recognized.
>> >
>> > == Community ==
>> >
>> > Though Livy is relatively new as a standalone open source project, it has
>> > already seen promising growth in its community across several
>> organizations:
>> > Cloudera is the original development sponsor for Livy
>> > Microsoft pushed the development of the interpreter fixing high
>> availability
>> > issues and adding additional features.
>> > Hortonworks has contributed the security features to Livy allowing
>> kerberos
>> > and impersonation to work with Spark
>> > IBM is starting to make contributions to the Livy project
>> > A number of other patches contributed by community members
>> >
>> > Livy currently relies on Google Groups for mailing lists. These lists
>> have
>> > been active since the end of 2015/start of 2016. Currently, Livy’s user
>> > mailing list has 173 subscribers and has hosted a total of 227 topic
>> > threads. Livy’s developer list has 49 subscribers and has hosted 79 topic
>> > threads.
>> >
>> > == Core Developers ==
>> >
>> > The early contributions to Livy were made by Cloudera engineers. In 2016,
>> > engineers from Microsoft and Hortonworks joined the core developer
>> > community.
>> >
>> > == Alignment ==
>> >
>> > Livy is built upon Apache Spark, and other Apache projects like Apache
>> > Hadoop YARN. It’s used as a building block by Apache Zeppelin. These
>> > community connections combined with our focus on development practices
>> that
>> > emphasize community engagement with a path to meritocratic recognition
>> > naturally align us with the ASF.
>> >
>> > = Known Risks =
>> >
>> > == Orphaned Products ==
>> >
>> > The risk of Livy being abandoned is low because it is supported by three
>> > major big-data software vendors. Moreover, Livy is already used to power
>> > multiple releases of services and products used in production.
>> >
>> > == Inexperience with Open Source ==
>> >
>> > Several of the initial committers are experienced open source developers,
>> > several being committers and/or PMC members on other ASF projects (Spark,
>> > YARN).
>> >
>> > == Homogenous Developers ==
>> >
>> > The project already has a diverse developer base. It has contributions
>> from
>> > 3 major organisations (Cloudera, Microsoft and Hortonworks), and is used
>> in
>> > diverse applications, in diverse settings (On-Prem and Cloud).
>> >
>> > == Reliance on salaried Developers ==
>> >
>> > The contributions to the Livy project to date have been made by salaried
>> > engineers from Cloudera, Microsoft and Hortonworks. One of the
>> individuals
>> > on the initial committer list has since left Microsoft and is currently
>> > unaffiliated. The remaining contributors are from Cloudera and
>> Hortonworks.
>> > Since there are at least two major organizations involved, the risk of
>> > reliance on a single group of salaried developers is mitigated. The Livy
>> > user base is diverse, with users from across the globe, including users
>> from
>> > academic settings. We aim to further diversify the Livy user and
>> contributor
>> > base.
>> >
>> > == Relationships with other Apache projects ==
>> >
>> > Livy is closely tied to the Apache Spark project and currently addresses
>> the
>> > scenarios for a REST based batch and interactive gateway for Spark jobs
>> on
>> > YARN. Given the growing number of integrations with Livy, keeping it
>> outside
>> > of Apache Spark aligns with the desire of the Apache Spark community to
>> > reduce the number of external dependencies in the Spark project.
>> > Specifically, the Apache Spark community has previously expressed a
>> desire
>> > to keep job servers independent from the project.<<FootNote(See, for
>> > example, discussion of the Ooyala Spark Job Server in SPARK-818)>>
>> > Furthermore, while Livy common usage is closely tied to Spark deployments
>> > right now, its core building blocks can be reused elsewhere.  Livy’s
>> Remote
>> > REPL could be used as a library for interactive scenarios in non-Spark
>> > projects. In the future, integrations with cluster managers like Apache
>> > Mesos and others could also be added.
>> >
>> > The features provided by Livy have already been integrated with existing
>> > projects like Jupyter and Apache Zeppelin for their interactive Spark use
>> > cases. This validates the need for a project like Livy and provides an
>> > active downstream user base that the Livy community can interact with to
>> > seed future interest in the project.
>> >
>> > Livy serves a similar purpose to Apache Toree (incubating) but differs in
>> > making session management, security and impersonation a focal design
>> point.
>> >
>> > == An Excessive Fascination with the Apache Brand ==
>> >
>> > The primary motivation for submitting Livy to the ASF is to grow a
>> diverse
>> > and strong community. We wish to encourage diverse organisations,
>> including
>> > ISVs, to adopt Livy and contribute to Livy without any concerns about
>> > ownership or licensing.
>> >
>> > = Documentation =
>> >
>> > Documentation can be found on the Livy website http://livy.io/
>> >
>> > The Livy web site is version controlled on the ‘gh-pages’ branch of the
>> > above repository.
>> > Additional documentation is provided on the github wiki:
>> > https://github.com/cloudera/livy/wiki
>> > APis are documented within the source code as JavaDoc style documentation
>> > comments.
>> >
>> > = Initial Source =
>> >
>> > The initial source code for Livy is hosted at
>> > https://github.com/cloudera/livy
>> >
>> > = Source and Intellectual Property submission plan =
>> >
>> > The Livy codebase and web site is currently hosted on GitHub and will be
>> > transitioned to the ASF repositories during incubation. Livy is already
>> > licensed under the Apache 2.0 license. Cloudera has collected ICLAs and
>> > CCLAs from all committers. There are, however, some contributions
>> recently
>> > from authors that have not signed the CCLA and ICLA. If necessary for a
>> > successful SGA, we’ll seek the necessary documentation or replace the
>> > contributions.
>> >
>> > The “Livy” name is not a registered trademark. We will need to do a
>> > trademark search and make sure it is available for the Apache Foundation
>> > prior to graduation.
>> >
>> > Cloudera currently owns the domain name: http://livy.io/. Once all the
>> > documentation has moved over to ASF infrastructure, the main landing page
>> > will become livy.incubator.apache.org and the old domain will just act
>> as a
>> > redirect.
>> >
>> > = External Dependencies =
>> >
>> > The list below covers the non-Apache dependencies of the project and
>> their
>> > licenses.
>> >
>> >  * Jetty: Apache 2.0
>> >  * Dropwizard Metrics: Apache 2.0
>> >  * FasterXML Jackson: Apache 2.0
>> >  * Netty: Apache 2.0
>> >  * Scala: BSD
>> >  * Py4J: BSD
>> >  * Scalatra: BSD
>> >
>> > Build/test-only dependencies:
>> >
>> >  * Mockito: MIT
>> >  * JUnit: Eclipse
>> >
>> > = Required Resources =
>> >
>> > == Mailing Lists ==
>> >
>> >  * private@livy.incubator.apache.org (PPMC)
>> >  * dev@livy.incubator.apache.org (dev mailing list)
>> >  * user@livy.incubator.apache.org (User questions)
>> >  * commits@livy.incubator.apache.org (subscribers shouldn’t be able to
>> post)
>> >  * issues@livy.incubator.apache.org (subscribers shouldn’t be able to
>> post)
>> >
>> > == Git Repository ==
>> >
>> > git://git.apache.org/incubator-livy
>> >
>> > == Issue Tracking ==
>> >
>> > We would like to import our current JIRA project into the ASF JIRA, such
>> > that our historical commit message and code comments continue to
>> reference
>> > the appropriate bug numbers.
>> >
>> > = Initial Committers =
>> >
>> >  * Marcelo Vanzin (vanzin@cloudera.com)
>> >  * Alex Man (alex@alexman.space)
>> >  * Jeff Zhang (zjffdu@gmail.com)
>> >  * Saisai Shao (sshao@hortonworks.com)
>> >  * Kostas Sakellis (kostas@cloudera.com)
>> >
>> > = Affiliations =
>> >
>> > The initial set of committers includes people employed by Cloudera and
>> > Hortonworks as well as one currently independent contributor.
>> >
>> > = Additional Interested Contributors =
>> >
>> > Those interested in getting involved with the project as we enter
>> incubation
>> > are encouraged to list themselves here.
>> >
>> >   * Ismaël Mejía (iemejia@apache.org)
>> >
>> > = Sponsors =
>> >
>> > == Champion ==
>> >
>> > Sean Busbey (busbey@apache.org)
>> >
>> > == Nominated Mentors ==
>> >
>> >  * Bikas Saha (bikas@apache.org)
>> >  * Brock Noland (brock@phdata.io)
>> >  * Luciano Resende (lresende@apache.org)
>> >
>> > == Sponsoring Entity ==
>> >
>> > We ask that the Incubator PMC sponsor this proposal.
>> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>> > For additional commands, e-mail: general-help@incubator.apache.org
>> >
>>
>>
>>
>> --
>> Marcelo
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>> For additional commands, e-mail: general-help@incubator.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org

Re: [VOTE] Livy to enter Apache Incubator

Posted by Neelesh Salian <ne...@gmail.com>.

+1 (non-binding)
Thanks for putting this together.

On May 31, 2017 9:46 AM, "Marcelo Vanzin" <va...@cloudera.com> wrote:

> +1 (non-binding)
>
> On Wed, May 31, 2017 at 6:03 AM, Sean Busbey <bu...@apache.org> wrote:
> > Hi folks!
> >
> > I'm calling a vote to accept "Livy" into the Apache Incubator.
> >
> > The full proposal is available below, and is also available in the wiki:
> >
> > https://wiki.apache.org/incubator/LivyProposal
> >
> > For additional context, please see the discussion thread:
> >
> > https://s.apache.org/incubator-livy-proposal-thread
> >
> > Please cast your vote:
> >
> > [ ] +1, bring Livy into Incubator
> > [ ] -1, do not bring Livy into Incubator, because...
> >
> > The vote will open at least for 72 hours and only votes from the
> Incubator
> > PMC are binding.
> >
> > I start with my vote:
> > +1
> >
> > ----
> >
> > = Abstract =
> >
> > Livy is web service that exposes a REST interface for managing long
> running
> > Apache Spark contexts in your cluster. With Livy, new applications can be
> > built on top of Apache Spark that require fine grained interaction with
> many
> > Spark contexts.
> >
> > = Proposal =
> >
> > Livy is an open-source REST service for Apache Spark. Livy enables
> > applications to submit Spark applications and retrieve results without a
> > co-location requirement on the Spark cluster.
> >
> > We propose to contribute the Livy codebase and associated artifacts (e.g.
> > documentation, web-site context etc) to the Apache Software Foundation.
> >
> > = Background =
> >
> > Apache Spark is a fast and general purpose distributed compute engine,
> with
> > a versatile API. It enables processing of large quantities of static data
> > distributed over a cluster of machines, as well as processing of
> continuous
> > streams of data. It is the preferred distributed data processing engine
> for
> > data engineering, stream processing and data science workloads. Each
> Spark
> > application uses a construct called the SparkContext, which is the
> > application’s connection or entry point to the Spark engine. Each Spark
> > application will have its own SparkContext.
> >
> > Livy enables clients to interact with one or more Spark sessions through
> the
> > Livy Server, which acts as a proxy layer. Livy Clients have fine grained
> > control over the lifecycle of the Spark sessions, as well as the ability
> to
> > submit jobs and retrieve results, all over HTTP. Clients have two modes
> of
> > interaction: RPC Client API, available in Java and Python, which allows
> > results to be retrieved as Java or Python objects. The serialization and
> > deserialization of the results is handled by the Livy framework. HTTP
> based
> > API that allows submission of code snippets, and retrieval of the
> results in
> > different formats.
> >
> > Multi-tenant resource allocation and security: Livy enables multiple
> > independent Spark sessions to be managed simultaneously. Multiple clients
> > can also interact simultaneously with the same Spark session and share
> the
> > resources of that Spark session. Livy can also enforce secure,
> authenticated
> > communication between the clients and their respective Spark sessions.
> >
> > More information on Livy can be found at the existing open source
> website:
> > http://livy.io/
> >
> > = Rationale =
> >
> > Users want to use Spark’s powerful processing engine and API as the data
> > processing backend for interactive applications. However, the job
> submission
> > and application interaction mechanisms built into Apache Spark are
> > insufficient and cumbersome for multi-user interactive applications.
> >
> > The primary mechanism for applications to submit Spark jobs is via
> > spark-submit
> > (http://spark.apache.org/docs/latest/submitting-applications.html),
> which is
> > available as a command line tool as well as a programmatic API. However,
> > spark-submit has the following limitations that make it difficult to
> build
> > interactive applications: It is slow: each invocation of spark-submit
> > involves a setup phase where cluster resources are acquired, new
> processes
> > are forked, etc. This setup phase runs for many seconds, or even minutes,
> > and hence is too slow for interactive applications. It is cumbersome and
> > lacks flexibility: application code and dependencies have to be
> pre-compiled
> > and submitted as jars, and can not be submitted interactively.
> >
> > Apache Spark comes with an ODBC/JDBC server, which can be used to submit
> SQL
> > queries to Spark. However, this solution is limited to SQL and does not
> > allow the client to leverage the rest of the Spark API, such as RDDs,
> MLlib
> > and Streaming.
> >
> > A third way of using Spark is via its command-line shell, which allows
> the
> > interactive submission of snippets of Spark code. However, the shell
> entails
> > running Spark code on the client machine and hence is not a viable
> mechanism
> > for remote clients to submit Spark jobs.
> >
> > Livy solves the limitations of the above three mechanisms, and provides
> the
> > full Spark API as a multi-tenant service to remote clients.
> >
> > Since the open source release of Livy in late 2015, we have seen
> tremendous
> > interest among a diverse set of application developers and ISVs that
> want to
> > build applications with Apache Spark. To make Livy a robust and flexible
> > solution that will enable a broad and growing set of applications, it is
> > important to grow a large and varied community of contributors.
> >
> > = Initial Goals =
> >
> >   * Move existing codebase, website, documentation and mailing lists to
> >     Apache-hosted infrastructure
> >   * Work with the infrastructure team to implement and approve our code
> >     review, build, and testing workflows in the context of the ASF
> >   * Incremental development and releases per Apache guidelines
> >
> > = Current Status =
> >
> > The Livy project began at Cloudera, as a part of the Hue project.
> Cloudera
> > soon realized the broad applicability of Livy, and separated it out into
> an
> > independent project in Nov 2015.
> >
> > == Releases ==
> >
> > Livy has undergone two public releases, tagged here:
> >
> >  * https://github.com/cloudera/livy/releases/tag/v0.2.0
> >  * https://github.com/cloudera/livy/releases/tag/v0.3.0
> >
> > Tarballs and zip files were created for each release and hosted on
> github.
> > Upon joining the incubator, we will adopt a more typical ASF release
> > process.
> >
> > == Source ==
> >
> > Livy’s source is currently hosted on Github at:
> > https://github.com/cloudera/livy
> >
> > This repository will be transitioned to Apache’s git hosting during
> > incubation.
> >
> > == Code review ==
> >
> > Livy’s code reviews are currently public and hosted on github as pull
> > request reviews at: https://github.com/cloudera/livy/pulls
> > The Livy developer community so far is happy with github pull request
> > reviews and hopes to continue this after being admitted to the ASF.
> >
> > == Issue Tracking ==
> >
> > Livy’s bug and feature tracking is hosted on JIRA at:
> > https://issues.cloudera.org/projects/LIVY/summary
> > This JIRA instance contains bugs and development discussion dating back 1
> > year and will provide an initial seed for the ASF JIRA
> >
> > == Community Discussion ==
> >
> > Livy has several public discussion forums:
> >
> >  * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-dev
> >  * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-user
> >
> > == Development Practices ==
> >
> > The Livy project follows a review before commit philosophy. Every commit
> > automatically runs through the unit tests and generates coverage reports
> > presented as a pull request comment. Our experience with this process
> leads
> > us to believe that it helps ease new contributors into the project. They
> get
> > feedback quickly on common mistakes, lowering the burden on reviewers.
> Those
> > same reviewers get to lead by example, showing the new contributors that
> we
> > value feedback within our community even when changes are done by more
> > experienced folks.
> >
> > == Meritocracy ==
> >
> > We believe strongly in meritocracy when electing committers and PMC
> members.
> > In the past few months, the project has added two new committers from two
> > different organisations, in recognition of their significant
> contributions
> > to the project. We will encourage contributions and participation of all
> > types, and ensure that contributors are appropriately recognized.
> >
> > == Community ==
> >
> > Though Livy is relatively new as a standalone open source project, it has
> > already seen promising growth in its community across several
> organizations:
> > Cloudera is the original development sponsor for Livy
> > Microsoft pushed the development of the interpreter fixing high
> availability
> > issues and adding additional features.
> > Hortonworks has contributed the security features to Livy allowing
> kerberos
> > and impersonation to work with Spark
> > IBM is starting to make contributions to the Livy project
> > A number of other patches contributed by community members
> >
> > Livy currently relies on Google Groups for mailing lists. These lists
> have
> > been active since the end of 2015/start of 2016. Currently, Livy’s user
> > mailing list has 173 subscribers and has hosted a total of 227 topic
> > threads. Livy’s developer list has 49 subscribers and has hosted 79 topic
> > threads.
> >
> > == Core Developers ==
> >
> > The early contributions to Livy were made by Cloudera engineers. In 2016,
> > engineers from Microsoft and Hortonworks joined the core developer
> > community.
> >
> > == Alignment ==
> >
> > Livy is built upon Apache Spark, and other Apache projects like Apache
> > Hadoop YARN. It’s used as a building block by Apache Zeppelin. These
> > community connections combined with our focus on development practices
> that
> > emphasize community engagement with a path to meritocratic recognition
> > naturally align us with the ASF.
> >
> > = Known Risks =
> >
> > == Orphaned Products ==
> >
> > The risk of Livy being abandoned is low because it is supported by three
> > major big-data software vendors. Moreover, Livy is already used to power
> > multiple releases of services and products used in production.
> >
> > == Inexperience with Open Source ==
> >
> > Several of the initial committers are experienced open source developers,
> > several being committers and/or PMC members on other ASF projects (Spark,
> > YARN).
> >
> > == Homogenous Developers ==
> >
> > The project already has a diverse developer base. It has contributions
> from
> > 3 major organisations (Cloudera, Microsoft and Hortonworks), and is used
> in
> > diverse applications, in diverse settings (On-Prem and Cloud).
> >
> > == Reliance on salaried Developers ==
> >
> > The contributions to the Livy project to date have been made by salaried
> > engineers from Cloudera, Microsoft and Hortonworks. One of the
> individuals
> > on the initial committer list has since left Microsoft and is currently
> > unaffiliated. The remaining contributors are from Cloudera and
> Hortonworks.
> > Since there are at least two major organizations involved, the risk of
> > reliance on a single group of salaried developers is mitigated. The Livy
> > user base is diverse, with users from across the globe, including users
> from
> > academic settings. We aim to further diversify the Livy user and
> contributor
> > base.
> >
> > == Relationships with other Apache projects ==
> >
> > Livy is closely tied to the Apache Spark project and currently addresses
> the
> > scenarios for a REST based batch and interactive gateway for Spark jobs
> on
> > YARN. Given the growing number of integrations with Livy, keeping it
> outside
> > of Apache Spark aligns with the desire of the Apache Spark community to
> > reduce the number of external dependencies in the Spark project.
> > Specifically, the Apache Spark community has previously expressed a
> desire
> > to keep job servers independent from the project.<<FootNote(See, for
> > example, discussion of the Ooyala Spark Job Server in SPARK-818)>>
> > Furthermore, while Livy common usage is closely tied to Spark deployments
> > right now, its core building blocks can be reused elsewhere.  Livy’s
> Remote
> > REPL could be used as a library for interactive scenarios in non-Spark
> > projects. In the future, integrations with cluster managers like Apache
> > Mesos and others could also be added.
> >
> > The features provided by Livy have already been integrated with existing
> > projects like Jupyter and Apache Zeppelin for their interactive Spark use
> > cases. This validates the need for a project like Livy and provides an
> > active downstream user base that the Livy community can interact with to
> > seed future interest in the project.
> >
> > Livy serves a similar purpose to Apache Toree (incubating) but differs in
> > making session management, security and impersonation a focal design
> point.
> >
> > == An Excessive Fascination with the Apache Brand ==
> >
> > The primary motivation for submitting Livy to the ASF is to grow a
> diverse
> > and strong community. We wish to encourage diverse organisations,
> including
> > ISVs, to adopt Livy and contribute to Livy without any concerns about
> > ownership or licensing.
> >
> > = Documentation =
> >
> > Documentation can be found on the Livy website http://livy.io/
> >
> > The Livy web site is version controlled on the ‘gh-pages’ branch of the
> > above repository.
> > Additional documentation is provided on the github wiki:
> > https://github.com/cloudera/livy/wiki
> > APis are documented within the source code as JavaDoc style documentation
> > comments.
> >
> > = Initial Source =
> >
> > The initial source code for Livy is hosted at
> > https://github.com/cloudera/livy
> >
> > = Source and Intellectual Property submission plan =
> >
> > The Livy codebase and web site is currently hosted on GitHub and will be
> > transitioned to the ASF repositories during incubation. Livy is already
> > licensed under the Apache 2.0 license. Cloudera has collected ICLAs and
> > CCLAs from all committers. There are, however, some contributions
> recently
> > from authors that have not signed the CCLA and ICLA. If necessary for a
> > successful SGA, we’ll seek the necessary documentation or replace the
> > contributions.
> >
> > The “Livy” name is not a registered trademark. We will need to do a
> > trademark search and make sure it is available for the Apache Foundation
> > prior to graduation.
> >
> > Cloudera currently owns the domain name: http://livy.io/. Once all the
> > documentation has moved over to ASF infrastructure, the main landing page
> > will become livy.incubator.apache.org and the old domain will just act
> as a
> > redirect.
> >
> > = External Dependencies =
> >
> > The list below covers the non-Apache dependencies of the project and
> their
> > licenses.
> >
> >  * Jetty: Apache 2.0
> >  * Dropwizard Metrics: Apache 2.0
> >  * FasterXML Jackson: Apache 2.0
> >  * Netty: Apache 2.0
> >  * Scala: BSD
> >  * Py4J: BSD
> >  * Scalatra: BSD
> >
> > Build/test-only dependencies:
> >
> >  * Mockito: MIT
> >  * JUnit: Eclipse
> >
> > = Required Resources =
> >
> > == Mailing Lists ==
> >
> >  * private@livy.incubator.apache.org (PPMC)
> >  * dev@livy.incubator.apache.org (dev mailing list)
> >  * user@livy.incubator.apache.org (User questions)
> >  * commits@livy.incubator.apache.org (subscribers shouldn’t be able to
> post)
> >  * issues@livy.incubator.apache.org (subscribers shouldn’t be able to
> post)
> >
> > == Git Repository ==
> >
> > git://git.apache.org/incubator-livy
> >
> > == Issue Tracking ==
> >
> > We would like to import our current JIRA project into the ASF JIRA, such
> > that our historical commit message and code comments continue to
> reference
> > the appropriate bug numbers.
> >
> > = Initial Committers =
> >
> >  * Marcelo Vanzin (vanzin@cloudera.com)
> >  * Alex Man (alex@alexman.space)
> >  * Jeff Zhang (zjffdu@gmail.com)
> >  * Saisai Shao (sshao@hortonworks.com)
> >  * Kostas Sakellis (kostas@cloudera.com)
> >
> > = Affiliations =
> >
> > The initial set of committers includes people employed by Cloudera and
> > Hortonworks as well as one currently independent contributor.
> >
> > = Additional Interested Contributors =
> >
> > Those interested in getting involved with the project as we enter
> incubation
> > are encouraged to list themselves here.
> >
> >   * Ismaël Mejía (iemejia@apache.org)
> >
> > = Sponsors =
> >
> > == Champion ==
> >
> > Sean Busbey (busbey@apache.org)
> >
> > == Nominated Mentors ==
> >
> >  * Bikas Saha (bikas@apache.org)
> >  * Brock Noland (brock@phdata.io)
> >  * Luciano Resende (lresende@apache.org)
> >
> > == Sponsoring Entity ==
> >
> > We ask that the Incubator PMC sponsor this proposal.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > For additional commands, e-mail: general-help@incubator.apache.org
> >
>
>
>
> --
> Marcelo
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>

Re: [VOTE] Livy to enter Apache Incubator

Posted by Marcelo Vanzin <va...@cloudera.com>.

+1 (non-binding)

On Wed, May 31, 2017 at 6:03 AM, Sean Busbey <bu...@apache.org> wrote:
> Hi folks!
>
> I'm calling a vote to accept "Livy" into the Apache Incubator.
>
> The full proposal is available below, and is also available in the wiki:
>
> https://wiki.apache.org/incubator/LivyProposal
>
> For additional context, please see the discussion thread:
>
> https://s.apache.org/incubator-livy-proposal-thread
>
> Please cast your vote:
>
> [ ] +1, bring Livy into Incubator
> [ ] -1, do not bring Livy into Incubator, because...
>
> The vote will open at least for 72 hours and only votes from the Incubator
> PMC are binding.
>
> I start with my vote:
> +1
>
> ----
>
> = Abstract =
>
> Livy is web service that exposes a REST interface for managing long running
> Apache Spark contexts in your cluster. With Livy, new applications can be
> built on top of Apache Spark that require fine grained interaction with many
> Spark contexts.
>
> = Proposal =
>
> Livy is an open-source REST service for Apache Spark. Livy enables
> applications to submit Spark applications and retrieve results without a
> co-location requirement on the Spark cluster.
>
> We propose to contribute the Livy codebase and associated artifacts (e.g.
> documentation, web-site context etc) to the Apache Software Foundation.
>
> = Background =
>
> Apache Spark is a fast and general purpose distributed compute engine, with
> a versatile API. It enables processing of large quantities of static data
> distributed over a cluster of machines, as well as processing of continuous
> streams of data. It is the preferred distributed data processing engine for
> data engineering, stream processing and data science workloads. Each Spark
> application uses a construct called the SparkContext, which is the
> application’s connection or entry point to the Spark engine. Each Spark
> application will have its own SparkContext.
>
> Livy enables clients to interact with one or more Spark sessions through the
> Livy Server, which acts as a proxy layer. Livy Clients have fine grained
> control over the lifecycle of the Spark sessions, as well as the ability to
> submit jobs and retrieve results, all over HTTP. Clients have two modes of
> interaction: RPC Client API, available in Java and Python, which allows
> results to be retrieved as Java or Python objects. The serialization and
> deserialization of the results is handled by the Livy framework. HTTP based
> API that allows submission of code snippets, and retrieval of the results in
> different formats.
>
> Multi-tenant resource allocation and security: Livy enables multiple
> independent Spark sessions to be managed simultaneously. Multiple clients
> can also interact simultaneously with the same Spark session and share the
> resources of that Spark session. Livy can also enforce secure, authenticated
> communication between the clients and their respective Spark sessions.
>
> More information on Livy can be found at the existing open source website:
> http://livy.io/
>
> = Rationale =
>
> Users want to use Spark’s powerful processing engine and API as the data
> processing backend for interactive applications. However, the job submission
> and application interaction mechanisms built into Apache Spark are
> insufficient and cumbersome for multi-user interactive applications.
>
> The primary mechanism for applications to submit Spark jobs is via
> spark-submit
> (http://spark.apache.org/docs/latest/submitting-applications.html), which is
> available as a command line tool as well as a programmatic API. However,
> spark-submit has the following limitations that make it difficult to build
> interactive applications: It is slow: each invocation of spark-submit
> involves a setup phase where cluster resources are acquired, new processes
> are forked, etc. This setup phase runs for many seconds, or even minutes,
> and hence is too slow for interactive applications. It is cumbersome and
> lacks flexibility: application code and dependencies have to be pre-compiled
> and submitted as jars, and can not be submitted interactively.
>
> Apache Spark comes with an ODBC/JDBC server, which can be used to submit SQL
> queries to Spark. However, this solution is limited to SQL and does not
> allow the client to leverage the rest of the Spark API, such as RDDs, MLlib
> and Streaming.
>
> A third way of using Spark is via its command-line shell, which allows the
> interactive submission of snippets of Spark code. However, the shell entails
> running Spark code on the client machine and hence is not a viable mechanism
> for remote clients to submit Spark jobs.
>
> Livy solves the limitations of the above three mechanisms, and provides the
> full Spark API as a multi-tenant service to remote clients.
>
> Since the open source release of Livy in late 2015, we have seen tremendous
> interest among a diverse set of application developers and ISVs that want to
> build applications with Apache Spark. To make Livy a robust and flexible
> solution that will enable a broad and growing set of applications, it is
> important to grow a large and varied community of contributors.
>
> = Initial Goals =
>
>   * Move existing codebase, website, documentation and mailing lists to
>     Apache-hosted infrastructure
>   * Work with the infrastructure team to implement and approve our code
>     review, build, and testing workflows in the context of the ASF
>   * Incremental development and releases per Apache guidelines
>
> = Current Status =
>
> The Livy project began at Cloudera, as a part of the Hue project. Cloudera
> soon realized the broad applicability of Livy, and separated it out into an
> independent project in Nov 2015.
>
> == Releases ==
>
> Livy has undergone two public releases, tagged here:
>
>  * https://github.com/cloudera/livy/releases/tag/v0.2.0
>  * https://github.com/cloudera/livy/releases/tag/v0.3.0
>
> Tarballs and zip files were created for each release and hosted on github.
> Upon joining the incubator, we will adopt a more typical ASF release
> process.
>
> == Source ==
>
> Livy’s source is currently hosted on Github at:
> https://github.com/cloudera/livy
>
> This repository will be transitioned to Apache’s git hosting during
> incubation.
>
> == Code review ==
>
> Livy’s code reviews are currently public and hosted on github as pull
> request reviews at: https://github.com/cloudera/livy/pulls
> The Livy developer community so far is happy with github pull request
> reviews and hopes to continue this after being admitted to the ASF.
>
> == Issue Tracking ==
>
> Livy’s bug and feature tracking is hosted on JIRA at:
> https://issues.cloudera.org/projects/LIVY/summary
> This JIRA instance contains bugs and development discussion dating back 1
> year and will provide an initial seed for the ASF JIRA
>
> == Community Discussion ==
>
> Livy has several public discussion forums:
>
>  * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-dev
>  * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-user
>
> == Development Practices ==
>
> The Livy project follows a review before commit philosophy. Every commit
> automatically runs through the unit tests and generates coverage reports
> presented as a pull request comment. Our experience with this process leads
> us to believe that it helps ease new contributors into the project. They get
> feedback quickly on common mistakes, lowering the burden on reviewers. Those
> same reviewers get to lead by example, showing the new contributors that we
> value feedback within our community even when changes are done by more
> experienced folks.
>
> == Meritocracy ==
>
> We believe strongly in meritocracy when electing committers and PMC members.
> In the past few months, the project has added two new committers from two
> different organisations, in recognition of their significant contributions
> to the project. We will encourage contributions and participation of all
> types, and ensure that contributors are appropriately recognized.
>
> == Community ==
>
> Though Livy is relatively new as a standalone open source project, it has
> already seen promising growth in its community across several organizations:
> Cloudera is the original development sponsor for Livy
> Microsoft pushed the development of the interpreter fixing high availability
> issues and adding additional features.
> Hortonworks has contributed the security features to Livy allowing kerberos
> and impersonation to work with Spark
> IBM is starting to make contributions to the Livy project
> A number of other patches contributed by community members
>
> Livy currently relies on Google Groups for mailing lists. These lists have
> been active since the end of 2015/start of 2016. Currently, Livy’s user
> mailing list has 173 subscribers and has hosted a total of 227 topic
> threads. Livy’s developer list has 49 subscribers and has hosted 79 topic
> threads.
>
> == Core Developers ==
>
> The early contributions to Livy were made by Cloudera engineers. In 2016,
> engineers from Microsoft and Hortonworks joined the core developer
> community.
>
> == Alignment ==
>
> Livy is built upon Apache Spark, and other Apache projects like Apache
> Hadoop YARN. It’s used as a building block by Apache Zeppelin. These
> community connections combined with our focus on development practices that
> emphasize community engagement with a path to meritocratic recognition
> naturally align us with the ASF.
>
> = Known Risks =
>
> == Orphaned Products ==
>
> The risk of Livy being abandoned is low because it is supported by three
> major big-data software vendors. Moreover, Livy is already used to power
> multiple releases of services and products used in production.
>
> == Inexperience with Open Source ==
>
> Several of the initial committers are experienced open source developers,
> several being committers and/or PMC members on other ASF projects (Spark,
> YARN).
>
> == Homogenous Developers ==
>
> The project already has a diverse developer base. It has contributions from
> 3 major organisations (Cloudera, Microsoft and Hortonworks), and is used in
> diverse applications, in diverse settings (On-Prem and Cloud).
>
> == Reliance on salaried Developers ==
>
> The contributions to the Livy project to date have been made by salaried
> engineers from Cloudera, Microsoft and Hortonworks. One of the individuals
> on the initial committer list has since left Microsoft and is currently
> unaffiliated. The remaining contributors are from Cloudera and Hortonworks.
> Since there are at least two major organizations involved, the risk of
> reliance on a single group of salaried developers is mitigated. The Livy
> user base is diverse, with users from across the globe, including users from
> academic settings. We aim to further diversify the Livy user and contributor
> base.
>
> == Relationships with other Apache projects ==
>
> Livy is closely tied to the Apache Spark project and currently addresses the
> scenarios for a REST based batch and interactive gateway for Spark jobs on
> YARN. Given the growing number of integrations with Livy, keeping it outside
> of Apache Spark aligns with the desire of the Apache Spark community to
> reduce the number of external dependencies in the Spark project.
> Specifically, the Apache Spark community has previously expressed a desire
> to keep job servers independent from the project.<<FootNote(See, for
> example, discussion of the Ooyala Spark Job Server in SPARK-818)>>
> Furthermore, while Livy common usage is closely tied to Spark deployments
> right now, its core building blocks can be reused elsewhere.  Livy’s Remote
> REPL could be used as a library for interactive scenarios in non-Spark
> projects. In the future, integrations with cluster managers like Apache
> Mesos and others could also be added.
>
> The features provided by Livy have already been integrated with existing
> projects like Jupyter and Apache Zeppelin for their interactive Spark use
> cases. This validates the need for a project like Livy and provides an
> active downstream user base that the Livy community can interact with to
> seed future interest in the project.
>
> Livy serves a similar purpose to Apache Toree (incubating) but differs in
> making session management, security and impersonation a focal design point.
>
> == An Excessive Fascination with the Apache Brand ==
>
> The primary motivation for submitting Livy to the ASF is to grow a diverse
> and strong community. We wish to encourage diverse organisations, including
> ISVs, to adopt Livy and contribute to Livy without any concerns about
> ownership or licensing.
>
> = Documentation =
>
> Documentation can be found on the Livy website http://livy.io/
>
> The Livy web site is version controlled on the ‘gh-pages’ branch of the
> above repository.
> Additional documentation is provided on the github wiki:
> https://github.com/cloudera/livy/wiki
> APis are documented within the source code as JavaDoc style documentation
> comments.
>
> = Initial Source =
>
> The initial source code for Livy is hosted at
> https://github.com/cloudera/livy
>
> = Source and Intellectual Property submission plan =
>
> The Livy codebase and web site is currently hosted on GitHub and will be
> transitioned to the ASF repositories during incubation. Livy is already
> licensed under the Apache 2.0 license. Cloudera has collected ICLAs and
> CCLAs from all committers. There are, however, some contributions recently
> from authors that have not signed the CCLA and ICLA. If necessary for a
> successful SGA, we’ll seek the necessary documentation or replace the
> contributions.
>
> The “Livy” name is not a registered trademark. We will need to do a
> trademark search and make sure it is available for the Apache Foundation
> prior to graduation.
>
> Cloudera currently owns the domain name: http://livy.io/. Once all the
> documentation has moved over to ASF infrastructure, the main landing page
> will become livy.incubator.apache.org and the old domain will just act as a
> redirect.
>
> = External Dependencies =
>
> The list below covers the non-Apache dependencies of the project and their
> licenses.
>
>  * Jetty: Apache 2.0
>  * Dropwizard Metrics: Apache 2.0
>  * FasterXML Jackson: Apache 2.0
>  * Netty: Apache 2.0
>  * Scala: BSD
>  * Py4J: BSD
>  * Scalatra: BSD
>
> Build/test-only dependencies:
>
>  * Mockito: MIT
>  * JUnit: Eclipse
>
> = Required Resources =
>
> == Mailing Lists ==
>
>  * private@livy.incubator.apache.org (PPMC)
>  * dev@livy.incubator.apache.org (dev mailing list)
>  * user@livy.incubator.apache.org (User questions)
>  * commits@livy.incubator.apache.org (subscribers shouldn’t be able to post)
>  * issues@livy.incubator.apache.org (subscribers shouldn’t be able to post)
>
> == Git Repository ==
>
> git://git.apache.org/incubator-livy
>
> == Issue Tracking ==
>
> We would like to import our current JIRA project into the ASF JIRA, such
> that our historical commit message and code comments continue to reference
> the appropriate bug numbers.
>
> = Initial Committers =
>
>  * Marcelo Vanzin (vanzin@cloudera.com)
>  * Alex Man (alex@alexman.space)
>  * Jeff Zhang (zjffdu@gmail.com)
>  * Saisai Shao (sshao@hortonworks.com)
>  * Kostas Sakellis (kostas@cloudera.com)
>
> = Affiliations =
>
> The initial set of committers includes people employed by Cloudera and
> Hortonworks as well as one currently independent contributor.
>
> = Additional Interested Contributors =
>
> Those interested in getting involved with the project as we enter incubation
> are encouraged to list themselves here.
>
>   * Ismaël Mejía (iemejia@apache.org)
>
> = Sponsors =
>
> == Champion ==
>
> Sean Busbey (busbey@apache.org)
>
> == Nominated Mentors ==
>
>  * Bikas Saha (bikas@apache.org)
>  * Brock Noland (brock@phdata.io)
>  * Luciano Resende (lresende@apache.org)
>
> == Sponsoring Entity ==
>
> We ask that the Incubator PMC sponsor this proposal.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org

Re: [RESULT] Re: [VOTE] Livy to enter Apache Incubator

Posted by Bikas Saha <bi...@apache.org>.

Hi,

I am sorry I was off email when all of this happened. Would like to add my post result +1 to join everyone in supporting this effort!

Thanks!
Bikas

________________________________
From: Sean Busbey <bu...@apache.org>
Sent: Monday, June 5, 2017 10:34:45 AM
To: general@incubator.apache.org
Subject: [RESULT] Re: [VOTE] Livy to enter Apache Incubator

With 7 binding +1 votes (and 14 non-binding +1 votes), this vote passes.

Thanks for everyone who took the time to vote!

I'll coordinate with the mentors to start the initial paperwork today. (And we'd be thrilled for the 4th mentor JB!)

binding:
Larry McCay
Jean-Baptiste OnofrÃ©
Luciano Resende
Andrew Purtell
Brock Noland
Raphael Bircher
Hitesh Shah

non-binding:
Sean Busbey
IsmaÃ«l MejÃa
Marcelo Vanzin
Neelesh Salian
Phillip Rhodes
Kostas Sakellis
Jeff Zhang
Saisai Shao
tim shea
Pierre Smits
Arpit Agarwal
Madhawa Kasun Gunasekara
Bruno MahÃ©
Felix Cheung

-busbey

On 2017-05-31 08:03 (-0500), "Sean Busbey"<bu...@apache.org> wrote:
> Hi folks!
>
> I'm calling a vote to accept "Livy" into the Apache Incubator.
>
> The full proposal is available below, and is also available in the wiki:
>
> https://wiki.apache.org/incubator/LivyProposal
>
> For additional context, please see the discussion thread:
>
> https://s.apache.org/incubator-livy-proposal-thread
>
> Please cast your vote:
>
> [ ]  1, bring Livy into Incubator
> [ ] -1, do not bring Livy into Incubator, because...
>
> The vote will open at least for 72 hours and only votes from the Incubator
> PMC are binding.
>
> I start with my vote:
>  1
>


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org

[RESULT] Re: [VOTE] Livy to enter Apache Incubator

Posted by Sean Busbey <bu...@apache.org>.

With 7 binding +1 votes (and 14 non-binding +1 votes), this vote passes.

Thanks for everyone who took the time to vote!

I'll coordinate with the mentors to start the initial paperwork today. (And we'd be thrilled for the 4th mentor JB!)

binding:
Larry McCay
Jean-Baptiste OnofrÃ©
Luciano Resende
Andrew Purtell
Brock Noland
Raphael Bircher
Hitesh Shah

non-binding:
Sean Busbey
IsmaÃ«l MejÃa
Marcelo Vanzin
Neelesh Salian
Phillip Rhodes
Kostas Sakellis
Jeff Zhang
Saisai Shao
tim shea
Pierre Smits
Arpit Agarwal
Madhawa Kasun Gunasekara
Bruno MahÃ©
Felix Cheung

-busbey

On 2017-05-31 08:03 (-0500), "Sean Busbey"<bu...@apache.org> wrote: 
> Hi folks!
> 
> I'm calling a vote to accept "Livy" into the Apache Incubator.
> 
> The full proposal is available below, and is also available in the wiki:
> 
> https://wiki.apache.org/incubator/LivyProposal
> 
> For additional context, please see the discussion thread:
> 
> https://s.apache.org/incubator-livy-proposal-thread
> 
> Please cast your vote:
> 
> [ ]  1, bring Livy into Incubator
> [ ] -1, do not bring Livy into Incubator, because...
> 
> The vote will open at least for 72 hours and only votes from the Incubator
> PMC are binding.
> 
> I start with my vote:
>  1
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org