You are viewing a plain text version of this content. The canonical link for it is here.

Posted to general@incubator.apache.org by Luciano Resende <lu...@gmail.com> on 2015/11/26 16:33:42 UTC

[VOTE] Accept Torii into Apache Incubator

After initial discussion (under the name Spark-Kernel), please vote on the
acceptance of Torii Project for incubation at the Apache Incubator. The
full proposal is
available at the end of this message and on the wiki at :

https://wiki.apache.org/incubator/ToriiProposal

Please cast your votes:

[ ] +1, bring Torii into Incubator
[ ] +0, I don't care either way
[ ] -1, do not bring Torii into Incubator, because...

Due to long weekend holiday in US, I will leave the vote open until
December 1st.


= Torii =

== Abstract ==
Torii provides applications with a mechanism to interactively and remotely
access Apache Spark.

== Proposal ==
Torii enables interactive applications to access Apache Spark clusters.
More specifically:
 * Applications can send code-snippets and libraries for execution by Spark
 * Applications can be deployed separately from Spark clusters and
communicate with the Torii using the provided Torii client
 * Execution results and streaming data can be sent back to calling
applications
 * Applications no longer have to be network connected to the workers on a
Spark cluster because the Torii acts as each application’s proxy
 * Work has started on enabling Torii to support languages in addition to
Scala, namely Python (with PySpark), R (with SparkR), and SQL (with
SparkSQL)

== Background & Rationale ==
Apache Spark provides applications with a fast and general purpose
distributed computing engine that supports static and streaming data,
tabular and graph representations of data, and an extensive library of
machine learning libraries. Consequently, a wide variety of applications
will be written for Spark and there will be interactive applications that
require relatively frequent function evaluations, and batch-oriented
applications that require one-shot or only occasional evaluation.

Apache Spark provides two mechanisms for applications to connect with
Spark. The primary mechanism launches applications on Spark clusters using
spark-submit (
http://spark.apache.org/docs/latest/submitting-applications.html); this
requires developers to bundle their application code plus any dependencies
into JAR files, and then submit them to Spark. A second mechanism is an
ODBC/JDBC API (
http://spark.apache.org/docs/latest/sql-programming-guide.html#distributed-sql-engine)
which enables applications to issue SQL queries against SparkSQL.

Our experience when developing interactive applications, such as analytic
applications integrated with Notebooks, to run against Spark was that the
spark-submit mechanism was overly cumbersome and slow (requiring JAR
creation and forking processes to run spark-submit), and the SQL interface
was too limiting and did not offer easy access to components other than
SparkSQL, such as streaming. The most promising mechanism provided by
Apache Spark was the command-line shell (
http://spark.apache.org/docs/latest/programming-guide.html#using-the-shell)
which enabled us to execute code snippets and dynamically control the tasks
submitted to  a Spark cluster. Spark does not provide the command-line
shell as a consumable service but it provided us with the starting point
from which we developed Torii.

== Current Status ==
Torii was first developed by a small team working on an internal-IBM
Spark-related project in July 2014. In recognition of its likely general
utility to Spark users and developers, in November 2014 the Torii project
was moved to GitHub and made available under the Apache License V2.

== Meritocracy ==
The current developers are familiar with the meritocratic open source
development process at Apache. As the project has gathered interest at
GitHub the developers have actively started a process to invite additional
developers into the project, and we have at least one new developer who is
ready to contribute code to the project.

== Community ==
We started building a community around Torii project when we moved it to
GitHub about one year ago. Since then we have grown to about 70 people, and
there are regular requests and suggestions from the community. We believe
that providing Apache Spark application developers with a general-purpose
and interactive API holds a lot of community potential, especially
considering possible tie-in’s with Notebooks and data science community.

== Core Developers ==
The core developers of the project are currently all from IBM, from the IBM
Emerging Technology team and from IBM’s recently formed Spark Technology
Center.

== Alignment ==
Apache, as the home of Apache Spark, is the most natural home for the Torii
project because it was designed to work with Apache Spark and to provide
capabilities for interactive applications and data science tools not
provided by Spark itself.

The Torii also has an affinity with Jupyter (jupyter.org) because it uses
the Jupyter protocol for communications, and so Jupyter Notebooks can
directly use the Torii as a kernel for communicating with Apache Spark.
However, we believe that the Torii provides a general-purpose mechanism
enabling a wider variety of applications than just Notebooks to access
Spark, and so the Torii’s greatest affinity is with Apache and Apache
Spark.

== Known Risks ==

=== Orphaned products ===
We believe the Torii project has a low-risk of abandonment due to interest
in its continuing existence from several parties. More specifically, the
Torii provides a capability that is not provided by Apache Spark today but
it enables a wider range of applications to leverage Spark. For example,
IBM uses (and is considering) the Torii in several offerings including its
IBM Analytics for Apache Spark product in the Bluemix Cloud. There are also
a couple of other commercial users who are using or considering its use in
their offerings. Furthermore, Jupyter Notebooks are used by data scientists
and Spark is gaining popularity as an analytic engine for them. Jupyter
Notebooks are very easily enabled with the Torii and so there is another
constituency for it.

=== Inexperience with Open Source ===
The Torii project has been running as an open-source project (albeit with
only IBM committers) for the past several months. The project has an active
issue tracker and due to the interest indicated by the nature and volume of
requests and comments, the team has publicly stated it is beginning to
build a process so they can accept third-party contributions to the project.

=== Relationships with Other Apache Products ===
The Torii has a clear affinity with the Apache Spark project because it is
designed to  provide capabilities for interactive applications and data
science tools not provided by Spark itself. The Torii can be a back-end for
the Zeppelin project currently incubating at Apache. There is interest from
the Torii community to develop this capability and an experimental branch
has been started.

=== Homogeneous Developers ===
The current group of developers working on Torii are all from IBM although
the group is in the process of expanding its membership to include members
of the GitHub community who are not from IBM and who have been active in
the Torii community in GutHub.

=== Reliance on Salaried Developers ===
The initial committers are full-time employees at IBM although not all work
on the project full-time.

=== Excessive Fascination with the Apache Brand ===
We believe the Torii benefits Apache Spark application developers, and we
are interested in an Apache Torii project to benefit these developers by
engaging a larger community, facilitating closer ties with the existing
Spark project, and yes, gaining more visibility for the Torii as a solution.

=== Documentation ===
Comprehensive documentation including “Getting Started”, API specifications
and a Roadmap are available from the GitHub project, see
https://github.com/ibm-et/Torii/wiki.

=== Initial Source ===
The source code resides at https://github.com/ibm-et/Torii.

=== External Dependencies ===
The Torii depends upon a number of Apache projects:
 * Spark
 * Hadoop
 * Ivy
 * Commons

The Torii also depends upon a number of other open source projects:
 * ZeroMQ (LGPL with Static Linking Exception,
http://zeromq.org/area:licensing)
 * Akka (MIT)
 * JOpt Simple (MIT)
 * Spring Framework Core (Apache v2)
 * Play (Apache v2)
 * SLF4J (MIT)
 * Scala
 * Scalatest (Apache v2)
 * Scalactic (Apache v2)
 * Mockito (MIT)

== Required Resources ==

=== Mailing lists ===

 * private@torii.incubator.apache.org (with moderated subscriptions)
 * commits@torii.incubator.apache.org
 * dev@torii.incubator.apache.org

=== Git Repository ===

 * https://git-wip-us.apache.org/repos/asf/incubator-torii.git

=== Issue Tracking ===

 * A JIRA issue tracker: https://issues.apache.org/jira/browse/TORII

== Initial Committers ==

 * Leugim Bustelo (lbustelo AT us DOT ibm DOT com)
 * Jakob Odersky (odersky AT us DOT ibm DOT com)
 * Luciano Resende (lresende AT apache DOT org)
 * Robert Senkbeil (rcsenkbe AT us DOT ibm DOT com)
 * Corey Stubbs (cstubbs AT us DOT ibm DOT com)
 * Miao Wang (wangmiao AT us DOT ibm DOT com)
 * Sean Welleck (swelleck AT us DOT ibm DOT com)

=== Affiliations ===
All of the initial committers are employed by IBM.

== Sponsors ==

=== Champion ===
 * Sam Ruby (rubys AT apache DOT org)

=== Nominated Mentors ===
 * Luciano Resende (lresende AT apache DOT org)
 * Reynold Xin (rxin AT apache DOT org)
 * Hitesh Shah (hitesh AT apache DOT org)
 * Julien Le Dem (julien AT apache DOT org)

=== Sponsoring Entity ===

We would like to propose the Apache Incubator to sponsor this project.


-- 
Luciano Resende
http://people.apache.org/~lresende
http://twitter.com/lresende1975
http://lresende.blogspot.com/

Re: [VOTE] Accept Torii into Apache Incubator

Posted by Hitesh Shah <hi...@apache.org>.

+1 (binding)

— Hitesh

On Nov 26, 2015, at 7:33 AM, Luciano Resende <lu...@gmail.com> wrote:

> After initial discussion (under the name Spark-Kernel), please vote on the
> acceptance of Torii Project for incubation at the Apache Incubator. The
> full proposal is
> available at the end of this message and on the wiki at :
> 
> https://wiki.apache.org/incubator/ToriiProposal
> 
> Please cast your votes:
> 
> [ ] +1, bring Torii into Incubator
> [ ] +0, I don't care either way
> [ ] -1, do not bring Torii into Incubator, because...
> 
> Due to long weekend holiday in US, I will leave the vote open until
> December 1st.
> 
> 
> = Torii =
> 
> == Abstract ==
> Torii provides applications with a mechanism to interactively and remotely
> access Apache Spark.
> 
> == Proposal ==
> Torii enables interactive applications to access Apache Spark clusters.
> More specifically:
> * Applications can send code-snippets and libraries for execution by Spark
> * Applications can be deployed separately from Spark clusters and
> communicate with the Torii using the provided Torii client
> * Execution results and streaming data can be sent back to calling
> applications
> * Applications no longer have to be network connected to the workers on a
> Spark cluster because the Torii acts as each application’s proxy
> * Work has started on enabling Torii to support languages in addition to
> Scala, namely Python (with PySpark), R (with SparkR), and SQL (with
> SparkSQL)
> 
> == Background & Rationale ==
> Apache Spark provides applications with a fast and general purpose
> distributed computing engine that supports static and streaming data,
> tabular and graph representations of data, and an extensive library of
> machine learning libraries. Consequently, a wide variety of applications
> will be written for Spark and there will be interactive applications that
> require relatively frequent function evaluations, and batch-oriented
> applications that require one-shot or only occasional evaluation.
> 
> Apache Spark provides two mechanisms for applications to connect with
> Spark. The primary mechanism launches applications on Spark clusters using
> spark-submit (
> http://spark.apache.org/docs/latest/submitting-applications.html); this
> requires developers to bundle their application code plus any dependencies
> into JAR files, and then submit them to Spark. A second mechanism is an
> ODBC/JDBC API (
> http://spark.apache.org/docs/latest/sql-programming-guide.html#distributed-sql-engine)
> which enables applications to issue SQL queries against SparkSQL.
> 
> Our experience when developing interactive applications, such as analytic
> applications integrated with Notebooks, to run against Spark was that the
> spark-submit mechanism was overly cumbersome and slow (requiring JAR
> creation and forking processes to run spark-submit), and the SQL interface
> was too limiting and did not offer easy access to components other than
> SparkSQL, such as streaming. The most promising mechanism provided by
> Apache Spark was the command-line shell (
> http://spark.apache.org/docs/latest/programming-guide.html#using-the-shell)
> which enabled us to execute code snippets and dynamically control the tasks
> submitted to  a Spark cluster. Spark does not provide the command-line
> shell as a consumable service but it provided us with the starting point
> from which we developed Torii.
> 
> == Current Status ==
> Torii was first developed by a small team working on an internal-IBM
> Spark-related project in July 2014. In recognition of its likely general
> utility to Spark users and developers, in November 2014 the Torii project
> was moved to GitHub and made available under the Apache License V2.
> 
> == Meritocracy ==
> The current developers are familiar with the meritocratic open source
> development process at Apache. As the project has gathered interest at
> GitHub the developers have actively started a process to invite additional
> developers into the project, and we have at least one new developer who is
> ready to contribute code to the project.
> 
> == Community ==
> We started building a community around Torii project when we moved it to
> GitHub about one year ago. Since then we have grown to about 70 people, and
> there are regular requests and suggestions from the community. We believe
> that providing Apache Spark application developers with a general-purpose
> and interactive API holds a lot of community potential, especially
> considering possible tie-in’s with Notebooks and data science community.
> 
> == Core Developers ==
> The core developers of the project are currently all from IBM, from the IBM
> Emerging Technology team and from IBM’s recently formed Spark Technology
> Center.
> 
> == Alignment ==
> Apache, as the home of Apache Spark, is the most natural home for the Torii
> project because it was designed to work with Apache Spark and to provide
> capabilities for interactive applications and data science tools not
> provided by Spark itself.
> 
> The Torii also has an affinity with Jupyter (jupyter.org) because it uses
> the Jupyter protocol for communications, and so Jupyter Notebooks can
> directly use the Torii as a kernel for communicating with Apache Spark.
> However, we believe that the Torii provides a general-purpose mechanism
> enabling a wider variety of applications than just Notebooks to access
> Spark, and so the Torii’s greatest affinity is with Apache and Apache
> Spark.
> 
> == Known Risks ==
> 
> === Orphaned products ===
> We believe the Torii project has a low-risk of abandonment due to interest
> in its continuing existence from several parties. More specifically, the
> Torii provides a capability that is not provided by Apache Spark today but
> it enables a wider range of applications to leverage Spark. For example,
> IBM uses (and is considering) the Torii in several offerings including its
> IBM Analytics for Apache Spark product in the Bluemix Cloud. There are also
> a couple of other commercial users who are using or considering its use in
> their offerings. Furthermore, Jupyter Notebooks are used by data scientists
> and Spark is gaining popularity as an analytic engine for them. Jupyter
> Notebooks are very easily enabled with the Torii and so there is another
> constituency for it.
> 
> === Inexperience with Open Source ===
> The Torii project has been running as an open-source project (albeit with
> only IBM committers) for the past several months. The project has an active
> issue tracker and due to the interest indicated by the nature and volume of
> requests and comments, the team has publicly stated it is beginning to
> build a process so they can accept third-party contributions to the project.
> 
> === Relationships with Other Apache Products ===
> The Torii has a clear affinity with the Apache Spark project because it is
> designed to  provide capabilities for interactive applications and data
> science tools not provided by Spark itself. The Torii can be a back-end for
> the Zeppelin project currently incubating at Apache. There is interest from
> the Torii community to develop this capability and an experimental branch
> has been started.
> 
> === Homogeneous Developers ===
> The current group of developers working on Torii are all from IBM although
> the group is in the process of expanding its membership to include members
> of the GitHub community who are not from IBM and who have been active in
> the Torii community in GutHub.
> 
> === Reliance on Salaried Developers ===
> The initial committers are full-time employees at IBM although not all work
> on the project full-time.
> 
> === Excessive Fascination with the Apache Brand ===
> We believe the Torii benefits Apache Spark application developers, and we
> are interested in an Apache Torii project to benefit these developers by
> engaging a larger community, facilitating closer ties with the existing
> Spark project, and yes, gaining more visibility for the Torii as a solution.
> 
> === Documentation ===
> Comprehensive documentation including “Getting Started”, API specifications
> and a Roadmap are available from the GitHub project, see
> https://github.com/ibm-et/Torii/wiki.
> 
> === Initial Source ===
> The source code resides at https://github.com/ibm-et/Torii.
> 
> === External Dependencies ===
> The Torii depends upon a number of Apache projects:
> * Spark
> * Hadoop
> * Ivy
> * Commons
> 
> The Torii also depends upon a number of other open source projects:
> * ZeroMQ (LGPL with Static Linking Exception,
> http://zeromq.org/area:licensing)
> * Akka (MIT)
> * JOpt Simple (MIT)
> * Spring Framework Core (Apache v2)
> * Play (Apache v2)
> * SLF4J (MIT)
> * Scala
> * Scalatest (Apache v2)
> * Scalactic (Apache v2)
> * Mockito (MIT)
> 
> == Required Resources ==
> 
> === Mailing lists ===
> 
> * private@torii.incubator.apache.org (with moderated subscriptions)
> * commits@torii.incubator.apache.org
> * dev@torii.incubator.apache.org
> 
> === Git Repository ===
> 
> * https://git-wip-us.apache.org/repos/asf/incubator-torii.git
> 
> === Issue Tracking ===
> 
> * A JIRA issue tracker: https://issues.apache.org/jira/browse/TORII
> 
> == Initial Committers ==
> 
> * Leugim Bustelo (lbustelo AT us DOT ibm DOT com)
> * Jakob Odersky (odersky AT us DOT ibm DOT com)
> * Luciano Resende (lresende AT apache DOT org)
> * Robert Senkbeil (rcsenkbe AT us DOT ibm DOT com)
> * Corey Stubbs (cstubbs AT us DOT ibm DOT com)
> * Miao Wang (wangmiao AT us DOT ibm DOT com)
> * Sean Welleck (swelleck AT us DOT ibm DOT com)
> 
> === Affiliations ===
> All of the initial committers are employed by IBM.
> 
> == Sponsors ==
> 
> === Champion ===
> * Sam Ruby (rubys AT apache DOT org)
> 
> === Nominated Mentors ===
> * Luciano Resende (lresende AT apache DOT org)
> * Reynold Xin (rxin AT apache DOT org)
> * Hitesh Shah (hitesh AT apache DOT org)
> * Julien Le Dem (julien AT apache DOT org)
> 
> === Sponsoring Entity ===
> 
> We would like to propose the Apache Incubator to sponsor this project.
> 
> 
> -- 
> Luciano Resende
> http://people.apache.org/~lresende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org

Re: [VOTE] Accept Torii into Apache Incubator

Posted by Jim Jagielski <ji...@jaguNET.com>.

+1 (binding)
> On Nov 30, 2015, at 1:08 PM, Luciano Resende <lu...@gmail.com> wrote:
> 
> And off-course, Here is my +1 (binding).
> 
> On Thu, Nov 26, 2015 at 7:33 AM, Luciano Resende <lu...@gmail.com>
> wrote:
> 
>> After initial discussion (under the name Spark-Kernel), please vote on
>> the acceptance of Torii Project for incubation at the Apache Incubator.
>> The full proposal is
>> available at the end of this message and on the wiki at :
>> 
>> https://wiki.apache.org/incubator/ToriiProposal
>> 
>> Please cast your votes:
>> 
>> [ ] +1, bring Torii into Incubator
>> [ ] +0, I don't care either way
>> [ ] -1, do not bring Torii into Incubator, because...
>> 
>> Due to long weekend holiday in US, I will leave the vote open until
>> December 1st.
>> 
>> 
>> = Torii =
>> 
>> == Abstract ==
>> Torii provides applications with a mechanism to interactively and remotely
>> access Apache Spark.
>> 
>> == Proposal ==
>> Torii enables interactive applications to access Apache Spark clusters.
>> More specifically:
>> * Applications can send code-snippets and libraries for execution by Spark
>> * Applications can be deployed separately from Spark clusters and
>> communicate with the Torii using the provided Torii client
>> * Execution results and streaming data can be sent back to calling
>> applications
>> * Applications no longer have to be network connected to the workers on a
>> Spark cluster because the Torii acts as each application’s proxy
>> * Work has started on enabling Torii to support languages in addition to
>> Scala, namely Python (with PySpark), R (with SparkR), and SQL (with
>> SparkSQL)
>> 
>> == Background & Rationale ==
>> Apache Spark provides applications with a fast and general purpose
>> distributed computing engine that supports static and streaming data,
>> tabular and graph representations of data, and an extensive library of
>> machine learning libraries. Consequently, a wide variety of applications
>> will be written for Spark and there will be interactive applications that
>> require relatively frequent function evaluations, and batch-oriented
>> applications that require one-shot or only occasional evaluation.
>> 
>> Apache Spark provides two mechanisms for applications to connect with
>> Spark. The primary mechanism launches applications on Spark clusters using
>> spark-submit (
>> http://spark.apache.org/docs/latest/submitting-applications.html); this
>> requires developers to bundle their application code plus any dependencies
>> into JAR files, and then submit them to Spark. A second mechanism is an
>> ODBC/JDBC API (
>> http://spark.apache.org/docs/latest/sql-programming-guide.html#distributed-sql-engine)
>> which enables applications to issue SQL queries against SparkSQL.
>> 
>> Our experience when developing interactive applications, such as analytic
>> applications integrated with Notebooks, to run against Spark was that the
>> spark-submit mechanism was overly cumbersome and slow (requiring JAR
>> creation and forking processes to run spark-submit), and the SQL interface
>> was too limiting and did not offer easy access to components other than
>> SparkSQL, such as streaming. The most promising mechanism provided by
>> Apache Spark was the command-line shell (
>> http://spark.apache.org/docs/latest/programming-guide.html#using-the-shell)
>> which enabled us to execute code snippets and dynamically control the tasks
>> submitted to  a Spark cluster. Spark does not provide the command-line
>> shell as a consumable service but it provided us with the starting point
>> from which we developed Torii.
>> 
>> == Current Status ==
>> Torii was first developed by a small team working on an internal-IBM
>> Spark-related project in July 2014. In recognition of its likely general
>> utility to Spark users and developers, in November 2014 the Torii project
>> was moved to GitHub and made available under the Apache License V2.
>> 
>> == Meritocracy ==
>> The current developers are familiar with the meritocratic open source
>> development process at Apache. As the project has gathered interest at
>> GitHub the developers have actively started a process to invite additional
>> developers into the project, and we have at least one new developer who is
>> ready to contribute code to the project.
>> 
>> == Community ==
>> We started building a community around Torii project when we moved it to
>> GitHub about one year ago. Since then we have grown to about 70 people, and
>> there are regular requests and suggestions from the community. We believe
>> that providing Apache Spark application developers with a general-purpose
>> and interactive API holds a lot of community potential, especially
>> considering possible tie-in’s with Notebooks and data science community.
>> 
>> == Core Developers ==
>> The core developers of the project are currently all from IBM, from the
>> IBM Emerging Technology team and from IBM’s recently formed Spark
>> Technology Center.
>> 
>> == Alignment ==
>> Apache, as the home of Apache Spark, is the most natural home for the
>> Torii project because it was designed to work with Apache Spark and to
>> provide capabilities for interactive applications and data science tools
>> not provided by Spark itself.
>> 
>> The Torii also has an affinity with Jupyter (jupyter.org) because it uses
>> the Jupyter protocol for communications, and so Jupyter Notebooks can
>> directly use the Torii as a kernel for communicating with Apache Spark.
>> However, we believe that the Torii provides a general-purpose mechanism
>> enabling a wider variety of applications than just Notebooks to access
>> Spark, and so the Torii’s greatest affinity is with Apache and Apache
>> Spark.
>> 
>> == Known Risks ==
>> 
>> === Orphaned products ===
>> We believe the Torii project has a low-risk of abandonment due to interest
>> in its continuing existence from several parties. More specifically, the
>> Torii provides a capability that is not provided by Apache Spark today but
>> it enables a wider range of applications to leverage Spark. For example,
>> IBM uses (and is considering) the Torii in several offerings including its
>> IBM Analytics for Apache Spark product in the Bluemix Cloud. There are also
>> a couple of other commercial users who are using or considering its use in
>> their offerings. Furthermore, Jupyter Notebooks are used by data scientists
>> and Spark is gaining popularity as an analytic engine for them. Jupyter
>> Notebooks are very easily enabled with the Torii and so there is another
>> constituency for it.
>> 
>> === Inexperience with Open Source ===
>> The Torii project has been running as an open-source project (albeit with
>> only IBM committers) for the past several months. The project has an active
>> issue tracker and due to the interest indicated by the nature and volume of
>> requests and comments, the team has publicly stated it is beginning to
>> build a process so they can accept third-party contributions to the project.
>> 
>> === Relationships with Other Apache Products ===
>> The Torii has a clear affinity with the Apache Spark project because it is
>> designed to  provide capabilities for interactive applications and data
>> science tools not provided by Spark itself. The Torii can be a back-end for
>> the Zeppelin project currently incubating at Apache. There is interest from
>> the Torii community to develop this capability and an experimental branch
>> has been started.
>> 
>> === Homogeneous Developers ===
>> The current group of developers working on Torii are all from IBM although
>> the group is in the process of expanding its membership to include members
>> of the GitHub community who are not from IBM and who have been active in
>> the Torii community in GutHub.
>> 
>> === Reliance on Salaried Developers ===
>> The initial committers are full-time employees at IBM although not all
>> work on the project full-time.
>> 
>> === Excessive Fascination with the Apache Brand ===
>> We believe the Torii benefits Apache Spark application developers, and we
>> are interested in an Apache Torii project to benefit these developers by
>> engaging a larger community, facilitating closer ties with the existing
>> Spark project, and yes, gaining more visibility for the Torii as a solution.
>> 
>> === Documentation ===
>> Comprehensive documentation including “Getting Started”, API
>> specifications and a Roadmap are available from the GitHub project, see
>> https://github.com/ibm-et/Torii/wiki.
>> 
>> === Initial Source ===
>> The source code resides at https://github.com/ibm-et/Torii.
>> 
>> === External Dependencies ===
>> The Torii depends upon a number of Apache projects:
>> * Spark
>> * Hadoop
>> * Ivy
>> * Commons
>> 
>> The Torii also depends upon a number of other open source projects:
>> * ZeroMQ (LGPL with Static Linking Exception,
>> http://zeromq.org/area:licensing)
>> * Akka (MIT)
>> * JOpt Simple (MIT)
>> * Spring Framework Core (Apache v2)
>> * Play (Apache v2)
>> * SLF4J (MIT)
>> * Scala
>> * Scalatest (Apache v2)
>> * Scalactic (Apache v2)
>> * Mockito (MIT)
>> 
>> == Required Resources ==
>> 
>> === Mailing lists ===
>> 
>> * private@torii.incubator.apache.org (with moderated subscriptions)
>> * commits@torii.incubator.apache.org
>> * dev@torii.incubator.apache.org
>> 
>> === Git Repository ===
>> 
>> * https://git-wip-us.apache.org/repos/asf/incubator-torii.git
>> 
>> === Issue Tracking ===
>> 
>> * A JIRA issue tracker: https://issues.apache.org/jira/browse/TORII
>> 
>> == Initial Committers ==
>> 
>> * Leugim Bustelo (lbustelo AT us DOT ibm DOT com)
>> * Jakob Odersky (odersky AT us DOT ibm DOT com)
>> * Luciano Resende (lresende AT apache DOT org)
>> * Robert Senkbeil (rcsenkbe AT us DOT ibm DOT com)
>> * Corey Stubbs (cstubbs AT us DOT ibm DOT com)
>> * Miao Wang (wangmiao AT us DOT ibm DOT com)
>> * Sean Welleck (swelleck AT us DOT ibm DOT com)
>> 
>> === Affiliations ===
>> All of the initial committers are employed by IBM.
>> 
>> == Sponsors ==
>> 
>> === Champion ===
>> * Sam Ruby (rubys AT apache DOT org)
>> 
>> === Nominated Mentors ===
>> * Luciano Resende (lresende AT apache DOT org)
>> * Reynold Xin (rxin AT apache DOT org)
>> * Hitesh Shah (hitesh AT apache DOT org)
>> * Julien Le Dem (julien AT apache DOT org)
>> 
>> === Sponsoring Entity ===
>> 
>> We would like to propose the Apache Incubator to sponsor this project.
>> 
>> 
>> --
>> Luciano Resende
>> http://people.apache.org/~lresende
>> http://twitter.com/lresende1975
>> http://lresende.blogspot.com/
>> 
> 
> 
> 
> -- 
> Luciano Resende
> http://people.apache.org/~lresende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org

Re: [VOTE] Accept Torii into Apache Incubator

Posted by Luke Han <lu...@gmail.com>.

+1 (non-binding)


Best Regards!
---------------------

Luke Han

On Tue, Dec 1, 2015 at 3:39 PM, Sree V <sr...@yahoo.com.invalid>
wrote:

> +1 (non-binding) Thanking you.With RegardsSree
>
>
>     On Monday, November 30, 2015 3:21 PM, Reynold Xin <rx...@databricks.com>
> wrote:
>
>
>  +1
>
> > On Dec 1, 2015, at 2:08 AM, Luciano Resende <lu...@gmail.com>
> wrote:
> >
> > And off-course, Here is my +1 (binding).
> >
> > On Thu, Nov 26, 2015 at 7:33 AM, Luciano Resende <lu...@gmail.com>
> > wrote:
> >
> >> After initial discussion (under the name Spark-Kernel), please vote on
> >> the acceptance of Torii Project for incubation at the Apache Incubator.
> >> The full proposal is
> >> available at the end of this message and on the wiki at :
> >>
> >> https://wiki.apache.org/incubator/ToriiProposal
> >>
> >> Please cast your votes:
> >>
> >> [ ] +1, bring Torii into Incubator
> >> [ ] +0, I don't care either way
> >> [ ] -1, do not bring Torii into Incubator, because...
> >>
> >> Due to long weekend holiday in US, I will leave the vote open until
> >> December 1st.
> >>
> >>
> >> = Torii =
> >>
> >> == Abstract ==
> >> Torii provides applications with a mechanism to interactively and
> remotely
> >> access Apache Spark.
> >>
> >> == Proposal ==
> >> Torii enables interactive applications to access Apache Spark clusters.
> >> More specifically:
> >> * Applications can send code-snippets and libraries for execution by
> Spark
> >> * Applications can be deployed separately from Spark clusters and
> >> communicate with the Torii using the provided Torii client
> >> * Execution results and streaming data can be sent back to calling
> >> applications
> >> * Applications no longer have to be network connected to the workers on
> a
> >> Spark cluster because the Torii acts as each application’s proxy
> >> * Work has started on enabling Torii to support languages in addition to
> >> Scala, namely Python (with PySpark), R (with SparkR), and SQL (with
> >> SparkSQL)
> >>
> >> == Background & Rationale ==
> >> Apache Spark provides applications with a fast and general purpose
> >> distributed computing engine that supports static and streaming data,
> >> tabular and graph representations of data, and an extensive library of
> >> machine learning libraries. Consequently, a wide variety of applications
> >> will be written for Spark and there will be interactive applications
> that
> >> require relatively frequent function evaluations, and batch-oriented
> >> applications that require one-shot or only occasional evaluation.
> >>
> >> Apache Spark provides two mechanisms for applications to connect with
> >> Spark. The primary mechanism launches applications on Spark clusters
> using
> >> spark-submit (
> >> http://spark.apache.org/docs/latest/submitting-applications.html); this
> >> requires developers to bundle their application code plus any
> dependencies
> >> into JAR files, and then submit them to Spark. A second mechanism is an
> >> ODBC/JDBC API (
> >>
> http://spark.apache.org/docs/latest/sql-programming-guide.html#distributed-sql-engine
> )
> >> which enables applications to issue SQL queries against SparkSQL.
> >>
> >> Our experience when developing interactive applications, such as
> analytic
> >> applications integrated with Notebooks, to run against Spark was that
> the
> >> spark-submit mechanism was overly cumbersome and slow (requiring JAR
> >> creation and forking processes to run spark-submit), and the SQL
> interface
> >> was too limiting and did not offer easy access to components other than
> >> SparkSQL, such as streaming. The most promising mechanism provided by
> >> Apache Spark was the command-line shell (
> >>
> http://spark.apache.org/docs/latest/programming-guide.html#using-the-shell
> )
> >> which enabled us to execute code snippets and dynamically control the
> tasks
> >> submitted to  a Spark cluster. Spark does not provide the command-line
> >> shell as a consumable service but it provided us with the starting point
> >> from which we developed Torii.
> >>
> >> == Current Status ==
> >> Torii was first developed by a small team working on an internal-IBM
> >> Spark-related project in July 2014. In recognition of its likely general
> >> utility to Spark users and developers, in November 2014 the Torii
> project
> >> was moved to GitHub and made available under the Apache License V2.
> >>
> >> == Meritocracy ==
> >> The current developers are familiar with the meritocratic open source
> >> development process at Apache. As the project has gathered interest at
> >> GitHub the developers have actively started a process to invite
> additional
> >> developers into the project, and we have at least one new developer who
> is
> >> ready to contribute code to the project.
> >>
> >> == Community ==
> >> We started building a community around Torii project when we moved it to
> >> GitHub about one year ago. Since then we have grown to about 70 people,
> and
> >> there are regular requests and suggestions from the community. We
> believe
> >> that providing Apache Spark application developers with a
> general-purpose
> >> and interactive API holds a lot of community potential, especially
> >> considering possible tie-in’s with Notebooks and data science community.
> >>
> >> == Core Developers ==
> >> The core developers of the project are currently all from IBM, from the
> >> IBM Emerging Technology team and from IBM’s recently formed Spark
> >> Technology Center.
> >>
> >> == Alignment ==
> >> Apache, as the home of Apache Spark, is the most natural home for the
> >> Torii project because it was designed to work with Apache Spark and to
> >> provide capabilities for interactive applications and data science tools
> >> not provided by Spark itself.
> >>
> >> The Torii also has an affinity with Jupyter (jupyter.org) because it
> uses
> >> the Jupyter protocol for communications, and so Jupyter Notebooks can
> >> directly use the Torii as a kernel for communicating with Apache Spark.
> >> However, we believe that the Torii provides a general-purpose mechanism
> >> enabling a wider variety of applications than just Notebooks to access
> >> Spark, and so the Torii’s greatest affinity is with Apache and Apache
> >> Spark.
> >>
> >> == Known Risks ==
> >>
> >> === Orphaned products ===
> >> We believe the Torii project has a low-risk of abandonment due to
> interest
> >> in its continuing existence from several parties. More specifically, the
> >> Torii provides a capability that is not provided by Apache Spark today
> but
> >> it enables a wider range of applications to leverage Spark. For example,
> >> IBM uses (and is considering) the Torii in several offerings including
> its
> >> IBM Analytics for Apache Spark product in the Bluemix Cloud. There are
> also
> >> a couple of other commercial users who are using or considering its use
> in
> >> their offerings. Furthermore, Jupyter Notebooks are used by data
> scientists
> >> and Spark is gaining popularity as an analytic engine for them. Jupyter
> >> Notebooks are very easily enabled with the Torii and so there is another
> >> constituency for it.
> >>
> >> === Inexperience with Open Source ===
> >> The Torii project has been running as an open-source project (albeit
> with
> >> only IBM committers) for the past several months. The project has an
> active
> >> issue tracker and due to the interest indicated by the nature and
> volume of
> >> requests and comments, the team has publicly stated it is beginning to
> >> build a process so they can accept third-party contributions to the
> project.
> >>
> >> === Relationships with Other Apache Products ===
> >> The Torii has a clear affinity with the Apache Spark project because it
> is
> >> designed to  provide capabilities for interactive applications and data
> >> science tools not provided by Spark itself. The Torii can be a back-end
> for
> >> the Zeppelin project currently incubating at Apache. There is interest
> from
> >> the Torii community to develop this capability and an experimental
> branch
> >> has been started.
> >>
> >> === Homogeneous Developers ===
> >> The current group of developers working on Torii are all from IBM
> although
> >> the group is in the process of expanding its membership to include
> members
> >> of the GitHub community who are not from IBM and who have been active in
> >> the Torii community in GutHub.
> >>
> >> === Reliance on Salaried Developers ===
> >> The initial committers are full-time employees at IBM although not all
> >> work on the project full-time.
> >>
> >> === Excessive Fascination with the Apache Brand ===
> >> We believe the Torii benefits Apache Spark application developers, and
> we
> >> are interested in an Apache Torii project to benefit these developers by
> >> engaging a larger community, facilitating closer ties with the existing
> >> Spark project, and yes, gaining more visibility for the Torii as a
> solution.
> >>
> >> === Documentation ===
> >> Comprehensive documentation including “Getting Started”, API
> >> specifications and a Roadmap are available from the GitHub project, see
> >> https://github.com/ibm-et/Torii/wiki.
> >>
> >> === Initial Source ===
> >> The source code resides at https://github.com/ibm-et/Torii.
> >>
> >> === External Dependencies ===
> >> The Torii depends upon a number of Apache projects:
> >> * Spark
> >> * Hadoop
> >> * Ivy
> >> * Commons
> >>
> >> The Torii also depends upon a number of other open source projects:
> >> * ZeroMQ (LGPL with Static Linking Exception,
> >> http://zeromq.org/area:licensing)
> >> * Akka (MIT)
> >> * JOpt Simple (MIT)
> >> * Spring Framework Core (Apache v2)
> >> * Play (Apache v2)
> >> * SLF4J (MIT)
> >> * Scala
> >> * Scalatest (Apache v2)
> >> * Scalactic (Apache v2)
> >> * Mockito (MIT)
> >>
> >> == Required Resources ==
> >>
> >> === Mailing lists ===
> >>
> >> * private@torii.incubator.apache.org (with moderated subscriptions)
> >> * commits@torii.incubator.apache.org
> >> * dev@torii.incubator.apache.org
> >>
> >> === Git Repository ===
> >>
> >> * https://git-wip-us.apache.org/repos/asf/incubator-torii.git
> >>
> >> === Issue Tracking ===
> >>
> >> * A JIRA issue tracker: https://issues.apache.org/jira/browse/TORII
> >>
> >> == Initial Committers ==
> >>
> >> * Leugim Bustelo (lbustelo AT us DOT ibm DOT com)
> >> * Jakob Odersky (odersky AT us DOT ibm DOT com)
> >> * Luciano Resende (lresende AT apache DOT org)
> >> * Robert Senkbeil (rcsenkbe AT us DOT ibm DOT com)
> >> * Corey Stubbs (cstubbs AT us DOT ibm DOT com)
> >> * Miao Wang (wangmiao AT us DOT ibm DOT com)
> >> * Sean Welleck (swelleck AT us DOT ibm DOT com)
> >>
> >> === Affiliations ===
> >> All of the initial committers are employed by IBM.
> >>
> >> == Sponsors ==
> >>
> >> === Champion ===
> >> * Sam Ruby (rubys AT apache DOT org)
> >>
> >> === Nominated Mentors ===
> >> * Luciano Resende (lresende AT apache DOT org)
> >> * Reynold Xin (rxin AT apache DOT org)
> >> * Hitesh Shah (hitesh AT apache DOT org)
> >> * Julien Le Dem (julien AT apache DOT org)
> >>
> >> === Sponsoring Entity ===
> >>
> >> We would like to propose the Apache Incubator to sponsor this project.
> >>
> >>
> >> --
> >> Luciano Resende
> >> http://people.apache.org/~lresende
> >> http://twitter.com/lresende1975
> >> http://lresende.blogspot.com/
> >
> >
> >
> > --
> > Luciano Resende
> > http://people.apache.org/~lresende
> > http://twitter.com/lresende1975
> > http://lresende.blogspot.com/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>
>
>

Re: [VOTE] Accept Torii into Apache Incubator

Posted by Sree V <sr...@yahoo.com.INVALID>.

+1 (non-binding) Thanking you.With RegardsSree 


    On Monday, November 30, 2015 3:21 PM, Reynold Xin <rx...@databricks.com> wrote:
 

 +1

> On Dec 1, 2015, at 2:08 AM, Luciano Resende <lu...@gmail.com> wrote:
> 
> And off-course, Here is my +1 (binding).
> 
> On Thu, Nov 26, 2015 at 7:33 AM, Luciano Resende <lu...@gmail.com>
> wrote:
> 
>> After initial discussion (under the name Spark-Kernel), please vote on
>> the acceptance of Torii Project for incubation at the Apache Incubator.
>> The full proposal is
>> available at the end of this message and on the wiki at :
>> 
>> https://wiki.apache.org/incubator/ToriiProposal
>> 
>> Please cast your votes:
>> 
>> [ ] +1, bring Torii into Incubator
>> [ ] +0, I don't care either way
>> [ ] -1, do not bring Torii into Incubator, because...
>> 
>> Due to long weekend holiday in US, I will leave the vote open until
>> December 1st.
>> 
>> 
>> = Torii =
>> 
>> == Abstract ==
>> Torii provides applications with a mechanism to interactively and remotely
>> access Apache Spark.
>> 
>> == Proposal ==
>> Torii enables interactive applications to access Apache Spark clusters.
>> More specifically:
>> * Applications can send code-snippets and libraries for execution by Spark
>> * Applications can be deployed separately from Spark clusters and
>> communicate with the Torii using the provided Torii client
>> * Execution results and streaming data can be sent back to calling
>> applications
>> * Applications no longer have to be network connected to the workers on a
>> Spark cluster because the Torii acts as each application’s proxy
>> * Work has started on enabling Torii to support languages in addition to
>> Scala, namely Python (with PySpark), R (with SparkR), and SQL (with
>> SparkSQL)
>> 
>> == Background & Rationale ==
>> Apache Spark provides applications with a fast and general purpose
>> distributed computing engine that supports static and streaming data,
>> tabular and graph representations of data, and an extensive library of
>> machine learning libraries. Consequently, a wide variety of applications
>> will be written for Spark and there will be interactive applications that
>> require relatively frequent function evaluations, and batch-oriented
>> applications that require one-shot or only occasional evaluation.
>> 
>> Apache Spark provides two mechanisms for applications to connect with
>> Spark. The primary mechanism launches applications on Spark clusters using
>> spark-submit (
>> http://spark.apache.org/docs/latest/submitting-applications.html); this
>> requires developers to bundle their application code plus any dependencies
>> into JAR files, and then submit them to Spark. A second mechanism is an
>> ODBC/JDBC API (
>> http://spark.apache.org/docs/latest/sql-programming-guide.html#distributed-sql-engine)
>> which enables applications to issue SQL queries against SparkSQL.
>> 
>> Our experience when developing interactive applications, such as analytic
>> applications integrated with Notebooks, to run against Spark was that the
>> spark-submit mechanism was overly cumbersome and slow (requiring JAR
>> creation and forking processes to run spark-submit), and the SQL interface
>> was too limiting and did not offer easy access to components other than
>> SparkSQL, such as streaming. The most promising mechanism provided by
>> Apache Spark was the command-line shell (
>> http://spark.apache.org/docs/latest/programming-guide.html#using-the-shell)
>> which enabled us to execute code snippets and dynamically control the tasks
>> submitted to  a Spark cluster. Spark does not provide the command-line
>> shell as a consumable service but it provided us with the starting point
>> from which we developed Torii.
>> 
>> == Current Status ==
>> Torii was first developed by a small team working on an internal-IBM
>> Spark-related project in July 2014. In recognition of its likely general
>> utility to Spark users and developers, in November 2014 the Torii project
>> was moved to GitHub and made available under the Apache License V2.
>> 
>> == Meritocracy ==
>> The current developers are familiar with the meritocratic open source
>> development process at Apache. As the project has gathered interest at
>> GitHub the developers have actively started a process to invite additional
>> developers into the project, and we have at least one new developer who is
>> ready to contribute code to the project.
>> 
>> == Community ==
>> We started building a community around Torii project when we moved it to
>> GitHub about one year ago. Since then we have grown to about 70 people, and
>> there are regular requests and suggestions from the community. We believe
>> that providing Apache Spark application developers with a general-purpose
>> and interactive API holds a lot of community potential, especially
>> considering possible tie-in’s with Notebooks and data science community.
>> 
>> == Core Developers ==
>> The core developers of the project are currently all from IBM, from the
>> IBM Emerging Technology team and from IBM’s recently formed Spark
>> Technology Center.
>> 
>> == Alignment ==
>> Apache, as the home of Apache Spark, is the most natural home for the
>> Torii project because it was designed to work with Apache Spark and to
>> provide capabilities for interactive applications and data science tools
>> not provided by Spark itself.
>> 
>> The Torii also has an affinity with Jupyter (jupyter.org) because it uses
>> the Jupyter protocol for communications, and so Jupyter Notebooks can
>> directly use the Torii as a kernel for communicating with Apache Spark.
>> However, we believe that the Torii provides a general-purpose mechanism
>> enabling a wider variety of applications than just Notebooks to access
>> Spark, and so the Torii’s greatest affinity is with Apache and Apache
>> Spark.
>> 
>> == Known Risks ==
>> 
>> === Orphaned products ===
>> We believe the Torii project has a low-risk of abandonment due to interest
>> in its continuing existence from several parties. More specifically, the
>> Torii provides a capability that is not provided by Apache Spark today but
>> it enables a wider range of applications to leverage Spark. For example,
>> IBM uses (and is considering) the Torii in several offerings including its
>> IBM Analytics for Apache Spark product in the Bluemix Cloud. There are also
>> a couple of other commercial users who are using or considering its use in
>> their offerings. Furthermore, Jupyter Notebooks are used by data scientists
>> and Spark is gaining popularity as an analytic engine for them. Jupyter
>> Notebooks are very easily enabled with the Torii and so there is another
>> constituency for it.
>> 
>> === Inexperience with Open Source ===
>> The Torii project has been running as an open-source project (albeit with
>> only IBM committers) for the past several months. The project has an active
>> issue tracker and due to the interest indicated by the nature and volume of
>> requests and comments, the team has publicly stated it is beginning to
>> build a process so they can accept third-party contributions to the project.
>> 
>> === Relationships with Other Apache Products ===
>> The Torii has a clear affinity with the Apache Spark project because it is
>> designed to  provide capabilities for interactive applications and data
>> science tools not provided by Spark itself. The Torii can be a back-end for
>> the Zeppelin project currently incubating at Apache. There is interest from
>> the Torii community to develop this capability and an experimental branch
>> has been started.
>> 
>> === Homogeneous Developers ===
>> The current group of developers working on Torii are all from IBM although
>> the group is in the process of expanding its membership to include members
>> of the GitHub community who are not from IBM and who have been active in
>> the Torii community in GutHub.
>> 
>> === Reliance on Salaried Developers ===
>> The initial committers are full-time employees at IBM although not all
>> work on the project full-time.
>> 
>> === Excessive Fascination with the Apache Brand ===
>> We believe the Torii benefits Apache Spark application developers, and we
>> are interested in an Apache Torii project to benefit these developers by
>> engaging a larger community, facilitating closer ties with the existing
>> Spark project, and yes, gaining more visibility for the Torii as a solution.
>> 
>> === Documentation ===
>> Comprehensive documentation including “Getting Started”, API
>> specifications and a Roadmap are available from the GitHub project, see
>> https://github.com/ibm-et/Torii/wiki.
>> 
>> === Initial Source ===
>> The source code resides at https://github.com/ibm-et/Torii.
>> 
>> === External Dependencies ===
>> The Torii depends upon a number of Apache projects:
>> * Spark
>> * Hadoop
>> * Ivy
>> * Commons
>> 
>> The Torii also depends upon a number of other open source projects:
>> * ZeroMQ (LGPL with Static Linking Exception,
>> http://zeromq.org/area:licensing)
>> * Akka (MIT)
>> * JOpt Simple (MIT)
>> * Spring Framework Core (Apache v2)
>> * Play (Apache v2)
>> * SLF4J (MIT)
>> * Scala
>> * Scalatest (Apache v2)
>> * Scalactic (Apache v2)
>> * Mockito (MIT)
>> 
>> == Required Resources ==
>> 
>> === Mailing lists ===
>> 
>> * private@torii.incubator.apache.org (with moderated subscriptions)
>> * commits@torii.incubator.apache.org
>> * dev@torii.incubator.apache.org
>> 
>> === Git Repository ===
>> 
>> * https://git-wip-us.apache.org/repos/asf/incubator-torii.git
>> 
>> === Issue Tracking ===
>> 
>> * A JIRA issue tracker: https://issues.apache.org/jira/browse/TORII
>> 
>> == Initial Committers ==
>> 
>> * Leugim Bustelo (lbustelo AT us DOT ibm DOT com)
>> * Jakob Odersky (odersky AT us DOT ibm DOT com)
>> * Luciano Resende (lresende AT apache DOT org)
>> * Robert Senkbeil (rcsenkbe AT us DOT ibm DOT com)
>> * Corey Stubbs (cstubbs AT us DOT ibm DOT com)
>> * Miao Wang (wangmiao AT us DOT ibm DOT com)
>> * Sean Welleck (swelleck AT us DOT ibm DOT com)
>> 
>> === Affiliations ===
>> All of the initial committers are employed by IBM.
>> 
>> == Sponsors ==
>> 
>> === Champion ===
>> * Sam Ruby (rubys AT apache DOT org)
>> 
>> === Nominated Mentors ===
>> * Luciano Resende (lresende AT apache DOT org)
>> * Reynold Xin (rxin AT apache DOT org)
>> * Hitesh Shah (hitesh AT apache DOT org)
>> * Julien Le Dem (julien AT apache DOT org)
>> 
>> === Sponsoring Entity ===
>> 
>> We would like to propose the Apache Incubator to sponsor this project.
>> 
>> 
>> --
>> Luciano Resende
>> http://people.apache.org/~lresende
>> http://twitter.com/lresende1975
>> http://lresende.blogspot.com/
> 
> 
> 
> -- 
> Luciano Resende
> http://people.apache.org/~lresende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org

Re: [VOTE] Accept Torii into Apache Incubator

Posted by Reynold Xin <rx...@databricks.com>.

+1

> On Dec 1, 2015, at 2:08 AM, Luciano Resende <lu...@gmail.com> wrote:
> 
> And off-course, Here is my +1 (binding).
> 
> On Thu, Nov 26, 2015 at 7:33 AM, Luciano Resende <lu...@gmail.com>
> wrote:
> 
>> After initial discussion (under the name Spark-Kernel), please vote on
>> the acceptance of Torii Project for incubation at the Apache Incubator.
>> The full proposal is
>> available at the end of this message and on the wiki at :
>> 
>> https://wiki.apache.org/incubator/ToriiProposal
>> 
>> Please cast your votes:
>> 
>> [ ] +1, bring Torii into Incubator
>> [ ] +0, I don't care either way
>> [ ] -1, do not bring Torii into Incubator, because...
>> 
>> Due to long weekend holiday in US, I will leave the vote open until
>> December 1st.
>> 
>> 
>> = Torii =
>> 
>> == Abstract ==
>> Torii provides applications with a mechanism to interactively and remotely
>> access Apache Spark.
>> 
>> == Proposal ==
>> Torii enables interactive applications to access Apache Spark clusters.
>> More specifically:
>> * Applications can send code-snippets and libraries for execution by Spark
>> * Applications can be deployed separately from Spark clusters and
>> communicate with the Torii using the provided Torii client
>> * Execution results and streaming data can be sent back to calling
>> applications
>> * Applications no longer have to be network connected to the workers on a
>> Spark cluster because the Torii acts as each application’s proxy
>> * Work has started on enabling Torii to support languages in addition to
>> Scala, namely Python (with PySpark), R (with SparkR), and SQL (with
>> SparkSQL)
>> 
>> == Background & Rationale ==
>> Apache Spark provides applications with a fast and general purpose
>> distributed computing engine that supports static and streaming data,
>> tabular and graph representations of data, and an extensive library of
>> machine learning libraries. Consequently, a wide variety of applications
>> will be written for Spark and there will be interactive applications that
>> require relatively frequent function evaluations, and batch-oriented
>> applications that require one-shot or only occasional evaluation.
>> 
>> Apache Spark provides two mechanisms for applications to connect with
>> Spark. The primary mechanism launches applications on Spark clusters using
>> spark-submit (
>> http://spark.apache.org/docs/latest/submitting-applications.html); this
>> requires developers to bundle their application code plus any dependencies
>> into JAR files, and then submit them to Spark. A second mechanism is an
>> ODBC/JDBC API (
>> http://spark.apache.org/docs/latest/sql-programming-guide.html#distributed-sql-engine)
>> which enables applications to issue SQL queries against SparkSQL.
>> 
>> Our experience when developing interactive applications, such as analytic
>> applications integrated with Notebooks, to run against Spark was that the
>> spark-submit mechanism was overly cumbersome and slow (requiring JAR
>> creation and forking processes to run spark-submit), and the SQL interface
>> was too limiting and did not offer easy access to components other than
>> SparkSQL, such as streaming. The most promising mechanism provided by
>> Apache Spark was the command-line shell (
>> http://spark.apache.org/docs/latest/programming-guide.html#using-the-shell)
>> which enabled us to execute code snippets and dynamically control the tasks
>> submitted to  a Spark cluster. Spark does not provide the command-line
>> shell as a consumable service but it provided us with the starting point
>> from which we developed Torii.
>> 
>> == Current Status ==
>> Torii was first developed by a small team working on an internal-IBM
>> Spark-related project in July 2014. In recognition of its likely general
>> utility to Spark users and developers, in November 2014 the Torii project
>> was moved to GitHub and made available under the Apache License V2.
>> 
>> == Meritocracy ==
>> The current developers are familiar with the meritocratic open source
>> development process at Apache. As the project has gathered interest at
>> GitHub the developers have actively started a process to invite additional
>> developers into the project, and we have at least one new developer who is
>> ready to contribute code to the project.
>> 
>> == Community ==
>> We started building a community around Torii project when we moved it to
>> GitHub about one year ago. Since then we have grown to about 70 people, and
>> there are regular requests and suggestions from the community. We believe
>> that providing Apache Spark application developers with a general-purpose
>> and interactive API holds a lot of community potential, especially
>> considering possible tie-in’s with Notebooks and data science community.
>> 
>> == Core Developers ==
>> The core developers of the project are currently all from IBM, from the
>> IBM Emerging Technology team and from IBM’s recently formed Spark
>> Technology Center.
>> 
>> == Alignment ==
>> Apache, as the home of Apache Spark, is the most natural home for the
>> Torii project because it was designed to work with Apache Spark and to
>> provide capabilities for interactive applications and data science tools
>> not provided by Spark itself.
>> 
>> The Torii also has an affinity with Jupyter (jupyter.org) because it uses
>> the Jupyter protocol for communications, and so Jupyter Notebooks can
>> directly use the Torii as a kernel for communicating with Apache Spark.
>> However, we believe that the Torii provides a general-purpose mechanism
>> enabling a wider variety of applications than just Notebooks to access
>> Spark, and so the Torii’s greatest affinity is with Apache and Apache
>> Spark.
>> 
>> == Known Risks ==
>> 
>> === Orphaned products ===
>> We believe the Torii project has a low-risk of abandonment due to interest
>> in its continuing existence from several parties. More specifically, the
>> Torii provides a capability that is not provided by Apache Spark today but
>> it enables a wider range of applications to leverage Spark. For example,
>> IBM uses (and is considering) the Torii in several offerings including its
>> IBM Analytics for Apache Spark product in the Bluemix Cloud. There are also
>> a couple of other commercial users who are using or considering its use in
>> their offerings. Furthermore, Jupyter Notebooks are used by data scientists
>> and Spark is gaining popularity as an analytic engine for them. Jupyter
>> Notebooks are very easily enabled with the Torii and so there is another
>> constituency for it.
>> 
>> === Inexperience with Open Source ===
>> The Torii project has been running as an open-source project (albeit with
>> only IBM committers) for the past several months. The project has an active
>> issue tracker and due to the interest indicated by the nature and volume of
>> requests and comments, the team has publicly stated it is beginning to
>> build a process so they can accept third-party contributions to the project.
>> 
>> === Relationships with Other Apache Products ===
>> The Torii has a clear affinity with the Apache Spark project because it is
>> designed to  provide capabilities for interactive applications and data
>> science tools not provided by Spark itself. The Torii can be a back-end for
>> the Zeppelin project currently incubating at Apache. There is interest from
>> the Torii community to develop this capability and an experimental branch
>> has been started.
>> 
>> === Homogeneous Developers ===
>> The current group of developers working on Torii are all from IBM although
>> the group is in the process of expanding its membership to include members
>> of the GitHub community who are not from IBM and who have been active in
>> the Torii community in GutHub.
>> 
>> === Reliance on Salaried Developers ===
>> The initial committers are full-time employees at IBM although not all
>> work on the project full-time.
>> 
>> === Excessive Fascination with the Apache Brand ===
>> We believe the Torii benefits Apache Spark application developers, and we
>> are interested in an Apache Torii project to benefit these developers by
>> engaging a larger community, facilitating closer ties with the existing
>> Spark project, and yes, gaining more visibility for the Torii as a solution.
>> 
>> === Documentation ===
>> Comprehensive documentation including “Getting Started”, API
>> specifications and a Roadmap are available from the GitHub project, see
>> https://github.com/ibm-et/Torii/wiki.
>> 
>> === Initial Source ===
>> The source code resides at https://github.com/ibm-et/Torii.
>> 
>> === External Dependencies ===
>> The Torii depends upon a number of Apache projects:
>> * Spark
>> * Hadoop
>> * Ivy
>> * Commons
>> 
>> The Torii also depends upon a number of other open source projects:
>> * ZeroMQ (LGPL with Static Linking Exception,
>> http://zeromq.org/area:licensing)
>> * Akka (MIT)
>> * JOpt Simple (MIT)
>> * Spring Framework Core (Apache v2)
>> * Play (Apache v2)
>> * SLF4J (MIT)
>> * Scala
>> * Scalatest (Apache v2)
>> * Scalactic (Apache v2)
>> * Mockito (MIT)
>> 
>> == Required Resources ==
>> 
>> === Mailing lists ===
>> 
>> * private@torii.incubator.apache.org (with moderated subscriptions)
>> * commits@torii.incubator.apache.org
>> * dev@torii.incubator.apache.org
>> 
>> === Git Repository ===
>> 
>> * https://git-wip-us.apache.org/repos/asf/incubator-torii.git
>> 
>> === Issue Tracking ===
>> 
>> * A JIRA issue tracker: https://issues.apache.org/jira/browse/TORII
>> 
>> == Initial Committers ==
>> 
>> * Leugim Bustelo (lbustelo AT us DOT ibm DOT com)
>> * Jakob Odersky (odersky AT us DOT ibm DOT com)
>> * Luciano Resende (lresende AT apache DOT org)
>> * Robert Senkbeil (rcsenkbe AT us DOT ibm DOT com)
>> * Corey Stubbs (cstubbs AT us DOT ibm DOT com)
>> * Miao Wang (wangmiao AT us DOT ibm DOT com)
>> * Sean Welleck (swelleck AT us DOT ibm DOT com)
>> 
>> === Affiliations ===
>> All of the initial committers are employed by IBM.
>> 
>> == Sponsors ==
>> 
>> === Champion ===
>> * Sam Ruby (rubys AT apache DOT org)
>> 
>> === Nominated Mentors ===
>> * Luciano Resende (lresende AT apache DOT org)
>> * Reynold Xin (rxin AT apache DOT org)
>> * Hitesh Shah (hitesh AT apache DOT org)
>> * Julien Le Dem (julien AT apache DOT org)
>> 
>> === Sponsoring Entity ===
>> 
>> We would like to propose the Apache Incubator to sponsor this project.
>> 
>> 
>> --
>> Luciano Resende
>> http://people.apache.org/~lresende
>> http://twitter.com/lresende1975
>> http://lresende.blogspot.com/
> 
> 
> 
> -- 
> Luciano Resende
> http://people.apache.org/~lresende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org

Re: [VOTE] Accept Torii into Apache Incubator

Posted by Luciano Resende <lu...@gmail.com>.

And off-course, Here is my +1 (binding).

On Thu, Nov 26, 2015 at 7:33 AM, Luciano Resende <lu...@gmail.com>
wrote:

> After initial discussion (under the name Spark-Kernel), please vote on
> the acceptance of Torii Project for incubation at the Apache Incubator.
> The full proposal is
> available at the end of this message and on the wiki at :
>
> https://wiki.apache.org/incubator/ToriiProposal
>
> Please cast your votes:
>
> [ ] +1, bring Torii into Incubator
> [ ] +0, I don't care either way
> [ ] -1, do not bring Torii into Incubator, because...
>
> Due to long weekend holiday in US, I will leave the vote open until
> December 1st.
>
>
> = Torii =
>
> == Abstract ==
> Torii provides applications with a mechanism to interactively and remotely
> access Apache Spark.
>
> == Proposal ==
> Torii enables interactive applications to access Apache Spark clusters.
> More specifically:
>  * Applications can send code-snippets and libraries for execution by Spark
>  * Applications can be deployed separately from Spark clusters and
> communicate with the Torii using the provided Torii client
>  * Execution results and streaming data can be sent back to calling
> applications
>  * Applications no longer have to be network connected to the workers on a
> Spark cluster because the Torii acts as each application’s proxy
>  * Work has started on enabling Torii to support languages in addition to
> Scala, namely Python (with PySpark), R (with SparkR), and SQL (with
> SparkSQL)
>
> == Background & Rationale ==
> Apache Spark provides applications with a fast and general purpose
> distributed computing engine that supports static and streaming data,
> tabular and graph representations of data, and an extensive library of
> machine learning libraries. Consequently, a wide variety of applications
> will be written for Spark and there will be interactive applications that
> require relatively frequent function evaluations, and batch-oriented
> applications that require one-shot or only occasional evaluation.
>
> Apache Spark provides two mechanisms for applications to connect with
> Spark. The primary mechanism launches applications on Spark clusters using
> spark-submit (
> http://spark.apache.org/docs/latest/submitting-applications.html); this
> requires developers to bundle their application code plus any dependencies
> into JAR files, and then submit them to Spark. A second mechanism is an
> ODBC/JDBC API (
> http://spark.apache.org/docs/latest/sql-programming-guide.html#distributed-sql-engine)
> which enables applications to issue SQL queries against SparkSQL.
>
> Our experience when developing interactive applications, such as analytic
> applications integrated with Notebooks, to run against Spark was that the
> spark-submit mechanism was overly cumbersome and slow (requiring JAR
> creation and forking processes to run spark-submit), and the SQL interface
> was too limiting and did not offer easy access to components other than
> SparkSQL, such as streaming. The most promising mechanism provided by
> Apache Spark was the command-line shell (
> http://spark.apache.org/docs/latest/programming-guide.html#using-the-shell)
> which enabled us to execute code snippets and dynamically control the tasks
> submitted to  a Spark cluster. Spark does not provide the command-line
> shell as a consumable service but it provided us with the starting point
> from which we developed Torii.
>
> == Current Status ==
> Torii was first developed by a small team working on an internal-IBM
> Spark-related project in July 2014. In recognition of its likely general
> utility to Spark users and developers, in November 2014 the Torii project
> was moved to GitHub and made available under the Apache License V2.
>
> == Meritocracy ==
> The current developers are familiar with the meritocratic open source
> development process at Apache. As the project has gathered interest at
> GitHub the developers have actively started a process to invite additional
> developers into the project, and we have at least one new developer who is
> ready to contribute code to the project.
>
> == Community ==
> We started building a community around Torii project when we moved it to
> GitHub about one year ago. Since then we have grown to about 70 people, and
> there are regular requests and suggestions from the community. We believe
> that providing Apache Spark application developers with a general-purpose
> and interactive API holds a lot of community potential, especially
> considering possible tie-in’s with Notebooks and data science community.
>
> == Core Developers ==
> The core developers of the project are currently all from IBM, from the
> IBM Emerging Technology team and from IBM’s recently formed Spark
> Technology Center.
>
> == Alignment ==
> Apache, as the home of Apache Spark, is the most natural home for the
> Torii project because it was designed to work with Apache Spark and to
> provide capabilities for interactive applications and data science tools
> not provided by Spark itself.
>
> The Torii also has an affinity with Jupyter (jupyter.org) because it uses
> the Jupyter protocol for communications, and so Jupyter Notebooks can
> directly use the Torii as a kernel for communicating with Apache Spark.
> However, we believe that the Torii provides a general-purpose mechanism
> enabling a wider variety of applications than just Notebooks to access
> Spark, and so the Torii’s greatest affinity is with Apache and Apache
> Spark.
>
> == Known Risks ==
>
> === Orphaned products ===
> We believe the Torii project has a low-risk of abandonment due to interest
> in its continuing existence from several parties. More specifically, the
> Torii provides a capability that is not provided by Apache Spark today but
> it enables a wider range of applications to leverage Spark. For example,
> IBM uses (and is considering) the Torii in several offerings including its
> IBM Analytics for Apache Spark product in the Bluemix Cloud. There are also
> a couple of other commercial users who are using or considering its use in
> their offerings. Furthermore, Jupyter Notebooks are used by data scientists
> and Spark is gaining popularity as an analytic engine for them. Jupyter
> Notebooks are very easily enabled with the Torii and so there is another
> constituency for it.
>
> === Inexperience with Open Source ===
> The Torii project has been running as an open-source project (albeit with
> only IBM committers) for the past several months. The project has an active
> issue tracker and due to the interest indicated by the nature and volume of
> requests and comments, the team has publicly stated it is beginning to
> build a process so they can accept third-party contributions to the project.
>
> === Relationships with Other Apache Products ===
> The Torii has a clear affinity with the Apache Spark project because it is
> designed to  provide capabilities for interactive applications and data
> science tools not provided by Spark itself. The Torii can be a back-end for
> the Zeppelin project currently incubating at Apache. There is interest from
> the Torii community to develop this capability and an experimental branch
> has been started.
>
> === Homogeneous Developers ===
> The current group of developers working on Torii are all from IBM although
> the group is in the process of expanding its membership to include members
> of the GitHub community who are not from IBM and who have been active in
> the Torii community in GutHub.
>
> === Reliance on Salaried Developers ===
> The initial committers are full-time employees at IBM although not all
> work on the project full-time.
>
> === Excessive Fascination with the Apache Brand ===
> We believe the Torii benefits Apache Spark application developers, and we
> are interested in an Apache Torii project to benefit these developers by
> engaging a larger community, facilitating closer ties with the existing
> Spark project, and yes, gaining more visibility for the Torii as a solution.
>
> === Documentation ===
> Comprehensive documentation including “Getting Started”, API
> specifications and a Roadmap are available from the GitHub project, see
> https://github.com/ibm-et/Torii/wiki.
>
> === Initial Source ===
> The source code resides at https://github.com/ibm-et/Torii.
>
> === External Dependencies ===
> The Torii depends upon a number of Apache projects:
>  * Spark
>  * Hadoop
>  * Ivy
>  * Commons
>
> The Torii also depends upon a number of other open source projects:
>  * ZeroMQ (LGPL with Static Linking Exception,
> http://zeromq.org/area:licensing)
>  * Akka (MIT)
>  * JOpt Simple (MIT)
>  * Spring Framework Core (Apache v2)
>  * Play (Apache v2)
>  * SLF4J (MIT)
>  * Scala
>  * Scalatest (Apache v2)
>  * Scalactic (Apache v2)
>  * Mockito (MIT)
>
> == Required Resources ==
>
> === Mailing lists ===
>
>  * private@torii.incubator.apache.org (with moderated subscriptions)
>  * commits@torii.incubator.apache.org
>  * dev@torii.incubator.apache.org
>
> === Git Repository ===
>
>  * https://git-wip-us.apache.org/repos/asf/incubator-torii.git
>
> === Issue Tracking ===
>
>  * A JIRA issue tracker: https://issues.apache.org/jira/browse/TORII
>
> == Initial Committers ==
>
>  * Leugim Bustelo (lbustelo AT us DOT ibm DOT com)
>  * Jakob Odersky (odersky AT us DOT ibm DOT com)
>  * Luciano Resende (lresende AT apache DOT org)
>  * Robert Senkbeil (rcsenkbe AT us DOT ibm DOT com)
>  * Corey Stubbs (cstubbs AT us DOT ibm DOT com)
>  * Miao Wang (wangmiao AT us DOT ibm DOT com)
>  * Sean Welleck (swelleck AT us DOT ibm DOT com)
>
> === Affiliations ===
> All of the initial committers are employed by IBM.
>
> == Sponsors ==
>
> === Champion ===
>  * Sam Ruby (rubys AT apache DOT org)
>
> === Nominated Mentors ===
>  * Luciano Resende (lresende AT apache DOT org)
>  * Reynold Xin (rxin AT apache DOT org)
>  * Hitesh Shah (hitesh AT apache DOT org)
>  * Julien Le Dem (julien AT apache DOT org)
>
> === Sponsoring Entity ===
>
> We would like to propose the Apache Incubator to sponsor this project.
>
>
> --
> Luciano Resende
> http://people.apache.org/~lresende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/
>



-- 
Luciano Resende
http://people.apache.org/~lresende
http://twitter.com/lresende1975
http://lresende.blogspot.com/

Re: [VOTE] Accept Torii into Apache Incubator

Posted by Sam Ruby <ru...@intertwingly.net>.

On Tue, Dec 1, 2015 at 10:24 AM, Steve Loughran <st...@hortonworks.com> wrote:
> Think I've missed the vote window, but
>
> +1 binding
>
> I will repeat what I raised when the proposal first came up, something that wasn't addresses at all: ZeroMQ is LGPL, which is forbidden as a mandatory dependency in ASF projects.
>
> Step 1 of the project is going to have to confirm that the zeroMQ : LGPL+ Static Linking Exception is sufficient for it to be allowed as a dependency on the project.

I'd like to encourage zeroMQ to move to MPL (and I'm willing to help
make that case).

Given that LGPL is essentially GPL+a static linking exception, I don't
know how LGPL+Static Linking Exception helps; the ZeroMQ licensing
page[1] suggests that it is a problem for corporate lawyers to accept;
Jim has repeatedly said in various ways that our goal is to be a
no-brainer.

> If it's not, then that's going to be a fundamental barrier to releasing Torii as ASF-signed off artifacts

- Sam Ruby

[1] http://zeromq.org/area:licensing

>>> On Thu, Nov 26, 2015 at 10:33 AM, Luciano Resende <lu...@gmail.com>
>>> wrote:
>>>> After initial discussion (under the name Spark-Kernel), please vote on
>>>> the
>>>> acceptance of Torii Project for incubation at the Apache Incubator. The
>>>> full proposal is
>>>> available at the end of this message and on the wiki at :
>>>>
>>>> https://wiki.apache.org/incubator/ToriiProposal
>>>>
>>>> Please cast your votes:
>>>>
>>>> [ ] +1, bring Torii into Incubator
>>>> [ ] +0, I don't care either way
>>>> [ ] -1, do not bring Torii into Incubator, because...
>>>>
>>>> Due to long weekend holiday in US, I will leave the vote open until
>>>> December 1st.
>>>>
>>>>
>>>> = Torii =
>>>>
>>>> == Abstract ==
>>>> Torii provides applications with a mechanism to interactively and
>>>> remotely
>>>> access Apache Spark.
>>>>
>>>> == Proposal ==
>>>> Torii enables interactive applications to access Apache Spark clusters.
>>>> More specifically:
>>>> * Applications can send code-snippets and libraries for execution by
>>>> Spark
>>>> * Applications can be deployed separately from Spark clusters and
>>>> communicate with the Torii using the provided Torii client
>>>> * Execution results and streaming data can be sent back to calling
>>>> applications
>>>> * Applications no longer have to be network connected to the workers
>>>> on a
>>>> Spark cluster because the Torii acts as each application’s proxy
>>>> * Work has started on enabling Torii to support languages in addition
>>>> to
>>>> Scala, namely Python (with PySpark), R (with SparkR), and SQL (with
>>>> SparkSQL)
>>>>
>>>> == Background & Rationale ==
>>>> Apache Spark provides applications with a fast and general purpose
>>>> distributed computing engine that supports static and streaming data,
>>>> tabular and graph representations of data, and an extensive library of
>>>> machine learning libraries. Consequently, a wide variety of applications
>>>> will be written for Spark and there will be interactive applications
>>>> that
>>>> require relatively frequent function evaluations, and batch-oriented
>>>> applications that require one-shot or only occasional evaluation.
>>>>
>>>> Apache Spark provides two mechanisms for applications to connect with
>>>> Spark. The primary mechanism launches applications on Spark clusters
>>>> using
>>>> spark-submit (
>>>> http://spark.apache.org/docs/latest/submitting-applications.html); this
>>>> requires developers to bundle their application code plus any
>>>> dependencies
>>>> into JAR files, and then submit them to Spark. A second mechanism is an
>>>> ODBC/JDBC API (
>>>>
>>>> http://spark.apache.org/docs/latest/sql-programming-guide.html#distribute
>>>> d-sql-engine)
>>>> which enables applications to issue SQL queries against SparkSQL.
>>>>
>>>> Our experience when developing interactive applications, such as
>>>> analytic
>>>> applications integrated with Notebooks, to run against Spark was that
>>>> the
>>>> spark-submit mechanism was overly cumbersome and slow (requiring JAR
>>>> creation and forking processes to run spark-submit), and the SQL
>>>> interface
>>>> was too limiting and did not offer easy access to components other than
>>>> SparkSQL, such as streaming. The most promising mechanism provided by
>>>> Apache Spark was the command-line shell (
>>>>
>>>> http://spark.apache.org/docs/latest/programming-guide.html#using-the-shel
>>>> l)
>>>> which enabled us to execute code snippets and dynamically control the
>>>> tasks
>>>> submitted to  a Spark cluster. Spark does not provide the command-line
>>>> shell as a consumable service but it provided us with the starting point
>>>> from which we developed Torii.
>>>>
>>>> == Current Status ==
>>>> Torii was first developed by a small team working on an internal-IBM
>>>> Spark-related project in July 2014. In recognition of its likely general
>>>> utility to Spark users and developers, in November 2014 the Torii
>>>> project
>>>> was moved to GitHub and made available under the Apache License V2.
>>>>
>>>> == Meritocracy ==
>>>> The current developers are familiar with the meritocratic open source
>>>> development process at Apache. As the project has gathered interest at
>>>> GitHub the developers have actively started a process to invite
>>>> additional
>>>> developers into the project, and we have at least one new developer who
>>>> is
>>>> ready to contribute code to the project.
>>>>
>>>> == Community ==
>>>> We started building a community around Torii project when we moved it to
>>>> GitHub about one year ago. Since then we have grown to about 70 people,
>>>> and
>>>> there are regular requests and suggestions from the community. We
>>>> believe
>>>> that providing Apache Spark application developers with a
>>>> general-purpose
>>>> and interactive API holds a lot of community potential, especially
>>>> considering possible tie-in’s with Notebooks and data science community.
>>>>
>>>> == Core Developers ==
>>>> The core developers of the project are currently all from IBM, from the
>>>> IBM
>>>> Emerging Technology team and from IBM’s recently formed Spark Technology
>>>> Center.
>>>>
>>>> == Alignment ==
>>>> Apache, as the home of Apache Spark, is the most natural home for the
>>>> Torii
>>>> project because it was designed to work with Apache Spark and to provide
>>>> capabilities for interactive applications and data science tools not
>>>> provided by Spark itself.
>>>>
>>>> The Torii also has an affinity with Jupyter (jupyter.org) because it
>>>> uses
>>>> the Jupyter protocol for communications, and so Jupyter Notebooks can
>>>> directly use the Torii as a kernel for communicating with Apache Spark.
>>>> However, we believe that the Torii provides a general-purpose mechanism
>>>> enabling a wider variety of applications than just Notebooks to access
>>>> Spark, and so the Torii’s greatest affinity is with Apache and Apache
>>>> Spark.
>>>>
>>>> == Known Risks ==
>>>>
>>>> === Orphaned products ===
>>>> We believe the Torii project has a low-risk of abandonment due to
>>>> interest
>>>> in its continuing existence from several parties. More specifically, the
>>>> Torii provides a capability that is not provided by Apache Spark today
>>>> but
>>>> it enables a wider range of applications to leverage Spark. For example,
>>>> IBM uses (and is considering) the Torii in several offerings including
>>>> its
>>>> IBM Analytics for Apache Spark product in the Bluemix Cloud. There are
>>>> also
>>>> a couple of other commercial users who are using or considering its use
>>>> in
>>>> their offerings. Furthermore, Jupyter Notebooks are used by data
>>>> scientists
>>>> and Spark is gaining popularity as an analytic engine for them. Jupyter
>>>> Notebooks are very easily enabled with the Torii and so there is another
>>>> constituency for it.
>>>>
>>>> === Inexperience with Open Source ===
>>>> The Torii project has been running as an open-source project (albeit
>>>> with
>>>> only IBM committers) for the past several months. The project has an
>>>> active
>>>> issue tracker and due to the interest indicated by the nature and
>>>> volume of
>>>> requests and comments, the team has publicly stated it is beginning to
>>>> build a process so they can accept third-party contributions to the
>>>> project.
>>>>
>>>> === Relationships with Other Apache Products ===
>>>> The Torii has a clear affinity with the Apache Spark project because it
>>>> is
>>>> designed to  provide capabilities for interactive applications and data
>>>> science tools not provided by Spark itself. The Torii can be a back-end
>>>> for
>>>> the Zeppelin project currently incubating at Apache. There is interest
>>>> from
>>>> the Torii community to develop this capability and an experimental
>>>> branch
>>>> has been started.
>>>>
>>>> === Homogeneous Developers ===
>>>> The current group of developers working on Torii are all from IBM
>>>> although
>>>> the group is in the process of expanding its membership to include
>>>> members
>>>> of the GitHub community who are not from IBM and who have been active in
>>>> the Torii community in GutHub.
>>>>
>>>> === Reliance on Salaried Developers ===
>>>> The initial committers are full-time employees at IBM although not all
>>>> work
>>>> on the project full-time.
>>>>
>>>> === Excessive Fascination with the Apache Brand ===
>>>> We believe the Torii benefits Apache Spark application developers, and
>>>> we
>>>> are interested in an Apache Torii project to benefit these developers by
>>>> engaging a larger community, facilitating closer ties with the existing
>>>> Spark project, and yes, gaining more visibility for the Torii as a
>>>> solution.
>>>>
>>>> === Documentation ===
>>>> Comprehensive documentation including “Getting Started”, API
>>>> specifications
>>>> and a Roadmap are available from the GitHub project, see
>>>> https://github.com/ibm-et/Torii/wiki.
>>>>
>>>> === Initial Source ===
>>>> The source code resides at https://github.com/ibm-et/Torii.
>>>>
>>>> === External Dependencies ===
>>>> The Torii depends upon a number of Apache projects:
>>>> * Spark
>>>> * Hadoop
>>>> * Ivy
>>>> * Commons
>>>>
>>>> The Torii also depends upon a number of other open source projects:
>>>> * ZeroMQ (LGPL with Static Linking Exception,
>>>> http://zeromq.org/area:licensing)
>>>> * Akka (MIT)
>>>> * JOpt Simple (MIT)
>>>> * Spring Framework Core (Apache v2)
>>>> * Play (Apache v2)
>>>> * SLF4J (MIT)
>>>> * Scala
>>>> * Scalatest (Apache v2)
>>>> * Scalactic (Apache v2)
>>>> * Mockito (MIT)
>>>>
>>>> == Required Resources ==
>>>>
>>>> === Mailing lists ===
>>>>
>>>> * private@torii.incubator.apache.org (with moderated subscriptions)
>>>> * commits@torii.incubator.apache.org
>>>> * dev@torii.incubator.apache.org
>>>>
>>>> === Git Repository ===
>>>>
>>>> * https://git-wip-us.apache.org/repos/asf/incubator-torii.git
>>>>
>>>> === Issue Tracking ===
>>>>
>>>> * A JIRA issue tracker: https://issues.apache.org/jira/browse/TORII
>>>>
>>>> == Initial Committers ==
>>>>
>>>> * Leugim Bustelo (lbustelo AT us DOT ibm DOT com)
>>>> * Jakob Odersky (odersky AT us DOT ibm DOT com)
>>>> * Luciano Resende (lresende AT apache DOT org)
>>>> * Robert Senkbeil (rcsenkbe AT us DOT ibm DOT com)
>>>> * Corey Stubbs (cstubbs AT us DOT ibm DOT com)
>>>> * Miao Wang (wangmiao AT us DOT ibm DOT com)
>>>> * Sean Welleck (swelleck AT us DOT ibm DOT com)
>>>>
>>>> === Affiliations ===
>>>> All of the initial committers are employed by IBM.
>>>>
>>>> == Sponsors ==
>>>>
>>>> === Champion ===
>>>> * Sam Ruby (rubys AT apache DOT org)
>>>>
>>>> === Nominated Mentors ===
>>>> * Luciano Resende (lresende AT apache DOT org)
>>>> * Reynold Xin (rxin AT apache DOT org)
>>>> * Hitesh Shah (hitesh AT apache DOT org)
>>>> * Julien Le Dem (julien AT apache DOT org)
>>>>
>>>> === Sponsoring Entity ===
>>>>
>>>> We would like to propose the Apache Incubator to sponsor this project.
>>>>
>>>>
>>>> --
>>>> Luciano Resende
>>>> http://people.apache.org/~lresende
>>>> http://twitter.com/lresende1975
>>>> http://lresende.blogspot.com/
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>>> For additional commands, e-mail: general-help@incubator.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>> For additional commands, e-mail: general-help@incubator.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org

Re: [VOTE] Accept Torii into Apache Incubator

Posted by Steve Loughran <st...@hortonworks.com>.

Think I've missed the vote window, but

+1 binding

I will repeat what I raised when the proposal first came up, something that wasn't addresses at all: ZeroMQ is LGPL, which is forbidden as a mandatory dependency in ASF projects.

Step 1 of the project is going to have to confirm that the zeroMQ : LGPL+ Static Linking Exception is sufficient for it to be allowed as a dependency on the project.

If it's not, then that's going to be a fundamental barrier to releasing Torii as ASF-signed off artifacts

>> 
>> On Thu, Nov 26, 2015 at 10:33 AM, Luciano Resende <lu...@gmail.com>
>> wrote:
>>> After initial discussion (under the name Spark-Kernel), please vote on
>>> the
>>> acceptance of Torii Project for incubation at the Apache Incubator. The
>>> full proposal is
>>> available at the end of this message and on the wiki at :
>>> 
>>> https://wiki.apache.org/incubator/ToriiProposal
>>> 
>>> Please cast your votes:
>>> 
>>> [ ] +1, bring Torii into Incubator
>>> [ ] +0, I don't care either way
>>> [ ] -1, do not bring Torii into Incubator, because...
>>> 
>>> Due to long weekend holiday in US, I will leave the vote open until
>>> December 1st.
>>> 
>>> 
>>> = Torii =
>>> 
>>> == Abstract ==
>>> Torii provides applications with a mechanism to interactively and
>>> remotely
>>> access Apache Spark.
>>> 
>>> == Proposal ==
>>> Torii enables interactive applications to access Apache Spark clusters.
>>> More specifically:
>>> * Applications can send code-snippets and libraries for execution by
>>> Spark
>>> * Applications can be deployed separately from Spark clusters and
>>> communicate with the Torii using the provided Torii client
>>> * Execution results and streaming data can be sent back to calling
>>> applications
>>> * Applications no longer have to be network connected to the workers
>>> on a
>>> Spark cluster because the Torii acts as each application’s proxy
>>> * Work has started on enabling Torii to support languages in addition
>>> to
>>> Scala, namely Python (with PySpark), R (with SparkR), and SQL (with
>>> SparkSQL)
>>> 
>>> == Background & Rationale ==
>>> Apache Spark provides applications with a fast and general purpose
>>> distributed computing engine that supports static and streaming data,
>>> tabular and graph representations of data, and an extensive library of
>>> machine learning libraries. Consequently, a wide variety of applications
>>> will be written for Spark and there will be interactive applications
>>> that
>>> require relatively frequent function evaluations, and batch-oriented
>>> applications that require one-shot or only occasional evaluation.
>>> 
>>> Apache Spark provides two mechanisms for applications to connect with
>>> Spark. The primary mechanism launches applications on Spark clusters
>>> using
>>> spark-submit (
>>> http://spark.apache.org/docs/latest/submitting-applications.html); this
>>> requires developers to bundle their application code plus any
>>> dependencies
>>> into JAR files, and then submit them to Spark. A second mechanism is an
>>> ODBC/JDBC API (
>>> 
>>> http://spark.apache.org/docs/latest/sql-programming-guide.html#distribute
>>> d-sql-engine)
>>> which enables applications to issue SQL queries against SparkSQL.
>>> 
>>> Our experience when developing interactive applications, such as
>>> analytic
>>> applications integrated with Notebooks, to run against Spark was that
>>> the
>>> spark-submit mechanism was overly cumbersome and slow (requiring JAR
>>> creation and forking processes to run spark-submit), and the SQL
>>> interface
>>> was too limiting and did not offer easy access to components other than
>>> SparkSQL, such as streaming. The most promising mechanism provided by
>>> Apache Spark was the command-line shell (
>>> 
>>> http://spark.apache.org/docs/latest/programming-guide.html#using-the-shel
>>> l)
>>> which enabled us to execute code snippets and dynamically control the
>>> tasks
>>> submitted to  a Spark cluster. Spark does not provide the command-line
>>> shell as a consumable service but it provided us with the starting point
>>> from which we developed Torii.
>>> 
>>> == Current Status ==
>>> Torii was first developed by a small team working on an internal-IBM
>>> Spark-related project in July 2014. In recognition of its likely general
>>> utility to Spark users and developers, in November 2014 the Torii
>>> project
>>> was moved to GitHub and made available under the Apache License V2.
>>> 
>>> == Meritocracy ==
>>> The current developers are familiar with the meritocratic open source
>>> development process at Apache. As the project has gathered interest at
>>> GitHub the developers have actively started a process to invite
>>> additional
>>> developers into the project, and we have at least one new developer who
>>> is
>>> ready to contribute code to the project.
>>> 
>>> == Community ==
>>> We started building a community around Torii project when we moved it to
>>> GitHub about one year ago. Since then we have grown to about 70 people,
>>> and
>>> there are regular requests and suggestions from the community. We
>>> believe
>>> that providing Apache Spark application developers with a
>>> general-purpose
>>> and interactive API holds a lot of community potential, especially
>>> considering possible tie-in’s with Notebooks and data science community.
>>> 
>>> == Core Developers ==
>>> The core developers of the project are currently all from IBM, from the
>>> IBM
>>> Emerging Technology team and from IBM’s recently formed Spark Technology
>>> Center.
>>> 
>>> == Alignment ==
>>> Apache, as the home of Apache Spark, is the most natural home for the
>>> Torii
>>> project because it was designed to work with Apache Spark and to provide
>>> capabilities for interactive applications and data science tools not
>>> provided by Spark itself.
>>> 
>>> The Torii also has an affinity with Jupyter (jupyter.org) because it
>>> uses
>>> the Jupyter protocol for communications, and so Jupyter Notebooks can
>>> directly use the Torii as a kernel for communicating with Apache Spark.
>>> However, we believe that the Torii provides a general-purpose mechanism
>>> enabling a wider variety of applications than just Notebooks to access
>>> Spark, and so the Torii’s greatest affinity is with Apache and Apache
>>> Spark.
>>> 
>>> == Known Risks ==
>>> 
>>> === Orphaned products ===
>>> We believe the Torii project has a low-risk of abandonment due to
>>> interest
>>> in its continuing existence from several parties. More specifically, the
>>> Torii provides a capability that is not provided by Apache Spark today
>>> but
>>> it enables a wider range of applications to leverage Spark. For example,
>>> IBM uses (and is considering) the Torii in several offerings including
>>> its
>>> IBM Analytics for Apache Spark product in the Bluemix Cloud. There are
>>> also
>>> a couple of other commercial users who are using or considering its use
>>> in
>>> their offerings. Furthermore, Jupyter Notebooks are used by data
>>> scientists
>>> and Spark is gaining popularity as an analytic engine for them. Jupyter
>>> Notebooks are very easily enabled with the Torii and so there is another
>>> constituency for it.
>>> 
>>> === Inexperience with Open Source ===
>>> The Torii project has been running as an open-source project (albeit
>>> with
>>> only IBM committers) for the past several months. The project has an
>>> active
>>> issue tracker and due to the interest indicated by the nature and
>>> volume of
>>> requests and comments, the team has publicly stated it is beginning to
>>> build a process so they can accept third-party contributions to the
>>> project.
>>> 
>>> === Relationships with Other Apache Products ===
>>> The Torii has a clear affinity with the Apache Spark project because it
>>> is
>>> designed to  provide capabilities for interactive applications and data
>>> science tools not provided by Spark itself. The Torii can be a back-end
>>> for
>>> the Zeppelin project currently incubating at Apache. There is interest
>>> from
>>> the Torii community to develop this capability and an experimental
>>> branch
>>> has been started.
>>> 
>>> === Homogeneous Developers ===
>>> The current group of developers working on Torii are all from IBM
>>> although
>>> the group is in the process of expanding its membership to include
>>> members
>>> of the GitHub community who are not from IBM and who have been active in
>>> the Torii community in GutHub.
>>> 
>>> === Reliance on Salaried Developers ===
>>> The initial committers are full-time employees at IBM although not all
>>> work
>>> on the project full-time.
>>> 
>>> === Excessive Fascination with the Apache Brand ===
>>> We believe the Torii benefits Apache Spark application developers, and
>>> we
>>> are interested in an Apache Torii project to benefit these developers by
>>> engaging a larger community, facilitating closer ties with the existing
>>> Spark project, and yes, gaining more visibility for the Torii as a
>>> solution.
>>> 
>>> === Documentation ===
>>> Comprehensive documentation including “Getting Started”, API
>>> specifications
>>> and a Roadmap are available from the GitHub project, see
>>> https://github.com/ibm-et/Torii/wiki.
>>> 
>>> === Initial Source ===
>>> The source code resides at https://github.com/ibm-et/Torii.
>>> 
>>> === External Dependencies ===
>>> The Torii depends upon a number of Apache projects:
>>> * Spark
>>> * Hadoop
>>> * Ivy
>>> * Commons
>>> 
>>> The Torii also depends upon a number of other open source projects:
>>> * ZeroMQ (LGPL with Static Linking Exception,
>>> http://zeromq.org/area:licensing)
>>> * Akka (MIT)
>>> * JOpt Simple (MIT)
>>> * Spring Framework Core (Apache v2)
>>> * Play (Apache v2)
>>> * SLF4J (MIT)
>>> * Scala
>>> * Scalatest (Apache v2)
>>> * Scalactic (Apache v2)
>>> * Mockito (MIT)
>>> 
>>> == Required Resources ==
>>> 
>>> === Mailing lists ===
>>> 
>>> * private@torii.incubator.apache.org (with moderated subscriptions)
>>> * commits@torii.incubator.apache.org
>>> * dev@torii.incubator.apache.org
>>> 
>>> === Git Repository ===
>>> 
>>> * https://git-wip-us.apache.org/repos/asf/incubator-torii.git
>>> 
>>> === Issue Tracking ===
>>> 
>>> * A JIRA issue tracker: https://issues.apache.org/jira/browse/TORII
>>> 
>>> == Initial Committers ==
>>> 
>>> * Leugim Bustelo (lbustelo AT us DOT ibm DOT com)
>>> * Jakob Odersky (odersky AT us DOT ibm DOT com)
>>> * Luciano Resende (lresende AT apache DOT org)
>>> * Robert Senkbeil (rcsenkbe AT us DOT ibm DOT com)
>>> * Corey Stubbs (cstubbs AT us DOT ibm DOT com)
>>> * Miao Wang (wangmiao AT us DOT ibm DOT com)
>>> * Sean Welleck (swelleck AT us DOT ibm DOT com)
>>> 
>>> === Affiliations ===
>>> All of the initial committers are employed by IBM.
>>> 
>>> == Sponsors ==
>>> 
>>> === Champion ===
>>> * Sam Ruby (rubys AT apache DOT org)
>>> 
>>> === Nominated Mentors ===
>>> * Luciano Resende (lresende AT apache DOT org)
>>> * Reynold Xin (rxin AT apache DOT org)
>>> * Hitesh Shah (hitesh AT apache DOT org)
>>> * Julien Le Dem (julien AT apache DOT org)
>>> 
>>> === Sponsoring Entity ===
>>> 
>>> We would like to propose the Apache Incubator to sponsor this project.
>>> 
>>> 
>>> --
>>> Luciano Resende
>>> http://people.apache.org/~lresende
>>> http://twitter.com/lresende1975
>>> http://lresende.blogspot.com/
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>> For additional commands, e-mail: general-help@incubator.apache.org
>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org

Re: [VOTE] Accept Torii into Apache Incubator

Posted by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>.

+1 from me.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++





-----Original Message-----
From: <sa...@gmail.com> on behalf of Sam Ruby <ru...@intertwingly.net>
Reply-To: "general@incubator.apache.org" <ge...@incubator.apache.org>
Date: Monday, November 30, 2015 at 10:58 AM
To: "general@incubator.apache.org" <ge...@incubator.apache.org>
Subject: Re: [VOTE] Accept Torii into Apache Incubator

>+1 (binding)
>
>- Sam Ruby
>
>On Thu, Nov 26, 2015 at 10:33 AM, Luciano Resende <lu...@gmail.com>
>wrote:
>> After initial discussion (under the name Spark-Kernel), please vote on
>>the
>> acceptance of Torii Project for incubation at the Apache Incubator. The
>> full proposal is
>> available at the end of this message and on the wiki at :
>>
>> https://wiki.apache.org/incubator/ToriiProposal
>>
>> Please cast your votes:
>>
>> [ ] +1, bring Torii into Incubator
>> [ ] +0, I don't care either way
>> [ ] -1, do not bring Torii into Incubator, because...
>>
>> Due to long weekend holiday in US, I will leave the vote open until
>> December 1st.
>>
>>
>> = Torii =
>>
>> == Abstract ==
>> Torii provides applications with a mechanism to interactively and
>>remotely
>> access Apache Spark.
>>
>> == Proposal ==
>> Torii enables interactive applications to access Apache Spark clusters.
>> More specifically:
>>  * Applications can send code-snippets and libraries for execution by
>>Spark
>>  * Applications can be deployed separately from Spark clusters and
>> communicate with the Torii using the provided Torii client
>>  * Execution results and streaming data can be sent back to calling
>> applications
>>  * Applications no longer have to be network connected to the workers
>>on a
>> Spark cluster because the Torii acts as each application’s proxy
>>  * Work has started on enabling Torii to support languages in addition
>>to
>> Scala, namely Python (with PySpark), R (with SparkR), and SQL (with
>> SparkSQL)
>>
>> == Background & Rationale ==
>> Apache Spark provides applications with a fast and general purpose
>> distributed computing engine that supports static and streaming data,
>> tabular and graph representations of data, and an extensive library of
>> machine learning libraries. Consequently, a wide variety of applications
>> will be written for Spark and there will be interactive applications
>>that
>> require relatively frequent function evaluations, and batch-oriented
>> applications that require one-shot or only occasional evaluation.
>>
>> Apache Spark provides two mechanisms for applications to connect with
>> Spark. The primary mechanism launches applications on Spark clusters
>>using
>> spark-submit (
>> http://spark.apache.org/docs/latest/submitting-applications.html); this
>> requires developers to bundle their application code plus any
>>dependencies
>> into JAR files, and then submit them to Spark. A second mechanism is an
>> ODBC/JDBC API (
>> 
>>http://spark.apache.org/docs/latest/sql-programming-guide.html#distribute
>>d-sql-engine)
>> which enables applications to issue SQL queries against SparkSQL.
>>
>> Our experience when developing interactive applications, such as
>>analytic
>> applications integrated with Notebooks, to run against Spark was that
>>the
>> spark-submit mechanism was overly cumbersome and slow (requiring JAR
>> creation and forking processes to run spark-submit), and the SQL
>>interface
>> was too limiting and did not offer easy access to components other than
>> SparkSQL, such as streaming. The most promising mechanism provided by
>> Apache Spark was the command-line shell (
>> 
>>http://spark.apache.org/docs/latest/programming-guide.html#using-the-shel
>>l)
>> which enabled us to execute code snippets and dynamically control the
>>tasks
>> submitted to  a Spark cluster. Spark does not provide the command-line
>> shell as a consumable service but it provided us with the starting point
>> from which we developed Torii.
>>
>> == Current Status ==
>> Torii was first developed by a small team working on an internal-IBM
>> Spark-related project in July 2014. In recognition of its likely general
>> utility to Spark users and developers, in November 2014 the Torii
>>project
>> was moved to GitHub and made available under the Apache License V2.
>>
>> == Meritocracy ==
>> The current developers are familiar with the meritocratic open source
>> development process at Apache. As the project has gathered interest at
>> GitHub the developers have actively started a process to invite
>>additional
>> developers into the project, and we have at least one new developer who
>>is
>> ready to contribute code to the project.
>>
>> == Community ==
>> We started building a community around Torii project when we moved it to
>> GitHub about one year ago. Since then we have grown to about 70 people,
>>and
>> there are regular requests and suggestions from the community. We
>>believe
>> that providing Apache Spark application developers with a
>>general-purpose
>> and interactive API holds a lot of community potential, especially
>> considering possible tie-in’s with Notebooks and data science community.
>>
>> == Core Developers ==
>> The core developers of the project are currently all from IBM, from the
>>IBM
>> Emerging Technology team and from IBM’s recently formed Spark Technology
>> Center.
>>
>> == Alignment ==
>> Apache, as the home of Apache Spark, is the most natural home for the
>>Torii
>> project because it was designed to work with Apache Spark and to provide
>> capabilities for interactive applications and data science tools not
>> provided by Spark itself.
>>
>> The Torii also has an affinity with Jupyter (jupyter.org) because it
>>uses
>> the Jupyter protocol for communications, and so Jupyter Notebooks can
>> directly use the Torii as a kernel for communicating with Apache Spark.
>> However, we believe that the Torii provides a general-purpose mechanism
>> enabling a wider variety of applications than just Notebooks to access
>> Spark, and so the Torii’s greatest affinity is with Apache and Apache
>> Spark.
>>
>> == Known Risks ==
>>
>> === Orphaned products ===
>> We believe the Torii project has a low-risk of abandonment due to
>>interest
>> in its continuing existence from several parties. More specifically, the
>> Torii provides a capability that is not provided by Apache Spark today
>>but
>> it enables a wider range of applications to leverage Spark. For example,
>> IBM uses (and is considering) the Torii in several offerings including
>>its
>> IBM Analytics for Apache Spark product in the Bluemix Cloud. There are
>>also
>> a couple of other commercial users who are using or considering its use
>>in
>> their offerings. Furthermore, Jupyter Notebooks are used by data
>>scientists
>> and Spark is gaining popularity as an analytic engine for them. Jupyter
>> Notebooks are very easily enabled with the Torii and so there is another
>> constituency for it.
>>
>> === Inexperience with Open Source ===
>> The Torii project has been running as an open-source project (albeit
>>with
>> only IBM committers) for the past several months. The project has an
>>active
>> issue tracker and due to the interest indicated by the nature and
>>volume of
>> requests and comments, the team has publicly stated it is beginning to
>> build a process so they can accept third-party contributions to the
>>project.
>>
>> === Relationships with Other Apache Products ===
>> The Torii has a clear affinity with the Apache Spark project because it
>>is
>> designed to  provide capabilities for interactive applications and data
>> science tools not provided by Spark itself. The Torii can be a back-end
>>for
>> the Zeppelin project currently incubating at Apache. There is interest
>>from
>> the Torii community to develop this capability and an experimental
>>branch
>> has been started.
>>
>> === Homogeneous Developers ===
>> The current group of developers working on Torii are all from IBM
>>although
>> the group is in the process of expanding its membership to include
>>members
>> of the GitHub community who are not from IBM and who have been active in
>> the Torii community in GutHub.
>>
>> === Reliance on Salaried Developers ===
>> The initial committers are full-time employees at IBM although not all
>>work
>> on the project full-time.
>>
>> === Excessive Fascination with the Apache Brand ===
>> We believe the Torii benefits Apache Spark application developers, and
>>we
>> are interested in an Apache Torii project to benefit these developers by
>> engaging a larger community, facilitating closer ties with the existing
>> Spark project, and yes, gaining more visibility for the Torii as a
>>solution.
>>
>> === Documentation ===
>> Comprehensive documentation including “Getting Started”, API
>>specifications
>> and a Roadmap are available from the GitHub project, see
>> https://github.com/ibm-et/Torii/wiki.
>>
>> === Initial Source ===
>> The source code resides at https://github.com/ibm-et/Torii.
>>
>> === External Dependencies ===
>> The Torii depends upon a number of Apache projects:
>>  * Spark
>>  * Hadoop
>>  * Ivy
>>  * Commons
>>
>> The Torii also depends upon a number of other open source projects:
>>  * ZeroMQ (LGPL with Static Linking Exception,
>> http://zeromq.org/area:licensing)
>>  * Akka (MIT)
>>  * JOpt Simple (MIT)
>>  * Spring Framework Core (Apache v2)
>>  * Play (Apache v2)
>>  * SLF4J (MIT)
>>  * Scala
>>  * Scalatest (Apache v2)
>>  * Scalactic (Apache v2)
>>  * Mockito (MIT)
>>
>> == Required Resources ==
>>
>> === Mailing lists ===
>>
>>  * private@torii.incubator.apache.org (with moderated subscriptions)
>>  * commits@torii.incubator.apache.org
>>  * dev@torii.incubator.apache.org
>>
>> === Git Repository ===
>>
>>  * https://git-wip-us.apache.org/repos/asf/incubator-torii.git
>>
>> === Issue Tracking ===
>>
>>  * A JIRA issue tracker: https://issues.apache.org/jira/browse/TORII
>>
>> == Initial Committers ==
>>
>>  * Leugim Bustelo (lbustelo AT us DOT ibm DOT com)
>>  * Jakob Odersky (odersky AT us DOT ibm DOT com)
>>  * Luciano Resende (lresende AT apache DOT org)
>>  * Robert Senkbeil (rcsenkbe AT us DOT ibm DOT com)
>>  * Corey Stubbs (cstubbs AT us DOT ibm DOT com)
>>  * Miao Wang (wangmiao AT us DOT ibm DOT com)
>>  * Sean Welleck (swelleck AT us DOT ibm DOT com)
>>
>> === Affiliations ===
>> All of the initial committers are employed by IBM.
>>
>> == Sponsors ==
>>
>> === Champion ===
>>  * Sam Ruby (rubys AT apache DOT org)
>>
>> === Nominated Mentors ===
>>  * Luciano Resende (lresende AT apache DOT org)
>>  * Reynold Xin (rxin AT apache DOT org)
>>  * Hitesh Shah (hitesh AT apache DOT org)
>>  * Julien Le Dem (julien AT apache DOT org)
>>
>> === Sponsoring Entity ===
>>
>> We would like to propose the Apache Incubator to sponsor this project.
>>
>>
>> --
>> Luciano Resende
>> http://people.apache.org/~lresende
>> http://twitter.com/lresende1975
>> http://lresende.blogspot.com/
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>For additional commands, e-mail: general-help@incubator.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org

Re: [VOTE] Accept Torii into Apache Incubator

Posted by Sam Ruby <ru...@intertwingly.net>.

+1 (binding)

- Sam Ruby

On Thu, Nov 26, 2015 at 10:33 AM, Luciano Resende <lu...@gmail.com> wrote:
> After initial discussion (under the name Spark-Kernel), please vote on the
> acceptance of Torii Project for incubation at the Apache Incubator. The
> full proposal is
> available at the end of this message and on the wiki at :
>
> https://wiki.apache.org/incubator/ToriiProposal
>
> Please cast your votes:
>
> [ ] +1, bring Torii into Incubator
> [ ] +0, I don't care either way
> [ ] -1, do not bring Torii into Incubator, because...
>
> Due to long weekend holiday in US, I will leave the vote open until
> December 1st.
>
>
> = Torii =
>
> == Abstract ==
> Torii provides applications with a mechanism to interactively and remotely
> access Apache Spark.
>
> == Proposal ==
> Torii enables interactive applications to access Apache Spark clusters.
> More specifically:
>  * Applications can send code-snippets and libraries for execution by Spark
>  * Applications can be deployed separately from Spark clusters and
> communicate with the Torii using the provided Torii client
>  * Execution results and streaming data can be sent back to calling
> applications
>  * Applications no longer have to be network connected to the workers on a
> Spark cluster because the Torii acts as each application’s proxy
>  * Work has started on enabling Torii to support languages in addition to
> Scala, namely Python (with PySpark), R (with SparkR), and SQL (with
> SparkSQL)
>
> == Background & Rationale ==
> Apache Spark provides applications with a fast and general purpose
> distributed computing engine that supports static and streaming data,
> tabular and graph representations of data, and an extensive library of
> machine learning libraries. Consequently, a wide variety of applications
> will be written for Spark and there will be interactive applications that
> require relatively frequent function evaluations, and batch-oriented
> applications that require one-shot or only occasional evaluation.
>
> Apache Spark provides two mechanisms for applications to connect with
> Spark. The primary mechanism launches applications on Spark clusters using
> spark-submit (
> http://spark.apache.org/docs/latest/submitting-applications.html); this
> requires developers to bundle their application code plus any dependencies
> into JAR files, and then submit them to Spark. A second mechanism is an
> ODBC/JDBC API (
> http://spark.apache.org/docs/latest/sql-programming-guide.html#distributed-sql-engine)
> which enables applications to issue SQL queries against SparkSQL.
>
> Our experience when developing interactive applications, such as analytic
> applications integrated with Notebooks, to run against Spark was that the
> spark-submit mechanism was overly cumbersome and slow (requiring JAR
> creation and forking processes to run spark-submit), and the SQL interface
> was too limiting and did not offer easy access to components other than
> SparkSQL, such as streaming. The most promising mechanism provided by
> Apache Spark was the command-line shell (
> http://spark.apache.org/docs/latest/programming-guide.html#using-the-shell)
> which enabled us to execute code snippets and dynamically control the tasks
> submitted to  a Spark cluster. Spark does not provide the command-line
> shell as a consumable service but it provided us with the starting point
> from which we developed Torii.
>
> == Current Status ==
> Torii was first developed by a small team working on an internal-IBM
> Spark-related project in July 2014. In recognition of its likely general
> utility to Spark users and developers, in November 2014 the Torii project
> was moved to GitHub and made available under the Apache License V2.
>
> == Meritocracy ==
> The current developers are familiar with the meritocratic open source
> development process at Apache. As the project has gathered interest at
> GitHub the developers have actively started a process to invite additional
> developers into the project, and we have at least one new developer who is
> ready to contribute code to the project.
>
> == Community ==
> We started building a community around Torii project when we moved it to
> GitHub about one year ago. Since then we have grown to about 70 people, and
> there are regular requests and suggestions from the community. We believe
> that providing Apache Spark application developers with a general-purpose
> and interactive API holds a lot of community potential, especially
> considering possible tie-in’s with Notebooks and data science community.
>
> == Core Developers ==
> The core developers of the project are currently all from IBM, from the IBM
> Emerging Technology team and from IBM’s recently formed Spark Technology
> Center.
>
> == Alignment ==
> Apache, as the home of Apache Spark, is the most natural home for the Torii
> project because it was designed to work with Apache Spark and to provide
> capabilities for interactive applications and data science tools not
> provided by Spark itself.
>
> The Torii also has an affinity with Jupyter (jupyter.org) because it uses
> the Jupyter protocol for communications, and so Jupyter Notebooks can
> directly use the Torii as a kernel for communicating with Apache Spark.
> However, we believe that the Torii provides a general-purpose mechanism
> enabling a wider variety of applications than just Notebooks to access
> Spark, and so the Torii’s greatest affinity is with Apache and Apache
> Spark.
>
> == Known Risks ==
>
> === Orphaned products ===
> We believe the Torii project has a low-risk of abandonment due to interest
> in its continuing existence from several parties. More specifically, the
> Torii provides a capability that is not provided by Apache Spark today but
> it enables a wider range of applications to leverage Spark. For example,
> IBM uses (and is considering) the Torii in several offerings including its
> IBM Analytics for Apache Spark product in the Bluemix Cloud. There are also
> a couple of other commercial users who are using or considering its use in
> their offerings. Furthermore, Jupyter Notebooks are used by data scientists
> and Spark is gaining popularity as an analytic engine for them. Jupyter
> Notebooks are very easily enabled with the Torii and so there is another
> constituency for it.
>
> === Inexperience with Open Source ===
> The Torii project has been running as an open-source project (albeit with
> only IBM committers) for the past several months. The project has an active
> issue tracker and due to the interest indicated by the nature and volume of
> requests and comments, the team has publicly stated it is beginning to
> build a process so they can accept third-party contributions to the project.
>
> === Relationships with Other Apache Products ===
> The Torii has a clear affinity with the Apache Spark project because it is
> designed to  provide capabilities for interactive applications and data
> science tools not provided by Spark itself. The Torii can be a back-end for
> the Zeppelin project currently incubating at Apache. There is interest from
> the Torii community to develop this capability and an experimental branch
> has been started.
>
> === Homogeneous Developers ===
> The current group of developers working on Torii are all from IBM although
> the group is in the process of expanding its membership to include members
> of the GitHub community who are not from IBM and who have been active in
> the Torii community in GutHub.
>
> === Reliance on Salaried Developers ===
> The initial committers are full-time employees at IBM although not all work
> on the project full-time.
>
> === Excessive Fascination with the Apache Brand ===
> We believe the Torii benefits Apache Spark application developers, and we
> are interested in an Apache Torii project to benefit these developers by
> engaging a larger community, facilitating closer ties with the existing
> Spark project, and yes, gaining more visibility for the Torii as a solution.
>
> === Documentation ===
> Comprehensive documentation including “Getting Started”, API specifications
> and a Roadmap are available from the GitHub project, see
> https://github.com/ibm-et/Torii/wiki.
>
> === Initial Source ===
> The source code resides at https://github.com/ibm-et/Torii.
>
> === External Dependencies ===
> The Torii depends upon a number of Apache projects:
>  * Spark
>  * Hadoop
>  * Ivy
>  * Commons
>
> The Torii also depends upon a number of other open source projects:
>  * ZeroMQ (LGPL with Static Linking Exception,
> http://zeromq.org/area:licensing)
>  * Akka (MIT)
>  * JOpt Simple (MIT)
>  * Spring Framework Core (Apache v2)
>  * Play (Apache v2)
>  * SLF4J (MIT)
>  * Scala
>  * Scalatest (Apache v2)
>  * Scalactic (Apache v2)
>  * Mockito (MIT)
>
> == Required Resources ==
>
> === Mailing lists ===
>
>  * private@torii.incubator.apache.org (with moderated subscriptions)
>  * commits@torii.incubator.apache.org
>  * dev@torii.incubator.apache.org
>
> === Git Repository ===
>
>  * https://git-wip-us.apache.org/repos/asf/incubator-torii.git
>
> === Issue Tracking ===
>
>  * A JIRA issue tracker: https://issues.apache.org/jira/browse/TORII
>
> == Initial Committers ==
>
>  * Leugim Bustelo (lbustelo AT us DOT ibm DOT com)
>  * Jakob Odersky (odersky AT us DOT ibm DOT com)
>  * Luciano Resende (lresende AT apache DOT org)
>  * Robert Senkbeil (rcsenkbe AT us DOT ibm DOT com)
>  * Corey Stubbs (cstubbs AT us DOT ibm DOT com)
>  * Miao Wang (wangmiao AT us DOT ibm DOT com)
>  * Sean Welleck (swelleck AT us DOT ibm DOT com)
>
> === Affiliations ===
> All of the initial committers are employed by IBM.
>
> == Sponsors ==
>
> === Champion ===
>  * Sam Ruby (rubys AT apache DOT org)
>
> === Nominated Mentors ===
>  * Luciano Resende (lresende AT apache DOT org)
>  * Reynold Xin (rxin AT apache DOT org)
>  * Hitesh Shah (hitesh AT apache DOT org)
>  * Julien Le Dem (julien AT apache DOT org)
>
> === Sponsoring Entity ===
>
> We would like to propose the Apache Incubator to sponsor this project.
>
>
> --
> Luciano Resende
> http://people.apache.org/~lresende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org