You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@incubator.apache.org by Andrew Purtell <ap...@apache.org> on 2016/05/13 20:41:38 UTC

[DISCUSS] PredictionIO incubation proposal

Greetings,

It is my pleasure to
​ ​
propose the PredictionIO project for incubation at the Apache Software
Foundation.
​ ​
PredictionIO is a
​ popular​
open
​ ​
source Machine Learning Server built on top of a state-of-the-art open
source stack, including several Apache technologies, that
​ ​
enables developers to manage and deploy production-ready predictive
services for various kinds of machine learning tasks
​, with more than 400 production deployments around the world and a growing
contributor community. ​


The text of the proposal is included below and is also available at
https://wiki.apache.org/incubator/PredictionIO

Best regards,
Andrew Purtell


= PredictionIO Proposal =

=== Abstract ===
PredictionIO is an open source Machine Learning Server built on top of
state-of-the-art open source stack, that enables developers to manage and
deploy production-ready predictive services for various kinds of machine
learning tasks.

=== Proposal ===
The PredictionIO platform consists of the following components:

 * PredictionIO framework - provides the machine learning stack for
 building, evaluating and deploying engines with machine learning
 algorithms. It uses Apache Spark for processing.

 * Event Server - the machine learning analytics layer for unifying events
 from multiple platforms. It can use Apache HBase or any JDBC backends
 as its data store.

The PredictionIO community also maintains a
​ ​
Template Gallery, a place to
publish and download (free or proprietary) engine templates for different
types of machine learning applications, and is a complemental part of the
project. At this point we exclude the Template Gallery from the proposal,
as it has a separate set of contributors and we’re not familiar with an
Apache approved mechanism to maintain such a gallery.

You can find the Template Gallery at https://templates.prediction.io/

=== Background ===
PredictionIO was started with a mission to democratize and bring machine
learning to the masses.

Machine learning has traditionally been a luxury for big companies like
Google, Facebook, and Netflix. There are ML libraries and tools lying
around the internet but the effort of putting them all together as a
production-ready infrastructure is a very resource-intensive task that is
remotely reachable by individuals or small businesses.

PredictionIO is a production-ready, full stack machine learning system that
allows organizations of any scale to quickly deploy machine learning
capabilities. It comes with official and community-contributed machine
learning engine templates that are easy to customize.

=== Rationale ===
As usage and number of contributors to PredictionIO has grown bigger and
more diverse, we have sought for an independent framework for the project
to keep thriving. We believe the Apache foundation is a great fit. Joining
Apache would ensure that tried and true processes and procedures are in
place for the growing number of organizations interested in contributing
to PredictionIO. PredictionIO is also a good fit for the Apache foundation.
PredictionIO was built on top of several Apache projects (HBase, Spark,
Hadoop). We are familiar with the Apache process and believe that the
democratic and meritocratic nature of the foundation aligns with the
project goals.

=== Initial Goals ===
The initial milestones will be to move the existing codebase to Apache and
integrate with the Apache development process. Once this is accomplished,
we plan for incremental development and releases that follow the Apache
guidelines, as well as growing our developer and user communities.

=== Current Status ===
PredictionIO has undergone nine minor releases and many patches.
PredictionIO is being used in production by Salesforce.com as well as many
other organizations and apps. The PredictionIO codebase is currently
hosted at GitHub, which will form the basis of the Apache git repository.

==== Meritocracy ====
We plan to invest in supporting a meritocracy. We will discuss the
requirements in an open forum. We intend to invite additional developers
to participate. We will encourage and monitor community participation so
that privileges can be extended to those that contribute.

==== Community ====
Acceptance into the Apache foundation would bolster the already strong
user and developer community around PredictionIO. That community includes
many contributors from various other companies, and an active mailing list
composed of hundreds of users.

==== Core Developers ====
The core developers of our project are listed in our contributors and
initial PPMC below. Though many are employed at Salesforce.com, there are
also engineers from ActionML, and independent developers.

=== Alignment ===
The ASF is the natural choice to host the PredictionIO project as its goal
is democratizing Machine Learning by making it more easily accessible to
every user/developer. PredictionIO is built on top of several top level
Apache projects as outlined above.

=== Known Risks ===

==== Orphaned products ====
PredictionIO has a solid and growing community. It is deployed on
production environments by companies of all sizes to run various kinds of
predictive engines.

In addition to the community contribution to PredictionIO framework, the
community is also actively contributing new engines to the Template
Gallery as well as SDKs and documentation for the project. Salesforce is
committed to utilize and advance the PredictionIO code base and support
its user community.

==== Inexperience with Open Source ====
PredictionIO has existed as a healthy open source project for almost two
years and is the most starred Scala project on GitHub. All of the proposed
committers have contributed to ASF and Linux Foundation open source
projects. Several current committers on Apache projects and Apache Members
are involved in this proposal and intend to provide mentorship.

==== Homogeneous Developers ====
The initial list of committers includes developers from several
institutions, including Salesforce, ActionML, Channel4, USC as well as
unaffiliated developers.

==== Reliance on Salaried Developers ====
Like most open source projects, PredictionIO receives substantial support
from salaried developers. PredictionIO development is partially supported
by Salesforce.com, but there are many contributors from various other
companies, and an active mailing list composed of hundreds of users. We
will continue our efforts to ensure stewardship of the project to be
independent of salaried developers by meritocratically promoting those
contributors to committers.

==== Relationships with Other Apache Product ====
PredictionIO relies heavily on top level apache projects such as Apache
Spark, HBase and Hadoop. However it brings a distinguished functionality,
rather than just an abstraction - Machine Learning in a plug-and-play
fashion.

Compared to Apache Mahout, which focuses on the development of a wide
variety of algorithms, PredictionIO offers a platform to manage the whole
machine learning workflow, including data collection, data preparation,
modeling, deployment and management of predictive services in production
environments.

==== An Excessive Fascination with the Apache Brand ====
PredictionIO is already a widely known open source project. This proposal
is not for the purpose of generating publicity. Rather, the primary
benefits to joining Apache are those outlined in the Rationale section.

=== Documentation ===
PredictionIO boasts rich and live documentation, included in the code repo
(docs/manual directory), is built with Middleman, and publicly hosted at
https://docs.prediction.io

=== Initial Source and Intellectual Property Submission Plan ===
Currently, the PredictionIO codebase is distributed under the Apache 2.0
License and hosted on GitHub: https://github.com/PredictionIO/PredictionIO

=== External Dependencies ===
PredictionIO has the following external dependencies:
 * Apache Hadoop 2.4.0 (optional, required only if YARN and HDFS are needed)
 * Apache Spark 1.3.0 for Hadoop 2.4
 * Java SE Development Kit 8
 * and one of the following sets:
​  ​
   * PostgreSQL 9.1

​  ​
or

​  ​
* MySQL 5.1
​  ​
 or

​  ​
 * Apache HBase 0.98.6

​  ​
* Elasticsearch 1.4.0

Upon acceptance to the incubator, we would begin a thorough analysis of
all transitive dependencies to verify this information and introduce
license checking into the build and release process by integrating with
Apache RAT.

=== Cryptography ===
PredictionIO does not include cryptographic code. We utilize standard
JCE and JSSE APIs provided by the Java Runtime Environment.

=== Required Resources ===
We request that following resources be created for the project to use

==== Mailing lists ====

predictionio-private@incubator.apache.org (with moderated subscriptions)

predictionio-dev

predictionio-user

predictionio-commits

We will migrate the existing PredictionIO mailing lists.

==== Git repository ====
The PredictionIO team would like to use Git for source control, due to our
current use of GitHub.

git://git.apache.org/incubator-predictionio

==== Documentation ====
https://predictionio.incubator.apache.org/docs/

==== JIRA instance ====
PredictionIO currently uses the GitHub issue tracking system associated
with its repository: https://github.com/PredictionIO/PredictionIO/issues.
We will migrate to Apache JIRA.

JIRA PREDICTIONIO
https://issues.apache.org/jira/browse/PREDICTIONIO

==== Other Resources ====
* TravisCI for builds and test running.

* PredictionIO's documentation, included in the code repo (docs/manual
directory), is built with Middleman and publicly hosted
https://docs.prediction.io

* A blog to drive adoption and excitement at https://blog.prediction.io

=== Initial Committers ===

* Pat Ferrell

* Tamas Jambor

* Justin Yip

* Xusen Yin

* Lee Moon Soo

* Donald Szeto

* Kenneth Chan

* Tom Chan

* Simon Chan

* Marco Vivero

* Matthew Tovbin

* Yevgeny Khodorkovsky

* Felipe Oliveira

* Vitaly Gordon

=== Affiliations ===

* Pat Ferrell - ActionML

* Tamas Jambor - Channel4

* Justin Yip - independent

* Xusen Yin - USC

* Lee Moon Soo - NFLabs

* Donald Szeto - Salesforce

* Kenneth Chan - Salesforce

* Tom Chan - Salesforce

* Simon Chan - Salesforce

* Marco Vivero - Salesforce

* Matthew Tovbin - Salesforce

* Yevgeny Khodorkovsky - Salesforce

* Felipe Oliveira - Salesforce

* Vitaly Gordon - Salesforce

=== Sponsors ===

==== Champion ====

Andrew Purtell <apurtell at apache dot org>

==== Nominated Mentors ====

* Andrew Purtell <apurtell at apache dot org>

* James Taylor <jtaylor at apache dot org>

* Lars Hofhansl <larsh at apache dot org>

* Suneel Marthi <smarthi at apache dot org>

* Xiangrui Meng <meng at apache dot org>

* Luciano Resende <lresende at apache dot org>

==== Sponsoring Entity ====

Apache Incubator PMC

Re: [DISCUSS] PredictionIO incubation proposal

Posted by Andrew Purtell <ap...@apache.org>.
The process for transferring the rights to the name PredictionIO has
started at Salesforce. I'm optimistic but can't guarantee an outcome as I
am not empowered to make such a decision wearing any hat. I think we can
proceed with the proposal using the PredictionIO mark conditionally as the
desired podling name. Completing the transfer or finding another mark would
be the earliest activity the podling would undertake working through their
PODLINGNAMESEARCH ticket. Does that sound reasonable?


On Sun, May 15, 2016 at 6:29 PM, John D. Ament <jo...@apache.org>
wrote:

> I just want to confirm, Salesforce plans to transfer the rights to the name
> "PredictionIO" to the ASF? Or is the podling expected to take a new name?
>
> John
>
> On Fri, May 13, 2016 at 4:42 PM Andrew Purtell <ap...@apache.org>
> wrote:
>
> > Greetings,
> >
> > It is my pleasure to
> > ​ ​
> > propose the PredictionIO project for incubation at the Apache Software
> > Foundation.
> > ​ ​
> > PredictionIO is a
> > ​ popular​
> > open
> > ​ ​
> > source Machine Learning Server built on top of a state-of-the-art open
> > source stack, including several Apache technologies, that
> > ​ ​
> > enables developers to manage and deploy production-ready predictive
> > services for various kinds of machine learning tasks
> > ​, with more than 400 production deployments around the world and a
> growing
> > contributor community. ​
> >
> >
> > The text of the proposal is included below and is also available at
> > https://wiki.apache.org/incubator/PredictionIO
> >
> > Best regards,
> > Andrew Purtell
> >
> >
> > = PredictionIO Proposal =
> >
> > === Abstract ===
> > PredictionIO is an open source Machine Learning Server built on top of
> > state-of-the-art open source stack, that enables developers to manage and
> > deploy production-ready predictive services for various kinds of machine
> > learning tasks.
> >
> > === Proposal ===
> > The PredictionIO platform consists of the following components:
> >
> >  * PredictionIO framework - provides the machine learning stack for
> >  building, evaluating and deploying engines with machine learning
> >  algorithms. It uses Apache Spark for processing.
> >
> >  * Event Server - the machine learning analytics layer for unifying
> events
> >  from multiple platforms. It can use Apache HBase or any JDBC backends
> >  as its data store.
> >
> > The PredictionIO community also maintains a
> > ​ ​
> > Template Gallery, a place to
> > publish and download (free or proprietary) engine templates for different
> > types of machine learning applications, and is a complemental part of the
> > project. At this point we exclude the Template Gallery from the proposal,
> > as it has a separate set of contributors and we’re not familiar with an
> > Apache approved mechanism to maintain such a gallery.
> >
> > You can find the Template Gallery at https://templates.prediction.io/
> >
> > === Background ===
> > PredictionIO was started with a mission to democratize and bring machine
> > learning to the masses.
> >
> > Machine learning has traditionally been a luxury for big companies like
> > Google, Facebook, and Netflix. There are ML libraries and tools lying
> > around the internet but the effort of putting them all together as a
> > production-ready infrastructure is a very resource-intensive task that is
> > remotely reachable by individuals or small businesses.
> >
> > PredictionIO is a production-ready, full stack machine learning system
> that
> > allows organizations of any scale to quickly deploy machine learning
> > capabilities. It comes with official and community-contributed machine
> > learning engine templates that are easy to customize.
> >
> > === Rationale ===
> > As usage and number of contributors to PredictionIO has grown bigger and
> > more diverse, we have sought for an independent framework for the project
> > to keep thriving. We believe the Apache foundation is a great fit.
> Joining
> > Apache would ensure that tried and true processes and procedures are in
> > place for the growing number of organizations interested in contributing
> > to PredictionIO. PredictionIO is also a good fit for the Apache
> foundation.
> > PredictionIO was built on top of several Apache projects (HBase, Spark,
> > Hadoop). We are familiar with the Apache process and believe that the
> > democratic and meritocratic nature of the foundation aligns with the
> > project goals.
> >
> > === Initial Goals ===
> > The initial milestones will be to move the existing codebase to Apache
> and
> > integrate with the Apache development process. Once this is accomplished,
> > we plan for incremental development and releases that follow the Apache
> > guidelines, as well as growing our developer and user communities.
> >
> > === Current Status ===
> > PredictionIO has undergone nine minor releases and many patches.
> > PredictionIO is being used in production by Salesforce.com as well as
> many
> > other organizations and apps. The PredictionIO codebase is currently
> > hosted at GitHub, which will form the basis of the Apache git repository.
> >
> > ==== Meritocracy ====
> > We plan to invest in supporting a meritocracy. We will discuss the
> > requirements in an open forum. We intend to invite additional developers
> > to participate. We will encourage and monitor community participation so
> > that privileges can be extended to those that contribute.
> >
> > ==== Community ====
> > Acceptance into the Apache foundation would bolster the already strong
> > user and developer community around PredictionIO. That community includes
> > many contributors from various other companies, and an active mailing
> list
> > composed of hundreds of users.
> >
> > ==== Core Developers ====
> > The core developers of our project are listed in our contributors and
> > initial PPMC below. Though many are employed at Salesforce.com, there are
> > also engineers from ActionML, and independent developers.
> >
> > === Alignment ===
> > The ASF is the natural choice to host the PredictionIO project as its
> goal
> > is democratizing Machine Learning by making it more easily accessible to
> > every user/developer. PredictionIO is built on top of several top level
> > Apache projects as outlined above.
> >
> > === Known Risks ===
> >
> > ==== Orphaned products ====
> > PredictionIO has a solid and growing community. It is deployed on
> > production environments by companies of all sizes to run various kinds of
> > predictive engines.
> >
> > In addition to the community contribution to PredictionIO framework, the
> > community is also actively contributing new engines to the Template
> > Gallery as well as SDKs and documentation for the project. Salesforce is
> > committed to utilize and advance the PredictionIO code base and support
> > its user community.
> >
> > ==== Inexperience with Open Source ====
> > PredictionIO has existed as a healthy open source project for almost two
> > years and is the most starred Scala project on GitHub. All of the
> proposed
> > committers have contributed to ASF and Linux Foundation open source
> > projects. Several current committers on Apache projects and Apache
> Members
> > are involved in this proposal and intend to provide mentorship.
> >
> > ==== Homogeneous Developers ====
> > The initial list of committers includes developers from several
> > institutions, including Salesforce, ActionML, Channel4, USC as well as
> > unaffiliated developers.
> >
> > ==== Reliance on Salaried Developers ====
> > Like most open source projects, PredictionIO receives substantial support
> > from salaried developers. PredictionIO development is partially supported
> > by Salesforce.com, but there are many contributors from various other
> > companies, and an active mailing list composed of hundreds of users. We
> > will continue our efforts to ensure stewardship of the project to be
> > independent of salaried developers by meritocratically promoting those
> > contributors to committers.
> >
> > ==== Relationships with Other Apache Product ====
> > PredictionIO relies heavily on top level apache projects such as Apache
> > Spark, HBase and Hadoop. However it brings a distinguished functionality,
> > rather than just an abstraction - Machine Learning in a plug-and-play
> > fashion.
> >
> > Compared to Apache Mahout, which focuses on the development of a wide
> > variety of algorithms, PredictionIO offers a platform to manage the whole
> > machine learning workflow, including data collection, data preparation,
> > modeling, deployment and management of predictive services in production
> > environments.
> >
> > ==== An Excessive Fascination with the Apache Brand ====
> > PredictionIO is already a widely known open source project. This proposal
> > is not for the purpose of generating publicity. Rather, the primary
> > benefits to joining Apache are those outlined in the Rationale section.
> >
> > === Documentation ===
> > PredictionIO boasts rich and live documentation, included in the code
> repo
> > (docs/manual directory), is built with Middleman, and publicly hosted at
> > https://docs.prediction.io
> >
> > === Initial Source and Intellectual Property Submission Plan ===
> > Currently, the PredictionIO codebase is distributed under the Apache 2.0
> > License and hosted on GitHub:
> https://github.com/PredictionIO/PredictionIO
> >
> > === External Dependencies ===
> > PredictionIO has the following external dependencies:
> >  * Apache Hadoop 2.4.0 (optional, required only if YARN and HDFS are
> > needed)
> >  * Apache Spark 1.3.0 for Hadoop 2.4
> >  * Java SE Development Kit 8
> >  * and one of the following sets:
> > ​  ​
> >    * PostgreSQL 9.1
> >
> > ​  ​
> > or
> >
> > ​  ​
> > * MySQL 5.1
> > ​  ​
> >  or
> >
> > ​  ​
> >  * Apache HBase 0.98.6
> >
> > ​  ​
> > * Elasticsearch 1.4.0
> >
> > Upon acceptance to the incubator, we would begin a thorough analysis of
> > all transitive dependencies to verify this information and introduce
> > license checking into the build and release process by integrating with
> > Apache RAT.
> >
> > === Cryptography ===
> > PredictionIO does not include cryptographic code. We utilize standard
> > JCE and JSSE APIs provided by the Java Runtime Environment.
> >
> > === Required Resources ===
> > We request that following resources be created for the project to use
> >
> > ==== Mailing lists ====
> >
> > predictionio-private@incubator.apache.org (with moderated subscriptions)
> >
> > predictionio-dev
> >
> > predictionio-user
> >
> > predictionio-commits
> >
> > We will migrate the existing PredictionIO mailing lists.
> >
> > ==== Git repository ====
> > The PredictionIO team would like to use Git for source control, due to
> our
> > current use of GitHub.
> >
> > git://git.apache.org/incubator-predictionio
> >
> > ==== Documentation ====
> > https://predictionio.incubator.apache.org/docs/
> >
> > ==== JIRA instance ====
> > PredictionIO currently uses the GitHub issue tracking system associated
> > with its repository: https://github.com/PredictionIO/PredictionIO/issues
> .
> > We will migrate to Apache JIRA.
> >
> > JIRA PREDICTIONIO
> > https://issues.apache.org/jira/browse/PREDICTIONIO
> >
> > ==== Other Resources ====
> > * TravisCI for builds and test running.
> >
> > * PredictionIO's documentation, included in the code repo (docs/manual
> > directory), is built with Middleman and publicly hosted
> > https://docs.prediction.io
> >
> > * A blog to drive adoption and excitement at https://blog.prediction.io
> >
> > === Initial Committers ===
> >
> > * Pat Ferrell
> >
> > * Tamas Jambor
> >
> > * Justin Yip
> >
> > * Xusen Yin
> >
> > * Lee Moon Soo
> >
> > * Donald Szeto
> >
> > * Kenneth Chan
> >
> > * Tom Chan
> >
> > * Simon Chan
> >
> > * Marco Vivero
> >
> > * Matthew Tovbin
> >
> > * Yevgeny Khodorkovsky
> >
> > * Felipe Oliveira
> >
> > * Vitaly Gordon
> >
> > === Affiliations ===
> >
> > * Pat Ferrell - ActionML
> >
> > * Tamas Jambor - Channel4
> >
> > * Justin Yip - independent
> >
> > * Xusen Yin - USC
> >
> > * Lee Moon Soo - NFLabs
> >
> > * Donald Szeto - Salesforce
> >
> > * Kenneth Chan - Salesforce
> >
> > * Tom Chan - Salesforce
> >
> > * Simon Chan - Salesforce
> >
> > * Marco Vivero - Salesforce
> >
> > * Matthew Tovbin - Salesforce
> >
> > * Yevgeny Khodorkovsky - Salesforce
> >
> > * Felipe Oliveira - Salesforce
> >
> > * Vitaly Gordon - Salesforce
> >
> > === Sponsors ===
> >
> > ==== Champion ====
> >
> > Andrew Purtell <apurtell at apache dot org>
> >
> > ==== Nominated Mentors ====
> >
> > * Andrew Purtell <apurtell at apache dot org>
> >
> > * James Taylor <jtaylor at apache dot org>
> >
> > * Lars Hofhansl <larsh at apache dot org>
> >
> > * Suneel Marthi <smarthi at apache dot org>
> >
> > * Xiangrui Meng <meng at apache dot org>
> >
> > * Luciano Resende <lresende at apache dot org>
> >
> > ==== Sponsoring Entity ====
> >
> > Apache Incubator PMC
> >
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: [DISCUSS] PredictionIO incubation proposal

Posted by "John D. Ament" <jo...@apache.org>.
I just want to confirm, Salesforce plans to transfer the rights to the name
"PredictionIO" to the ASF? Or is the podling expected to take a new name?

John

On Fri, May 13, 2016 at 4:42 PM Andrew Purtell <ap...@apache.org> wrote:

> Greetings,
>
> It is my pleasure to
> ​ ​
> propose the PredictionIO project for incubation at the Apache Software
> Foundation.
> ​ ​
> PredictionIO is a
> ​ popular​
> open
> ​ ​
> source Machine Learning Server built on top of a state-of-the-art open
> source stack, including several Apache technologies, that
> ​ ​
> enables developers to manage and deploy production-ready predictive
> services for various kinds of machine learning tasks
> ​, with more than 400 production deployments around the world and a growing
> contributor community. ​
>
>
> The text of the proposal is included below and is also available at
> https://wiki.apache.org/incubator/PredictionIO
>
> Best regards,
> Andrew Purtell
>
>
> = PredictionIO Proposal =
>
> === Abstract ===
> PredictionIO is an open source Machine Learning Server built on top of
> state-of-the-art open source stack, that enables developers to manage and
> deploy production-ready predictive services for various kinds of machine
> learning tasks.
>
> === Proposal ===
> The PredictionIO platform consists of the following components:
>
>  * PredictionIO framework - provides the machine learning stack for
>  building, evaluating and deploying engines with machine learning
>  algorithms. It uses Apache Spark for processing.
>
>  * Event Server - the machine learning analytics layer for unifying events
>  from multiple platforms. It can use Apache HBase or any JDBC backends
>  as its data store.
>
> The PredictionIO community also maintains a
> ​ ​
> Template Gallery, a place to
> publish and download (free or proprietary) engine templates for different
> types of machine learning applications, and is a complemental part of the
> project. At this point we exclude the Template Gallery from the proposal,
> as it has a separate set of contributors and we’re not familiar with an
> Apache approved mechanism to maintain such a gallery.
>
> You can find the Template Gallery at https://templates.prediction.io/
>
> === Background ===
> PredictionIO was started with a mission to democratize and bring machine
> learning to the masses.
>
> Machine learning has traditionally been a luxury for big companies like
> Google, Facebook, and Netflix. There are ML libraries and tools lying
> around the internet but the effort of putting them all together as a
> production-ready infrastructure is a very resource-intensive task that is
> remotely reachable by individuals or small businesses.
>
> PredictionIO is a production-ready, full stack machine learning system that
> allows organizations of any scale to quickly deploy machine learning
> capabilities. It comes with official and community-contributed machine
> learning engine templates that are easy to customize.
>
> === Rationale ===
> As usage and number of contributors to PredictionIO has grown bigger and
> more diverse, we have sought for an independent framework for the project
> to keep thriving. We believe the Apache foundation is a great fit. Joining
> Apache would ensure that tried and true processes and procedures are in
> place for the growing number of organizations interested in contributing
> to PredictionIO. PredictionIO is also a good fit for the Apache foundation.
> PredictionIO was built on top of several Apache projects (HBase, Spark,
> Hadoop). We are familiar with the Apache process and believe that the
> democratic and meritocratic nature of the foundation aligns with the
> project goals.
>
> === Initial Goals ===
> The initial milestones will be to move the existing codebase to Apache and
> integrate with the Apache development process. Once this is accomplished,
> we plan for incremental development and releases that follow the Apache
> guidelines, as well as growing our developer and user communities.
>
> === Current Status ===
> PredictionIO has undergone nine minor releases and many patches.
> PredictionIO is being used in production by Salesforce.com as well as many
> other organizations and apps. The PredictionIO codebase is currently
> hosted at GitHub, which will form the basis of the Apache git repository.
>
> ==== Meritocracy ====
> We plan to invest in supporting a meritocracy. We will discuss the
> requirements in an open forum. We intend to invite additional developers
> to participate. We will encourage and monitor community participation so
> that privileges can be extended to those that contribute.
>
> ==== Community ====
> Acceptance into the Apache foundation would bolster the already strong
> user and developer community around PredictionIO. That community includes
> many contributors from various other companies, and an active mailing list
> composed of hundreds of users.
>
> ==== Core Developers ====
> The core developers of our project are listed in our contributors and
> initial PPMC below. Though many are employed at Salesforce.com, there are
> also engineers from ActionML, and independent developers.
>
> === Alignment ===
> The ASF is the natural choice to host the PredictionIO project as its goal
> is democratizing Machine Learning by making it more easily accessible to
> every user/developer. PredictionIO is built on top of several top level
> Apache projects as outlined above.
>
> === Known Risks ===
>
> ==== Orphaned products ====
> PredictionIO has a solid and growing community. It is deployed on
> production environments by companies of all sizes to run various kinds of
> predictive engines.
>
> In addition to the community contribution to PredictionIO framework, the
> community is also actively contributing new engines to the Template
> Gallery as well as SDKs and documentation for the project. Salesforce is
> committed to utilize and advance the PredictionIO code base and support
> its user community.
>
> ==== Inexperience with Open Source ====
> PredictionIO has existed as a healthy open source project for almost two
> years and is the most starred Scala project on GitHub. All of the proposed
> committers have contributed to ASF and Linux Foundation open source
> projects. Several current committers on Apache projects and Apache Members
> are involved in this proposal and intend to provide mentorship.
>
> ==== Homogeneous Developers ====
> The initial list of committers includes developers from several
> institutions, including Salesforce, ActionML, Channel4, USC as well as
> unaffiliated developers.
>
> ==== Reliance on Salaried Developers ====
> Like most open source projects, PredictionIO receives substantial support
> from salaried developers. PredictionIO development is partially supported
> by Salesforce.com, but there are many contributors from various other
> companies, and an active mailing list composed of hundreds of users. We
> will continue our efforts to ensure stewardship of the project to be
> independent of salaried developers by meritocratically promoting those
> contributors to committers.
>
> ==== Relationships with Other Apache Product ====
> PredictionIO relies heavily on top level apache projects such as Apache
> Spark, HBase and Hadoop. However it brings a distinguished functionality,
> rather than just an abstraction - Machine Learning in a plug-and-play
> fashion.
>
> Compared to Apache Mahout, which focuses on the development of a wide
> variety of algorithms, PredictionIO offers a platform to manage the whole
> machine learning workflow, including data collection, data preparation,
> modeling, deployment and management of predictive services in production
> environments.
>
> ==== An Excessive Fascination with the Apache Brand ====
> PredictionIO is already a widely known open source project. This proposal
> is not for the purpose of generating publicity. Rather, the primary
> benefits to joining Apache are those outlined in the Rationale section.
>
> === Documentation ===
> PredictionIO boasts rich and live documentation, included in the code repo
> (docs/manual directory), is built with Middleman, and publicly hosted at
> https://docs.prediction.io
>
> === Initial Source and Intellectual Property Submission Plan ===
> Currently, the PredictionIO codebase is distributed under the Apache 2.0
> License and hosted on GitHub: https://github.com/PredictionIO/PredictionIO
>
> === External Dependencies ===
> PredictionIO has the following external dependencies:
>  * Apache Hadoop 2.4.0 (optional, required only if YARN and HDFS are
> needed)
>  * Apache Spark 1.3.0 for Hadoop 2.4
>  * Java SE Development Kit 8
>  * and one of the following sets:
> ​  ​
>    * PostgreSQL 9.1
>
> ​  ​
> or
>
> ​  ​
> * MySQL 5.1
> ​  ​
>  or
>
> ​  ​
>  * Apache HBase 0.98.6
>
> ​  ​
> * Elasticsearch 1.4.0
>
> Upon acceptance to the incubator, we would begin a thorough analysis of
> all transitive dependencies to verify this information and introduce
> license checking into the build and release process by integrating with
> Apache RAT.
>
> === Cryptography ===
> PredictionIO does not include cryptographic code. We utilize standard
> JCE and JSSE APIs provided by the Java Runtime Environment.
>
> === Required Resources ===
> We request that following resources be created for the project to use
>
> ==== Mailing lists ====
>
> predictionio-private@incubator.apache.org (with moderated subscriptions)
>
> predictionio-dev
>
> predictionio-user
>
> predictionio-commits
>
> We will migrate the existing PredictionIO mailing lists.
>
> ==== Git repository ====
> The PredictionIO team would like to use Git for source control, due to our
> current use of GitHub.
>
> git://git.apache.org/incubator-predictionio
>
> ==== Documentation ====
> https://predictionio.incubator.apache.org/docs/
>
> ==== JIRA instance ====
> PredictionIO currently uses the GitHub issue tracking system associated
> with its repository: https://github.com/PredictionIO/PredictionIO/issues.
> We will migrate to Apache JIRA.
>
> JIRA PREDICTIONIO
> https://issues.apache.org/jira/browse/PREDICTIONIO
>
> ==== Other Resources ====
> * TravisCI for builds and test running.
>
> * PredictionIO's documentation, included in the code repo (docs/manual
> directory), is built with Middleman and publicly hosted
> https://docs.prediction.io
>
> * A blog to drive adoption and excitement at https://blog.prediction.io
>
> === Initial Committers ===
>
> * Pat Ferrell
>
> * Tamas Jambor
>
> * Justin Yip
>
> * Xusen Yin
>
> * Lee Moon Soo
>
> * Donald Szeto
>
> * Kenneth Chan
>
> * Tom Chan
>
> * Simon Chan
>
> * Marco Vivero
>
> * Matthew Tovbin
>
> * Yevgeny Khodorkovsky
>
> * Felipe Oliveira
>
> * Vitaly Gordon
>
> === Affiliations ===
>
> * Pat Ferrell - ActionML
>
> * Tamas Jambor - Channel4
>
> * Justin Yip - independent
>
> * Xusen Yin - USC
>
> * Lee Moon Soo - NFLabs
>
> * Donald Szeto - Salesforce
>
> * Kenneth Chan - Salesforce
>
> * Tom Chan - Salesforce
>
> * Simon Chan - Salesforce
>
> * Marco Vivero - Salesforce
>
> * Matthew Tovbin - Salesforce
>
> * Yevgeny Khodorkovsky - Salesforce
>
> * Felipe Oliveira - Salesforce
>
> * Vitaly Gordon - Salesforce
>
> === Sponsors ===
>
> ==== Champion ====
>
> Andrew Purtell <apurtell at apache dot org>
>
> ==== Nominated Mentors ====
>
> * Andrew Purtell <apurtell at apache dot org>
>
> * James Taylor <jtaylor at apache dot org>
>
> * Lars Hofhansl <larsh at apache dot org>
>
> * Suneel Marthi <smarthi at apache dot org>
>
> * Xiangrui Meng <meng at apache dot org>
>
> * Luciano Resende <lresende at apache dot org>
>
> ==== Sponsoring Entity ====
>
> Apache Incubator PMC
>

Re: [DISCUSS] PredictionIO incubation proposal

Posted by Henry Saputra <he...@gmail.com>.
Ah sorry, I missed that statement from the proposal. Thanks for the reply,
Simon.

I think it would be better to make it separate from the "core" PredictionIO.

I have seen it become bit of problem in term of contributions and for
similar projects having such libraries such as Zeppellin, NiFi, even Flink
for the ML libraries.

- Henry

On Sun, May 15, 2016 at 7:02 PM, Simon Chan <si...@salesforce.com> wrote:

> Great question, Henry. This is the main issue we are not 100% sure how to
> handle yet. We put this in the proposal:
>
> "The PredictionIO community also maintains a Template Gallery, a place to
> publish and download (free or proprietary) engine templates for different
> types of machine learning applications, and is a complemental part of the
> project. At this point we exclude the Template Gallery from the proposal,
> as it has a separate set of contributors and we’re not familiar with an
> Apache approved mechanism to maintain such a gallery."
>
> Any suggestion?
>
> Regards,
> Simon
>
> On Sun, May 15, 2016 at 5:26 PM, Henry Saputra <he...@gmail.com>
> wrote:
>
> > This is great news!
> >
> > One question, what would happen with the template gallery repository?
> >
> > Will it be moved under ASF too or will it be maintained as separate repo?
> >
> > - Henry
> >
> > On Friday, May 13, 2016, Andrew Purtell <ap...@apache.org> wrote:
> >
> > > Greetings,
> > >
> > > It is my pleasure to
> > > ​ ​
> > > propose the PredictionIO project for incubation at the Apache Software
> > > Foundation.
> > > ​ ​
> > > PredictionIO is a
> > > ​ popular​
> > > open
> > > ​ ​
> > > source Machine Learning Server built on top of a state-of-the-art open
> > > source stack, including several Apache technologies, that
> > > ​ ​
> > > enables developers to manage and deploy production-ready predictive
> > > services for various kinds of machine learning tasks
> > > ​, with more than 400 production deployments around the world and a
> > growing
> > > contributor community. ​
> > >
> > >
> > > The text of the proposal is included below and is also available at
> > > https://wiki.apache.org/incubator/PredictionIO
> > >
> > > Best regards,
> > > Andrew Purtell
> > >
> > >
> > > = PredictionIO Proposal =
> > >
> > > === Abstract ===
> > > PredictionIO is an open source Machine Learning Server built on top of
> > > state-of-the-art open source stack, that enables developers to manage
> and
> > > deploy production-ready predictive services for various kinds of
> machine
> > > learning tasks.
> > >
> > > === Proposal ===
> > > The PredictionIO platform consists of the following components:
> > >
> > >  * PredictionIO framework - provides the machine learning stack for
> > >  building, evaluating and deploying engines with machine learning
> > >  algorithms. It uses Apache Spark for processing.
> > >
> > >  * Event Server - the machine learning analytics layer for unifying
> > events
> > >  from multiple platforms. It can use Apache HBase or any JDBC backends
> > >  as its data store.
> > >
> > > The PredictionIO community also maintains a
> > > ​ ​
> > > Template Gallery, a place to
> > > publish and download (free or proprietary) engine templates for
> different
> > > types of machine learning applications, and is a complemental part of
> the
> > > project. At this point we exclude the Template Gallery from the
> proposal,
> > > as it has a separate set of contributors and we’re not familiar with an
> > > Apache approved mechanism to maintain such a gallery.
> > >
> > > You can find the Template Gallery at https://templates.prediction.io/
> > >
> > > === Background ===
> > > PredictionIO was started with a mission to democratize and bring
> machine
> > > learning to the masses.
> > >
> > > Machine learning has traditionally been a luxury for big companies like
> > > Google, Facebook, and Netflix. There are ML libraries and tools lying
> > > around the internet but the effort of putting them all together as a
> > > production-ready infrastructure is a very resource-intensive task that
> is
> > > remotely reachable by individuals or small businesses.
> > >
> > > PredictionIO is a production-ready, full stack machine learning system
> > that
> > > allows organizations of any scale to quickly deploy machine learning
> > > capabilities. It comes with official and community-contributed machine
> > > learning engine templates that are easy to customize.
> > >
> > > === Rationale ===
> > > As usage and number of contributors to PredictionIO has grown bigger
> and
> > > more diverse, we have sought for an independent framework for the
> project
> > > to keep thriving. We believe the Apache foundation is a great fit.
> > Joining
> > > Apache would ensure that tried and true processes and procedures are in
> > > place for the growing number of organizations interested in
> contributing
> > > to PredictionIO. PredictionIO is also a good fit for the Apache
> > foundation.
> > > PredictionIO was built on top of several Apache projects (HBase, Spark,
> > > Hadoop). We are familiar with the Apache process and believe that the
> > > democratic and meritocratic nature of the foundation aligns with the
> > > project goals.
> > >
> > > === Initial Goals ===
> > > The initial milestones will be to move the existing codebase to Apache
> > and
> > > integrate with the Apache development process. Once this is
> accomplished,
> > > we plan for incremental development and releases that follow the Apache
> > > guidelines, as well as growing our developer and user communities.
> > >
> > > === Current Status ===
> > > PredictionIO has undergone nine minor releases and many patches.
> > > PredictionIO is being used in production by Salesforce.com as well as
> > many
> > > other organizations and apps. The PredictionIO codebase is currently
> > > hosted at GitHub, which will form the basis of the Apache git
> repository.
> > >
> > > ==== Meritocracy ====
> > > We plan to invest in supporting a meritocracy. We will discuss the
> > > requirements in an open forum. We intend to invite additional
> developers
> > > to participate. We will encourage and monitor community participation
> so
> > > that privileges can be extended to those that contribute.
> > >
> > > ==== Community ====
> > > Acceptance into the Apache foundation would bolster the already strong
> > > user and developer community around PredictionIO. That community
> includes
> > > many contributors from various other companies, and an active mailing
> > list
> > > composed of hundreds of users.
> > >
> > > ==== Core Developers ====
> > > The core developers of our project are listed in our contributors and
> > > initial PPMC below. Though many are employed at Salesforce.com, there
> are
> > > also engineers from ActionML, and independent developers.
> > >
> > > === Alignment ===
> > > The ASF is the natural choice to host the PredictionIO project as its
> > goal
> > > is democratizing Machine Learning by making it more easily accessible
> to
> > > every user/developer. PredictionIO is built on top of several top level
> > > Apache projects as outlined above.
> > >
> > > === Known Risks ===
> > >
> > > ==== Orphaned products ====
> > > PredictionIO has a solid and growing community. It is deployed on
> > > production environments by companies of all sizes to run various kinds
> of
> > > predictive engines.
> > >
> > > In addition to the community contribution to PredictionIO framework,
> the
> > > community is also actively contributing new engines to the Template
> > > Gallery as well as SDKs and documentation for the project. Salesforce
> is
> > > committed to utilize and advance the PredictionIO code base and support
> > > its user community.
> > >
> > > ==== Inexperience with Open Source ====
> > > PredictionIO has existed as a healthy open source project for almost
> two
> > > years and is the most starred Scala project on GitHub. All of the
> > proposed
> > > committers have contributed to ASF and Linux Foundation open source
> > > projects. Several current committers on Apache projects and Apache
> > Members
> > > are involved in this proposal and intend to provide mentorship.
> > >
> > > ==== Homogeneous Developers ====
> > > The initial list of committers includes developers from several
> > > institutions, including Salesforce, ActionML, Channel4, USC as well as
> > > unaffiliated developers.
> > >
> > > ==== Reliance on Salaried Developers ====
> > > Like most open source projects, PredictionIO receives substantial
> support
> > > from salaried developers. PredictionIO development is partially
> supported
> > > by Salesforce.com, but there are many contributors from various other
> > > companies, and an active mailing list composed of hundreds of users. We
> > > will continue our efforts to ensure stewardship of the project to be
> > > independent of salaried developers by meritocratically promoting those
> > > contributors to committers.
> > >
> > > ==== Relationships with Other Apache Product ====
> > > PredictionIO relies heavily on top level apache projects such as Apache
> > > Spark, HBase and Hadoop. However it brings a distinguished
> functionality,
> > > rather than just an abstraction - Machine Learning in a plug-and-play
> > > fashion.
> > >
> > > Compared to Apache Mahout, which focuses on the development of a wide
> > > variety of algorithms, PredictionIO offers a platform to manage the
> whole
> > > machine learning workflow, including data collection, data preparation,
> > > modeling, deployment and management of predictive services in
> production
> > > environments.
> > >
> > > ==== An Excessive Fascination with the Apache Brand ====
> > > PredictionIO is already a widely known open source project. This
> proposal
> > > is not for the purpose of generating publicity. Rather, the primary
> > > benefits to joining Apache are those outlined in the Rationale section.
> > >
> > > === Documentation ===
> > > PredictionIO boasts rich and live documentation, included in the code
> > repo
> > > (docs/manual directory), is built with Middleman, and publicly hosted
> at
> > > https://docs.prediction.io
> > >
> > > === Initial Source and Intellectual Property Submission Plan ===
> > > Currently, the PredictionIO codebase is distributed under the Apache
> 2.0
> > > License and hosted on GitHub:
> > https://github.com/PredictionIO/PredictionIO
> > >
> > > === External Dependencies ===
> > > PredictionIO has the following external dependencies:
> > >  * Apache Hadoop 2.4.0 (optional, required only if YARN and HDFS are
> > > needed)
> > >  * Apache Spark 1.3.0 for Hadoop 2.4
> > >  * Java SE Development Kit 8
> > >  * and one of the following sets:
> > > ​  ​
> > >    * PostgreSQL 9.1
> > >
> > > ​  ​
> > > or
> > >
> > > ​  ​
> > > * MySQL 5.1
> > > ​  ​
> > >  or
> > >
> > > ​  ​
> > >  * Apache HBase 0.98.6
> > >
> > > ​  ​
> > > * Elasticsearch 1.4.0
> > >
> > > Upon acceptance to the incubator, we would begin a thorough analysis of
> > > all transitive dependencies to verify this information and introduce
> > > license checking into the build and release process by integrating with
> > > Apache RAT.
> > >
> > > === Cryptography ===
> > > PredictionIO does not include cryptographic code. We utilize standard
> > > JCE and JSSE APIs provided by the Java Runtime Environment.
> > >
> > > === Required Resources ===
> > > We request that following resources be created for the project to use
> > >
> > > ==== Mailing lists ====
> > >
> > > predictionio-private@incubator.apache.org <javascript:;> (with
> moderated
> > > subscriptions)
> > >
> > > predictionio-dev
> > >
> > > predictionio-user
> > >
> > > predictionio-commits
> > >
> > > We will migrate the existing PredictionIO mailing lists.
> > >
> > > ==== Git repository ====
> > > The PredictionIO team would like to use Git for source control, due to
> > our
> > > current use of GitHub.
> > >
> > > git://git.apache.org/incubator-predictionio
> > >
> > > ==== Documentation ====
> > > https://predictionio.incubator.apache.org/docs/
> > >
> > > ==== JIRA instance ====
> > > PredictionIO currently uses the GitHub issue tracking system associated
> > > with its repository:
> https://github.com/PredictionIO/PredictionIO/issues
> > .
> > > We will migrate to Apache JIRA.
> > >
> > > JIRA PREDICTIONIO
> > > https://issues.apache.org/jira/browse/PREDICTIONIO
> > >
> > > ==== Other Resources ====
> > > * TravisCI for builds and test running.
> > >
> > > * PredictionIO's documentation, included in the code repo (docs/manual
> > > directory), is built with Middleman and publicly hosted
> > > https://docs.prediction.io
> > >
> > > * A blog to drive adoption and excitement at
> https://blog.prediction.io
> > >
> > > === Initial Committers ===
> > >
> > > * Pat Ferrell
> > >
> > > * Tamas Jambor
> > >
> > > * Justin Yip
> > >
> > > * Xusen Yin
> > >
> > > * Lee Moon Soo
> > >
> > > * Donald Szeto
> > >
> > > * Kenneth Chan
> > >
> > > * Tom Chan
> > >
> > > * Simon Chan
> > >
> > > * Marco Vivero
> > >
> > > * Matthew Tovbin
> > >
> > > * Yevgeny Khodorkovsky
> > >
> > > * Felipe Oliveira
> > >
> > > * Vitaly Gordon
> > >
> > > === Affiliations ===
> > >
> > > * Pat Ferrell - ActionML
> > >
> > > * Tamas Jambor - Channel4
> > >
> > > * Justin Yip - independent
> > >
> > > * Xusen Yin - USC
> > >
> > > * Lee Moon Soo - NFLabs
> > >
> > > * Donald Szeto - Salesforce
> > >
> > > * Kenneth Chan - Salesforce
> > >
> > > * Tom Chan - Salesforce
> > >
> > > * Simon Chan - Salesforce
> > >
> > > * Marco Vivero - Salesforce
> > >
> > > * Matthew Tovbin - Salesforce
> > >
> > > * Yevgeny Khodorkovsky - Salesforce
> > >
> > > * Felipe Oliveira - Salesforce
> > >
> > > * Vitaly Gordon - Salesforce
> > >
> > > === Sponsors ===
> > >
> > > ==== Champion ====
> > >
> > > Andrew Purtell <apurtell at apache dot org>
> > >
> > > ==== Nominated Mentors ====
> > >
> > > * Andrew Purtell <apurtell at apache dot org>
> > >
> > > * James Taylor <jtaylor at apache dot org>
> > >
> > > * Lars Hofhansl <larsh at apache dot org>
> > >
> > > * Suneel Marthi <smarthi at apache dot org>
> > >
> > > * Xiangrui Meng <meng at apache dot org>
> > >
> > > * Luciano Resende <lresende at apache dot org>
> > >
> > > ==== Sponsoring Entity ====
> > >
> > > Apache Incubator PMC
> > >
> >
>

Re: [DISCUSS] PredictionIO incubation proposal

Posted by Simon Chan <si...@salesforce.com>.
Great question, Henry. This is the main issue we are not 100% sure how to
handle yet. We put this in the proposal:

"The PredictionIO community also maintains a Template Gallery, a place to
publish and download (free or proprietary) engine templates for different
types of machine learning applications, and is a complemental part of the
project. At this point we exclude the Template Gallery from the proposal,
as it has a separate set of contributors and we’re not familiar with an
Apache approved mechanism to maintain such a gallery."

Any suggestion?

Regards,
Simon

On Sun, May 15, 2016 at 5:26 PM, Henry Saputra <he...@gmail.com>
wrote:

> This is great news!
>
> One question, what would happen with the template gallery repository?
>
> Will it be moved under ASF too or will it be maintained as separate repo?
>
> - Henry
>
> On Friday, May 13, 2016, Andrew Purtell <ap...@apache.org> wrote:
>
> > Greetings,
> >
> > It is my pleasure to
> > ​ ​
> > propose the PredictionIO project for incubation at the Apache Software
> > Foundation.
> > ​ ​
> > PredictionIO is a
> > ​ popular​
> > open
> > ​ ​
> > source Machine Learning Server built on top of a state-of-the-art open
> > source stack, including several Apache technologies, that
> > ​ ​
> > enables developers to manage and deploy production-ready predictive
> > services for various kinds of machine learning tasks
> > ​, with more than 400 production deployments around the world and a
> growing
> > contributor community. ​
> >
> >
> > The text of the proposal is included below and is also available at
> > https://wiki.apache.org/incubator/PredictionIO
> >
> > Best regards,
> > Andrew Purtell
> >
> >
> > = PredictionIO Proposal =
> >
> > === Abstract ===
> > PredictionIO is an open source Machine Learning Server built on top of
> > state-of-the-art open source stack, that enables developers to manage and
> > deploy production-ready predictive services for various kinds of machine
> > learning tasks.
> >
> > === Proposal ===
> > The PredictionIO platform consists of the following components:
> >
> >  * PredictionIO framework - provides the machine learning stack for
> >  building, evaluating and deploying engines with machine learning
> >  algorithms. It uses Apache Spark for processing.
> >
> >  * Event Server - the machine learning analytics layer for unifying
> events
> >  from multiple platforms. It can use Apache HBase or any JDBC backends
> >  as its data store.
> >
> > The PredictionIO community also maintains a
> > ​ ​
> > Template Gallery, a place to
> > publish and download (free or proprietary) engine templates for different
> > types of machine learning applications, and is a complemental part of the
> > project. At this point we exclude the Template Gallery from the proposal,
> > as it has a separate set of contributors and we’re not familiar with an
> > Apache approved mechanism to maintain such a gallery.
> >
> > You can find the Template Gallery at https://templates.prediction.io/
> >
> > === Background ===
> > PredictionIO was started with a mission to democratize and bring machine
> > learning to the masses.
> >
> > Machine learning has traditionally been a luxury for big companies like
> > Google, Facebook, and Netflix. There are ML libraries and tools lying
> > around the internet but the effort of putting them all together as a
> > production-ready infrastructure is a very resource-intensive task that is
> > remotely reachable by individuals or small businesses.
> >
> > PredictionIO is a production-ready, full stack machine learning system
> that
> > allows organizations of any scale to quickly deploy machine learning
> > capabilities. It comes with official and community-contributed machine
> > learning engine templates that are easy to customize.
> >
> > === Rationale ===
> > As usage and number of contributors to PredictionIO has grown bigger and
> > more diverse, we have sought for an independent framework for the project
> > to keep thriving. We believe the Apache foundation is a great fit.
> Joining
> > Apache would ensure that tried and true processes and procedures are in
> > place for the growing number of organizations interested in contributing
> > to PredictionIO. PredictionIO is also a good fit for the Apache
> foundation.
> > PredictionIO was built on top of several Apache projects (HBase, Spark,
> > Hadoop). We are familiar with the Apache process and believe that the
> > democratic and meritocratic nature of the foundation aligns with the
> > project goals.
> >
> > === Initial Goals ===
> > The initial milestones will be to move the existing codebase to Apache
> and
> > integrate with the Apache development process. Once this is accomplished,
> > we plan for incremental development and releases that follow the Apache
> > guidelines, as well as growing our developer and user communities.
> >
> > === Current Status ===
> > PredictionIO has undergone nine minor releases and many patches.
> > PredictionIO is being used in production by Salesforce.com as well as
> many
> > other organizations and apps. The PredictionIO codebase is currently
> > hosted at GitHub, which will form the basis of the Apache git repository.
> >
> > ==== Meritocracy ====
> > We plan to invest in supporting a meritocracy. We will discuss the
> > requirements in an open forum. We intend to invite additional developers
> > to participate. We will encourage and monitor community participation so
> > that privileges can be extended to those that contribute.
> >
> > ==== Community ====
> > Acceptance into the Apache foundation would bolster the already strong
> > user and developer community around PredictionIO. That community includes
> > many contributors from various other companies, and an active mailing
> list
> > composed of hundreds of users.
> >
> > ==== Core Developers ====
> > The core developers of our project are listed in our contributors and
> > initial PPMC below. Though many are employed at Salesforce.com, there are
> > also engineers from ActionML, and independent developers.
> >
> > === Alignment ===
> > The ASF is the natural choice to host the PredictionIO project as its
> goal
> > is democratizing Machine Learning by making it more easily accessible to
> > every user/developer. PredictionIO is built on top of several top level
> > Apache projects as outlined above.
> >
> > === Known Risks ===
> >
> > ==== Orphaned products ====
> > PredictionIO has a solid and growing community. It is deployed on
> > production environments by companies of all sizes to run various kinds of
> > predictive engines.
> >
> > In addition to the community contribution to PredictionIO framework, the
> > community is also actively contributing new engines to the Template
> > Gallery as well as SDKs and documentation for the project. Salesforce is
> > committed to utilize and advance the PredictionIO code base and support
> > its user community.
> >
> > ==== Inexperience with Open Source ====
> > PredictionIO has existed as a healthy open source project for almost two
> > years and is the most starred Scala project on GitHub. All of the
> proposed
> > committers have contributed to ASF and Linux Foundation open source
> > projects. Several current committers on Apache projects and Apache
> Members
> > are involved in this proposal and intend to provide mentorship.
> >
> > ==== Homogeneous Developers ====
> > The initial list of committers includes developers from several
> > institutions, including Salesforce, ActionML, Channel4, USC as well as
> > unaffiliated developers.
> >
> > ==== Reliance on Salaried Developers ====
> > Like most open source projects, PredictionIO receives substantial support
> > from salaried developers. PredictionIO development is partially supported
> > by Salesforce.com, but there are many contributors from various other
> > companies, and an active mailing list composed of hundreds of users. We
> > will continue our efforts to ensure stewardship of the project to be
> > independent of salaried developers by meritocratically promoting those
> > contributors to committers.
> >
> > ==== Relationships with Other Apache Product ====
> > PredictionIO relies heavily on top level apache projects such as Apache
> > Spark, HBase and Hadoop. However it brings a distinguished functionality,
> > rather than just an abstraction - Machine Learning in a plug-and-play
> > fashion.
> >
> > Compared to Apache Mahout, which focuses on the development of a wide
> > variety of algorithms, PredictionIO offers a platform to manage the whole
> > machine learning workflow, including data collection, data preparation,
> > modeling, deployment and management of predictive services in production
> > environments.
> >
> > ==== An Excessive Fascination with the Apache Brand ====
> > PredictionIO is already a widely known open source project. This proposal
> > is not for the purpose of generating publicity. Rather, the primary
> > benefits to joining Apache are those outlined in the Rationale section.
> >
> > === Documentation ===
> > PredictionIO boasts rich and live documentation, included in the code
> repo
> > (docs/manual directory), is built with Middleman, and publicly hosted at
> > https://docs.prediction.io
> >
> > === Initial Source and Intellectual Property Submission Plan ===
> > Currently, the PredictionIO codebase is distributed under the Apache 2.0
> > License and hosted on GitHub:
> https://github.com/PredictionIO/PredictionIO
> >
> > === External Dependencies ===
> > PredictionIO has the following external dependencies:
> >  * Apache Hadoop 2.4.0 (optional, required only if YARN and HDFS are
> > needed)
> >  * Apache Spark 1.3.0 for Hadoop 2.4
> >  * Java SE Development Kit 8
> >  * and one of the following sets:
> > ​  ​
> >    * PostgreSQL 9.1
> >
> > ​  ​
> > or
> >
> > ​  ​
> > * MySQL 5.1
> > ​  ​
> >  or
> >
> > ​  ​
> >  * Apache HBase 0.98.6
> >
> > ​  ​
> > * Elasticsearch 1.4.0
> >
> > Upon acceptance to the incubator, we would begin a thorough analysis of
> > all transitive dependencies to verify this information and introduce
> > license checking into the build and release process by integrating with
> > Apache RAT.
> >
> > === Cryptography ===
> > PredictionIO does not include cryptographic code. We utilize standard
> > JCE and JSSE APIs provided by the Java Runtime Environment.
> >
> > === Required Resources ===
> > We request that following resources be created for the project to use
> >
> > ==== Mailing lists ====
> >
> > predictionio-private@incubator.apache.org <javascript:;> (with moderated
> > subscriptions)
> >
> > predictionio-dev
> >
> > predictionio-user
> >
> > predictionio-commits
> >
> > We will migrate the existing PredictionIO mailing lists.
> >
> > ==== Git repository ====
> > The PredictionIO team would like to use Git for source control, due to
> our
> > current use of GitHub.
> >
> > git://git.apache.org/incubator-predictionio
> >
> > ==== Documentation ====
> > https://predictionio.incubator.apache.org/docs/
> >
> > ==== JIRA instance ====
> > PredictionIO currently uses the GitHub issue tracking system associated
> > with its repository: https://github.com/PredictionIO/PredictionIO/issues
> .
> > We will migrate to Apache JIRA.
> >
> > JIRA PREDICTIONIO
> > https://issues.apache.org/jira/browse/PREDICTIONIO
> >
> > ==== Other Resources ====
> > * TravisCI for builds and test running.
> >
> > * PredictionIO's documentation, included in the code repo (docs/manual
> > directory), is built with Middleman and publicly hosted
> > https://docs.prediction.io
> >
> > * A blog to drive adoption and excitement at https://blog.prediction.io
> >
> > === Initial Committers ===
> >
> > * Pat Ferrell
> >
> > * Tamas Jambor
> >
> > * Justin Yip
> >
> > * Xusen Yin
> >
> > * Lee Moon Soo
> >
> > * Donald Szeto
> >
> > * Kenneth Chan
> >
> > * Tom Chan
> >
> > * Simon Chan
> >
> > * Marco Vivero
> >
> > * Matthew Tovbin
> >
> > * Yevgeny Khodorkovsky
> >
> > * Felipe Oliveira
> >
> > * Vitaly Gordon
> >
> > === Affiliations ===
> >
> > * Pat Ferrell - ActionML
> >
> > * Tamas Jambor - Channel4
> >
> > * Justin Yip - independent
> >
> > * Xusen Yin - USC
> >
> > * Lee Moon Soo - NFLabs
> >
> > * Donald Szeto - Salesforce
> >
> > * Kenneth Chan - Salesforce
> >
> > * Tom Chan - Salesforce
> >
> > * Simon Chan - Salesforce
> >
> > * Marco Vivero - Salesforce
> >
> > * Matthew Tovbin - Salesforce
> >
> > * Yevgeny Khodorkovsky - Salesforce
> >
> > * Felipe Oliveira - Salesforce
> >
> > * Vitaly Gordon - Salesforce
> >
> > === Sponsors ===
> >
> > ==== Champion ====
> >
> > Andrew Purtell <apurtell at apache dot org>
> >
> > ==== Nominated Mentors ====
> >
> > * Andrew Purtell <apurtell at apache dot org>
> >
> > * James Taylor <jtaylor at apache dot org>
> >
> > * Lars Hofhansl <larsh at apache dot org>
> >
> > * Suneel Marthi <smarthi at apache dot org>
> >
> > * Xiangrui Meng <meng at apache dot org>
> >
> > * Luciano Resende <lresende at apache dot org>
> >
> > ==== Sponsoring Entity ====
> >
> > Apache Incubator PMC
> >
>

Re: [DISCUSS] PredictionIO incubation proposal

Posted by Henry Saputra <he...@gmail.com>.
This is great news!

One question, what would happen with the template gallery repository?

Will it be moved under ASF too or will it be maintained as separate repo?

- Henry

On Friday, May 13, 2016, Andrew Purtell <ap...@apache.org> wrote:

> Greetings,
>
> It is my pleasure to
> ​ ​
> propose the PredictionIO project for incubation at the Apache Software
> Foundation.
> ​ ​
> PredictionIO is a
> ​ popular​
> open
> ​ ​
> source Machine Learning Server built on top of a state-of-the-art open
> source stack, including several Apache technologies, that
> ​ ​
> enables developers to manage and deploy production-ready predictive
> services for various kinds of machine learning tasks
> ​, with more than 400 production deployments around the world and a growing
> contributor community. ​
>
>
> The text of the proposal is included below and is also available at
> https://wiki.apache.org/incubator/PredictionIO
>
> Best regards,
> Andrew Purtell
>
>
> = PredictionIO Proposal =
>
> === Abstract ===
> PredictionIO is an open source Machine Learning Server built on top of
> state-of-the-art open source stack, that enables developers to manage and
> deploy production-ready predictive services for various kinds of machine
> learning tasks.
>
> === Proposal ===
> The PredictionIO platform consists of the following components:
>
>  * PredictionIO framework - provides the machine learning stack for
>  building, evaluating and deploying engines with machine learning
>  algorithms. It uses Apache Spark for processing.
>
>  * Event Server - the machine learning analytics layer for unifying events
>  from multiple platforms. It can use Apache HBase or any JDBC backends
>  as its data store.
>
> The PredictionIO community also maintains a
> ​ ​
> Template Gallery, a place to
> publish and download (free or proprietary) engine templates for different
> types of machine learning applications, and is a complemental part of the
> project. At this point we exclude the Template Gallery from the proposal,
> as it has a separate set of contributors and we’re not familiar with an
> Apache approved mechanism to maintain such a gallery.
>
> You can find the Template Gallery at https://templates.prediction.io/
>
> === Background ===
> PredictionIO was started with a mission to democratize and bring machine
> learning to the masses.
>
> Machine learning has traditionally been a luxury for big companies like
> Google, Facebook, and Netflix. There are ML libraries and tools lying
> around the internet but the effort of putting them all together as a
> production-ready infrastructure is a very resource-intensive task that is
> remotely reachable by individuals or small businesses.
>
> PredictionIO is a production-ready, full stack machine learning system that
> allows organizations of any scale to quickly deploy machine learning
> capabilities. It comes with official and community-contributed machine
> learning engine templates that are easy to customize.
>
> === Rationale ===
> As usage and number of contributors to PredictionIO has grown bigger and
> more diverse, we have sought for an independent framework for the project
> to keep thriving. We believe the Apache foundation is a great fit. Joining
> Apache would ensure that tried and true processes and procedures are in
> place for the growing number of organizations interested in contributing
> to PredictionIO. PredictionIO is also a good fit for the Apache foundation.
> PredictionIO was built on top of several Apache projects (HBase, Spark,
> Hadoop). We are familiar with the Apache process and believe that the
> democratic and meritocratic nature of the foundation aligns with the
> project goals.
>
> === Initial Goals ===
> The initial milestones will be to move the existing codebase to Apache and
> integrate with the Apache development process. Once this is accomplished,
> we plan for incremental development and releases that follow the Apache
> guidelines, as well as growing our developer and user communities.
>
> === Current Status ===
> PredictionIO has undergone nine minor releases and many patches.
> PredictionIO is being used in production by Salesforce.com as well as many
> other organizations and apps. The PredictionIO codebase is currently
> hosted at GitHub, which will form the basis of the Apache git repository.
>
> ==== Meritocracy ====
> We plan to invest in supporting a meritocracy. We will discuss the
> requirements in an open forum. We intend to invite additional developers
> to participate. We will encourage and monitor community participation so
> that privileges can be extended to those that contribute.
>
> ==== Community ====
> Acceptance into the Apache foundation would bolster the already strong
> user and developer community around PredictionIO. That community includes
> many contributors from various other companies, and an active mailing list
> composed of hundreds of users.
>
> ==== Core Developers ====
> The core developers of our project are listed in our contributors and
> initial PPMC below. Though many are employed at Salesforce.com, there are
> also engineers from ActionML, and independent developers.
>
> === Alignment ===
> The ASF is the natural choice to host the PredictionIO project as its goal
> is democratizing Machine Learning by making it more easily accessible to
> every user/developer. PredictionIO is built on top of several top level
> Apache projects as outlined above.
>
> === Known Risks ===
>
> ==== Orphaned products ====
> PredictionIO has a solid and growing community. It is deployed on
> production environments by companies of all sizes to run various kinds of
> predictive engines.
>
> In addition to the community contribution to PredictionIO framework, the
> community is also actively contributing new engines to the Template
> Gallery as well as SDKs and documentation for the project. Salesforce is
> committed to utilize and advance the PredictionIO code base and support
> its user community.
>
> ==== Inexperience with Open Source ====
> PredictionIO has existed as a healthy open source project for almost two
> years and is the most starred Scala project on GitHub. All of the proposed
> committers have contributed to ASF and Linux Foundation open source
> projects. Several current committers on Apache projects and Apache Members
> are involved in this proposal and intend to provide mentorship.
>
> ==== Homogeneous Developers ====
> The initial list of committers includes developers from several
> institutions, including Salesforce, ActionML, Channel4, USC as well as
> unaffiliated developers.
>
> ==== Reliance on Salaried Developers ====
> Like most open source projects, PredictionIO receives substantial support
> from salaried developers. PredictionIO development is partially supported
> by Salesforce.com, but there are many contributors from various other
> companies, and an active mailing list composed of hundreds of users. We
> will continue our efforts to ensure stewardship of the project to be
> independent of salaried developers by meritocratically promoting those
> contributors to committers.
>
> ==== Relationships with Other Apache Product ====
> PredictionIO relies heavily on top level apache projects such as Apache
> Spark, HBase and Hadoop. However it brings a distinguished functionality,
> rather than just an abstraction - Machine Learning in a plug-and-play
> fashion.
>
> Compared to Apache Mahout, which focuses on the development of a wide
> variety of algorithms, PredictionIO offers a platform to manage the whole
> machine learning workflow, including data collection, data preparation,
> modeling, deployment and management of predictive services in production
> environments.
>
> ==== An Excessive Fascination with the Apache Brand ====
> PredictionIO is already a widely known open source project. This proposal
> is not for the purpose of generating publicity. Rather, the primary
> benefits to joining Apache are those outlined in the Rationale section.
>
> === Documentation ===
> PredictionIO boasts rich and live documentation, included in the code repo
> (docs/manual directory), is built with Middleman, and publicly hosted at
> https://docs.prediction.io
>
> === Initial Source and Intellectual Property Submission Plan ===
> Currently, the PredictionIO codebase is distributed under the Apache 2.0
> License and hosted on GitHub: https://github.com/PredictionIO/PredictionIO
>
> === External Dependencies ===
> PredictionIO has the following external dependencies:
>  * Apache Hadoop 2.4.0 (optional, required only if YARN and HDFS are
> needed)
>  * Apache Spark 1.3.0 for Hadoop 2.4
>  * Java SE Development Kit 8
>  * and one of the following sets:
> ​  ​
>    * PostgreSQL 9.1
>
> ​  ​
> or
>
> ​  ​
> * MySQL 5.1
> ​  ​
>  or
>
> ​  ​
>  * Apache HBase 0.98.6
>
> ​  ​
> * Elasticsearch 1.4.0
>
> Upon acceptance to the incubator, we would begin a thorough analysis of
> all transitive dependencies to verify this information and introduce
> license checking into the build and release process by integrating with
> Apache RAT.
>
> === Cryptography ===
> PredictionIO does not include cryptographic code. We utilize standard
> JCE and JSSE APIs provided by the Java Runtime Environment.
>
> === Required Resources ===
> We request that following resources be created for the project to use
>
> ==== Mailing lists ====
>
> predictionio-private@incubator.apache.org <javascript:;> (with moderated
> subscriptions)
>
> predictionio-dev
>
> predictionio-user
>
> predictionio-commits
>
> We will migrate the existing PredictionIO mailing lists.
>
> ==== Git repository ====
> The PredictionIO team would like to use Git for source control, due to our
> current use of GitHub.
>
> git://git.apache.org/incubator-predictionio
>
> ==== Documentation ====
> https://predictionio.incubator.apache.org/docs/
>
> ==== JIRA instance ====
> PredictionIO currently uses the GitHub issue tracking system associated
> with its repository: https://github.com/PredictionIO/PredictionIO/issues.
> We will migrate to Apache JIRA.
>
> JIRA PREDICTIONIO
> https://issues.apache.org/jira/browse/PREDICTIONIO
>
> ==== Other Resources ====
> * TravisCI for builds and test running.
>
> * PredictionIO's documentation, included in the code repo (docs/manual
> directory), is built with Middleman and publicly hosted
> https://docs.prediction.io
>
> * A blog to drive adoption and excitement at https://blog.prediction.io
>
> === Initial Committers ===
>
> * Pat Ferrell
>
> * Tamas Jambor
>
> * Justin Yip
>
> * Xusen Yin
>
> * Lee Moon Soo
>
> * Donald Szeto
>
> * Kenneth Chan
>
> * Tom Chan
>
> * Simon Chan
>
> * Marco Vivero
>
> * Matthew Tovbin
>
> * Yevgeny Khodorkovsky
>
> * Felipe Oliveira
>
> * Vitaly Gordon
>
> === Affiliations ===
>
> * Pat Ferrell - ActionML
>
> * Tamas Jambor - Channel4
>
> * Justin Yip - independent
>
> * Xusen Yin - USC
>
> * Lee Moon Soo - NFLabs
>
> * Donald Szeto - Salesforce
>
> * Kenneth Chan - Salesforce
>
> * Tom Chan - Salesforce
>
> * Simon Chan - Salesforce
>
> * Marco Vivero - Salesforce
>
> * Matthew Tovbin - Salesforce
>
> * Yevgeny Khodorkovsky - Salesforce
>
> * Felipe Oliveira - Salesforce
>
> * Vitaly Gordon - Salesforce
>
> === Sponsors ===
>
> ==== Champion ====
>
> Andrew Purtell <apurtell at apache dot org>
>
> ==== Nominated Mentors ====
>
> * Andrew Purtell <apurtell at apache dot org>
>
> * James Taylor <jtaylor at apache dot org>
>
> * Lars Hofhansl <larsh at apache dot org>
>
> * Suneel Marthi <smarthi at apache dot org>
>
> * Xiangrui Meng <meng at apache dot org>
>
> * Luciano Resende <lresende at apache dot org>
>
> ==== Sponsoring Entity ====
>
> Apache Incubator PMC
>

Re: [DISCUSS] PredictionIO incubation proposal

Posted by Suneel Marthi <sm...@apache.org>.
+1 to integrating with Beam



On Mon, May 16, 2016 at 1:13 PM, Jean-Baptiste Onofré <jb...@nanthrax.net>
wrote:

> Hi,
>
> I second Roman here.
>
> Using Beam to abstract the execution environment would provide a very
> flexible architecture for PredictionIO.
>
> It would benefit for both projects.
>
> Regards
> JB
>
> On 05/15/2016 02:32 AM, Roman Shaposhnik wrote:
>
>> Super excited to see this proposal! This will finally allow us to have
>> an ASF managed
>> backend for next generation data-driven apps that I see emerging quite
>> rapidly.
>>
>> The proposal looks great to me (although I'd recommend calling Scala
>> as an implementation
>> language more prominently since it may attract additional developers
>> with affinity to it).
>>
>> I do have two questions about technology:
>>     1. do you think it would be possible to leverage Apache Beam
>> (incubating)
>>         for abstracting away dependency on execution frameworks? My
>> understanding
>>         is that PredictionIO currently only run on Spark.
>>     2. is there a potential integration with Apache Zeppelin possible?
>>
>> Thanks,
>> Roman.
>>
>>
>> On Fri, May 13, 2016 at 1:41 PM, Andrew Purtell <ap...@apache.org>
>> wrote:
>>
>>> Greetings,
>>>
>>> It is my pleasure to
>>>
>>> propose the PredictionIO project for incubation at the Apache Software
>>> Foundation.
>>>
>>> PredictionIO is a
>>> popular
>>> open
>>>
>>> source Machine Learning Server built on top of a state-of-the-art open
>>> source stack, including several Apache technologies, that
>>>
>>> enables developers to manage and deploy production-ready predictive
>>> services for various kinds of machine learning tasks
>>> , with more than 400 production deployments around the world and a
>>> growing
>>> contributor community.
>>>
>>>
>>> The text of the proposal is included below and is also available at
>>> https://wiki.apache.org/incubator/PredictionIO
>>>
>>> Best regards,
>>> Andrew Purtell
>>>
>>>
>>> = PredictionIO Proposal =
>>>
>>> === Abstract ===
>>> PredictionIO is an open source Machine Learning Server built on top of
>>> state-of-the-art open source stack, that enables developers to manage and
>>> deploy production-ready predictive services for various kinds of machine
>>> learning tasks.
>>>
>>> === Proposal ===
>>> The PredictionIO platform consists of the following components:
>>>
>>>   * PredictionIO framework - provides the machine learning stack for
>>>   building, evaluating and deploying engines with machine learning
>>>   algorithms. It uses Apache Spark for processing.
>>>
>>>   * Event Server - the machine learning analytics layer for unifying
>>> events
>>>   from multiple platforms. It can use Apache HBase or any JDBC backends
>>>   as its data store.
>>>
>>> The PredictionIO community also maintains a
>>>
>>> Template Gallery, a place to
>>> publish and download (free or proprietary) engine templates for different
>>> types of machine learning applications, and is a complemental part of the
>>> project. At this point we exclude the Template Gallery from the proposal,
>>> as it has a separate set of contributors and we’re not familiar with an
>>> Apache approved mechanism to maintain such a gallery.
>>>
>>> You can find the Template Gallery at https://templates.prediction.io/
>>>
>>> === Background ===
>>> PredictionIO was started with a mission to democratize and bring machine
>>> learning to the masses.
>>>
>>> Machine learning has traditionally been a luxury for big companies like
>>> Google, Facebook, and Netflix. There are ML libraries and tools lying
>>> around the internet but the effort of putting them all together as a
>>> production-ready infrastructure is a very resource-intensive task that is
>>> remotely reachable by individuals or small businesses.
>>>
>>> PredictionIO is a production-ready, full stack machine learning system
>>> that
>>> allows organizations of any scale to quickly deploy machine learning
>>> capabilities. It comes with official and community-contributed machine
>>> learning engine templates that are easy to customize.
>>>
>>> === Rationale ===
>>> As usage and number of contributors to PredictionIO has grown bigger and
>>> more diverse, we have sought for an independent framework for the project
>>> to keep thriving. We believe the Apache foundation is a great fit.
>>> Joining
>>> Apache would ensure that tried and true processes and procedures are in
>>> place for the growing number of organizations interested in contributing
>>> to PredictionIO. PredictionIO is also a good fit for the Apache
>>> foundation.
>>> PredictionIO was built on top of several Apache projects (HBase, Spark,
>>> Hadoop). We are familiar with the Apache process and believe that the
>>> democratic and meritocratic nature of the foundation aligns with the
>>> project goals.
>>>
>>> === Initial Goals ===
>>> The initial milestones will be to move the existing codebase to Apache
>>> and
>>> integrate with the Apache development process. Once this is accomplished,
>>> we plan for incremental development and releases that follow the Apache
>>> guidelines, as well as growing our developer and user communities.
>>>
>>> === Current Status ===
>>> PredictionIO has undergone nine minor releases and many patches.
>>> PredictionIO is being used in production by Salesforce.com as well as
>>> many
>>> other organizations and apps. The PredictionIO codebase is currently
>>> hosted at GitHub, which will form the basis of the Apache git repository.
>>>
>>> ==== Meritocracy ====
>>> We plan to invest in supporting a meritocracy. We will discuss the
>>> requirements in an open forum. We intend to invite additional developers
>>> to participate. We will encourage and monitor community participation so
>>> that privileges can be extended to those that contribute.
>>>
>>> ==== Community ====
>>> Acceptance into the Apache foundation would bolster the already strong
>>> user and developer community around PredictionIO. That community includes
>>> many contributors from various other companies, and an active mailing
>>> list
>>> composed of hundreds of users.
>>>
>>> ==== Core Developers ====
>>> The core developers of our project are listed in our contributors and
>>> initial PPMC below. Though many are employed at Salesforce.com, there are
>>> also engineers from ActionML, and independent developers.
>>>
>>> === Alignment ===
>>> The ASF is the natural choice to host the PredictionIO project as its
>>> goal
>>> is democratizing Machine Learning by making it more easily accessible to
>>> every user/developer. PredictionIO is built on top of several top level
>>> Apache projects as outlined above.
>>>
>>> === Known Risks ===
>>>
>>> ==== Orphaned products ====
>>> PredictionIO has a solid and growing community. It is deployed on
>>> production environments by companies of all sizes to run various kinds of
>>> predictive engines.
>>>
>>> In addition to the community contribution to PredictionIO framework, the
>>> community is also actively contributing new engines to the Template
>>> Gallery as well as SDKs and documentation for the project. Salesforce is
>>> committed to utilize and advance the PredictionIO code base and support
>>> its user community.
>>>
>>> ==== Inexperience with Open Source ====
>>> PredictionIO has existed as a healthy open source project for almost two
>>> years and is the most starred Scala project on GitHub. All of the
>>> proposed
>>> committers have contributed to ASF and Linux Foundation open source
>>> projects. Several current committers on Apache projects and Apache
>>> Members
>>> are involved in this proposal and intend to provide mentorship.
>>>
>>> ==== Homogeneous Developers ====
>>> The initial list of committers includes developers from several
>>> institutions, including Salesforce, ActionML, Channel4, USC as well as
>>> unaffiliated developers.
>>>
>>> ==== Reliance on Salaried Developers ====
>>> Like most open source projects, PredictionIO receives substantial support
>>> from salaried developers. PredictionIO development is partially supported
>>> by Salesforce.com, but there are many contributors from various other
>>> companies, and an active mailing list composed of hundreds of users. We
>>> will continue our efforts to ensure stewardship of the project to be
>>> independent of salaried developers by meritocratically promoting those
>>> contributors to committers.
>>>
>>> ==== Relationships with Other Apache Product ====
>>> PredictionIO relies heavily on top level apache projects such as Apache
>>> Spark, HBase and Hadoop. However it brings a distinguished functionality,
>>> rather than just an abstraction - Machine Learning in a plug-and-play
>>> fashion.
>>>
>>> Compared to Apache Mahout, which focuses on the development of a wide
>>> variety of algorithms, PredictionIO offers a platform to manage the whole
>>> machine learning workflow, including data collection, data preparation,
>>> modeling, deployment and management of predictive services in production
>>> environments.
>>>
>>> ==== An Excessive Fascination with the Apache Brand ====
>>> PredictionIO is already a widely known open source project. This proposal
>>> is not for the purpose of generating publicity. Rather, the primary
>>> benefits to joining Apache are those outlined in the Rationale section.
>>>
>>> === Documentation ===
>>> PredictionIO boasts rich and live documentation, included in the code
>>> repo
>>> (docs/manual directory), is built with Middleman, and publicly hosted at
>>> https://docs.prediction.io
>>>
>>> === Initial Source and Intellectual Property Submission Plan ===
>>> Currently, the PredictionIO codebase is distributed under the Apache 2.0
>>> License and hosted on GitHub:
>>> https://github.com/PredictionIO/PredictionIO
>>>
>>> === External Dependencies ===
>>> PredictionIO has the following external dependencies:
>>>   * Apache Hadoop 2.4.0 (optional, required only if YARN and HDFS are
>>> needed)
>>>   * Apache Spark 1.3.0 for Hadoop 2.4
>>>   * Java SE Development Kit 8
>>>   * and one of the following sets:
>>>
>>>     * PostgreSQL 9.1
>>>
>>>
>>> or
>>>
>>>
>>> * MySQL 5.1
>>>
>>>   or
>>>
>>>
>>>   * Apache HBase 0.98.6
>>>
>>>
>>> * Elasticsearch 1.4.0
>>>
>>> Upon acceptance to the incubator, we would begin a thorough analysis of
>>> all transitive dependencies to verify this information and introduce
>>> license checking into the build and release process by integrating with
>>> Apache RAT.
>>>
>>> === Cryptography ===
>>> PredictionIO does not include cryptographic code. We utilize standard
>>> JCE and JSSE APIs provided by the Java Runtime Environment.
>>>
>>> === Required Resources ===
>>> We request that following resources be created for the project to use
>>>
>>> ==== Mailing lists ====
>>>
>>> predictionio-private@incubator.apache.org (with moderated subscriptions)
>>>
>>>
>>> predictionio-dev
>>>
>>> predictionio-user
>>>
>>> predictionio-commits
>>>
>>> We will migrate the existing PredictionIO mailing lists.
>>>
>>> ==== Git repository ====
>>> The PredictionIO team would like to use Git for source control, due to
>>> our
>>> current use of GitHub.
>>>
>>> git://git.apache.org/incubator-predictionio
>>>
>>> ==== Documentation ====
>>> https://predictionio.incubator.apache.org/docs/
>>>
>>> ==== JIRA instance ====
>>> PredictionIO currently uses the GitHub issue tracking system associated
>>> with its repository: https://github.com/PredictionIO/PredictionIO/issues
>>> .
>>> We will migrate to Apache JIRA.
>>>
>>> JIRA PREDICTIONIO
>>> https://issues.apache.org/jira/browse/PREDICTIONIO
>>>
>>> ==== Other Resources ====
>>> * TravisCI for builds and test running.
>>>
>>> * PredictionIO's documentation, included in the code repo (docs/manual
>>> directory), is built with Middleman and publicly hosted
>>> https://docs.prediction.io
>>>
>>> * A blog to drive adoption and excitement at https://blog.prediction.io
>>>
>>> === Initial Committers ===
>>>
>>> * Pat Ferrell
>>>
>>> * Tamas Jambor
>>>
>>> * Justin Yip
>>>
>>> * Xusen Yin
>>>
>>> * Lee Moon Soo
>>>
>>> * Donald Szeto
>>>
>>> * Kenneth Chan
>>>
>>> * Tom Chan
>>>
>>> * Simon Chan
>>>
>>> * Marco Vivero
>>>
>>> * Matthew Tovbin
>>>
>>> * Yevgeny Khodorkovsky
>>>
>>> * Felipe Oliveira
>>>
>>> * Vitaly Gordon
>>>
>>> === Affiliations ===
>>>
>>> * Pat Ferrell - ActionML
>>>
>>> * Tamas Jambor - Channel4
>>>
>>> * Justin Yip - independent
>>>
>>> * Xusen Yin - USC
>>>
>>> * Lee Moon Soo - NFLabs
>>>
>>> * Donald Szeto - Salesforce
>>>
>>> * Kenneth Chan - Salesforce
>>>
>>> * Tom Chan - Salesforce
>>>
>>> * Simon Chan - Salesforce
>>>
>>> * Marco Vivero - Salesforce
>>>
>>> * Matthew Tovbin - Salesforce
>>>
>>> * Yevgeny Khodorkovsky - Salesforce
>>>
>>> * Felipe Oliveira - Salesforce
>>>
>>> * Vitaly Gordon - Salesforce
>>>
>>> === Sponsors ===
>>>
>>> ==== Champion ====
>>>
>>> Andrew Purtell <apurtell at apache dot org>
>>>
>>> ==== Nominated Mentors ====
>>>
>>> * Andrew Purtell <apurtell at apache dot org>
>>>
>>> * James Taylor <jtaylor at apache dot org>
>>>
>>> * Lars Hofhansl <larsh at apache dot org>
>>>
>>> * Suneel Marthi <smarthi at apache dot org>
>>>
>>> * Xiangrui Meng <meng at apache dot org>
>>>
>>> * Luciano Resende <lresende at apache dot org>
>>>
>>> ==== Sponsoring Entity ====
>>>
>>> Apache Incubator PMC
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>> For additional commands, e-mail: general-help@incubator.apache.org
>>
>>
> --
> Jean-Baptiste Onofré
> jbonofre@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>

Re: [DISCUSS] PredictionIO incubation proposal

Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.
Hi,

I second Roman here.

Using Beam to abstract the execution environment would provide a very 
flexible architecture for PredictionIO.

It would benefit for both projects.

Regards
JB

On 05/15/2016 02:32 AM, Roman Shaposhnik wrote:
> Super excited to see this proposal! This will finally allow us to have
> an ASF managed
> backend for next generation data-driven apps that I see emerging quite rapidly.
>
> The proposal looks great to me (although I'd recommend calling Scala
> as an implementation
> language more prominently since it may attract additional developers
> with affinity to it).
>
> I do have two questions about technology:
>     1. do you think it would be possible to leverage Apache Beam (incubating)
>         for abstracting away dependency on execution frameworks? My understanding
>         is that PredictionIO currently only run on Spark.
>     2. is there a potential integration with Apache Zeppelin possible?
>
> Thanks,
> Roman.
>
> On Fri, May 13, 2016 at 1:41 PM, Andrew Purtell <ap...@apache.org> wrote:
>> Greetings,
>>
>> It is my pleasure to
>>
>> propose the PredictionIO project for incubation at the Apache Software
>> Foundation.
>>
>> PredictionIO is a
>> popular
>> open
>>
>> source Machine Learning Server built on top of a state-of-the-art open
>> source stack, including several Apache technologies, that
>>
>> enables developers to manage and deploy production-ready predictive
>> services for various kinds of machine learning tasks
>> , with more than 400 production deployments around the world and a growing
>> contributor community.
>>
>>
>> The text of the proposal is included below and is also available at
>> https://wiki.apache.org/incubator/PredictionIO
>>
>> Best regards,
>> Andrew Purtell
>>
>>
>> = PredictionIO Proposal =
>>
>> === Abstract ===
>> PredictionIO is an open source Machine Learning Server built on top of
>> state-of-the-art open source stack, that enables developers to manage and
>> deploy production-ready predictive services for various kinds of machine
>> learning tasks.
>>
>> === Proposal ===
>> The PredictionIO platform consists of the following components:
>>
>>   * PredictionIO framework - provides the machine learning stack for
>>   building, evaluating and deploying engines with machine learning
>>   algorithms. It uses Apache Spark for processing.
>>
>>   * Event Server - the machine learning analytics layer for unifying events
>>   from multiple platforms. It can use Apache HBase or any JDBC backends
>>   as its data store.
>>
>> The PredictionIO community also maintains a
>>
>> Template Gallery, a place to
>> publish and download (free or proprietary) engine templates for different
>> types of machine learning applications, and is a complemental part of the
>> project. At this point we exclude the Template Gallery from the proposal,
>> as it has a separate set of contributors and we\u2019re not familiar with an
>> Apache approved mechanism to maintain such a gallery.
>>
>> You can find the Template Gallery at https://templates.prediction.io/
>>
>> === Background ===
>> PredictionIO was started with a mission to democratize and bring machine
>> learning to the masses.
>>
>> Machine learning has traditionally been a luxury for big companies like
>> Google, Facebook, and Netflix. There are ML libraries and tools lying
>> around the internet but the effort of putting them all together as a
>> production-ready infrastructure is a very resource-intensive task that is
>> remotely reachable by individuals or small businesses.
>>
>> PredictionIO is a production-ready, full stack machine learning system that
>> allows organizations of any scale to quickly deploy machine learning
>> capabilities. It comes with official and community-contributed machine
>> learning engine templates that are easy to customize.
>>
>> === Rationale ===
>> As usage and number of contributors to PredictionIO has grown bigger and
>> more diverse, we have sought for an independent framework for the project
>> to keep thriving. We believe the Apache foundation is a great fit. Joining
>> Apache would ensure that tried and true processes and procedures are in
>> place for the growing number of organizations interested in contributing
>> to PredictionIO. PredictionIO is also a good fit for the Apache foundation.
>> PredictionIO was built on top of several Apache projects (HBase, Spark,
>> Hadoop). We are familiar with the Apache process and believe that the
>> democratic and meritocratic nature of the foundation aligns with the
>> project goals.
>>
>> === Initial Goals ===
>> The initial milestones will be to move the existing codebase to Apache and
>> integrate with the Apache development process. Once this is accomplished,
>> we plan for incremental development and releases that follow the Apache
>> guidelines, as well as growing our developer and user communities.
>>
>> === Current Status ===
>> PredictionIO has undergone nine minor releases and many patches.
>> PredictionIO is being used in production by Salesforce.com as well as many
>> other organizations and apps. The PredictionIO codebase is currently
>> hosted at GitHub, which will form the basis of the Apache git repository.
>>
>> ==== Meritocracy ====
>> We plan to invest in supporting a meritocracy. We will discuss the
>> requirements in an open forum. We intend to invite additional developers
>> to participate. We will encourage and monitor community participation so
>> that privileges can be extended to those that contribute.
>>
>> ==== Community ====
>> Acceptance into the Apache foundation would bolster the already strong
>> user and developer community around PredictionIO. That community includes
>> many contributors from various other companies, and an active mailing list
>> composed of hundreds of users.
>>
>> ==== Core Developers ====
>> The core developers of our project are listed in our contributors and
>> initial PPMC below. Though many are employed at Salesforce.com, there are
>> also engineers from ActionML, and independent developers.
>>
>> === Alignment ===
>> The ASF is the natural choice to host the PredictionIO project as its goal
>> is democratizing Machine Learning by making it more easily accessible to
>> every user/developer. PredictionIO is built on top of several top level
>> Apache projects as outlined above.
>>
>> === Known Risks ===
>>
>> ==== Orphaned products ====
>> PredictionIO has a solid and growing community. It is deployed on
>> production environments by companies of all sizes to run various kinds of
>> predictive engines.
>>
>> In addition to the community contribution to PredictionIO framework, the
>> community is also actively contributing new engines to the Template
>> Gallery as well as SDKs and documentation for the project. Salesforce is
>> committed to utilize and advance the PredictionIO code base and support
>> its user community.
>>
>> ==== Inexperience with Open Source ====
>> PredictionIO has existed as a healthy open source project for almost two
>> years and is the most starred Scala project on GitHub. All of the proposed
>> committers have contributed to ASF and Linux Foundation open source
>> projects. Several current committers on Apache projects and Apache Members
>> are involved in this proposal and intend to provide mentorship.
>>
>> ==== Homogeneous Developers ====
>> The initial list of committers includes developers from several
>> institutions, including Salesforce, ActionML, Channel4, USC as well as
>> unaffiliated developers.
>>
>> ==== Reliance on Salaried Developers ====
>> Like most open source projects, PredictionIO receives substantial support
>> from salaried developers. PredictionIO development is partially supported
>> by Salesforce.com, but there are many contributors from various other
>> companies, and an active mailing list composed of hundreds of users. We
>> will continue our efforts to ensure stewardship of the project to be
>> independent of salaried developers by meritocratically promoting those
>> contributors to committers.
>>
>> ==== Relationships with Other Apache Product ====
>> PredictionIO relies heavily on top level apache projects such as Apache
>> Spark, HBase and Hadoop. However it brings a distinguished functionality,
>> rather than just an abstraction - Machine Learning in a plug-and-play
>> fashion.
>>
>> Compared to Apache Mahout, which focuses on the development of a wide
>> variety of algorithms, PredictionIO offers a platform to manage the whole
>> machine learning workflow, including data collection, data preparation,
>> modeling, deployment and management of predictive services in production
>> environments.
>>
>> ==== An Excessive Fascination with the Apache Brand ====
>> PredictionIO is already a widely known open source project. This proposal
>> is not for the purpose of generating publicity. Rather, the primary
>> benefits to joining Apache are those outlined in the Rationale section.
>>
>> === Documentation ===
>> PredictionIO boasts rich and live documentation, included in the code repo
>> (docs/manual directory), is built with Middleman, and publicly hosted at
>> https://docs.prediction.io
>>
>> === Initial Source and Intellectual Property Submission Plan ===
>> Currently, the PredictionIO codebase is distributed under the Apache 2.0
>> License and hosted on GitHub: https://github.com/PredictionIO/PredictionIO
>>
>> === External Dependencies ===
>> PredictionIO has the following external dependencies:
>>   * Apache Hadoop 2.4.0 (optional, required only if YARN and HDFS are needed)
>>   * Apache Spark 1.3.0 for Hadoop 2.4
>>   * Java SE Development Kit 8
>>   * and one of the following sets:
>>
>>     * PostgreSQL 9.1
>>
>>
>> or
>>
>>
>> * MySQL 5.1
>>
>>   or
>>
>>
>>   * Apache HBase 0.98.6
>>
>>
>> * Elasticsearch 1.4.0
>>
>> Upon acceptance to the incubator, we would begin a thorough analysis of
>> all transitive dependencies to verify this information and introduce
>> license checking into the build and release process by integrating with
>> Apache RAT.
>>
>> === Cryptography ===
>> PredictionIO does not include cryptographic code. We utilize standard
>> JCE and JSSE APIs provided by the Java Runtime Environment.
>>
>> === Required Resources ===
>> We request that following resources be created for the project to use
>>
>> ==== Mailing lists ====
>>
>> predictionio-private@incubator.apache.org (with moderated subscriptions)
>>
>> predictionio-dev
>>
>> predictionio-user
>>
>> predictionio-commits
>>
>> We will migrate the existing PredictionIO mailing lists.
>>
>> ==== Git repository ====
>> The PredictionIO team would like to use Git for source control, due to our
>> current use of GitHub.
>>
>> git://git.apache.org/incubator-predictionio
>>
>> ==== Documentation ====
>> https://predictionio.incubator.apache.org/docs/
>>
>> ==== JIRA instance ====
>> PredictionIO currently uses the GitHub issue tracking system associated
>> with its repository: https://github.com/PredictionIO/PredictionIO/issues.
>> We will migrate to Apache JIRA.
>>
>> JIRA PREDICTIONIO
>> https://issues.apache.org/jira/browse/PREDICTIONIO
>>
>> ==== Other Resources ====
>> * TravisCI for builds and test running.
>>
>> * PredictionIO's documentation, included in the code repo (docs/manual
>> directory), is built with Middleman and publicly hosted
>> https://docs.prediction.io
>>
>> * A blog to drive adoption and excitement at https://blog.prediction.io
>>
>> === Initial Committers ===
>>
>> * Pat Ferrell
>>
>> * Tamas Jambor
>>
>> * Justin Yip
>>
>> * Xusen Yin
>>
>> * Lee Moon Soo
>>
>> * Donald Szeto
>>
>> * Kenneth Chan
>>
>> * Tom Chan
>>
>> * Simon Chan
>>
>> * Marco Vivero
>>
>> * Matthew Tovbin
>>
>> * Yevgeny Khodorkovsky
>>
>> * Felipe Oliveira
>>
>> * Vitaly Gordon
>>
>> === Affiliations ===
>>
>> * Pat Ferrell - ActionML
>>
>> * Tamas Jambor - Channel4
>>
>> * Justin Yip - independent
>>
>> * Xusen Yin - USC
>>
>> * Lee Moon Soo - NFLabs
>>
>> * Donald Szeto - Salesforce
>>
>> * Kenneth Chan - Salesforce
>>
>> * Tom Chan - Salesforce
>>
>> * Simon Chan - Salesforce
>>
>> * Marco Vivero - Salesforce
>>
>> * Matthew Tovbin - Salesforce
>>
>> * Yevgeny Khodorkovsky - Salesforce
>>
>> * Felipe Oliveira - Salesforce
>>
>> * Vitaly Gordon - Salesforce
>>
>> === Sponsors ===
>>
>> ==== Champion ====
>>
>> Andrew Purtell <apurtell at apache dot org>
>>
>> ==== Nominated Mentors ====
>>
>> * Andrew Purtell <apurtell at apache dot org>
>>
>> * James Taylor <jtaylor at apache dot org>
>>
>> * Lars Hofhansl <larsh at apache dot org>
>>
>> * Suneel Marthi <smarthi at apache dot org>
>>
>> * Xiangrui Meng <meng at apache dot org>
>>
>> * Luciano Resende <lresende at apache dot org>
>>
>> ==== Sponsoring Entity ====
>>
>> Apache Incubator PMC
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>

-- 
Jean-Baptiste Onofr�
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [DISCUSS] PredictionIO incubation proposal

Posted by "Debo Dutta (dedutta)" <de...@cisco.com>.
Thx a lot Henry. Would love to. 

Sent from my iPhone

> On May 17, 2016, at 2:19 PM, Henry Saputra <he...@gmail.com> wrote:
> 
> You are welcome, and great to have you as one of mentors for PredictionIO
> polling.
> 
> Should be a fun project to be part of =)
> 
> - Henry
> 
>> On Tue, May 17, 2016 at 2:14 PM, Suneel Marthi <sm...@apache.org> wrote:
>> 
>> Thanks Henry
>> 
>> On Tue, May 17, 2016 at 5:11 PM, Henry Saputra <he...@gmail.com>
>> wrote:
>> 
>>> As mentor, you will have karma to commit to the source repository.
>>> 
>>> As you probably know, the initial committers and mentors will form the
>>> initial PPMCs for the podling.
>>> Hopefully for day to day operations you should not need to have
>> distinction
>>> of committer vs mentors anymore.
>>> 
>>> You do not have to be listed as committer for the proposal.
>>> 
>>> - Henry
>>> 
>>>> On Tue, May 17, 2016 at 1:57 PM, Suneel Marthi <sm...@apache.org>
>>> wrote:
>>> 
>>>> Thanks for having me as a mentor for PIO.  I would like to be added to
>>> the
>>>> initial list of committers and am looking to actively participate in
>> the
>>>> development too. I am not sure if my being a mentor automatically
>> grants
>>> me
>>>> the 'commit' karma.
>>>> 
>>>> Its already been suggested earlier in this thread by Roman and
>>>> Jean-Baptiste that the project needs to be decoupled from Spark and
>>>> integrated with Beam.  It would be good to reduce the reliance on
>>>> Spark-Submit from what I have seen of the project so far. But let's not
>>>> talk architecture and design here when the project's not in incubator
>>> yet.
>>>> :)
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Tue, May 17, 2016 at 4:09 PM, Henry Saputra <
>> henry.saputra@gmail.com>
>>>> wrote:
>>>> 
>>>>> Cool, this will make code grant process to be easier =)
>>>>> 
>>>>> The initial committers and mentors look great.
>>>>> I am sure more will come as contributions start pouring in to the
>>>> project.
>>>>> 
>>>>> Looking forward for the VOTE thread soon.
>>>>> 
>>>>> - Henry
>>>>> 
>>>>>> On Mon, May 16, 2016 at 12:07 PM, Simon Chan <si...@salesforce.com>
>>>>> wrote:
>>>>> 
>>>>>> Yes, it includes everyone who previously contributed code from
>>>>> PredictionIO
>>>>>> before the acquisition and still want to be involved in the
>> project.
>>>>>> 
>>>>>> We may have missed "Alex Merritt", going to add him to the list
>> soon.
>>>>>> 
>>>>>> Simon
>>>>>> 
>>>>>> 
>>>>>> On Mon, May 16, 2016 at 11:58 AM, Suneel Marthi <
>> smarthi@apache.org>
>>>>>> wrote:
>>>>>> 
>>>>>>> I do have a question about the proposed list of committers.
>>>>>>> 
>>>>>>> Does the list also include all of those folks who were with
>>>>> PredictionIO
>>>>>>> (and had contributed to the project) and then chose to leave when
>>> PIO
>>>>> was
>>>>>>> acquired by Salesforce?
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Mon, May 16, 2016 at 1:13 PM, Jean-Baptiste Onofré <
>>>> jb@nanthrax.net
>>>>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> By the way, we have some discussion about integrating Zeppelin
>>> with
>>>>>> Beam
>>>>>>> ;)
>>>>>>>> 
>>>>>>>> Regards
>>>>>>>> JB
>>>>>>>> 
>>>>>>>>> On 05/15/2016 02:32 AM, Roman Shaposhnik wrote:
>>>>>>>>> 
>>>>>>>>> Super excited to see this proposal! This will finally allow us
>>> to
>>>>> have
>>>>>>>>> an ASF managed
>>>>>>>>> backend for next generation data-driven apps that I see
>> emerging
>>>>> quite
>>>>>>>>> rapidly.
>>>>>>>>> 
>>>>>>>>> The proposal looks great to me (although I'd recommend calling
>>>> Scala
>>>>>>>>> as an implementation
>>>>>>>>> language more prominently since it may attract additional
>>>> developers
>>>>>>>>> with affinity to it).
>>>>>>>>> 
>>>>>>>>> I do have two questions about technology:
>>>>>>>>>    1. do you think it would be possible to leverage Apache
>> Beam
>>>>>>>>> (incubating)
>>>>>>>>>        for abstracting away dependency on execution
>> frameworks?
>>>> My
>>>>>>>>> understanding
>>>>>>>>>        is that PredictionIO currently only run on Spark.
>>>>>>>>>    2. is there a potential integration with Apache Zeppelin
>>>>> possible?
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Roman.
>>>>>>>>> 
>>>>>>>>> On Fri, May 13, 2016 at 1:41 PM, Andrew Purtell <
>>>>> apurtell@apache.org>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Greetings,
>>>>>>>>>> 
>>>>>>>>>> It is my pleasure to
>>>>>>>>>> 
>>>>>>>>>> propose the PredictionIO project for incubation at the Apache
>>>>>> Software
>>>>>>>>>> Foundation.
>>>>>>>>>> 
>>>>>>>>>> PredictionIO is a
>>>>>>>>>> popular
>>>>>>>>>> open
>>>>>>>>>> 
>>>>>>>>>> source Machine Learning Server built on top of a
>>> state-of-the-art
>>>>>> open
>>>>>>>>>> source stack, including several Apache technologies, that
>>>>>>>>>> 
>>>>>>>>>> enables developers to manage and deploy production-ready
>>>> predictive
>>>>>>>>>> services for various kinds of machine learning tasks
>>>>>>>>>> , with more than 400 production deployments around the world
>>> and
>>>> a
>>>>>>>>>> growing
>>>>>>>>>> contributor community.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> The text of the proposal is included below and is also
>>> available
>>>> at
>>>>>>>>>> https://wiki.apache.org/incubator/PredictionIO
>>>>>>>>>> 
>>>>>>>>>> Best regards,
>>>>>>>>>> Andrew Purtell
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> = PredictionIO Proposal =
>>>>>>>>>> 
>>>>>>>>>> === Abstract ===
>>>>>>>>>> PredictionIO is an open source Machine Learning Server built
>> on
>>>> top
>>>>>> of
>>>>>>>>>> state-of-the-art open source stack, that enables developers
>> to
>>>>> manage
>>>>>>> and
>>>>>>>>>> deploy production-ready predictive services for various kinds
>>> of
>>>>>>> machine
>>>>>>>>>> learning tasks.
>>>>>>>>>> 
>>>>>>>>>> === Proposal ===
>>>>>>>>>> The PredictionIO platform consists of the following
>> components:
>>>>>>>>>> 
>>>>>>>>>>  * PredictionIO framework - provides the machine learning
>>> stack
>>>>> for
>>>>>>>>>>  building, evaluating and deploying engines with machine
>>>> learning
>>>>>>>>>>  algorithms. It uses Apache Spark for processing.
>>>>>>>>>> 
>>>>>>>>>>  * Event Server - the machine learning analytics layer for
>>>>> unifying
>>>>>>>>>> events
>>>>>>>>>>  from multiple platforms. It can use Apache HBase or any
>> JDBC
>>>>>> backends
>>>>>>>>>>  as its data store.
>>>>>>>>>> 
>>>>>>>>>> The PredictionIO community also maintains a
>>>>>>>>>> 
>>>>>>>>>> Template Gallery, a place to
>>>>>>>>>> publish and download (free or proprietary) engine templates
>> for
>>>>>>> different
>>>>>>>>>> types of machine learning applications, and is a complemental
>>>> part
>>>>> of
>>>>>>> the
>>>>>>>>>> project. At this point we exclude the Template Gallery from
>> the
>>>>>>> proposal,
>>>>>>>>>> as it has a separate set of contributors and we’re not
>> familiar
>>>>> with
>>>>>> an
>>>>>>>>>> Apache approved mechanism to maintain such a gallery.
>>>>>>>>>> 
>>>>>>>>>> You can find the Template Gallery at
>>>>>> https://templates.prediction.io/
>>>>>>>>>> 
>>>>>>>>>> === Background ===
>>>>>>>>>> PredictionIO was started with a mission to democratize and
>>> bring
>>>>>>> machine
>>>>>>>>>> learning to the masses.
>>>>>>>>>> 
>>>>>>>>>> Machine learning has traditionally been a luxury for big
>>>> companies
>>>>>> like
>>>>>>>>>> Google, Facebook, and Netflix. There are ML libraries and
>> tools
>>>>> lying
>>>>>>>>>> around the internet but the effort of putting them all
>> together
>>>> as
>>>>> a
>>>>>>>>>> production-ready infrastructure is a very resource-intensive
>>> task
>>>>>> that
>>>>>>> is
>>>>>>>>>> remotely reachable by individuals or small businesses.
>>>>>>>>>> 
>>>>>>>>>> PredictionIO is a production-ready, full stack machine
>> learning
>>>>>> system
>>>>>>>>>> that
>>>>>>>>>> allows organizations of any scale to quickly deploy machine
>>>>> learning
>>>>>>>>>> capabilities. It comes with official and
>> community-contributed
>>>>>> machine
>>>>>>>>>> learning engine templates that are easy to customize.
>>>>>>>>>> 
>>>>>>>>>> === Rationale ===
>>>>>>>>>> As usage and number of contributors to PredictionIO has grown
>>>>> bigger
>>>>>>> and
>>>>>>>>>> more diverse, we have sought for an independent framework for
>>> the
>>>>>>> project
>>>>>>>>>> to keep thriving. We believe the Apache foundation is a great
>>>> fit.
>>>>>>>>>> Joining
>>>>>>>>>> Apache would ensure that tried and true processes and
>>> procedures
>>>>> are
>>>>>> in
>>>>>>>>>> place for the growing number of organizations interested in
>>>>>>> contributing
>>>>>>>>>> to PredictionIO. PredictionIO is also a good fit for the
>> Apache
>>>>>>>>>> foundation.
>>>>>>>>>> PredictionIO was built on top of several Apache projects
>>> (HBase,
>>>>>> Spark,
>>>>>>>>>> Hadoop). We are familiar with the Apache process and believe
>>> that
>>>>> the
>>>>>>>>>> democratic and meritocratic nature of the foundation aligns
>>> with
>>>>> the
>>>>>>>>>> project goals.
>>>>>>>>>> 
>>>>>>>>>> === Initial Goals ===
>>>>>>>>>> The initial milestones will be to move the existing codebase
>> to
>>>>>> Apache
>>>>>>>>>> and
>>>>>>>>>> integrate with the Apache development process. Once this is
>>>>>>> accomplished,
>>>>>>>>>> we plan for incremental development and releases that follow
>>> the
>>>>>> Apache
>>>>>>>>>> guidelines, as well as growing our developer and user
>>>> communities.
>>>>>>>>>> 
>>>>>>>>>> === Current Status ===
>>>>>>>>>> PredictionIO has undergone nine minor releases and many
>>> patches.
>>>>>>>>>> PredictionIO is being used in production by Salesforce.com as
>>>> well
>>>>> as
>>>>>>>>>> many
>>>>>>>>>> other organizations and apps. The PredictionIO codebase is
>>>>> currently
>>>>>>>>>> hosted at GitHub, which will form the basis of the Apache git
>>>>>>> repository.
>>>>>>>>>> 
>>>>>>>>>> ==== Meritocracy ====
>>>>>>>>>> We plan to invest in supporting a meritocracy. We will
>> discuss
>>>> the
>>>>>>>>>> requirements in an open forum. We intend to invite additional
>>>>>>> developers
>>>>>>>>>> to participate. We will encourage and monitor community
>>>>> participation
>>>>>>> so
>>>>>>>>>> that privileges can be extended to those that contribute.
>>>>>>>>>> 
>>>>>>>>>> ==== Community ====
>>>>>>>>>> Acceptance into the Apache foundation would bolster the
>> already
>>>>>> strong
>>>>>>>>>> user and developer community around PredictionIO. That
>>> community
>>>>>>> includes
>>>>>>>>>> many contributors from various other companies, and an active
>>>>> mailing
>>>>>>>>>> list
>>>>>>>>>> composed of hundreds of users.
>>>>>>>>>> 
>>>>>>>>>> ==== Core Developers ====
>>>>>>>>>> The core developers of our project are listed in our
>>> contributors
>>>>> and
>>>>>>>>>> initial PPMC below. Though many are employed at
>> Salesforce.com,
>>>>> there
>>>>>>> are
>>>>>>>>>> also engineers from ActionML, and independent developers.
>>>>>>>>>> 
>>>>>>>>>> === Alignment ===
>>>>>>>>>> The ASF is the natural choice to host the PredictionIO
>> project
>>> as
>>>>> its
>>>>>>>>>> goal
>>>>>>>>>> is democratizing Machine Learning by making it more easily
>>>>> accessible
>>>>>>> to
>>>>>>>>>> every user/developer. PredictionIO is built on top of several
>>> top
>>>>>> level
>>>>>>>>>> Apache projects as outlined above.
>>>>>>>>>> 
>>>>>>>>>> === Known Risks ===
>>>>>>>>>> 
>>>>>>>>>> ==== Orphaned products ====
>>>>>>>>>> PredictionIO has a solid and growing community. It is
>> deployed
>>> on
>>>>>>>>>> production environments by companies of all sizes to run
>>> various
>>>>>> kinds
>>>>>>> of
>>>>>>>>>> predictive engines.
>>>>>>>>>> 
>>>>>>>>>> In addition to the community contribution to PredictionIO
>>>>> framework,
>>>>>>> the
>>>>>>>>>> community is also actively contributing new engines to the
>>>> Template
>>>>>>>>>> Gallery as well as SDKs and documentation for the project.
>>>>> Salesforce
>>>>>>> is
>>>>>>>>>> committed to utilize and advance the PredictionIO code base
>> and
>>>>>> support
>>>>>>>>>> its user community.
>>>>>>>>>> 
>>>>>>>>>> ==== Inexperience with Open Source ====
>>>>>>>>>> PredictionIO has existed as a healthy open source project for
>>>>> almost
>>>>>>> two
>>>>>>>>>> years and is the most starred Scala project on GitHub. All of
>>> the
>>>>>>>>>> proposed
>>>>>>>>>> committers have contributed to ASF and Linux Foundation open
>>>> source
>>>>>>>>>> projects. Several current committers on Apache projects and
>>>> Apache
>>>>>>>>>> Members
>>>>>>>>>> are involved in this proposal and intend to provide
>> mentorship.
>>>>>>>>>> 
>>>>>>>>>> ==== Homogeneous Developers ====
>>>>>>>>>> The initial list of committers includes developers from
>> several
>>>>>>>>>> institutions, including Salesforce, ActionML, Channel4, USC
>> as
>>>> well
>>>>>> as
>>>>>>>>>> unaffiliated developers.
>>>>>>>>>> 
>>>>>>>>>> ==== Reliance on Salaried Developers ====
>>>>>>>>>> Like most open source projects, PredictionIO receives
>>> substantial
>>>>>>> support
>>>>>>>>>> from salaried developers. PredictionIO development is
>> partially
>>>>>>> supported
>>>>>>>>>> by Salesforce.com, but there are many contributors from
>> various
>>>>> other
>>>>>>>>>> companies, and an active mailing list composed of hundreds of
>>>>> users.
>>>>>> We
>>>>>>>>>> will continue our efforts to ensure stewardship of the
>> project
>>> to
>>>>> be
>>>>>>>>>> independent of salaried developers by meritocratically
>>> promoting
>>>>>> those
>>>>>>>>>> contributors to committers.
>>>>>>>>>> 
>>>>>>>>>> ==== Relationships with Other Apache Product ====
>>>>>>>>>> PredictionIO relies heavily on top level apache projects such
>>> as
>>>>>> Apache
>>>>>>>>>> Spark, HBase and Hadoop. However it brings a distinguished
>>>>>>> functionality,
>>>>>>>>>> rather than just an abstraction - Machine Learning in a
>>>>> plug-and-play
>>>>>>>>>> fashion.
>>>>>>>>>> 
>>>>>>>>>> Compared to Apache Mahout, which focuses on the development
>> of
>>> a
>>>>> wide
>>>>>>>>>> variety of algorithms, PredictionIO offers a platform to
>> manage
>>>> the
>>>>>>> whole
>>>>>>>>>> machine learning workflow, including data collection, data
>>>>>> preparation,
>>>>>>>>>> modeling, deployment and management of predictive services in
>>>>>>> production
>>>>>>>>>> environments.
>>>>>>>>>> 
>>>>>>>>>> ==== An Excessive Fascination with the Apache Brand ====
>>>>>>>>>> PredictionIO is already a widely known open source project.
>>> This
>>>>>>> proposal
>>>>>>>>>> is not for the purpose of generating publicity. Rather, the
>>>> primary
>>>>>>>>>> benefits to joining Apache are those outlined in the
>> Rationale
>>>>>> section.
>>>>>>>>>> 
>>>>>>>>>> === Documentation ===
>>>>>>>>>> PredictionIO boasts rich and live documentation, included in
>>> the
>>>>> code
>>>>>>>>>> repo
>>>>>>>>>> (docs/manual directory), is built with Middleman, and
>> publicly
>>>>> hosted
>>>>>>> at
>>>>>>>>>> https://docs.prediction.io
>>>>>>>>>> 
>>>>>>>>>> === Initial Source and Intellectual Property Submission Plan
>>> ===
>>>>>>>>>> Currently, the PredictionIO codebase is distributed under the
>>>>> Apache
>>>>>>> 2.0
>>>>>>>>>> License and hosted on GitHub:
>>>>>>>>>> https://github.com/PredictionIO/PredictionIO
>>>>>>>>>> 
>>>>>>>>>> === External Dependencies ===
>>>>>>>>>> PredictionIO has the following external dependencies:
>>>>>>>>>>  * Apache Hadoop 2.4.0 (optional, required only if YARN and
>>> HDFS
>>>>> are
>>>>>>>>>> needed)
>>>>>>>>>>  * Apache Spark 1.3.0 for Hadoop 2.4
>>>>>>>>>>  * Java SE Development Kit 8
>>>>>>>>>>  * and one of the following sets:
>>>>>>>>>> 
>>>>>>>>>>    * PostgreSQL 9.1
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> or
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> * MySQL 5.1
>>>>>>>>>> 
>>>>>>>>>>  or
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>  * Apache HBase 0.98.6
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> * Elasticsearch 1.4.0
>>>>>>>>>> 
>>>>>>>>>> Upon acceptance to the incubator, we would begin a thorough
>>>>> analysis
>>>>>> of
>>>>>>>>>> all transitive dependencies to verify this information and
>>>>> introduce
>>>>>>>>>> license checking into the build and release process by
>>>> integrating
>>>>>> with
>>>>>>>>>> Apache RAT.
>>>>>>>>>> 
>>>>>>>>>> === Cryptography ===
>>>>>>>>>> PredictionIO does not include cryptographic code. We utilize
>>>>> standard
>>>>>>>>>> JCE and JSSE APIs provided by the Java Runtime Environment.
>>>>>>>>>> 
>>>>>>>>>> === Required Resources ===
>>>>>>>>>> We request that following resources be created for the
>> project
>>> to
>>>>> use
>>>>>>>>>> 
>>>>>>>>>> ==== Mailing lists ====
>>>>>>>>>> 
>>>>>>>>>> predictionio-private@incubator.apache.org (with moderated
>>>>>>> subscriptions)
>>>>>>>>>> 
>>>>>>>>>> predictionio-dev
>>>>>>>>>> 
>>>>>>>>>> predictionio-user
>>>>>>>>>> 
>>>>>>>>>> predictionio-commits
>>>>>>>>>> 
>>>>>>>>>> We will migrate the existing PredictionIO mailing lists.
>>>>>>>>>> 
>>>>>>>>>> ==== Git repository ====
>>>>>>>>>> The PredictionIO team would like to use Git for source
>> control,
>>>> due
>>>>>> to
>>>>>>>>>> our
>>>>>>>>>> current use of GitHub.
>>>>>>>>>> 
>>>>>>>>>> git://git.apache.org/incubator-predictionio
>>>>>>>>>> 
>>>>>>>>>> ==== Documentation ====
>>>>>>>>>> https://predictionio.incubator.apache.org/docs/
>>>>>>>>>> 
>>>>>>>>>> ==== JIRA instance ====
>>>>>>>>>> PredictionIO currently uses the GitHub issue tracking system
>>>>>> associated
>>>>>>>>>> with its repository:
>>>>>>> https://github.com/PredictionIO/PredictionIO/issues
>>>>>>>>>> .
>>>>>>>>>> We will migrate to Apache JIRA.
>>>>>>>>>> 
>>>>>>>>>> JIRA PREDICTIONIO
>>>>>>>>>> https://issues.apache.org/jira/browse/PREDICTIONIO
>>>>>>>>>> 
>>>>>>>>>> ==== Other Resources ====
>>>>>>>>>> * TravisCI for builds and test running.
>>>>>>>>>> 
>>>>>>>>>> * PredictionIO's documentation, included in the code repo
>>>>>> (docs/manual
>>>>>>>>>> directory), is built with Middleman and publicly hosted
>>>>>>>>>> https://docs.prediction.io
>>>>>>>>>> 
>>>>>>>>>> * A blog to drive adoption and excitement at
>>>>>>> https://blog.prediction.io
>>>>>>>>>> 
>>>>>>>>>> === Initial Committers ===
>>>>>>>>>> 
>>>>>>>>>> * Pat Ferrell
>>>>>>>>>> 
>>>>>>>>>> * Tamas Jambor
>>>>>>>>>> 
>>>>>>>>>> * Justin Yip
>>>>>>>>>> 
>>>>>>>>>> * Xusen Yin
>>>>>>>>>> 
>>>>>>>>>> * Lee Moon Soo
>>>>>>>>>> 
>>>>>>>>>> * Donald Szeto
>>>>>>>>>> 
>>>>>>>>>> * Kenneth Chan
>>>>>>>>>> 
>>>>>>>>>> * Tom Chan
>>>>>>>>>> 
>>>>>>>>>> * Simon Chan
>>>>>>>>>> 
>>>>>>>>>> * Marco Vivero
>>>>>>>>>> 
>>>>>>>>>> * Matthew Tovbin
>>>>>>>>>> 
>>>>>>>>>> * Yevgeny Khodorkovsky
>>>>>>>>>> 
>>>>>>>>>> * Felipe Oliveira
>>>>>>>>>> 
>>>>>>>>>> * Vitaly Gordon
>>>>>>>>>> 
>>>>>>>>>> === Affiliations ===
>>>>>>>>>> 
>>>>>>>>>> * Pat Ferrell - ActionML
>>>>>>>>>> 
>>>>>>>>>> * Tamas Jambor - Channel4
>>>>>>>>>> 
>>>>>>>>>> * Justin Yip - independent
>>>>>>>>>> 
>>>>>>>>>> * Xusen Yin - USC
>>>>>>>>>> 
>>>>>>>>>> * Lee Moon Soo - NFLabs
>>>>>>>>>> 
>>>>>>>>>> * Donald Szeto - Salesforce
>>>>>>>>>> 
>>>>>>>>>> * Kenneth Chan - Salesforce
>>>>>>>>>> 
>>>>>>>>>> * Tom Chan - Salesforce
>>>>>>>>>> 
>>>>>>>>>> * Simon Chan - Salesforce
>>>>>>>>>> 
>>>>>>>>>> * Marco Vivero - Salesforce
>>>>>>>>>> 
>>>>>>>>>> * Matthew Tovbin - Salesforce
>>>>>>>>>> 
>>>>>>>>>> * Yevgeny Khodorkovsky - Salesforce
>>>>>>>>>> 
>>>>>>>>>> * Felipe Oliveira - Salesforce
>>>>>>>>>> 
>>>>>>>>>> * Vitaly Gordon - Salesforce
>>>>>>>>>> 
>>>>>>>>>> === Sponsors ===
>>>>>>>>>> 
>>>>>>>>>> ==== Champion ====
>>>>>>>>>> 
>>>>>>>>>> Andrew Purtell <apurtell at apache dot org>
>>>>>>>>>> 
>>>>>>>>>> ==== Nominated Mentors ====
>>>>>>>>>> 
>>>>>>>>>> * Andrew Purtell <apurtell at apache dot org>
>>>>>>>>>> 
>>>>>>>>>> * James Taylor <jtaylor at apache dot org>
>>>>>>>>>> 
>>>>>>>>>> * Lars Hofhansl <larsh at apache dot org>
>>>>>>>>>> 
>>>>>>>>>> * Suneel Marthi <smarthi at apache dot org>
>>>>>>>>>> 
>>>>>>>>>> * Xiangrui Meng <meng at apache dot org>
>>>>>>>>>> 
>>>>>>>>>> * Luciano Resende <lresende at apache dot org>
>>>>>>>>>> 
>>>>>>>>>> ==== Sponsoring Entity ====
>>>>>>>>>> 
>>>>>>>>>> Apache Incubator PMC
>>>>> ---------------------------------------------------------------------
>>>>>>>>> To unsubscribe, e-mail:
>>> general-unsubscribe@incubator.apache.org
>>>>>>>>> For additional commands, e-mail:
>>>> general-help@incubator.apache.org
>>>>>>>> --
>>>>>>>> Jean-Baptiste Onofré
>>>>>>>> jbonofre@apache.org
>>>>>>>> http://blog.nanthrax.net
>>>>>>>> Talend - http://www.talend.com
>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail:
>> general-unsubscribe@incubator.apache.org
>>>>>>>> For additional commands, e-mail:
>>> general-help@incubator.apache.org
>> 

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [DISCUSS] PredictionIO incubation proposal

Posted by Henry Saputra <he...@gmail.com>.
You are welcome, and great to have you as one of mentors for PredictionIO
polling.

Should be a fun project to be part of =)

- Henry

On Tue, May 17, 2016 at 2:14 PM, Suneel Marthi <sm...@apache.org> wrote:

> Thanks Henry
>
> On Tue, May 17, 2016 at 5:11 PM, Henry Saputra <he...@gmail.com>
> wrote:
>
> > As mentor, you will have karma to commit to the source repository.
> >
> > As you probably know, the initial committers and mentors will form the
> > initial PPMCs for the podling.
> > Hopefully for day to day operations you should not need to have
> distinction
> > of committer vs mentors anymore.
> >
> > You do not have to be listed as committer for the proposal.
> >
> > - Henry
> >
> > On Tue, May 17, 2016 at 1:57 PM, Suneel Marthi <sm...@apache.org>
> wrote:
> >
> > > Thanks for having me as a mentor for PIO.  I would like to be added to
> > the
> > > initial list of committers and am looking to actively participate in
> the
> > > development too. I am not sure if my being a mentor automatically
> grants
> > me
> > > the 'commit' karma.
> > >
> > > Its already been suggested earlier in this thread by Roman and
> > > Jean-Baptiste that the project needs to be decoupled from Spark and
> > > integrated with Beam.  It would be good to reduce the reliance on
> > > Spark-Submit from what I have seen of the project so far. But let's not
> > > talk architecture and design here when the project's not in incubator
> > yet.
> > > :)
> > >
> > >
> > >
> > >
> > > On Tue, May 17, 2016 at 4:09 PM, Henry Saputra <
> henry.saputra@gmail.com>
> > > wrote:
> > >
> > > > Cool, this will make code grant process to be easier =)
> > > >
> > > > The initial committers and mentors look great.
> > > > I am sure more will come as contributions start pouring in to the
> > > project.
> > > >
> > > > Looking forward for the VOTE thread soon.
> > > >
> > > > - Henry
> > > >
> > > > On Mon, May 16, 2016 at 12:07 PM, Simon Chan <si...@salesforce.com>
> > > wrote:
> > > >
> > > > > Yes, it includes everyone who previously contributed code from
> > > > PredictionIO
> > > > > before the acquisition and still want to be involved in the
> project.
> > > > >
> > > > > We may have missed "Alex Merritt", going to add him to the list
> soon.
> > > > >
> > > > > Simon
> > > > >
> > > > >
> > > > > On Mon, May 16, 2016 at 11:58 AM, Suneel Marthi <
> smarthi@apache.org>
> > > > > wrote:
> > > > >
> > > > > > I do have a question about the proposed list of committers.
> > > > > >
> > > > > > Does the list also include all of those folks who were with
> > > > PredictionIO
> > > > > > (and had contributed to the project) and then chose to leave when
> > PIO
> > > > was
> > > > > > acquired by Salesforce?
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Mon, May 16, 2016 at 1:13 PM, Jean-Baptiste Onofré <
> > > jb@nanthrax.net
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > By the way, we have some discussion about integrating Zeppelin
> > with
> > > > > Beam
> > > > > > ;)
> > > > > > >
> > > > > > > Regards
> > > > > > > JB
> > > > > > >
> > > > > > > On 05/15/2016 02:32 AM, Roman Shaposhnik wrote:
> > > > > > >
> > > > > > >> Super excited to see this proposal! This will finally allow us
> > to
> > > > have
> > > > > > >> an ASF managed
> > > > > > >> backend for next generation data-driven apps that I see
> emerging
> > > > quite
> > > > > > >> rapidly.
> > > > > > >>
> > > > > > >> The proposal looks great to me (although I'd recommend calling
> > > Scala
> > > > > > >> as an implementation
> > > > > > >> language more prominently since it may attract additional
> > > developers
> > > > > > >> with affinity to it).
> > > > > > >>
> > > > > > >> I do have two questions about technology:
> > > > > > >>     1. do you think it would be possible to leverage Apache
> Beam
> > > > > > >> (incubating)
> > > > > > >>         for abstracting away dependency on execution
> frameworks?
> > > My
> > > > > > >> understanding
> > > > > > >>         is that PredictionIO currently only run on Spark.
> > > > > > >>     2. is there a potential integration with Apache Zeppelin
> > > > possible?
> > > > > > >>
> > > > > > >> Thanks,
> > > > > > >> Roman.
> > > > > > >>
> > > > > > >> On Fri, May 13, 2016 at 1:41 PM, Andrew Purtell <
> > > > apurtell@apache.org>
> > > > > > >> wrote:
> > > > > > >>
> > > > > > >>> Greetings,
> > > > > > >>>
> > > > > > >>> It is my pleasure to
> > > > > > >>>
> > > > > > >>> propose the PredictionIO project for incubation at the Apache
> > > > > Software
> > > > > > >>> Foundation.
> > > > > > >>>
> > > > > > >>> PredictionIO is a
> > > > > > >>> popular
> > > > > > >>> open
> > > > > > >>>
> > > > > > >>> source Machine Learning Server built on top of a
> > state-of-the-art
> > > > > open
> > > > > > >>> source stack, including several Apache technologies, that
> > > > > > >>>
> > > > > > >>> enables developers to manage and deploy production-ready
> > > predictive
> > > > > > >>> services for various kinds of machine learning tasks
> > > > > > >>> , with more than 400 production deployments around the world
> > and
> > > a
> > > > > > >>> growing
> > > > > > >>> contributor community.
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> The text of the proposal is included below and is also
> > available
> > > at
> > > > > > >>> https://wiki.apache.org/incubator/PredictionIO
> > > > > > >>>
> > > > > > >>> Best regards,
> > > > > > >>> Andrew Purtell
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> = PredictionIO Proposal =
> > > > > > >>>
> > > > > > >>> === Abstract ===
> > > > > > >>> PredictionIO is an open source Machine Learning Server built
> on
> > > top
> > > > > of
> > > > > > >>> state-of-the-art open source stack, that enables developers
> to
> > > > manage
> > > > > > and
> > > > > > >>> deploy production-ready predictive services for various kinds
> > of
> > > > > > machine
> > > > > > >>> learning tasks.
> > > > > > >>>
> > > > > > >>> === Proposal ===
> > > > > > >>> The PredictionIO platform consists of the following
> components:
> > > > > > >>>
> > > > > > >>>   * PredictionIO framework - provides the machine learning
> > stack
> > > > for
> > > > > > >>>   building, evaluating and deploying engines with machine
> > > learning
> > > > > > >>>   algorithms. It uses Apache Spark for processing.
> > > > > > >>>
> > > > > > >>>   * Event Server - the machine learning analytics layer for
> > > > unifying
> > > > > > >>> events
> > > > > > >>>   from multiple platforms. It can use Apache HBase or any
> JDBC
> > > > > backends
> > > > > > >>>   as its data store.
> > > > > > >>>
> > > > > > >>> The PredictionIO community also maintains a
> > > > > > >>>
> > > > > > >>> Template Gallery, a place to
> > > > > > >>> publish and download (free or proprietary) engine templates
> for
> > > > > > different
> > > > > > >>> types of machine learning applications, and is a complemental
> > > part
> > > > of
> > > > > > the
> > > > > > >>> project. At this point we exclude the Template Gallery from
> the
> > > > > > proposal,
> > > > > > >>> as it has a separate set of contributors and we’re not
> familiar
> > > > with
> > > > > an
> > > > > > >>> Apache approved mechanism to maintain such a gallery.
> > > > > > >>>
> > > > > > >>> You can find the Template Gallery at
> > > > > https://templates.prediction.io/
> > > > > > >>>
> > > > > > >>> === Background ===
> > > > > > >>> PredictionIO was started with a mission to democratize and
> > bring
> > > > > > machine
> > > > > > >>> learning to the masses.
> > > > > > >>>
> > > > > > >>> Machine learning has traditionally been a luxury for big
> > > companies
> > > > > like
> > > > > > >>> Google, Facebook, and Netflix. There are ML libraries and
> tools
> > > > lying
> > > > > > >>> around the internet but the effort of putting them all
> together
> > > as
> > > > a
> > > > > > >>> production-ready infrastructure is a very resource-intensive
> > task
> > > > > that
> > > > > > is
> > > > > > >>> remotely reachable by individuals or small businesses.
> > > > > > >>>
> > > > > > >>> PredictionIO is a production-ready, full stack machine
> learning
> > > > > system
> > > > > > >>> that
> > > > > > >>> allows organizations of any scale to quickly deploy machine
> > > > learning
> > > > > > >>> capabilities. It comes with official and
> community-contributed
> > > > > machine
> > > > > > >>> learning engine templates that are easy to customize.
> > > > > > >>>
> > > > > > >>> === Rationale ===
> > > > > > >>> As usage and number of contributors to PredictionIO has grown
> > > > bigger
> > > > > > and
> > > > > > >>> more diverse, we have sought for an independent framework for
> > the
> > > > > > project
> > > > > > >>> to keep thriving. We believe the Apache foundation is a great
> > > fit.
> > > > > > >>> Joining
> > > > > > >>> Apache would ensure that tried and true processes and
> > procedures
> > > > are
> > > > > in
> > > > > > >>> place for the growing number of organizations interested in
> > > > > > contributing
> > > > > > >>> to PredictionIO. PredictionIO is also a good fit for the
> Apache
> > > > > > >>> foundation.
> > > > > > >>> PredictionIO was built on top of several Apache projects
> > (HBase,
> > > > > Spark,
> > > > > > >>> Hadoop). We are familiar with the Apache process and believe
> > that
> > > > the
> > > > > > >>> democratic and meritocratic nature of the foundation aligns
> > with
> > > > the
> > > > > > >>> project goals.
> > > > > > >>>
> > > > > > >>> === Initial Goals ===
> > > > > > >>> The initial milestones will be to move the existing codebase
> to
> > > > > Apache
> > > > > > >>> and
> > > > > > >>> integrate with the Apache development process. Once this is
> > > > > > accomplished,
> > > > > > >>> we plan for incremental development and releases that follow
> > the
> > > > > Apache
> > > > > > >>> guidelines, as well as growing our developer and user
> > > communities.
> > > > > > >>>
> > > > > > >>> === Current Status ===
> > > > > > >>> PredictionIO has undergone nine minor releases and many
> > patches.
> > > > > > >>> PredictionIO is being used in production by Salesforce.com as
> > > well
> > > > as
> > > > > > >>> many
> > > > > > >>> other organizations and apps. The PredictionIO codebase is
> > > > currently
> > > > > > >>> hosted at GitHub, which will form the basis of the Apache git
> > > > > > repository.
> > > > > > >>>
> > > > > > >>> ==== Meritocracy ====
> > > > > > >>> We plan to invest in supporting a meritocracy. We will
> discuss
> > > the
> > > > > > >>> requirements in an open forum. We intend to invite additional
> > > > > > developers
> > > > > > >>> to participate. We will encourage and monitor community
> > > > participation
> > > > > > so
> > > > > > >>> that privileges can be extended to those that contribute.
> > > > > > >>>
> > > > > > >>> ==== Community ====
> > > > > > >>> Acceptance into the Apache foundation would bolster the
> already
> > > > > strong
> > > > > > >>> user and developer community around PredictionIO. That
> > community
> > > > > > includes
> > > > > > >>> many contributors from various other companies, and an active
> > > > mailing
> > > > > > >>> list
> > > > > > >>> composed of hundreds of users.
> > > > > > >>>
> > > > > > >>> ==== Core Developers ====
> > > > > > >>> The core developers of our project are listed in our
> > contributors
> > > > and
> > > > > > >>> initial PPMC below. Though many are employed at
> Salesforce.com,
> > > > there
> > > > > > are
> > > > > > >>> also engineers from ActionML, and independent developers.
> > > > > > >>>
> > > > > > >>> === Alignment ===
> > > > > > >>> The ASF is the natural choice to host the PredictionIO
> project
> > as
> > > > its
> > > > > > >>> goal
> > > > > > >>> is democratizing Machine Learning by making it more easily
> > > > accessible
> > > > > > to
> > > > > > >>> every user/developer. PredictionIO is built on top of several
> > top
> > > > > level
> > > > > > >>> Apache projects as outlined above.
> > > > > > >>>
> > > > > > >>> === Known Risks ===
> > > > > > >>>
> > > > > > >>> ==== Orphaned products ====
> > > > > > >>> PredictionIO has a solid and growing community. It is
> deployed
> > on
> > > > > > >>> production environments by companies of all sizes to run
> > various
> > > > > kinds
> > > > > > of
> > > > > > >>> predictive engines.
> > > > > > >>>
> > > > > > >>> In addition to the community contribution to PredictionIO
> > > > framework,
> > > > > > the
> > > > > > >>> community is also actively contributing new engines to the
> > > Template
> > > > > > >>> Gallery as well as SDKs and documentation for the project.
> > > > Salesforce
> > > > > > is
> > > > > > >>> committed to utilize and advance the PredictionIO code base
> and
> > > > > support
> > > > > > >>> its user community.
> > > > > > >>>
> > > > > > >>> ==== Inexperience with Open Source ====
> > > > > > >>> PredictionIO has existed as a healthy open source project for
> > > > almost
> > > > > > two
> > > > > > >>> years and is the most starred Scala project on GitHub. All of
> > the
> > > > > > >>> proposed
> > > > > > >>> committers have contributed to ASF and Linux Foundation open
> > > source
> > > > > > >>> projects. Several current committers on Apache projects and
> > > Apache
> > > > > > >>> Members
> > > > > > >>> are involved in this proposal and intend to provide
> mentorship.
> > > > > > >>>
> > > > > > >>> ==== Homogeneous Developers ====
> > > > > > >>> The initial list of committers includes developers from
> several
> > > > > > >>> institutions, including Salesforce, ActionML, Channel4, USC
> as
> > > well
> > > > > as
> > > > > > >>> unaffiliated developers.
> > > > > > >>>
> > > > > > >>> ==== Reliance on Salaried Developers ====
> > > > > > >>> Like most open source projects, PredictionIO receives
> > substantial
> > > > > > support
> > > > > > >>> from salaried developers. PredictionIO development is
> partially
> > > > > > supported
> > > > > > >>> by Salesforce.com, but there are many contributors from
> various
> > > > other
> > > > > > >>> companies, and an active mailing list composed of hundreds of
> > > > users.
> > > > > We
> > > > > > >>> will continue our efforts to ensure stewardship of the
> project
> > to
> > > > be
> > > > > > >>> independent of salaried developers by meritocratically
> > promoting
> > > > > those
> > > > > > >>> contributors to committers.
> > > > > > >>>
> > > > > > >>> ==== Relationships with Other Apache Product ====
> > > > > > >>> PredictionIO relies heavily on top level apache projects such
> > as
> > > > > Apache
> > > > > > >>> Spark, HBase and Hadoop. However it brings a distinguished
> > > > > > functionality,
> > > > > > >>> rather than just an abstraction - Machine Learning in a
> > > > plug-and-play
> > > > > > >>> fashion.
> > > > > > >>>
> > > > > > >>> Compared to Apache Mahout, which focuses on the development
> of
> > a
> > > > wide
> > > > > > >>> variety of algorithms, PredictionIO offers a platform to
> manage
> > > the
> > > > > > whole
> > > > > > >>> machine learning workflow, including data collection, data
> > > > > preparation,
> > > > > > >>> modeling, deployment and management of predictive services in
> > > > > > production
> > > > > > >>> environments.
> > > > > > >>>
> > > > > > >>> ==== An Excessive Fascination with the Apache Brand ====
> > > > > > >>> PredictionIO is already a widely known open source project.
> > This
> > > > > > proposal
> > > > > > >>> is not for the purpose of generating publicity. Rather, the
> > > primary
> > > > > > >>> benefits to joining Apache are those outlined in the
> Rationale
> > > > > section.
> > > > > > >>>
> > > > > > >>> === Documentation ===
> > > > > > >>> PredictionIO boasts rich and live documentation, included in
> > the
> > > > code
> > > > > > >>> repo
> > > > > > >>> (docs/manual directory), is built with Middleman, and
> publicly
> > > > hosted
> > > > > > at
> > > > > > >>> https://docs.prediction.io
> > > > > > >>>
> > > > > > >>> === Initial Source and Intellectual Property Submission Plan
> > ===
> > > > > > >>> Currently, the PredictionIO codebase is distributed under the
> > > > Apache
> > > > > > 2.0
> > > > > > >>> License and hosted on GitHub:
> > > > > > >>> https://github.com/PredictionIO/PredictionIO
> > > > > > >>>
> > > > > > >>> === External Dependencies ===
> > > > > > >>> PredictionIO has the following external dependencies:
> > > > > > >>>   * Apache Hadoop 2.4.0 (optional, required only if YARN and
> > HDFS
> > > > are
> > > > > > >>> needed)
> > > > > > >>>   * Apache Spark 1.3.0 for Hadoop 2.4
> > > > > > >>>   * Java SE Development Kit 8
> > > > > > >>>   * and one of the following sets:
> > > > > > >>>
> > > > > > >>>     * PostgreSQL 9.1
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> or
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> * MySQL 5.1
> > > > > > >>>
> > > > > > >>>   or
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>   * Apache HBase 0.98.6
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> * Elasticsearch 1.4.0
> > > > > > >>>
> > > > > > >>> Upon acceptance to the incubator, we would begin a thorough
> > > > analysis
> > > > > of
> > > > > > >>> all transitive dependencies to verify this information and
> > > > introduce
> > > > > > >>> license checking into the build and release process by
> > > integrating
> > > > > with
> > > > > > >>> Apache RAT.
> > > > > > >>>
> > > > > > >>> === Cryptography ===
> > > > > > >>> PredictionIO does not include cryptographic code. We utilize
> > > > standard
> > > > > > >>> JCE and JSSE APIs provided by the Java Runtime Environment.
> > > > > > >>>
> > > > > > >>> === Required Resources ===
> > > > > > >>> We request that following resources be created for the
> project
> > to
> > > > use
> > > > > > >>>
> > > > > > >>> ==== Mailing lists ====
> > > > > > >>>
> > > > > > >>> predictionio-private@incubator.apache.org (with moderated
> > > > > > subscriptions)
> > > > > > >>>
> > > > > > >>> predictionio-dev
> > > > > > >>>
> > > > > > >>> predictionio-user
> > > > > > >>>
> > > > > > >>> predictionio-commits
> > > > > > >>>
> > > > > > >>> We will migrate the existing PredictionIO mailing lists.
> > > > > > >>>
> > > > > > >>> ==== Git repository ====
> > > > > > >>> The PredictionIO team would like to use Git for source
> control,
> > > due
> > > > > to
> > > > > > >>> our
> > > > > > >>> current use of GitHub.
> > > > > > >>>
> > > > > > >>> git://git.apache.org/incubator-predictionio
> > > > > > >>>
> > > > > > >>> ==== Documentation ====
> > > > > > >>> https://predictionio.incubator.apache.org/docs/
> > > > > > >>>
> > > > > > >>> ==== JIRA instance ====
> > > > > > >>> PredictionIO currently uses the GitHub issue tracking system
> > > > > associated
> > > > > > >>> with its repository:
> > > > > > https://github.com/PredictionIO/PredictionIO/issues
> > > > > > >>> .
> > > > > > >>> We will migrate to Apache JIRA.
> > > > > > >>>
> > > > > > >>> JIRA PREDICTIONIO
> > > > > > >>> https://issues.apache.org/jira/browse/PREDICTIONIO
> > > > > > >>>
> > > > > > >>> ==== Other Resources ====
> > > > > > >>> * TravisCI for builds and test running.
> > > > > > >>>
> > > > > > >>> * PredictionIO's documentation, included in the code repo
> > > > > (docs/manual
> > > > > > >>> directory), is built with Middleman and publicly hosted
> > > > > > >>> https://docs.prediction.io
> > > > > > >>>
> > > > > > >>> * A blog to drive adoption and excitement at
> > > > > > https://blog.prediction.io
> > > > > > >>>
> > > > > > >>> === Initial Committers ===
> > > > > > >>>
> > > > > > >>> * Pat Ferrell
> > > > > > >>>
> > > > > > >>> * Tamas Jambor
> > > > > > >>>
> > > > > > >>> * Justin Yip
> > > > > > >>>
> > > > > > >>> * Xusen Yin
> > > > > > >>>
> > > > > > >>> * Lee Moon Soo
> > > > > > >>>
> > > > > > >>> * Donald Szeto
> > > > > > >>>
> > > > > > >>> * Kenneth Chan
> > > > > > >>>
> > > > > > >>> * Tom Chan
> > > > > > >>>
> > > > > > >>> * Simon Chan
> > > > > > >>>
> > > > > > >>> * Marco Vivero
> > > > > > >>>
> > > > > > >>> * Matthew Tovbin
> > > > > > >>>
> > > > > > >>> * Yevgeny Khodorkovsky
> > > > > > >>>
> > > > > > >>> * Felipe Oliveira
> > > > > > >>>
> > > > > > >>> * Vitaly Gordon
> > > > > > >>>
> > > > > > >>> === Affiliations ===
> > > > > > >>>
> > > > > > >>> * Pat Ferrell - ActionML
> > > > > > >>>
> > > > > > >>> * Tamas Jambor - Channel4
> > > > > > >>>
> > > > > > >>> * Justin Yip - independent
> > > > > > >>>
> > > > > > >>> * Xusen Yin - USC
> > > > > > >>>
> > > > > > >>> * Lee Moon Soo - NFLabs
> > > > > > >>>
> > > > > > >>> * Donald Szeto - Salesforce
> > > > > > >>>
> > > > > > >>> * Kenneth Chan - Salesforce
> > > > > > >>>
> > > > > > >>> * Tom Chan - Salesforce
> > > > > > >>>
> > > > > > >>> * Simon Chan - Salesforce
> > > > > > >>>
> > > > > > >>> * Marco Vivero - Salesforce
> > > > > > >>>
> > > > > > >>> * Matthew Tovbin - Salesforce
> > > > > > >>>
> > > > > > >>> * Yevgeny Khodorkovsky - Salesforce
> > > > > > >>>
> > > > > > >>> * Felipe Oliveira - Salesforce
> > > > > > >>>
> > > > > > >>> * Vitaly Gordon - Salesforce
> > > > > > >>>
> > > > > > >>> === Sponsors ===
> > > > > > >>>
> > > > > > >>> ==== Champion ====
> > > > > > >>>
> > > > > > >>> Andrew Purtell <apurtell at apache dot org>
> > > > > > >>>
> > > > > > >>> ==== Nominated Mentors ====
> > > > > > >>>
> > > > > > >>> * Andrew Purtell <apurtell at apache dot org>
> > > > > > >>>
> > > > > > >>> * James Taylor <jtaylor at apache dot org>
> > > > > > >>>
> > > > > > >>> * Lars Hofhansl <larsh at apache dot org>
> > > > > > >>>
> > > > > > >>> * Suneel Marthi <smarthi at apache dot org>
> > > > > > >>>
> > > > > > >>> * Xiangrui Meng <meng at apache dot org>
> > > > > > >>>
> > > > > > >>> * Luciano Resende <lresende at apache dot org>
> > > > > > >>>
> > > > > > >>> ==== Sponsoring Entity ====
> > > > > > >>>
> > > > > > >>> Apache Incubator PMC
> > > > > > >>>
> > > > > > >>
> > > > > > >>
> > > > ---------------------------------------------------------------------
> > > > > > >> To unsubscribe, e-mail:
> > general-unsubscribe@incubator.apache.org
> > > > > > >> For additional commands, e-mail:
> > > general-help@incubator.apache.org
> > > > > > >>
> > > > > > >>
> > > > > > > --
> > > > > > > Jean-Baptiste Onofré
> > > > > > > jbonofre@apache.org
> > > > > > > http://blog.nanthrax.net
> > > > > > > Talend - http://www.talend.com
> > > > > > >
> > > > > > >
> > > ---------------------------------------------------------------------
> > > > > > > To unsubscribe, e-mail:
> general-unsubscribe@incubator.apache.org
> > > > > > > For additional commands, e-mail:
> > general-help@incubator.apache.org
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] PredictionIO incubation proposal

Posted by Suneel Marthi <sm...@apache.org>.
Thanks Henry

On Tue, May 17, 2016 at 5:11 PM, Henry Saputra <he...@gmail.com>
wrote:

> As mentor, you will have karma to commit to the source repository.
>
> As you probably know, the initial committers and mentors will form the
> initial PPMCs for the podling.
> Hopefully for day to day operations you should not need to have distinction
> of committer vs mentors anymore.
>
> You do not have to be listed as committer for the proposal.
>
> - Henry
>
> On Tue, May 17, 2016 at 1:57 PM, Suneel Marthi <sm...@apache.org> wrote:
>
> > Thanks for having me as a mentor for PIO.  I would like to be added to
> the
> > initial list of committers and am looking to actively participate in the
> > development too. I am not sure if my being a mentor automatically grants
> me
> > the 'commit' karma.
> >
> > Its already been suggested earlier in this thread by Roman and
> > Jean-Baptiste that the project needs to be decoupled from Spark and
> > integrated with Beam.  It would be good to reduce the reliance on
> > Spark-Submit from what I have seen of the project so far. But let's not
> > talk architecture and design here when the project's not in incubator
> yet.
> > :)
> >
> >
> >
> >
> > On Tue, May 17, 2016 at 4:09 PM, Henry Saputra <he...@gmail.com>
> > wrote:
> >
> > > Cool, this will make code grant process to be easier =)
> > >
> > > The initial committers and mentors look great.
> > > I am sure more will come as contributions start pouring in to the
> > project.
> > >
> > > Looking forward for the VOTE thread soon.
> > >
> > > - Henry
> > >
> > > On Mon, May 16, 2016 at 12:07 PM, Simon Chan <si...@salesforce.com>
> > wrote:
> > >
> > > > Yes, it includes everyone who previously contributed code from
> > > PredictionIO
> > > > before the acquisition and still want to be involved in the project.
> > > >
> > > > We may have missed "Alex Merritt", going to add him to the list soon.
> > > >
> > > > Simon
> > > >
> > > >
> > > > On Mon, May 16, 2016 at 11:58 AM, Suneel Marthi <sm...@apache.org>
> > > > wrote:
> > > >
> > > > > I do have a question about the proposed list of committers.
> > > > >
> > > > > Does the list also include all of those folks who were with
> > > PredictionIO
> > > > > (and had contributed to the project) and then chose to leave when
> PIO
> > > was
> > > > > acquired by Salesforce?
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Mon, May 16, 2016 at 1:13 PM, Jean-Baptiste Onofré <
> > jb@nanthrax.net
> > > >
> > > > > wrote:
> > > > >
> > > > > > By the way, we have some discussion about integrating Zeppelin
> with
> > > > Beam
> > > > > ;)
> > > > > >
> > > > > > Regards
> > > > > > JB
> > > > > >
> > > > > > On 05/15/2016 02:32 AM, Roman Shaposhnik wrote:
> > > > > >
> > > > > >> Super excited to see this proposal! This will finally allow us
> to
> > > have
> > > > > >> an ASF managed
> > > > > >> backend for next generation data-driven apps that I see emerging
> > > quite
> > > > > >> rapidly.
> > > > > >>
> > > > > >> The proposal looks great to me (although I'd recommend calling
> > Scala
> > > > > >> as an implementation
> > > > > >> language more prominently since it may attract additional
> > developers
> > > > > >> with affinity to it).
> > > > > >>
> > > > > >> I do have two questions about technology:
> > > > > >>     1. do you think it would be possible to leverage Apache Beam
> > > > > >> (incubating)
> > > > > >>         for abstracting away dependency on execution frameworks?
> > My
> > > > > >> understanding
> > > > > >>         is that PredictionIO currently only run on Spark.
> > > > > >>     2. is there a potential integration with Apache Zeppelin
> > > possible?
> > > > > >>
> > > > > >> Thanks,
> > > > > >> Roman.
> > > > > >>
> > > > > >> On Fri, May 13, 2016 at 1:41 PM, Andrew Purtell <
> > > apurtell@apache.org>
> > > > > >> wrote:
> > > > > >>
> > > > > >>> Greetings,
> > > > > >>>
> > > > > >>> It is my pleasure to
> > > > > >>>
> > > > > >>> propose the PredictionIO project for incubation at the Apache
> > > > Software
> > > > > >>> Foundation.
> > > > > >>>
> > > > > >>> PredictionIO is a
> > > > > >>> popular
> > > > > >>> open
> > > > > >>>
> > > > > >>> source Machine Learning Server built on top of a
> state-of-the-art
> > > > open
> > > > > >>> source stack, including several Apache technologies, that
> > > > > >>>
> > > > > >>> enables developers to manage and deploy production-ready
> > predictive
> > > > > >>> services for various kinds of machine learning tasks
> > > > > >>> , with more than 400 production deployments around the world
> and
> > a
> > > > > >>> growing
> > > > > >>> contributor community.
> > > > > >>>
> > > > > >>>
> > > > > >>> The text of the proposal is included below and is also
> available
> > at
> > > > > >>> https://wiki.apache.org/incubator/PredictionIO
> > > > > >>>
> > > > > >>> Best regards,
> > > > > >>> Andrew Purtell
> > > > > >>>
> > > > > >>>
> > > > > >>> = PredictionIO Proposal =
> > > > > >>>
> > > > > >>> === Abstract ===
> > > > > >>> PredictionIO is an open source Machine Learning Server built on
> > top
> > > > of
> > > > > >>> state-of-the-art open source stack, that enables developers to
> > > manage
> > > > > and
> > > > > >>> deploy production-ready predictive services for various kinds
> of
> > > > > machine
> > > > > >>> learning tasks.
> > > > > >>>
> > > > > >>> === Proposal ===
> > > > > >>> The PredictionIO platform consists of the following components:
> > > > > >>>
> > > > > >>>   * PredictionIO framework - provides the machine learning
> stack
> > > for
> > > > > >>>   building, evaluating and deploying engines with machine
> > learning
> > > > > >>>   algorithms. It uses Apache Spark for processing.
> > > > > >>>
> > > > > >>>   * Event Server - the machine learning analytics layer for
> > > unifying
> > > > > >>> events
> > > > > >>>   from multiple platforms. It can use Apache HBase or any JDBC
> > > > backends
> > > > > >>>   as its data store.
> > > > > >>>
> > > > > >>> The PredictionIO community also maintains a
> > > > > >>>
> > > > > >>> Template Gallery, a place to
> > > > > >>> publish and download (free or proprietary) engine templates for
> > > > > different
> > > > > >>> types of machine learning applications, and is a complemental
> > part
> > > of
> > > > > the
> > > > > >>> project. At this point we exclude the Template Gallery from the
> > > > > proposal,
> > > > > >>> as it has a separate set of contributors and we’re not familiar
> > > with
> > > > an
> > > > > >>> Apache approved mechanism to maintain such a gallery.
> > > > > >>>
> > > > > >>> You can find the Template Gallery at
> > > > https://templates.prediction.io/
> > > > > >>>
> > > > > >>> === Background ===
> > > > > >>> PredictionIO was started with a mission to democratize and
> bring
> > > > > machine
> > > > > >>> learning to the masses.
> > > > > >>>
> > > > > >>> Machine learning has traditionally been a luxury for big
> > companies
> > > > like
> > > > > >>> Google, Facebook, and Netflix. There are ML libraries and tools
> > > lying
> > > > > >>> around the internet but the effort of putting them all together
> > as
> > > a
> > > > > >>> production-ready infrastructure is a very resource-intensive
> task
> > > > that
> > > > > is
> > > > > >>> remotely reachable by individuals or small businesses.
> > > > > >>>
> > > > > >>> PredictionIO is a production-ready, full stack machine learning
> > > > system
> > > > > >>> that
> > > > > >>> allows organizations of any scale to quickly deploy machine
> > > learning
> > > > > >>> capabilities. It comes with official and community-contributed
> > > > machine
> > > > > >>> learning engine templates that are easy to customize.
> > > > > >>>
> > > > > >>> === Rationale ===
> > > > > >>> As usage and number of contributors to PredictionIO has grown
> > > bigger
> > > > > and
> > > > > >>> more diverse, we have sought for an independent framework for
> the
> > > > > project
> > > > > >>> to keep thriving. We believe the Apache foundation is a great
> > fit.
> > > > > >>> Joining
> > > > > >>> Apache would ensure that tried and true processes and
> procedures
> > > are
> > > > in
> > > > > >>> place for the growing number of organizations interested in
> > > > > contributing
> > > > > >>> to PredictionIO. PredictionIO is also a good fit for the Apache
> > > > > >>> foundation.
> > > > > >>> PredictionIO was built on top of several Apache projects
> (HBase,
> > > > Spark,
> > > > > >>> Hadoop). We are familiar with the Apache process and believe
> that
> > > the
> > > > > >>> democratic and meritocratic nature of the foundation aligns
> with
> > > the
> > > > > >>> project goals.
> > > > > >>>
> > > > > >>> === Initial Goals ===
> > > > > >>> The initial milestones will be to move the existing codebase to
> > > > Apache
> > > > > >>> and
> > > > > >>> integrate with the Apache development process. Once this is
> > > > > accomplished,
> > > > > >>> we plan for incremental development and releases that follow
> the
> > > > Apache
> > > > > >>> guidelines, as well as growing our developer and user
> > communities.
> > > > > >>>
> > > > > >>> === Current Status ===
> > > > > >>> PredictionIO has undergone nine minor releases and many
> patches.
> > > > > >>> PredictionIO is being used in production by Salesforce.com as
> > well
> > > as
> > > > > >>> many
> > > > > >>> other organizations and apps. The PredictionIO codebase is
> > > currently
> > > > > >>> hosted at GitHub, which will form the basis of the Apache git
> > > > > repository.
> > > > > >>>
> > > > > >>> ==== Meritocracy ====
> > > > > >>> We plan to invest in supporting a meritocracy. We will discuss
> > the
> > > > > >>> requirements in an open forum. We intend to invite additional
> > > > > developers
> > > > > >>> to participate. We will encourage and monitor community
> > > participation
> > > > > so
> > > > > >>> that privileges can be extended to those that contribute.
> > > > > >>>
> > > > > >>> ==== Community ====
> > > > > >>> Acceptance into the Apache foundation would bolster the already
> > > > strong
> > > > > >>> user and developer community around PredictionIO. That
> community
> > > > > includes
> > > > > >>> many contributors from various other companies, and an active
> > > mailing
> > > > > >>> list
> > > > > >>> composed of hundreds of users.
> > > > > >>>
> > > > > >>> ==== Core Developers ====
> > > > > >>> The core developers of our project are listed in our
> contributors
> > > and
> > > > > >>> initial PPMC below. Though many are employed at Salesforce.com,
> > > there
> > > > > are
> > > > > >>> also engineers from ActionML, and independent developers.
> > > > > >>>
> > > > > >>> === Alignment ===
> > > > > >>> The ASF is the natural choice to host the PredictionIO project
> as
> > > its
> > > > > >>> goal
> > > > > >>> is democratizing Machine Learning by making it more easily
> > > accessible
> > > > > to
> > > > > >>> every user/developer. PredictionIO is built on top of several
> top
> > > > level
> > > > > >>> Apache projects as outlined above.
> > > > > >>>
> > > > > >>> === Known Risks ===
> > > > > >>>
> > > > > >>> ==== Orphaned products ====
> > > > > >>> PredictionIO has a solid and growing community. It is deployed
> on
> > > > > >>> production environments by companies of all sizes to run
> various
> > > > kinds
> > > > > of
> > > > > >>> predictive engines.
> > > > > >>>
> > > > > >>> In addition to the community contribution to PredictionIO
> > > framework,
> > > > > the
> > > > > >>> community is also actively contributing new engines to the
> > Template
> > > > > >>> Gallery as well as SDKs and documentation for the project.
> > > Salesforce
> > > > > is
> > > > > >>> committed to utilize and advance the PredictionIO code base and
> > > > support
> > > > > >>> its user community.
> > > > > >>>
> > > > > >>> ==== Inexperience with Open Source ====
> > > > > >>> PredictionIO has existed as a healthy open source project for
> > > almost
> > > > > two
> > > > > >>> years and is the most starred Scala project on GitHub. All of
> the
> > > > > >>> proposed
> > > > > >>> committers have contributed to ASF and Linux Foundation open
> > source
> > > > > >>> projects. Several current committers on Apache projects and
> > Apache
> > > > > >>> Members
> > > > > >>> are involved in this proposal and intend to provide mentorship.
> > > > > >>>
> > > > > >>> ==== Homogeneous Developers ====
> > > > > >>> The initial list of committers includes developers from several
> > > > > >>> institutions, including Salesforce, ActionML, Channel4, USC as
> > well
> > > > as
> > > > > >>> unaffiliated developers.
> > > > > >>>
> > > > > >>> ==== Reliance on Salaried Developers ====
> > > > > >>> Like most open source projects, PredictionIO receives
> substantial
> > > > > support
> > > > > >>> from salaried developers. PredictionIO development is partially
> > > > > supported
> > > > > >>> by Salesforce.com, but there are many contributors from various
> > > other
> > > > > >>> companies, and an active mailing list composed of hundreds of
> > > users.
> > > > We
> > > > > >>> will continue our efforts to ensure stewardship of the project
> to
> > > be
> > > > > >>> independent of salaried developers by meritocratically
> promoting
> > > > those
> > > > > >>> contributors to committers.
> > > > > >>>
> > > > > >>> ==== Relationships with Other Apache Product ====
> > > > > >>> PredictionIO relies heavily on top level apache projects such
> as
> > > > Apache
> > > > > >>> Spark, HBase and Hadoop. However it brings a distinguished
> > > > > functionality,
> > > > > >>> rather than just an abstraction - Machine Learning in a
> > > plug-and-play
> > > > > >>> fashion.
> > > > > >>>
> > > > > >>> Compared to Apache Mahout, which focuses on the development of
> a
> > > wide
> > > > > >>> variety of algorithms, PredictionIO offers a platform to manage
> > the
> > > > > whole
> > > > > >>> machine learning workflow, including data collection, data
> > > > preparation,
> > > > > >>> modeling, deployment and management of predictive services in
> > > > > production
> > > > > >>> environments.
> > > > > >>>
> > > > > >>> ==== An Excessive Fascination with the Apache Brand ====
> > > > > >>> PredictionIO is already a widely known open source project.
> This
> > > > > proposal
> > > > > >>> is not for the purpose of generating publicity. Rather, the
> > primary
> > > > > >>> benefits to joining Apache are those outlined in the Rationale
> > > > section.
> > > > > >>>
> > > > > >>> === Documentation ===
> > > > > >>> PredictionIO boasts rich and live documentation, included in
> the
> > > code
> > > > > >>> repo
> > > > > >>> (docs/manual directory), is built with Middleman, and publicly
> > > hosted
> > > > > at
> > > > > >>> https://docs.prediction.io
> > > > > >>>
> > > > > >>> === Initial Source and Intellectual Property Submission Plan
> ===
> > > > > >>> Currently, the PredictionIO codebase is distributed under the
> > > Apache
> > > > > 2.0
> > > > > >>> License and hosted on GitHub:
> > > > > >>> https://github.com/PredictionIO/PredictionIO
> > > > > >>>
> > > > > >>> === External Dependencies ===
> > > > > >>> PredictionIO has the following external dependencies:
> > > > > >>>   * Apache Hadoop 2.4.0 (optional, required only if YARN and
> HDFS
> > > are
> > > > > >>> needed)
> > > > > >>>   * Apache Spark 1.3.0 for Hadoop 2.4
> > > > > >>>   * Java SE Development Kit 8
> > > > > >>>   * and one of the following sets:
> > > > > >>>
> > > > > >>>     * PostgreSQL 9.1
> > > > > >>>
> > > > > >>>
> > > > > >>> or
> > > > > >>>
> > > > > >>>
> > > > > >>> * MySQL 5.1
> > > > > >>>
> > > > > >>>   or
> > > > > >>>
> > > > > >>>
> > > > > >>>   * Apache HBase 0.98.6
> > > > > >>>
> > > > > >>>
> > > > > >>> * Elasticsearch 1.4.0
> > > > > >>>
> > > > > >>> Upon acceptance to the incubator, we would begin a thorough
> > > analysis
> > > > of
> > > > > >>> all transitive dependencies to verify this information and
> > > introduce
> > > > > >>> license checking into the build and release process by
> > integrating
> > > > with
> > > > > >>> Apache RAT.
> > > > > >>>
> > > > > >>> === Cryptography ===
> > > > > >>> PredictionIO does not include cryptographic code. We utilize
> > > standard
> > > > > >>> JCE and JSSE APIs provided by the Java Runtime Environment.
> > > > > >>>
> > > > > >>> === Required Resources ===
> > > > > >>> We request that following resources be created for the project
> to
> > > use
> > > > > >>>
> > > > > >>> ==== Mailing lists ====
> > > > > >>>
> > > > > >>> predictionio-private@incubator.apache.org (with moderated
> > > > > subscriptions)
> > > > > >>>
> > > > > >>> predictionio-dev
> > > > > >>>
> > > > > >>> predictionio-user
> > > > > >>>
> > > > > >>> predictionio-commits
> > > > > >>>
> > > > > >>> We will migrate the existing PredictionIO mailing lists.
> > > > > >>>
> > > > > >>> ==== Git repository ====
> > > > > >>> The PredictionIO team would like to use Git for source control,
> > due
> > > > to
> > > > > >>> our
> > > > > >>> current use of GitHub.
> > > > > >>>
> > > > > >>> git://git.apache.org/incubator-predictionio
> > > > > >>>
> > > > > >>> ==== Documentation ====
> > > > > >>> https://predictionio.incubator.apache.org/docs/
> > > > > >>>
> > > > > >>> ==== JIRA instance ====
> > > > > >>> PredictionIO currently uses the GitHub issue tracking system
> > > > associated
> > > > > >>> with its repository:
> > > > > https://github.com/PredictionIO/PredictionIO/issues
> > > > > >>> .
> > > > > >>> We will migrate to Apache JIRA.
> > > > > >>>
> > > > > >>> JIRA PREDICTIONIO
> > > > > >>> https://issues.apache.org/jira/browse/PREDICTIONIO
> > > > > >>>
> > > > > >>> ==== Other Resources ====
> > > > > >>> * TravisCI for builds and test running.
> > > > > >>>
> > > > > >>> * PredictionIO's documentation, included in the code repo
> > > > (docs/manual
> > > > > >>> directory), is built with Middleman and publicly hosted
> > > > > >>> https://docs.prediction.io
> > > > > >>>
> > > > > >>> * A blog to drive adoption and excitement at
> > > > > https://blog.prediction.io
> > > > > >>>
> > > > > >>> === Initial Committers ===
> > > > > >>>
> > > > > >>> * Pat Ferrell
> > > > > >>>
> > > > > >>> * Tamas Jambor
> > > > > >>>
> > > > > >>> * Justin Yip
> > > > > >>>
> > > > > >>> * Xusen Yin
> > > > > >>>
> > > > > >>> * Lee Moon Soo
> > > > > >>>
> > > > > >>> * Donald Szeto
> > > > > >>>
> > > > > >>> * Kenneth Chan
> > > > > >>>
> > > > > >>> * Tom Chan
> > > > > >>>
> > > > > >>> * Simon Chan
> > > > > >>>
> > > > > >>> * Marco Vivero
> > > > > >>>
> > > > > >>> * Matthew Tovbin
> > > > > >>>
> > > > > >>> * Yevgeny Khodorkovsky
> > > > > >>>
> > > > > >>> * Felipe Oliveira
> > > > > >>>
> > > > > >>> * Vitaly Gordon
> > > > > >>>
> > > > > >>> === Affiliations ===
> > > > > >>>
> > > > > >>> * Pat Ferrell - ActionML
> > > > > >>>
> > > > > >>> * Tamas Jambor - Channel4
> > > > > >>>
> > > > > >>> * Justin Yip - independent
> > > > > >>>
> > > > > >>> * Xusen Yin - USC
> > > > > >>>
> > > > > >>> * Lee Moon Soo - NFLabs
> > > > > >>>
> > > > > >>> * Donald Szeto - Salesforce
> > > > > >>>
> > > > > >>> * Kenneth Chan - Salesforce
> > > > > >>>
> > > > > >>> * Tom Chan - Salesforce
> > > > > >>>
> > > > > >>> * Simon Chan - Salesforce
> > > > > >>>
> > > > > >>> * Marco Vivero - Salesforce
> > > > > >>>
> > > > > >>> * Matthew Tovbin - Salesforce
> > > > > >>>
> > > > > >>> * Yevgeny Khodorkovsky - Salesforce
> > > > > >>>
> > > > > >>> * Felipe Oliveira - Salesforce
> > > > > >>>
> > > > > >>> * Vitaly Gordon - Salesforce
> > > > > >>>
> > > > > >>> === Sponsors ===
> > > > > >>>
> > > > > >>> ==== Champion ====
> > > > > >>>
> > > > > >>> Andrew Purtell <apurtell at apache dot org>
> > > > > >>>
> > > > > >>> ==== Nominated Mentors ====
> > > > > >>>
> > > > > >>> * Andrew Purtell <apurtell at apache dot org>
> > > > > >>>
> > > > > >>> * James Taylor <jtaylor at apache dot org>
> > > > > >>>
> > > > > >>> * Lars Hofhansl <larsh at apache dot org>
> > > > > >>>
> > > > > >>> * Suneel Marthi <smarthi at apache dot org>
> > > > > >>>
> > > > > >>> * Xiangrui Meng <meng at apache dot org>
> > > > > >>>
> > > > > >>> * Luciano Resende <lresende at apache dot org>
> > > > > >>>
> > > > > >>> ==== Sponsoring Entity ====
> > > > > >>>
> > > > > >>> Apache Incubator PMC
> > > > > >>>
> > > > > >>
> > > > > >>
> > > ---------------------------------------------------------------------
> > > > > >> To unsubscribe, e-mail:
> general-unsubscribe@incubator.apache.org
> > > > > >> For additional commands, e-mail:
> > general-help@incubator.apache.org
> > > > > >>
> > > > > >>
> > > > > > --
> > > > > > Jean-Baptiste Onofré
> > > > > > jbonofre@apache.org
> > > > > > http://blog.nanthrax.net
> > > > > > Talend - http://www.talend.com
> > > > > >
> > > > > >
> > ---------------------------------------------------------------------
> > > > > > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > > > > > For additional commands, e-mail:
> general-help@incubator.apache.org
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] PredictionIO incubation proposal

Posted by Henry Saputra <he...@gmail.com>.
As mentor, you will have karma to commit to the source repository.

As you probably know, the initial committers and mentors will form the
initial PPMCs for the podling.
Hopefully for day to day operations you should not need to have distinction
of committer vs mentors anymore.

You do not have to be listed as committer for the proposal.

- Henry

On Tue, May 17, 2016 at 1:57 PM, Suneel Marthi <sm...@apache.org> wrote:

> Thanks for having me as a mentor for PIO.  I would like to be added to the
> initial list of committers and am looking to actively participate in the
> development too. I am not sure if my being a mentor automatically grants me
> the 'commit' karma.
>
> Its already been suggested earlier in this thread by Roman and
> Jean-Baptiste that the project needs to be decoupled from Spark and
> integrated with Beam.  It would be good to reduce the reliance on
> Spark-Submit from what I have seen of the project so far. But let's not
> talk architecture and design here when the project's not in incubator yet.
> :)
>
>
>
>
> On Tue, May 17, 2016 at 4:09 PM, Henry Saputra <he...@gmail.com>
> wrote:
>
> > Cool, this will make code grant process to be easier =)
> >
> > The initial committers and mentors look great.
> > I am sure more will come as contributions start pouring in to the
> project.
> >
> > Looking forward for the VOTE thread soon.
> >
> > - Henry
> >
> > On Mon, May 16, 2016 at 12:07 PM, Simon Chan <si...@salesforce.com>
> wrote:
> >
> > > Yes, it includes everyone who previously contributed code from
> > PredictionIO
> > > before the acquisition and still want to be involved in the project.
> > >
> > > We may have missed "Alex Merritt", going to add him to the list soon.
> > >
> > > Simon
> > >
> > >
> > > On Mon, May 16, 2016 at 11:58 AM, Suneel Marthi <sm...@apache.org>
> > > wrote:
> > >
> > > > I do have a question about the proposed list of committers.
> > > >
> > > > Does the list also include all of those folks who were with
> > PredictionIO
> > > > (and had contributed to the project) and then chose to leave when PIO
> > was
> > > > acquired by Salesforce?
> > > >
> > > >
> > > >
> > > >
> > > > On Mon, May 16, 2016 at 1:13 PM, Jean-Baptiste Onofré <
> jb@nanthrax.net
> > >
> > > > wrote:
> > > >
> > > > > By the way, we have some discussion about integrating Zeppelin with
> > > Beam
> > > > ;)
> > > > >
> > > > > Regards
> > > > > JB
> > > > >
> > > > > On 05/15/2016 02:32 AM, Roman Shaposhnik wrote:
> > > > >
> > > > >> Super excited to see this proposal! This will finally allow us to
> > have
> > > > >> an ASF managed
> > > > >> backend for next generation data-driven apps that I see emerging
> > quite
> > > > >> rapidly.
> > > > >>
> > > > >> The proposal looks great to me (although I'd recommend calling
> Scala
> > > > >> as an implementation
> > > > >> language more prominently since it may attract additional
> developers
> > > > >> with affinity to it).
> > > > >>
> > > > >> I do have two questions about technology:
> > > > >>     1. do you think it would be possible to leverage Apache Beam
> > > > >> (incubating)
> > > > >>         for abstracting away dependency on execution frameworks?
> My
> > > > >> understanding
> > > > >>         is that PredictionIO currently only run on Spark.
> > > > >>     2. is there a potential integration with Apache Zeppelin
> > possible?
> > > > >>
> > > > >> Thanks,
> > > > >> Roman.
> > > > >>
> > > > >> On Fri, May 13, 2016 at 1:41 PM, Andrew Purtell <
> > apurtell@apache.org>
> > > > >> wrote:
> > > > >>
> > > > >>> Greetings,
> > > > >>>
> > > > >>> It is my pleasure to
> > > > >>>
> > > > >>> propose the PredictionIO project for incubation at the Apache
> > > Software
> > > > >>> Foundation.
> > > > >>>
> > > > >>> PredictionIO is a
> > > > >>> popular
> > > > >>> open
> > > > >>>
> > > > >>> source Machine Learning Server built on top of a state-of-the-art
> > > open
> > > > >>> source stack, including several Apache technologies, that
> > > > >>>
> > > > >>> enables developers to manage and deploy production-ready
> predictive
> > > > >>> services for various kinds of machine learning tasks
> > > > >>> , with more than 400 production deployments around the world and
> a
> > > > >>> growing
> > > > >>> contributor community.
> > > > >>>
> > > > >>>
> > > > >>> The text of the proposal is included below and is also available
> at
> > > > >>> https://wiki.apache.org/incubator/PredictionIO
> > > > >>>
> > > > >>> Best regards,
> > > > >>> Andrew Purtell
> > > > >>>
> > > > >>>
> > > > >>> = PredictionIO Proposal =
> > > > >>>
> > > > >>> === Abstract ===
> > > > >>> PredictionIO is an open source Machine Learning Server built on
> top
> > > of
> > > > >>> state-of-the-art open source stack, that enables developers to
> > manage
> > > > and
> > > > >>> deploy production-ready predictive services for various kinds of
> > > > machine
> > > > >>> learning tasks.
> > > > >>>
> > > > >>> === Proposal ===
> > > > >>> The PredictionIO platform consists of the following components:
> > > > >>>
> > > > >>>   * PredictionIO framework - provides the machine learning stack
> > for
> > > > >>>   building, evaluating and deploying engines with machine
> learning
> > > > >>>   algorithms. It uses Apache Spark for processing.
> > > > >>>
> > > > >>>   * Event Server - the machine learning analytics layer for
> > unifying
> > > > >>> events
> > > > >>>   from multiple platforms. It can use Apache HBase or any JDBC
> > > backends
> > > > >>>   as its data store.
> > > > >>>
> > > > >>> The PredictionIO community also maintains a
> > > > >>>
> > > > >>> Template Gallery, a place to
> > > > >>> publish and download (free or proprietary) engine templates for
> > > > different
> > > > >>> types of machine learning applications, and is a complemental
> part
> > of
> > > > the
> > > > >>> project. At this point we exclude the Template Gallery from the
> > > > proposal,
> > > > >>> as it has a separate set of contributors and we’re not familiar
> > with
> > > an
> > > > >>> Apache approved mechanism to maintain such a gallery.
> > > > >>>
> > > > >>> You can find the Template Gallery at
> > > https://templates.prediction.io/
> > > > >>>
> > > > >>> === Background ===
> > > > >>> PredictionIO was started with a mission to democratize and bring
> > > > machine
> > > > >>> learning to the masses.
> > > > >>>
> > > > >>> Machine learning has traditionally been a luxury for big
> companies
> > > like
> > > > >>> Google, Facebook, and Netflix. There are ML libraries and tools
> > lying
> > > > >>> around the internet but the effort of putting them all together
> as
> > a
> > > > >>> production-ready infrastructure is a very resource-intensive task
> > > that
> > > > is
> > > > >>> remotely reachable by individuals or small businesses.
> > > > >>>
> > > > >>> PredictionIO is a production-ready, full stack machine learning
> > > system
> > > > >>> that
> > > > >>> allows organizations of any scale to quickly deploy machine
> > learning
> > > > >>> capabilities. It comes with official and community-contributed
> > > machine
> > > > >>> learning engine templates that are easy to customize.
> > > > >>>
> > > > >>> === Rationale ===
> > > > >>> As usage and number of contributors to PredictionIO has grown
> > bigger
> > > > and
> > > > >>> more diverse, we have sought for an independent framework for the
> > > > project
> > > > >>> to keep thriving. We believe the Apache foundation is a great
> fit.
> > > > >>> Joining
> > > > >>> Apache would ensure that tried and true processes and procedures
> > are
> > > in
> > > > >>> place for the growing number of organizations interested in
> > > > contributing
> > > > >>> to PredictionIO. PredictionIO is also a good fit for the Apache
> > > > >>> foundation.
> > > > >>> PredictionIO was built on top of several Apache projects (HBase,
> > > Spark,
> > > > >>> Hadoop). We are familiar with the Apache process and believe that
> > the
> > > > >>> democratic and meritocratic nature of the foundation aligns with
> > the
> > > > >>> project goals.
> > > > >>>
> > > > >>> === Initial Goals ===
> > > > >>> The initial milestones will be to move the existing codebase to
> > > Apache
> > > > >>> and
> > > > >>> integrate with the Apache development process. Once this is
> > > > accomplished,
> > > > >>> we plan for incremental development and releases that follow the
> > > Apache
> > > > >>> guidelines, as well as growing our developer and user
> communities.
> > > > >>>
> > > > >>> === Current Status ===
> > > > >>> PredictionIO has undergone nine minor releases and many patches.
> > > > >>> PredictionIO is being used in production by Salesforce.com as
> well
> > as
> > > > >>> many
> > > > >>> other organizations and apps. The PredictionIO codebase is
> > currently
> > > > >>> hosted at GitHub, which will form the basis of the Apache git
> > > > repository.
> > > > >>>
> > > > >>> ==== Meritocracy ====
> > > > >>> We plan to invest in supporting a meritocracy. We will discuss
> the
> > > > >>> requirements in an open forum. We intend to invite additional
> > > > developers
> > > > >>> to participate. We will encourage and monitor community
> > participation
> > > > so
> > > > >>> that privileges can be extended to those that contribute.
> > > > >>>
> > > > >>> ==== Community ====
> > > > >>> Acceptance into the Apache foundation would bolster the already
> > > strong
> > > > >>> user and developer community around PredictionIO. That community
> > > > includes
> > > > >>> many contributors from various other companies, and an active
> > mailing
> > > > >>> list
> > > > >>> composed of hundreds of users.
> > > > >>>
> > > > >>> ==== Core Developers ====
> > > > >>> The core developers of our project are listed in our contributors
> > and
> > > > >>> initial PPMC below. Though many are employed at Salesforce.com,
> > there
> > > > are
> > > > >>> also engineers from ActionML, and independent developers.
> > > > >>>
> > > > >>> === Alignment ===
> > > > >>> The ASF is the natural choice to host the PredictionIO project as
> > its
> > > > >>> goal
> > > > >>> is democratizing Machine Learning by making it more easily
> > accessible
> > > > to
> > > > >>> every user/developer. PredictionIO is built on top of several top
> > > level
> > > > >>> Apache projects as outlined above.
> > > > >>>
> > > > >>> === Known Risks ===
> > > > >>>
> > > > >>> ==== Orphaned products ====
> > > > >>> PredictionIO has a solid and growing community. It is deployed on
> > > > >>> production environments by companies of all sizes to run various
> > > kinds
> > > > of
> > > > >>> predictive engines.
> > > > >>>
> > > > >>> In addition to the community contribution to PredictionIO
> > framework,
> > > > the
> > > > >>> community is also actively contributing new engines to the
> Template
> > > > >>> Gallery as well as SDKs and documentation for the project.
> > Salesforce
> > > > is
> > > > >>> committed to utilize and advance the PredictionIO code base and
> > > support
> > > > >>> its user community.
> > > > >>>
> > > > >>> ==== Inexperience with Open Source ====
> > > > >>> PredictionIO has existed as a healthy open source project for
> > almost
> > > > two
> > > > >>> years and is the most starred Scala project on GitHub. All of the
> > > > >>> proposed
> > > > >>> committers have contributed to ASF and Linux Foundation open
> source
> > > > >>> projects. Several current committers on Apache projects and
> Apache
> > > > >>> Members
> > > > >>> are involved in this proposal and intend to provide mentorship.
> > > > >>>
> > > > >>> ==== Homogeneous Developers ====
> > > > >>> The initial list of committers includes developers from several
> > > > >>> institutions, including Salesforce, ActionML, Channel4, USC as
> well
> > > as
> > > > >>> unaffiliated developers.
> > > > >>>
> > > > >>> ==== Reliance on Salaried Developers ====
> > > > >>> Like most open source projects, PredictionIO receives substantial
> > > > support
> > > > >>> from salaried developers. PredictionIO development is partially
> > > > supported
> > > > >>> by Salesforce.com, but there are many contributors from various
> > other
> > > > >>> companies, and an active mailing list composed of hundreds of
> > users.
> > > We
> > > > >>> will continue our efforts to ensure stewardship of the project to
> > be
> > > > >>> independent of salaried developers by meritocratically promoting
> > > those
> > > > >>> contributors to committers.
> > > > >>>
> > > > >>> ==== Relationships with Other Apache Product ====
> > > > >>> PredictionIO relies heavily on top level apache projects such as
> > > Apache
> > > > >>> Spark, HBase and Hadoop. However it brings a distinguished
> > > > functionality,
> > > > >>> rather than just an abstraction - Machine Learning in a
> > plug-and-play
> > > > >>> fashion.
> > > > >>>
> > > > >>> Compared to Apache Mahout, which focuses on the development of a
> > wide
> > > > >>> variety of algorithms, PredictionIO offers a platform to manage
> the
> > > > whole
> > > > >>> machine learning workflow, including data collection, data
> > > preparation,
> > > > >>> modeling, deployment and management of predictive services in
> > > > production
> > > > >>> environments.
> > > > >>>
> > > > >>> ==== An Excessive Fascination with the Apache Brand ====
> > > > >>> PredictionIO is already a widely known open source project. This
> > > > proposal
> > > > >>> is not for the purpose of generating publicity. Rather, the
> primary
> > > > >>> benefits to joining Apache are those outlined in the Rationale
> > > section.
> > > > >>>
> > > > >>> === Documentation ===
> > > > >>> PredictionIO boasts rich and live documentation, included in the
> > code
> > > > >>> repo
> > > > >>> (docs/manual directory), is built with Middleman, and publicly
> > hosted
> > > > at
> > > > >>> https://docs.prediction.io
> > > > >>>
> > > > >>> === Initial Source and Intellectual Property Submission Plan ===
> > > > >>> Currently, the PredictionIO codebase is distributed under the
> > Apache
> > > > 2.0
> > > > >>> License and hosted on GitHub:
> > > > >>> https://github.com/PredictionIO/PredictionIO
> > > > >>>
> > > > >>> === External Dependencies ===
> > > > >>> PredictionIO has the following external dependencies:
> > > > >>>   * Apache Hadoop 2.4.0 (optional, required only if YARN and HDFS
> > are
> > > > >>> needed)
> > > > >>>   * Apache Spark 1.3.0 for Hadoop 2.4
> > > > >>>   * Java SE Development Kit 8
> > > > >>>   * and one of the following sets:
> > > > >>>
> > > > >>>     * PostgreSQL 9.1
> > > > >>>
> > > > >>>
> > > > >>> or
> > > > >>>
> > > > >>>
> > > > >>> * MySQL 5.1
> > > > >>>
> > > > >>>   or
> > > > >>>
> > > > >>>
> > > > >>>   * Apache HBase 0.98.6
> > > > >>>
> > > > >>>
> > > > >>> * Elasticsearch 1.4.0
> > > > >>>
> > > > >>> Upon acceptance to the incubator, we would begin a thorough
> > analysis
> > > of
> > > > >>> all transitive dependencies to verify this information and
> > introduce
> > > > >>> license checking into the build and release process by
> integrating
> > > with
> > > > >>> Apache RAT.
> > > > >>>
> > > > >>> === Cryptography ===
> > > > >>> PredictionIO does not include cryptographic code. We utilize
> > standard
> > > > >>> JCE and JSSE APIs provided by the Java Runtime Environment.
> > > > >>>
> > > > >>> === Required Resources ===
> > > > >>> We request that following resources be created for the project to
> > use
> > > > >>>
> > > > >>> ==== Mailing lists ====
> > > > >>>
> > > > >>> predictionio-private@incubator.apache.org (with moderated
> > > > subscriptions)
> > > > >>>
> > > > >>> predictionio-dev
> > > > >>>
> > > > >>> predictionio-user
> > > > >>>
> > > > >>> predictionio-commits
> > > > >>>
> > > > >>> We will migrate the existing PredictionIO mailing lists.
> > > > >>>
> > > > >>> ==== Git repository ====
> > > > >>> The PredictionIO team would like to use Git for source control,
> due
> > > to
> > > > >>> our
> > > > >>> current use of GitHub.
> > > > >>>
> > > > >>> git://git.apache.org/incubator-predictionio
> > > > >>>
> > > > >>> ==== Documentation ====
> > > > >>> https://predictionio.incubator.apache.org/docs/
> > > > >>>
> > > > >>> ==== JIRA instance ====
> > > > >>> PredictionIO currently uses the GitHub issue tracking system
> > > associated
> > > > >>> with its repository:
> > > > https://github.com/PredictionIO/PredictionIO/issues
> > > > >>> .
> > > > >>> We will migrate to Apache JIRA.
> > > > >>>
> > > > >>> JIRA PREDICTIONIO
> > > > >>> https://issues.apache.org/jira/browse/PREDICTIONIO
> > > > >>>
> > > > >>> ==== Other Resources ====
> > > > >>> * TravisCI for builds and test running.
> > > > >>>
> > > > >>> * PredictionIO's documentation, included in the code repo
> > > (docs/manual
> > > > >>> directory), is built with Middleman and publicly hosted
> > > > >>> https://docs.prediction.io
> > > > >>>
> > > > >>> * A blog to drive adoption and excitement at
> > > > https://blog.prediction.io
> > > > >>>
> > > > >>> === Initial Committers ===
> > > > >>>
> > > > >>> * Pat Ferrell
> > > > >>>
> > > > >>> * Tamas Jambor
> > > > >>>
> > > > >>> * Justin Yip
> > > > >>>
> > > > >>> * Xusen Yin
> > > > >>>
> > > > >>> * Lee Moon Soo
> > > > >>>
> > > > >>> * Donald Szeto
> > > > >>>
> > > > >>> * Kenneth Chan
> > > > >>>
> > > > >>> * Tom Chan
> > > > >>>
> > > > >>> * Simon Chan
> > > > >>>
> > > > >>> * Marco Vivero
> > > > >>>
> > > > >>> * Matthew Tovbin
> > > > >>>
> > > > >>> * Yevgeny Khodorkovsky
> > > > >>>
> > > > >>> * Felipe Oliveira
> > > > >>>
> > > > >>> * Vitaly Gordon
> > > > >>>
> > > > >>> === Affiliations ===
> > > > >>>
> > > > >>> * Pat Ferrell - ActionML
> > > > >>>
> > > > >>> * Tamas Jambor - Channel4
> > > > >>>
> > > > >>> * Justin Yip - independent
> > > > >>>
> > > > >>> * Xusen Yin - USC
> > > > >>>
> > > > >>> * Lee Moon Soo - NFLabs
> > > > >>>
> > > > >>> * Donald Szeto - Salesforce
> > > > >>>
> > > > >>> * Kenneth Chan - Salesforce
> > > > >>>
> > > > >>> * Tom Chan - Salesforce
> > > > >>>
> > > > >>> * Simon Chan - Salesforce
> > > > >>>
> > > > >>> * Marco Vivero - Salesforce
> > > > >>>
> > > > >>> * Matthew Tovbin - Salesforce
> > > > >>>
> > > > >>> * Yevgeny Khodorkovsky - Salesforce
> > > > >>>
> > > > >>> * Felipe Oliveira - Salesforce
> > > > >>>
> > > > >>> * Vitaly Gordon - Salesforce
> > > > >>>
> > > > >>> === Sponsors ===
> > > > >>>
> > > > >>> ==== Champion ====
> > > > >>>
> > > > >>> Andrew Purtell <apurtell at apache dot org>
> > > > >>>
> > > > >>> ==== Nominated Mentors ====
> > > > >>>
> > > > >>> * Andrew Purtell <apurtell at apache dot org>
> > > > >>>
> > > > >>> * James Taylor <jtaylor at apache dot org>
> > > > >>>
> > > > >>> * Lars Hofhansl <larsh at apache dot org>
> > > > >>>
> > > > >>> * Suneel Marthi <smarthi at apache dot org>
> > > > >>>
> > > > >>> * Xiangrui Meng <meng at apache dot org>
> > > > >>>
> > > > >>> * Luciano Resende <lresende at apache dot org>
> > > > >>>
> > > > >>> ==== Sponsoring Entity ====
> > > > >>>
> > > > >>> Apache Incubator PMC
> > > > >>>
> > > > >>
> > > > >>
> > ---------------------------------------------------------------------
> > > > >> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > > > >> For additional commands, e-mail:
> general-help@incubator.apache.org
> > > > >>
> > > > >>
> > > > > --
> > > > > Jean-Baptiste Onofré
> > > > > jbonofre@apache.org
> > > > > http://blog.nanthrax.net
> > > > > Talend - http://www.talend.com
> > > > >
> > > > >
> ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > > > > For additional commands, e-mail: general-help@incubator.apache.org
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] PredictionIO incubation proposal

Posted by Suneel Marthi <sm...@apache.org>.
Thanks for having me as a mentor for PIO.  I would like to be added to the
initial list of committers and am looking to actively participate in the
development too. I am not sure if my being a mentor automatically grants me
the 'commit' karma.

Its already been suggested earlier in this thread by Roman and
Jean-Baptiste that the project needs to be decoupled from Spark and
integrated with Beam.  It would be good to reduce the reliance on
Spark-Submit from what I have seen of the project so far. But let's not
talk architecture and design here when the project's not in incubator yet.
:)




On Tue, May 17, 2016 at 4:09 PM, Henry Saputra <he...@gmail.com>
wrote:

> Cool, this will make code grant process to be easier =)
>
> The initial committers and mentors look great.
> I am sure more will come as contributions start pouring in to the project.
>
> Looking forward for the VOTE thread soon.
>
> - Henry
>
> On Mon, May 16, 2016 at 12:07 PM, Simon Chan <si...@salesforce.com> wrote:
>
> > Yes, it includes everyone who previously contributed code from
> PredictionIO
> > before the acquisition and still want to be involved in the project.
> >
> > We may have missed "Alex Merritt", going to add him to the list soon.
> >
> > Simon
> >
> >
> > On Mon, May 16, 2016 at 11:58 AM, Suneel Marthi <sm...@apache.org>
> > wrote:
> >
> > > I do have a question about the proposed list of committers.
> > >
> > > Does the list also include all of those folks who were with
> PredictionIO
> > > (and had contributed to the project) and then chose to leave when PIO
> was
> > > acquired by Salesforce?
> > >
> > >
> > >
> > >
> > > On Mon, May 16, 2016 at 1:13 PM, Jean-Baptiste Onofré <jb@nanthrax.net
> >
> > > wrote:
> > >
> > > > By the way, we have some discussion about integrating Zeppelin with
> > Beam
> > > ;)
> > > >
> > > > Regards
> > > > JB
> > > >
> > > > On 05/15/2016 02:32 AM, Roman Shaposhnik wrote:
> > > >
> > > >> Super excited to see this proposal! This will finally allow us to
> have
> > > >> an ASF managed
> > > >> backend for next generation data-driven apps that I see emerging
> quite
> > > >> rapidly.
> > > >>
> > > >> The proposal looks great to me (although I'd recommend calling Scala
> > > >> as an implementation
> > > >> language more prominently since it may attract additional developers
> > > >> with affinity to it).
> > > >>
> > > >> I do have two questions about technology:
> > > >>     1. do you think it would be possible to leverage Apache Beam
> > > >> (incubating)
> > > >>         for abstracting away dependency on execution frameworks? My
> > > >> understanding
> > > >>         is that PredictionIO currently only run on Spark.
> > > >>     2. is there a potential integration with Apache Zeppelin
> possible?
> > > >>
> > > >> Thanks,
> > > >> Roman.
> > > >>
> > > >> On Fri, May 13, 2016 at 1:41 PM, Andrew Purtell <
> apurtell@apache.org>
> > > >> wrote:
> > > >>
> > > >>> Greetings,
> > > >>>
> > > >>> It is my pleasure to
> > > >>>
> > > >>> propose the PredictionIO project for incubation at the Apache
> > Software
> > > >>> Foundation.
> > > >>>
> > > >>> PredictionIO is a
> > > >>> popular
> > > >>> open
> > > >>>
> > > >>> source Machine Learning Server built on top of a state-of-the-art
> > open
> > > >>> source stack, including several Apache technologies, that
> > > >>>
> > > >>> enables developers to manage and deploy production-ready predictive
> > > >>> services for various kinds of machine learning tasks
> > > >>> , with more than 400 production deployments around the world and a
> > > >>> growing
> > > >>> contributor community.
> > > >>>
> > > >>>
> > > >>> The text of the proposal is included below and is also available at
> > > >>> https://wiki.apache.org/incubator/PredictionIO
> > > >>>
> > > >>> Best regards,
> > > >>> Andrew Purtell
> > > >>>
> > > >>>
> > > >>> = PredictionIO Proposal =
> > > >>>
> > > >>> === Abstract ===
> > > >>> PredictionIO is an open source Machine Learning Server built on top
> > of
> > > >>> state-of-the-art open source stack, that enables developers to
> manage
> > > and
> > > >>> deploy production-ready predictive services for various kinds of
> > > machine
> > > >>> learning tasks.
> > > >>>
> > > >>> === Proposal ===
> > > >>> The PredictionIO platform consists of the following components:
> > > >>>
> > > >>>   * PredictionIO framework - provides the machine learning stack
> for
> > > >>>   building, evaluating and deploying engines with machine learning
> > > >>>   algorithms. It uses Apache Spark for processing.
> > > >>>
> > > >>>   * Event Server - the machine learning analytics layer for
> unifying
> > > >>> events
> > > >>>   from multiple platforms. It can use Apache HBase or any JDBC
> > backends
> > > >>>   as its data store.
> > > >>>
> > > >>> The PredictionIO community also maintains a
> > > >>>
> > > >>> Template Gallery, a place to
> > > >>> publish and download (free or proprietary) engine templates for
> > > different
> > > >>> types of machine learning applications, and is a complemental part
> of
> > > the
> > > >>> project. At this point we exclude the Template Gallery from the
> > > proposal,
> > > >>> as it has a separate set of contributors and we’re not familiar
> with
> > an
> > > >>> Apache approved mechanism to maintain such a gallery.
> > > >>>
> > > >>> You can find the Template Gallery at
> > https://templates.prediction.io/
> > > >>>
> > > >>> === Background ===
> > > >>> PredictionIO was started with a mission to democratize and bring
> > > machine
> > > >>> learning to the masses.
> > > >>>
> > > >>> Machine learning has traditionally been a luxury for big companies
> > like
> > > >>> Google, Facebook, and Netflix. There are ML libraries and tools
> lying
> > > >>> around the internet but the effort of putting them all together as
> a
> > > >>> production-ready infrastructure is a very resource-intensive task
> > that
> > > is
> > > >>> remotely reachable by individuals or small businesses.
> > > >>>
> > > >>> PredictionIO is a production-ready, full stack machine learning
> > system
> > > >>> that
> > > >>> allows organizations of any scale to quickly deploy machine
> learning
> > > >>> capabilities. It comes with official and community-contributed
> > machine
> > > >>> learning engine templates that are easy to customize.
> > > >>>
> > > >>> === Rationale ===
> > > >>> As usage and number of contributors to PredictionIO has grown
> bigger
> > > and
> > > >>> more diverse, we have sought for an independent framework for the
> > > project
> > > >>> to keep thriving. We believe the Apache foundation is a great fit.
> > > >>> Joining
> > > >>> Apache would ensure that tried and true processes and procedures
> are
> > in
> > > >>> place for the growing number of organizations interested in
> > > contributing
> > > >>> to PredictionIO. PredictionIO is also a good fit for the Apache
> > > >>> foundation.
> > > >>> PredictionIO was built on top of several Apache projects (HBase,
> > Spark,
> > > >>> Hadoop). We are familiar with the Apache process and believe that
> the
> > > >>> democratic and meritocratic nature of the foundation aligns with
> the
> > > >>> project goals.
> > > >>>
> > > >>> === Initial Goals ===
> > > >>> The initial milestones will be to move the existing codebase to
> > Apache
> > > >>> and
> > > >>> integrate with the Apache development process. Once this is
> > > accomplished,
> > > >>> we plan for incremental development and releases that follow the
> > Apache
> > > >>> guidelines, as well as growing our developer and user communities.
> > > >>>
> > > >>> === Current Status ===
> > > >>> PredictionIO has undergone nine minor releases and many patches.
> > > >>> PredictionIO is being used in production by Salesforce.com as well
> as
> > > >>> many
> > > >>> other organizations and apps. The PredictionIO codebase is
> currently
> > > >>> hosted at GitHub, which will form the basis of the Apache git
> > > repository.
> > > >>>
> > > >>> ==== Meritocracy ====
> > > >>> We plan to invest in supporting a meritocracy. We will discuss the
> > > >>> requirements in an open forum. We intend to invite additional
> > > developers
> > > >>> to participate. We will encourage and monitor community
> participation
> > > so
> > > >>> that privileges can be extended to those that contribute.
> > > >>>
> > > >>> ==== Community ====
> > > >>> Acceptance into the Apache foundation would bolster the already
> > strong
> > > >>> user and developer community around PredictionIO. That community
> > > includes
> > > >>> many contributors from various other companies, and an active
> mailing
> > > >>> list
> > > >>> composed of hundreds of users.
> > > >>>
> > > >>> ==== Core Developers ====
> > > >>> The core developers of our project are listed in our contributors
> and
> > > >>> initial PPMC below. Though many are employed at Salesforce.com,
> there
> > > are
> > > >>> also engineers from ActionML, and independent developers.
> > > >>>
> > > >>> === Alignment ===
> > > >>> The ASF is the natural choice to host the PredictionIO project as
> its
> > > >>> goal
> > > >>> is democratizing Machine Learning by making it more easily
> accessible
> > > to
> > > >>> every user/developer. PredictionIO is built on top of several top
> > level
> > > >>> Apache projects as outlined above.
> > > >>>
> > > >>> === Known Risks ===
> > > >>>
> > > >>> ==== Orphaned products ====
> > > >>> PredictionIO has a solid and growing community. It is deployed on
> > > >>> production environments by companies of all sizes to run various
> > kinds
> > > of
> > > >>> predictive engines.
> > > >>>
> > > >>> In addition to the community contribution to PredictionIO
> framework,
> > > the
> > > >>> community is also actively contributing new engines to the Template
> > > >>> Gallery as well as SDKs and documentation for the project.
> Salesforce
> > > is
> > > >>> committed to utilize and advance the PredictionIO code base and
> > support
> > > >>> its user community.
> > > >>>
> > > >>> ==== Inexperience with Open Source ====
> > > >>> PredictionIO has existed as a healthy open source project for
> almost
> > > two
> > > >>> years and is the most starred Scala project on GitHub. All of the
> > > >>> proposed
> > > >>> committers have contributed to ASF and Linux Foundation open source
> > > >>> projects. Several current committers on Apache projects and Apache
> > > >>> Members
> > > >>> are involved in this proposal and intend to provide mentorship.
> > > >>>
> > > >>> ==== Homogeneous Developers ====
> > > >>> The initial list of committers includes developers from several
> > > >>> institutions, including Salesforce, ActionML, Channel4, USC as well
> > as
> > > >>> unaffiliated developers.
> > > >>>
> > > >>> ==== Reliance on Salaried Developers ====
> > > >>> Like most open source projects, PredictionIO receives substantial
> > > support
> > > >>> from salaried developers. PredictionIO development is partially
> > > supported
> > > >>> by Salesforce.com, but there are many contributors from various
> other
> > > >>> companies, and an active mailing list composed of hundreds of
> users.
> > We
> > > >>> will continue our efforts to ensure stewardship of the project to
> be
> > > >>> independent of salaried developers by meritocratically promoting
> > those
> > > >>> contributors to committers.
> > > >>>
> > > >>> ==== Relationships with Other Apache Product ====
> > > >>> PredictionIO relies heavily on top level apache projects such as
> > Apache
> > > >>> Spark, HBase and Hadoop. However it brings a distinguished
> > > functionality,
> > > >>> rather than just an abstraction - Machine Learning in a
> plug-and-play
> > > >>> fashion.
> > > >>>
> > > >>> Compared to Apache Mahout, which focuses on the development of a
> wide
> > > >>> variety of algorithms, PredictionIO offers a platform to manage the
> > > whole
> > > >>> machine learning workflow, including data collection, data
> > preparation,
> > > >>> modeling, deployment and management of predictive services in
> > > production
> > > >>> environments.
> > > >>>
> > > >>> ==== An Excessive Fascination with the Apache Brand ====
> > > >>> PredictionIO is already a widely known open source project. This
> > > proposal
> > > >>> is not for the purpose of generating publicity. Rather, the primary
> > > >>> benefits to joining Apache are those outlined in the Rationale
> > section.
> > > >>>
> > > >>> === Documentation ===
> > > >>> PredictionIO boasts rich and live documentation, included in the
> code
> > > >>> repo
> > > >>> (docs/manual directory), is built with Middleman, and publicly
> hosted
> > > at
> > > >>> https://docs.prediction.io
> > > >>>
> > > >>> === Initial Source and Intellectual Property Submission Plan ===
> > > >>> Currently, the PredictionIO codebase is distributed under the
> Apache
> > > 2.0
> > > >>> License and hosted on GitHub:
> > > >>> https://github.com/PredictionIO/PredictionIO
> > > >>>
> > > >>> === External Dependencies ===
> > > >>> PredictionIO has the following external dependencies:
> > > >>>   * Apache Hadoop 2.4.0 (optional, required only if YARN and HDFS
> are
> > > >>> needed)
> > > >>>   * Apache Spark 1.3.0 for Hadoop 2.4
> > > >>>   * Java SE Development Kit 8
> > > >>>   * and one of the following sets:
> > > >>>
> > > >>>     * PostgreSQL 9.1
> > > >>>
> > > >>>
> > > >>> or
> > > >>>
> > > >>>
> > > >>> * MySQL 5.1
> > > >>>
> > > >>>   or
> > > >>>
> > > >>>
> > > >>>   * Apache HBase 0.98.6
> > > >>>
> > > >>>
> > > >>> * Elasticsearch 1.4.0
> > > >>>
> > > >>> Upon acceptance to the incubator, we would begin a thorough
> analysis
> > of
> > > >>> all transitive dependencies to verify this information and
> introduce
> > > >>> license checking into the build and release process by integrating
> > with
> > > >>> Apache RAT.
> > > >>>
> > > >>> === Cryptography ===
> > > >>> PredictionIO does not include cryptographic code. We utilize
> standard
> > > >>> JCE and JSSE APIs provided by the Java Runtime Environment.
> > > >>>
> > > >>> === Required Resources ===
> > > >>> We request that following resources be created for the project to
> use
> > > >>>
> > > >>> ==== Mailing lists ====
> > > >>>
> > > >>> predictionio-private@incubator.apache.org (with moderated
> > > subscriptions)
> > > >>>
> > > >>> predictionio-dev
> > > >>>
> > > >>> predictionio-user
> > > >>>
> > > >>> predictionio-commits
> > > >>>
> > > >>> We will migrate the existing PredictionIO mailing lists.
> > > >>>
> > > >>> ==== Git repository ====
> > > >>> The PredictionIO team would like to use Git for source control, due
> > to
> > > >>> our
> > > >>> current use of GitHub.
> > > >>>
> > > >>> git://git.apache.org/incubator-predictionio
> > > >>>
> > > >>> ==== Documentation ====
> > > >>> https://predictionio.incubator.apache.org/docs/
> > > >>>
> > > >>> ==== JIRA instance ====
> > > >>> PredictionIO currently uses the GitHub issue tracking system
> > associated
> > > >>> with its repository:
> > > https://github.com/PredictionIO/PredictionIO/issues
> > > >>> .
> > > >>> We will migrate to Apache JIRA.
> > > >>>
> > > >>> JIRA PREDICTIONIO
> > > >>> https://issues.apache.org/jira/browse/PREDICTIONIO
> > > >>>
> > > >>> ==== Other Resources ====
> > > >>> * TravisCI for builds and test running.
> > > >>>
> > > >>> * PredictionIO's documentation, included in the code repo
> > (docs/manual
> > > >>> directory), is built with Middleman and publicly hosted
> > > >>> https://docs.prediction.io
> > > >>>
> > > >>> * A blog to drive adoption and excitement at
> > > https://blog.prediction.io
> > > >>>
> > > >>> === Initial Committers ===
> > > >>>
> > > >>> * Pat Ferrell
> > > >>>
> > > >>> * Tamas Jambor
> > > >>>
> > > >>> * Justin Yip
> > > >>>
> > > >>> * Xusen Yin
> > > >>>
> > > >>> * Lee Moon Soo
> > > >>>
> > > >>> * Donald Szeto
> > > >>>
> > > >>> * Kenneth Chan
> > > >>>
> > > >>> * Tom Chan
> > > >>>
> > > >>> * Simon Chan
> > > >>>
> > > >>> * Marco Vivero
> > > >>>
> > > >>> * Matthew Tovbin
> > > >>>
> > > >>> * Yevgeny Khodorkovsky
> > > >>>
> > > >>> * Felipe Oliveira
> > > >>>
> > > >>> * Vitaly Gordon
> > > >>>
> > > >>> === Affiliations ===
> > > >>>
> > > >>> * Pat Ferrell - ActionML
> > > >>>
> > > >>> * Tamas Jambor - Channel4
> > > >>>
> > > >>> * Justin Yip - independent
> > > >>>
> > > >>> * Xusen Yin - USC
> > > >>>
> > > >>> * Lee Moon Soo - NFLabs
> > > >>>
> > > >>> * Donald Szeto - Salesforce
> > > >>>
> > > >>> * Kenneth Chan - Salesforce
> > > >>>
> > > >>> * Tom Chan - Salesforce
> > > >>>
> > > >>> * Simon Chan - Salesforce
> > > >>>
> > > >>> * Marco Vivero - Salesforce
> > > >>>
> > > >>> * Matthew Tovbin - Salesforce
> > > >>>
> > > >>> * Yevgeny Khodorkovsky - Salesforce
> > > >>>
> > > >>> * Felipe Oliveira - Salesforce
> > > >>>
> > > >>> * Vitaly Gordon - Salesforce
> > > >>>
> > > >>> === Sponsors ===
> > > >>>
> > > >>> ==== Champion ====
> > > >>>
> > > >>> Andrew Purtell <apurtell at apache dot org>
> > > >>>
> > > >>> ==== Nominated Mentors ====
> > > >>>
> > > >>> * Andrew Purtell <apurtell at apache dot org>
> > > >>>
> > > >>> * James Taylor <jtaylor at apache dot org>
> > > >>>
> > > >>> * Lars Hofhansl <larsh at apache dot org>
> > > >>>
> > > >>> * Suneel Marthi <smarthi at apache dot org>
> > > >>>
> > > >>> * Xiangrui Meng <meng at apache dot org>
> > > >>>
> > > >>> * Luciano Resende <lresende at apache dot org>
> > > >>>
> > > >>> ==== Sponsoring Entity ====
> > > >>>
> > > >>> Apache Incubator PMC
> > > >>>
> > > >>
> > > >>
> ---------------------------------------------------------------------
> > > >> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > > >> For additional commands, e-mail: general-help@incubator.apache.org
> > > >>
> > > >>
> > > > --
> > > > Jean-Baptiste Onofré
> > > > jbonofre@apache.org
> > > > http://blog.nanthrax.net
> > > > Talend - http://www.talend.com
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > > > For additional commands, e-mail: general-help@incubator.apache.org
> > > >
> > > >
> > >
> >
>

Re: [DISCUSS] PredictionIO incubation proposal

Posted by Henry Saputra <he...@gmail.com>.
Cool, this will make code grant process to be easier =)

The initial committers and mentors look great.
I am sure more will come as contributions start pouring in to the project.

Looking forward for the VOTE thread soon.

- Henry

On Mon, May 16, 2016 at 12:07 PM, Simon Chan <si...@salesforce.com> wrote:

> Yes, it includes everyone who previously contributed code from PredictionIO
> before the acquisition and still want to be involved in the project.
>
> We may have missed "Alex Merritt", going to add him to the list soon.
>
> Simon
>
>
> On Mon, May 16, 2016 at 11:58 AM, Suneel Marthi <sm...@apache.org>
> wrote:
>
> > I do have a question about the proposed list of committers.
> >
> > Does the list also include all of those folks who were with PredictionIO
> > (and had contributed to the project) and then chose to leave when PIO was
> > acquired by Salesforce?
> >
> >
> >
> >
> > On Mon, May 16, 2016 at 1:13 PM, Jean-Baptiste Onofré <jb...@nanthrax.net>
> > wrote:
> >
> > > By the way, we have some discussion about integrating Zeppelin with
> Beam
> > ;)
> > >
> > > Regards
> > > JB
> > >
> > > On 05/15/2016 02:32 AM, Roman Shaposhnik wrote:
> > >
> > >> Super excited to see this proposal! This will finally allow us to have
> > >> an ASF managed
> > >> backend for next generation data-driven apps that I see emerging quite
> > >> rapidly.
> > >>
> > >> The proposal looks great to me (although I'd recommend calling Scala
> > >> as an implementation
> > >> language more prominently since it may attract additional developers
> > >> with affinity to it).
> > >>
> > >> I do have two questions about technology:
> > >>     1. do you think it would be possible to leverage Apache Beam
> > >> (incubating)
> > >>         for abstracting away dependency on execution frameworks? My
> > >> understanding
> > >>         is that PredictionIO currently only run on Spark.
> > >>     2. is there a potential integration with Apache Zeppelin possible?
> > >>
> > >> Thanks,
> > >> Roman.
> > >>
> > >> On Fri, May 13, 2016 at 1:41 PM, Andrew Purtell <ap...@apache.org>
> > >> wrote:
> > >>
> > >>> Greetings,
> > >>>
> > >>> It is my pleasure to
> > >>>
> > >>> propose the PredictionIO project for incubation at the Apache
> Software
> > >>> Foundation.
> > >>>
> > >>> PredictionIO is a
> > >>> popular
> > >>> open
> > >>>
> > >>> source Machine Learning Server built on top of a state-of-the-art
> open
> > >>> source stack, including several Apache technologies, that
> > >>>
> > >>> enables developers to manage and deploy production-ready predictive
> > >>> services for various kinds of machine learning tasks
> > >>> , with more than 400 production deployments around the world and a
> > >>> growing
> > >>> contributor community.
> > >>>
> > >>>
> > >>> The text of the proposal is included below and is also available at
> > >>> https://wiki.apache.org/incubator/PredictionIO
> > >>>
> > >>> Best regards,
> > >>> Andrew Purtell
> > >>>
> > >>>
> > >>> = PredictionIO Proposal =
> > >>>
> > >>> === Abstract ===
> > >>> PredictionIO is an open source Machine Learning Server built on top
> of
> > >>> state-of-the-art open source stack, that enables developers to manage
> > and
> > >>> deploy production-ready predictive services for various kinds of
> > machine
> > >>> learning tasks.
> > >>>
> > >>> === Proposal ===
> > >>> The PredictionIO platform consists of the following components:
> > >>>
> > >>>   * PredictionIO framework - provides the machine learning stack for
> > >>>   building, evaluating and deploying engines with machine learning
> > >>>   algorithms. It uses Apache Spark for processing.
> > >>>
> > >>>   * Event Server - the machine learning analytics layer for unifying
> > >>> events
> > >>>   from multiple platforms. It can use Apache HBase or any JDBC
> backends
> > >>>   as its data store.
> > >>>
> > >>> The PredictionIO community also maintains a
> > >>>
> > >>> Template Gallery, a place to
> > >>> publish and download (free or proprietary) engine templates for
> > different
> > >>> types of machine learning applications, and is a complemental part of
> > the
> > >>> project. At this point we exclude the Template Gallery from the
> > proposal,
> > >>> as it has a separate set of contributors and we’re not familiar with
> an
> > >>> Apache approved mechanism to maintain such a gallery.
> > >>>
> > >>> You can find the Template Gallery at
> https://templates.prediction.io/
> > >>>
> > >>> === Background ===
> > >>> PredictionIO was started with a mission to democratize and bring
> > machine
> > >>> learning to the masses.
> > >>>
> > >>> Machine learning has traditionally been a luxury for big companies
> like
> > >>> Google, Facebook, and Netflix. There are ML libraries and tools lying
> > >>> around the internet but the effort of putting them all together as a
> > >>> production-ready infrastructure is a very resource-intensive task
> that
> > is
> > >>> remotely reachable by individuals or small businesses.
> > >>>
> > >>> PredictionIO is a production-ready, full stack machine learning
> system
> > >>> that
> > >>> allows organizations of any scale to quickly deploy machine learning
> > >>> capabilities. It comes with official and community-contributed
> machine
> > >>> learning engine templates that are easy to customize.
> > >>>
> > >>> === Rationale ===
> > >>> As usage and number of contributors to PredictionIO has grown bigger
> > and
> > >>> more diverse, we have sought for an independent framework for the
> > project
> > >>> to keep thriving. We believe the Apache foundation is a great fit.
> > >>> Joining
> > >>> Apache would ensure that tried and true processes and procedures are
> in
> > >>> place for the growing number of organizations interested in
> > contributing
> > >>> to PredictionIO. PredictionIO is also a good fit for the Apache
> > >>> foundation.
> > >>> PredictionIO was built on top of several Apache projects (HBase,
> Spark,
> > >>> Hadoop). We are familiar with the Apache process and believe that the
> > >>> democratic and meritocratic nature of the foundation aligns with the
> > >>> project goals.
> > >>>
> > >>> === Initial Goals ===
> > >>> The initial milestones will be to move the existing codebase to
> Apache
> > >>> and
> > >>> integrate with the Apache development process. Once this is
> > accomplished,
> > >>> we plan for incremental development and releases that follow the
> Apache
> > >>> guidelines, as well as growing our developer and user communities.
> > >>>
> > >>> === Current Status ===
> > >>> PredictionIO has undergone nine minor releases and many patches.
> > >>> PredictionIO is being used in production by Salesforce.com as well as
> > >>> many
> > >>> other organizations and apps. The PredictionIO codebase is currently
> > >>> hosted at GitHub, which will form the basis of the Apache git
> > repository.
> > >>>
> > >>> ==== Meritocracy ====
> > >>> We plan to invest in supporting a meritocracy. We will discuss the
> > >>> requirements in an open forum. We intend to invite additional
> > developers
> > >>> to participate. We will encourage and monitor community participation
> > so
> > >>> that privileges can be extended to those that contribute.
> > >>>
> > >>> ==== Community ====
> > >>> Acceptance into the Apache foundation would bolster the already
> strong
> > >>> user and developer community around PredictionIO. That community
> > includes
> > >>> many contributors from various other companies, and an active mailing
> > >>> list
> > >>> composed of hundreds of users.
> > >>>
> > >>> ==== Core Developers ====
> > >>> The core developers of our project are listed in our contributors and
> > >>> initial PPMC below. Though many are employed at Salesforce.com, there
> > are
> > >>> also engineers from ActionML, and independent developers.
> > >>>
> > >>> === Alignment ===
> > >>> The ASF is the natural choice to host the PredictionIO project as its
> > >>> goal
> > >>> is democratizing Machine Learning by making it more easily accessible
> > to
> > >>> every user/developer. PredictionIO is built on top of several top
> level
> > >>> Apache projects as outlined above.
> > >>>
> > >>> === Known Risks ===
> > >>>
> > >>> ==== Orphaned products ====
> > >>> PredictionIO has a solid and growing community. It is deployed on
> > >>> production environments by companies of all sizes to run various
> kinds
> > of
> > >>> predictive engines.
> > >>>
> > >>> In addition to the community contribution to PredictionIO framework,
> > the
> > >>> community is also actively contributing new engines to the Template
> > >>> Gallery as well as SDKs and documentation for the project. Salesforce
> > is
> > >>> committed to utilize and advance the PredictionIO code base and
> support
> > >>> its user community.
> > >>>
> > >>> ==== Inexperience with Open Source ====
> > >>> PredictionIO has existed as a healthy open source project for almost
> > two
> > >>> years and is the most starred Scala project on GitHub. All of the
> > >>> proposed
> > >>> committers have contributed to ASF and Linux Foundation open source
> > >>> projects. Several current committers on Apache projects and Apache
> > >>> Members
> > >>> are involved in this proposal and intend to provide mentorship.
> > >>>
> > >>> ==== Homogeneous Developers ====
> > >>> The initial list of committers includes developers from several
> > >>> institutions, including Salesforce, ActionML, Channel4, USC as well
> as
> > >>> unaffiliated developers.
> > >>>
> > >>> ==== Reliance on Salaried Developers ====
> > >>> Like most open source projects, PredictionIO receives substantial
> > support
> > >>> from salaried developers. PredictionIO development is partially
> > supported
> > >>> by Salesforce.com, but there are many contributors from various other
> > >>> companies, and an active mailing list composed of hundreds of users.
> We
> > >>> will continue our efforts to ensure stewardship of the project to be
> > >>> independent of salaried developers by meritocratically promoting
> those
> > >>> contributors to committers.
> > >>>
> > >>> ==== Relationships with Other Apache Product ====
> > >>> PredictionIO relies heavily on top level apache projects such as
> Apache
> > >>> Spark, HBase and Hadoop. However it brings a distinguished
> > functionality,
> > >>> rather than just an abstraction - Machine Learning in a plug-and-play
> > >>> fashion.
> > >>>
> > >>> Compared to Apache Mahout, which focuses on the development of a wide
> > >>> variety of algorithms, PredictionIO offers a platform to manage the
> > whole
> > >>> machine learning workflow, including data collection, data
> preparation,
> > >>> modeling, deployment and management of predictive services in
> > production
> > >>> environments.
> > >>>
> > >>> ==== An Excessive Fascination with the Apache Brand ====
> > >>> PredictionIO is already a widely known open source project. This
> > proposal
> > >>> is not for the purpose of generating publicity. Rather, the primary
> > >>> benefits to joining Apache are those outlined in the Rationale
> section.
> > >>>
> > >>> === Documentation ===
> > >>> PredictionIO boasts rich and live documentation, included in the code
> > >>> repo
> > >>> (docs/manual directory), is built with Middleman, and publicly hosted
> > at
> > >>> https://docs.prediction.io
> > >>>
> > >>> === Initial Source and Intellectual Property Submission Plan ===
> > >>> Currently, the PredictionIO codebase is distributed under the Apache
> > 2.0
> > >>> License and hosted on GitHub:
> > >>> https://github.com/PredictionIO/PredictionIO
> > >>>
> > >>> === External Dependencies ===
> > >>> PredictionIO has the following external dependencies:
> > >>>   * Apache Hadoop 2.4.0 (optional, required only if YARN and HDFS are
> > >>> needed)
> > >>>   * Apache Spark 1.3.0 for Hadoop 2.4
> > >>>   * Java SE Development Kit 8
> > >>>   * and one of the following sets:
> > >>>
> > >>>     * PostgreSQL 9.1
> > >>>
> > >>>
> > >>> or
> > >>>
> > >>>
> > >>> * MySQL 5.1
> > >>>
> > >>>   or
> > >>>
> > >>>
> > >>>   * Apache HBase 0.98.6
> > >>>
> > >>>
> > >>> * Elasticsearch 1.4.0
> > >>>
> > >>> Upon acceptance to the incubator, we would begin a thorough analysis
> of
> > >>> all transitive dependencies to verify this information and introduce
> > >>> license checking into the build and release process by integrating
> with
> > >>> Apache RAT.
> > >>>
> > >>> === Cryptography ===
> > >>> PredictionIO does not include cryptographic code. We utilize standard
> > >>> JCE and JSSE APIs provided by the Java Runtime Environment.
> > >>>
> > >>> === Required Resources ===
> > >>> We request that following resources be created for the project to use
> > >>>
> > >>> ==== Mailing lists ====
> > >>>
> > >>> predictionio-private@incubator.apache.org (with moderated
> > subscriptions)
> > >>>
> > >>> predictionio-dev
> > >>>
> > >>> predictionio-user
> > >>>
> > >>> predictionio-commits
> > >>>
> > >>> We will migrate the existing PredictionIO mailing lists.
> > >>>
> > >>> ==== Git repository ====
> > >>> The PredictionIO team would like to use Git for source control, due
> to
> > >>> our
> > >>> current use of GitHub.
> > >>>
> > >>> git://git.apache.org/incubator-predictionio
> > >>>
> > >>> ==== Documentation ====
> > >>> https://predictionio.incubator.apache.org/docs/
> > >>>
> > >>> ==== JIRA instance ====
> > >>> PredictionIO currently uses the GitHub issue tracking system
> associated
> > >>> with its repository:
> > https://github.com/PredictionIO/PredictionIO/issues
> > >>> .
> > >>> We will migrate to Apache JIRA.
> > >>>
> > >>> JIRA PREDICTIONIO
> > >>> https://issues.apache.org/jira/browse/PREDICTIONIO
> > >>>
> > >>> ==== Other Resources ====
> > >>> * TravisCI for builds and test running.
> > >>>
> > >>> * PredictionIO's documentation, included in the code repo
> (docs/manual
> > >>> directory), is built with Middleman and publicly hosted
> > >>> https://docs.prediction.io
> > >>>
> > >>> * A blog to drive adoption and excitement at
> > https://blog.prediction.io
> > >>>
> > >>> === Initial Committers ===
> > >>>
> > >>> * Pat Ferrell
> > >>>
> > >>> * Tamas Jambor
> > >>>
> > >>> * Justin Yip
> > >>>
> > >>> * Xusen Yin
> > >>>
> > >>> * Lee Moon Soo
> > >>>
> > >>> * Donald Szeto
> > >>>
> > >>> * Kenneth Chan
> > >>>
> > >>> * Tom Chan
> > >>>
> > >>> * Simon Chan
> > >>>
> > >>> * Marco Vivero
> > >>>
> > >>> * Matthew Tovbin
> > >>>
> > >>> * Yevgeny Khodorkovsky
> > >>>
> > >>> * Felipe Oliveira
> > >>>
> > >>> * Vitaly Gordon
> > >>>
> > >>> === Affiliations ===
> > >>>
> > >>> * Pat Ferrell - ActionML
> > >>>
> > >>> * Tamas Jambor - Channel4
> > >>>
> > >>> * Justin Yip - independent
> > >>>
> > >>> * Xusen Yin - USC
> > >>>
> > >>> * Lee Moon Soo - NFLabs
> > >>>
> > >>> * Donald Szeto - Salesforce
> > >>>
> > >>> * Kenneth Chan - Salesforce
> > >>>
> > >>> * Tom Chan - Salesforce
> > >>>
> > >>> * Simon Chan - Salesforce
> > >>>
> > >>> * Marco Vivero - Salesforce
> > >>>
> > >>> * Matthew Tovbin - Salesforce
> > >>>
> > >>> * Yevgeny Khodorkovsky - Salesforce
> > >>>
> > >>> * Felipe Oliveira - Salesforce
> > >>>
> > >>> * Vitaly Gordon - Salesforce
> > >>>
> > >>> === Sponsors ===
> > >>>
> > >>> ==== Champion ====
> > >>>
> > >>> Andrew Purtell <apurtell at apache dot org>
> > >>>
> > >>> ==== Nominated Mentors ====
> > >>>
> > >>> * Andrew Purtell <apurtell at apache dot org>
> > >>>
> > >>> * James Taylor <jtaylor at apache dot org>
> > >>>
> > >>> * Lars Hofhansl <larsh at apache dot org>
> > >>>
> > >>> * Suneel Marthi <smarthi at apache dot org>
> > >>>
> > >>> * Xiangrui Meng <meng at apache dot org>
> > >>>
> > >>> * Luciano Resende <lresende at apache dot org>
> > >>>
> > >>> ==== Sponsoring Entity ====
> > >>>
> > >>> Apache Incubator PMC
> > >>>
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > >> For additional commands, e-mail: general-help@incubator.apache.org
> > >>
> > >>
> > > --
> > > Jean-Baptiste Onofré
> > > jbonofre@apache.org
> > > http://blog.nanthrax.net
> > > Talend - http://www.talend.com
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > > For additional commands, e-mail: general-help@incubator.apache.org
> > >
> > >
> >
>

Re: [DISCUSS] PredictionIO incubation proposal

Posted by Andrew Purtell <ap...@apache.org>.
I am just waiting on an accept from Alex to be in the initial committers
list and will then update the proposal on the wiki.


On Mon, May 16, 2016 at 12:07 PM, Simon Chan <si...@salesforce.com> wrote:

> Yes, it includes everyone who previously contributed code from PredictionIO
> before the acquisition and still want to be involved in the project.
>
> We may have missed "Alex Merritt", going to add him to the list soon.
>
> Simon
>
>
> On Mon, May 16, 2016 at 11:58 AM, Suneel Marthi <sm...@apache.org>
> wrote:
>
> > I do have a question about the proposed list of committers.
> >
> > Does the list also include all of those folks who were with PredictionIO
> > (and had contributed to the project) and then chose to leave when PIO was
> > acquired by Salesforce?
> >
> >
> >
> >
> > On Mon, May 16, 2016 at 1:13 PM, Jean-Baptiste Onofré <jb...@nanthrax.net>
> > wrote:
> >
> > > By the way, we have some discussion about integrating Zeppelin with
> Beam
> > ;)
> > >
> > > Regards
> > > JB
> > >
> > > On 05/15/2016 02:32 AM, Roman Shaposhnik wrote:
> > >
> > >> Super excited to see this proposal! This will finally allow us to have
> > >> an ASF managed
> > >> backend for next generation data-driven apps that I see emerging quite
> > >> rapidly.
> > >>
> > >> The proposal looks great to me (although I'd recommend calling Scala
> > >> as an implementation
> > >> language more prominently since it may attract additional developers
> > >> with affinity to it).
> > >>
> > >> I do have two questions about technology:
> > >>     1. do you think it would be possible to leverage Apache Beam
> > >> (incubating)
> > >>         for abstracting away dependency on execution frameworks? My
> > >> understanding
> > >>         is that PredictionIO currently only run on Spark.
> > >>     2. is there a potential integration with Apache Zeppelin possible?
> > >>
> > >> Thanks,
> > >> Roman.
> > >>
> > >> On Fri, May 13, 2016 at 1:41 PM, Andrew Purtell <ap...@apache.org>
> > >> wrote:
> > >>
> > >>> Greetings,
> > >>>
> > >>> It is my pleasure to
> > >>>
> > >>> propose the PredictionIO project for incubation at the Apache
> Software
> > >>> Foundation.
> > >>>
> > >>> PredictionIO is a
> > >>> popular
> > >>> open
> > >>>
> > >>> source Machine Learning Server built on top of a state-of-the-art
> open
> > >>> source stack, including several Apache technologies, that
> > >>>
> > >>> enables developers to manage and deploy production-ready predictive
> > >>> services for various kinds of machine learning tasks
> > >>> , with more than 400 production deployments around the world and a
> > >>> growing
> > >>> contributor community.
> > >>>
> > >>>
> > >>> The text of the proposal is included below and is also available at
> > >>> https://wiki.apache.org/incubator/PredictionIO
> > >>>
> > >>> Best regards,
> > >>> Andrew Purtell
> > >>>
> > >>>
> > >>> = PredictionIO Proposal =
> > >>>
> > >>> === Abstract ===
> > >>> PredictionIO is an open source Machine Learning Server built on top
> of
> > >>> state-of-the-art open source stack, that enables developers to manage
> > and
> > >>> deploy production-ready predictive services for various kinds of
> > machine
> > >>> learning tasks.
> > >>>
> > >>> === Proposal ===
> > >>> The PredictionIO platform consists of the following components:
> > >>>
> > >>>   * PredictionIO framework - provides the machine learning stack for
> > >>>   building, evaluating and deploying engines with machine learning
> > >>>   algorithms. It uses Apache Spark for processing.
> > >>>
> > >>>   * Event Server - the machine learning analytics layer for unifying
> > >>> events
> > >>>   from multiple platforms. It can use Apache HBase or any JDBC
> backends
> > >>>   as its data store.
> > >>>
> > >>> The PredictionIO community also maintains a
> > >>>
> > >>> Template Gallery, a place to
> > >>> publish and download (free or proprietary) engine templates for
> > different
> > >>> types of machine learning applications, and is a complemental part of
> > the
> > >>> project. At this point we exclude the Template Gallery from the
> > proposal,
> > >>> as it has a separate set of contributors and we’re not familiar with
> an
> > >>> Apache approved mechanism to maintain such a gallery.
> > >>>
> > >>> You can find the Template Gallery at
> https://templates.prediction.io/
> > >>>
> > >>> === Background ===
> > >>> PredictionIO was started with a mission to democratize and bring
> > machine
> > >>> learning to the masses.
> > >>>
> > >>> Machine learning has traditionally been a luxury for big companies
> like
> > >>> Google, Facebook, and Netflix. There are ML libraries and tools lying
> > >>> around the internet but the effort of putting them all together as a
> > >>> production-ready infrastructure is a very resource-intensive task
> that
> > is
> > >>> remotely reachable by individuals or small businesses.
> > >>>
> > >>> PredictionIO is a production-ready, full stack machine learning
> system
> > >>> that
> > >>> allows organizations of any scale to quickly deploy machine learning
> > >>> capabilities. It comes with official and community-contributed
> machine
> > >>> learning engine templates that are easy to customize.
> > >>>
> > >>> === Rationale ===
> > >>> As usage and number of contributors to PredictionIO has grown bigger
> > and
> > >>> more diverse, we have sought for an independent framework for the
> > project
> > >>> to keep thriving. We believe the Apache foundation is a great fit.
> > >>> Joining
> > >>> Apache would ensure that tried and true processes and procedures are
> in
> > >>> place for the growing number of organizations interested in
> > contributing
> > >>> to PredictionIO. PredictionIO is also a good fit for the Apache
> > >>> foundation.
> > >>> PredictionIO was built on top of several Apache projects (HBase,
> Spark,
> > >>> Hadoop). We are familiar with the Apache process and believe that the
> > >>> democratic and meritocratic nature of the foundation aligns with the
> > >>> project goals.
> > >>>
> > >>> === Initial Goals ===
> > >>> The initial milestones will be to move the existing codebase to
> Apache
> > >>> and
> > >>> integrate with the Apache development process. Once this is
> > accomplished,
> > >>> we plan for incremental development and releases that follow the
> Apache
> > >>> guidelines, as well as growing our developer and user communities.
> > >>>
> > >>> === Current Status ===
> > >>> PredictionIO has undergone nine minor releases and many patches.
> > >>> PredictionIO is being used in production by Salesforce.com as well as
> > >>> many
> > >>> other organizations and apps. The PredictionIO codebase is currently
> > >>> hosted at GitHub, which will form the basis of the Apache git
> > repository.
> > >>>
> > >>> ==== Meritocracy ====
> > >>> We plan to invest in supporting a meritocracy. We will discuss the
> > >>> requirements in an open forum. We intend to invite additional
> > developers
> > >>> to participate. We will encourage and monitor community participation
> > so
> > >>> that privileges can be extended to those that contribute.
> > >>>
> > >>> ==== Community ====
> > >>> Acceptance into the Apache foundation would bolster the already
> strong
> > >>> user and developer community around PredictionIO. That community
> > includes
> > >>> many contributors from various other companies, and an active mailing
> > >>> list
> > >>> composed of hundreds of users.
> > >>>
> > >>> ==== Core Developers ====
> > >>> The core developers of our project are listed in our contributors and
> > >>> initial PPMC below. Though many are employed at Salesforce.com, there
> > are
> > >>> also engineers from ActionML, and independent developers.
> > >>>
> > >>> === Alignment ===
> > >>> The ASF is the natural choice to host the PredictionIO project as its
> > >>> goal
> > >>> is democratizing Machine Learning by making it more easily accessible
> > to
> > >>> every user/developer. PredictionIO is built on top of several top
> level
> > >>> Apache projects as outlined above.
> > >>>
> > >>> === Known Risks ===
> > >>>
> > >>> ==== Orphaned products ====
> > >>> PredictionIO has a solid and growing community. It is deployed on
> > >>> production environments by companies of all sizes to run various
> kinds
> > of
> > >>> predictive engines.
> > >>>
> > >>> In addition to the community contribution to PredictionIO framework,
> > the
> > >>> community is also actively contributing new engines to the Template
> > >>> Gallery as well as SDKs and documentation for the project. Salesforce
> > is
> > >>> committed to utilize and advance the PredictionIO code base and
> support
> > >>> its user community.
> > >>>
> > >>> ==== Inexperience with Open Source ====
> > >>> PredictionIO has existed as a healthy open source project for almost
> > two
> > >>> years and is the most starred Scala project on GitHub. All of the
> > >>> proposed
> > >>> committers have contributed to ASF and Linux Foundation open source
> > >>> projects. Several current committers on Apache projects and Apache
> > >>> Members
> > >>> are involved in this proposal and intend to provide mentorship.
> > >>>
> > >>> ==== Homogeneous Developers ====
> > >>> The initial list of committers includes developers from several
> > >>> institutions, including Salesforce, ActionML, Channel4, USC as well
> as
> > >>> unaffiliated developers.
> > >>>
> > >>> ==== Reliance on Salaried Developers ====
> > >>> Like most open source projects, PredictionIO receives substantial
> > support
> > >>> from salaried developers. PredictionIO development is partially
> > supported
> > >>> by Salesforce.com, but there are many contributors from various other
> > >>> companies, and an active mailing list composed of hundreds of users.
> We
> > >>> will continue our efforts to ensure stewardship of the project to be
> > >>> independent of salaried developers by meritocratically promoting
> those
> > >>> contributors to committers.
> > >>>
> > >>> ==== Relationships with Other Apache Product ====
> > >>> PredictionIO relies heavily on top level apache projects such as
> Apache
> > >>> Spark, HBase and Hadoop. However it brings a distinguished
> > functionality,
> > >>> rather than just an abstraction - Machine Learning in a plug-and-play
> > >>> fashion.
> > >>>
> > >>> Compared to Apache Mahout, which focuses on the development of a wide
> > >>> variety of algorithms, PredictionIO offers a platform to manage the
> > whole
> > >>> machine learning workflow, including data collection, data
> preparation,
> > >>> modeling, deployment and management of predictive services in
> > production
> > >>> environments.
> > >>>
> > >>> ==== An Excessive Fascination with the Apache Brand ====
> > >>> PredictionIO is already a widely known open source project. This
> > proposal
> > >>> is not for the purpose of generating publicity. Rather, the primary
> > >>> benefits to joining Apache are those outlined in the Rationale
> section.
> > >>>
> > >>> === Documentation ===
> > >>> PredictionIO boasts rich and live documentation, included in the code
> > >>> repo
> > >>> (docs/manual directory), is built with Middleman, and publicly hosted
> > at
> > >>> https://docs.prediction.io
> > >>>
> > >>> === Initial Source and Intellectual Property Submission Plan ===
> > >>> Currently, the PredictionIO codebase is distributed under the Apache
> > 2.0
> > >>> License and hosted on GitHub:
> > >>> https://github.com/PredictionIO/PredictionIO
> > >>>
> > >>> === External Dependencies ===
> > >>> PredictionIO has the following external dependencies:
> > >>>   * Apache Hadoop 2.4.0 (optional, required only if YARN and HDFS are
> > >>> needed)
> > >>>   * Apache Spark 1.3.0 for Hadoop 2.4
> > >>>   * Java SE Development Kit 8
> > >>>   * and one of the following sets:
> > >>>
> > >>>     * PostgreSQL 9.1
> > >>>
> > >>>
> > >>> or
> > >>>
> > >>>
> > >>> * MySQL 5.1
> > >>>
> > >>>   or
> > >>>
> > >>>
> > >>>   * Apache HBase 0.98.6
> > >>>
> > >>>
> > >>> * Elasticsearch 1.4.0
> > >>>
> > >>> Upon acceptance to the incubator, we would begin a thorough analysis
> of
> > >>> all transitive dependencies to verify this information and introduce
> > >>> license checking into the build and release process by integrating
> with
> > >>> Apache RAT.
> > >>>
> > >>> === Cryptography ===
> > >>> PredictionIO does not include cryptographic code. We utilize standard
> > >>> JCE and JSSE APIs provided by the Java Runtime Environment.
> > >>>
> > >>> === Required Resources ===
> > >>> We request that following resources be created for the project to use
> > >>>
> > >>> ==== Mailing lists ====
> > >>>
> > >>> predictionio-private@incubator.apache.org (with moderated
> > subscriptions)
> > >>>
> > >>> predictionio-dev
> > >>>
> > >>> predictionio-user
> > >>>
> > >>> predictionio-commits
> > >>>
> > >>> We will migrate the existing PredictionIO mailing lists.
> > >>>
> > >>> ==== Git repository ====
> > >>> The PredictionIO team would like to use Git for source control, due
> to
> > >>> our
> > >>> current use of GitHub.
> > >>>
> > >>> git://git.apache.org/incubator-predictionio
> > >>>
> > >>> ==== Documentation ====
> > >>> https://predictionio.incubator.apache.org/docs/
> > >>>
> > >>> ==== JIRA instance ====
> > >>> PredictionIO currently uses the GitHub issue tracking system
> associated
> > >>> with its repository:
> > https://github.com/PredictionIO/PredictionIO/issues
> > >>> .
> > >>> We will migrate to Apache JIRA.
> > >>>
> > >>> JIRA PREDICTIONIO
> > >>> https://issues.apache.org/jira/browse/PREDICTIONIO
> > >>>
> > >>> ==== Other Resources ====
> > >>> * TravisCI for builds and test running.
> > >>>
> > >>> * PredictionIO's documentation, included in the code repo
> (docs/manual
> > >>> directory), is built with Middleman and publicly hosted
> > >>> https://docs.prediction.io
> > >>>
> > >>> * A blog to drive adoption and excitement at
> > https://blog.prediction.io
> > >>>
> > >>> === Initial Committers ===
> > >>>
> > >>> * Pat Ferrell
> > >>>
> > >>> * Tamas Jambor
> > >>>
> > >>> * Justin Yip
> > >>>
> > >>> * Xusen Yin
> > >>>
> > >>> * Lee Moon Soo
> > >>>
> > >>> * Donald Szeto
> > >>>
> > >>> * Kenneth Chan
> > >>>
> > >>> * Tom Chan
> > >>>
> > >>> * Simon Chan
> > >>>
> > >>> * Marco Vivero
> > >>>
> > >>> * Matthew Tovbin
> > >>>
> > >>> * Yevgeny Khodorkovsky
> > >>>
> > >>> * Felipe Oliveira
> > >>>
> > >>> * Vitaly Gordon
> > >>>
> > >>> === Affiliations ===
> > >>>
> > >>> * Pat Ferrell - ActionML
> > >>>
> > >>> * Tamas Jambor - Channel4
> > >>>
> > >>> * Justin Yip - independent
> > >>>
> > >>> * Xusen Yin - USC
> > >>>
> > >>> * Lee Moon Soo - NFLabs
> > >>>
> > >>> * Donald Szeto - Salesforce
> > >>>
> > >>> * Kenneth Chan - Salesforce
> > >>>
> > >>> * Tom Chan - Salesforce
> > >>>
> > >>> * Simon Chan - Salesforce
> > >>>
> > >>> * Marco Vivero - Salesforce
> > >>>
> > >>> * Matthew Tovbin - Salesforce
> > >>>
> > >>> * Yevgeny Khodorkovsky - Salesforce
> > >>>
> > >>> * Felipe Oliveira - Salesforce
> > >>>
> > >>> * Vitaly Gordon - Salesforce
> > >>>
> > >>> === Sponsors ===
> > >>>
> > >>> ==== Champion ====
> > >>>
> > >>> Andrew Purtell <apurtell at apache dot org>
> > >>>
> > >>> ==== Nominated Mentors ====
> > >>>
> > >>> * Andrew Purtell <apurtell at apache dot org>
> > >>>
> > >>> * James Taylor <jtaylor at apache dot org>
> > >>>
> > >>> * Lars Hofhansl <larsh at apache dot org>
> > >>>
> > >>> * Suneel Marthi <smarthi at apache dot org>
> > >>>
> > >>> * Xiangrui Meng <meng at apache dot org>
> > >>>
> > >>> * Luciano Resende <lresende at apache dot org>
> > >>>
> > >>> ==== Sponsoring Entity ====
> > >>>
> > >>> Apache Incubator PMC
> > >>>
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > >> For additional commands, e-mail: general-help@incubator.apache.org
> > >>
> > >>
> > > --
> > > Jean-Baptiste Onofré
> > > jbonofre@apache.org
> > > http://blog.nanthrax.net
> > > Talend - http://www.talend.com
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > > For additional commands, e-mail: general-help@incubator.apache.org
> > >
> > >
> >
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: [DISCUSS] PredictionIO incubation proposal

Posted by Simon Chan <si...@salesforce.com>.
Yes, it includes everyone who previously contributed code from PredictionIO
before the acquisition and still want to be involved in the project.

We may have missed "Alex Merritt", going to add him to the list soon.

Simon


On Mon, May 16, 2016 at 11:58 AM, Suneel Marthi <sm...@apache.org> wrote:

> I do have a question about the proposed list of committers.
>
> Does the list also include all of those folks who were with PredictionIO
> (and had contributed to the project) and then chose to leave when PIO was
> acquired by Salesforce?
>
>
>
>
> On Mon, May 16, 2016 at 1:13 PM, Jean-Baptiste Onofré <jb...@nanthrax.net>
> wrote:
>
> > By the way, we have some discussion about integrating Zeppelin with Beam
> ;)
> >
> > Regards
> > JB
> >
> > On 05/15/2016 02:32 AM, Roman Shaposhnik wrote:
> >
> >> Super excited to see this proposal! This will finally allow us to have
> >> an ASF managed
> >> backend for next generation data-driven apps that I see emerging quite
> >> rapidly.
> >>
> >> The proposal looks great to me (although I'd recommend calling Scala
> >> as an implementation
> >> language more prominently since it may attract additional developers
> >> with affinity to it).
> >>
> >> I do have two questions about technology:
> >>     1. do you think it would be possible to leverage Apache Beam
> >> (incubating)
> >>         for abstracting away dependency on execution frameworks? My
> >> understanding
> >>         is that PredictionIO currently only run on Spark.
> >>     2. is there a potential integration with Apache Zeppelin possible?
> >>
> >> Thanks,
> >> Roman.
> >>
> >> On Fri, May 13, 2016 at 1:41 PM, Andrew Purtell <ap...@apache.org>
> >> wrote:
> >>
> >>> Greetings,
> >>>
> >>> It is my pleasure to
> >>>
> >>> propose the PredictionIO project for incubation at the Apache Software
> >>> Foundation.
> >>>
> >>> PredictionIO is a
> >>> popular
> >>> open
> >>>
> >>> source Machine Learning Server built on top of a state-of-the-art open
> >>> source stack, including several Apache technologies, that
> >>>
> >>> enables developers to manage and deploy production-ready predictive
> >>> services for various kinds of machine learning tasks
> >>> , with more than 400 production deployments around the world and a
> >>> growing
> >>> contributor community.
> >>>
> >>>
> >>> The text of the proposal is included below and is also available at
> >>> https://wiki.apache.org/incubator/PredictionIO
> >>>
> >>> Best regards,
> >>> Andrew Purtell
> >>>
> >>>
> >>> = PredictionIO Proposal =
> >>>
> >>> === Abstract ===
> >>> PredictionIO is an open source Machine Learning Server built on top of
> >>> state-of-the-art open source stack, that enables developers to manage
> and
> >>> deploy production-ready predictive services for various kinds of
> machine
> >>> learning tasks.
> >>>
> >>> === Proposal ===
> >>> The PredictionIO platform consists of the following components:
> >>>
> >>>   * PredictionIO framework - provides the machine learning stack for
> >>>   building, evaluating and deploying engines with machine learning
> >>>   algorithms. It uses Apache Spark for processing.
> >>>
> >>>   * Event Server - the machine learning analytics layer for unifying
> >>> events
> >>>   from multiple platforms. It can use Apache HBase or any JDBC backends
> >>>   as its data store.
> >>>
> >>> The PredictionIO community also maintains a
> >>>
> >>> Template Gallery, a place to
> >>> publish and download (free or proprietary) engine templates for
> different
> >>> types of machine learning applications, and is a complemental part of
> the
> >>> project. At this point we exclude the Template Gallery from the
> proposal,
> >>> as it has a separate set of contributors and we’re not familiar with an
> >>> Apache approved mechanism to maintain such a gallery.
> >>>
> >>> You can find the Template Gallery at https://templates.prediction.io/
> >>>
> >>> === Background ===
> >>> PredictionIO was started with a mission to democratize and bring
> machine
> >>> learning to the masses.
> >>>
> >>> Machine learning has traditionally been a luxury for big companies like
> >>> Google, Facebook, and Netflix. There are ML libraries and tools lying
> >>> around the internet but the effort of putting them all together as a
> >>> production-ready infrastructure is a very resource-intensive task that
> is
> >>> remotely reachable by individuals or small businesses.
> >>>
> >>> PredictionIO is a production-ready, full stack machine learning system
> >>> that
> >>> allows organizations of any scale to quickly deploy machine learning
> >>> capabilities. It comes with official and community-contributed machine
> >>> learning engine templates that are easy to customize.
> >>>
> >>> === Rationale ===
> >>> As usage and number of contributors to PredictionIO has grown bigger
> and
> >>> more diverse, we have sought for an independent framework for the
> project
> >>> to keep thriving. We believe the Apache foundation is a great fit.
> >>> Joining
> >>> Apache would ensure that tried and true processes and procedures are in
> >>> place for the growing number of organizations interested in
> contributing
> >>> to PredictionIO. PredictionIO is also a good fit for the Apache
> >>> foundation.
> >>> PredictionIO was built on top of several Apache projects (HBase, Spark,
> >>> Hadoop). We are familiar with the Apache process and believe that the
> >>> democratic and meritocratic nature of the foundation aligns with the
> >>> project goals.
> >>>
> >>> === Initial Goals ===
> >>> The initial milestones will be to move the existing codebase to Apache
> >>> and
> >>> integrate with the Apache development process. Once this is
> accomplished,
> >>> we plan for incremental development and releases that follow the Apache
> >>> guidelines, as well as growing our developer and user communities.
> >>>
> >>> === Current Status ===
> >>> PredictionIO has undergone nine minor releases and many patches.
> >>> PredictionIO is being used in production by Salesforce.com as well as
> >>> many
> >>> other organizations and apps. The PredictionIO codebase is currently
> >>> hosted at GitHub, which will form the basis of the Apache git
> repository.
> >>>
> >>> ==== Meritocracy ====
> >>> We plan to invest in supporting a meritocracy. We will discuss the
> >>> requirements in an open forum. We intend to invite additional
> developers
> >>> to participate. We will encourage and monitor community participation
> so
> >>> that privileges can be extended to those that contribute.
> >>>
> >>> ==== Community ====
> >>> Acceptance into the Apache foundation would bolster the already strong
> >>> user and developer community around PredictionIO. That community
> includes
> >>> many contributors from various other companies, and an active mailing
> >>> list
> >>> composed of hundreds of users.
> >>>
> >>> ==== Core Developers ====
> >>> The core developers of our project are listed in our contributors and
> >>> initial PPMC below. Though many are employed at Salesforce.com, there
> are
> >>> also engineers from ActionML, and independent developers.
> >>>
> >>> === Alignment ===
> >>> The ASF is the natural choice to host the PredictionIO project as its
> >>> goal
> >>> is democratizing Machine Learning by making it more easily accessible
> to
> >>> every user/developer. PredictionIO is built on top of several top level
> >>> Apache projects as outlined above.
> >>>
> >>> === Known Risks ===
> >>>
> >>> ==== Orphaned products ====
> >>> PredictionIO has a solid and growing community. It is deployed on
> >>> production environments by companies of all sizes to run various kinds
> of
> >>> predictive engines.
> >>>
> >>> In addition to the community contribution to PredictionIO framework,
> the
> >>> community is also actively contributing new engines to the Template
> >>> Gallery as well as SDKs and documentation for the project. Salesforce
> is
> >>> committed to utilize and advance the PredictionIO code base and support
> >>> its user community.
> >>>
> >>> ==== Inexperience with Open Source ====
> >>> PredictionIO has existed as a healthy open source project for almost
> two
> >>> years and is the most starred Scala project on GitHub. All of the
> >>> proposed
> >>> committers have contributed to ASF and Linux Foundation open source
> >>> projects. Several current committers on Apache projects and Apache
> >>> Members
> >>> are involved in this proposal and intend to provide mentorship.
> >>>
> >>> ==== Homogeneous Developers ====
> >>> The initial list of committers includes developers from several
> >>> institutions, including Salesforce, ActionML, Channel4, USC as well as
> >>> unaffiliated developers.
> >>>
> >>> ==== Reliance on Salaried Developers ====
> >>> Like most open source projects, PredictionIO receives substantial
> support
> >>> from salaried developers. PredictionIO development is partially
> supported
> >>> by Salesforce.com, but there are many contributors from various other
> >>> companies, and an active mailing list composed of hundreds of users. We
> >>> will continue our efforts to ensure stewardship of the project to be
> >>> independent of salaried developers by meritocratically promoting those
> >>> contributors to committers.
> >>>
> >>> ==== Relationships with Other Apache Product ====
> >>> PredictionIO relies heavily on top level apache projects such as Apache
> >>> Spark, HBase and Hadoop. However it brings a distinguished
> functionality,
> >>> rather than just an abstraction - Machine Learning in a plug-and-play
> >>> fashion.
> >>>
> >>> Compared to Apache Mahout, which focuses on the development of a wide
> >>> variety of algorithms, PredictionIO offers a platform to manage the
> whole
> >>> machine learning workflow, including data collection, data preparation,
> >>> modeling, deployment and management of predictive services in
> production
> >>> environments.
> >>>
> >>> ==== An Excessive Fascination with the Apache Brand ====
> >>> PredictionIO is already a widely known open source project. This
> proposal
> >>> is not for the purpose of generating publicity. Rather, the primary
> >>> benefits to joining Apache are those outlined in the Rationale section.
> >>>
> >>> === Documentation ===
> >>> PredictionIO boasts rich and live documentation, included in the code
> >>> repo
> >>> (docs/manual directory), is built with Middleman, and publicly hosted
> at
> >>> https://docs.prediction.io
> >>>
> >>> === Initial Source and Intellectual Property Submission Plan ===
> >>> Currently, the PredictionIO codebase is distributed under the Apache
> 2.0
> >>> License and hosted on GitHub:
> >>> https://github.com/PredictionIO/PredictionIO
> >>>
> >>> === External Dependencies ===
> >>> PredictionIO has the following external dependencies:
> >>>   * Apache Hadoop 2.4.0 (optional, required only if YARN and HDFS are
> >>> needed)
> >>>   * Apache Spark 1.3.0 for Hadoop 2.4
> >>>   * Java SE Development Kit 8
> >>>   * and one of the following sets:
> >>>
> >>>     * PostgreSQL 9.1
> >>>
> >>>
> >>> or
> >>>
> >>>
> >>> * MySQL 5.1
> >>>
> >>>   or
> >>>
> >>>
> >>>   * Apache HBase 0.98.6
> >>>
> >>>
> >>> * Elasticsearch 1.4.0
> >>>
> >>> Upon acceptance to the incubator, we would begin a thorough analysis of
> >>> all transitive dependencies to verify this information and introduce
> >>> license checking into the build and release process by integrating with
> >>> Apache RAT.
> >>>
> >>> === Cryptography ===
> >>> PredictionIO does not include cryptographic code. We utilize standard
> >>> JCE and JSSE APIs provided by the Java Runtime Environment.
> >>>
> >>> === Required Resources ===
> >>> We request that following resources be created for the project to use
> >>>
> >>> ==== Mailing lists ====
> >>>
> >>> predictionio-private@incubator.apache.org (with moderated
> subscriptions)
> >>>
> >>> predictionio-dev
> >>>
> >>> predictionio-user
> >>>
> >>> predictionio-commits
> >>>
> >>> We will migrate the existing PredictionIO mailing lists.
> >>>
> >>> ==== Git repository ====
> >>> The PredictionIO team would like to use Git for source control, due to
> >>> our
> >>> current use of GitHub.
> >>>
> >>> git://git.apache.org/incubator-predictionio
> >>>
> >>> ==== Documentation ====
> >>> https://predictionio.incubator.apache.org/docs/
> >>>
> >>> ==== JIRA instance ====
> >>> PredictionIO currently uses the GitHub issue tracking system associated
> >>> with its repository:
> https://github.com/PredictionIO/PredictionIO/issues
> >>> .
> >>> We will migrate to Apache JIRA.
> >>>
> >>> JIRA PREDICTIONIO
> >>> https://issues.apache.org/jira/browse/PREDICTIONIO
> >>>
> >>> ==== Other Resources ====
> >>> * TravisCI for builds and test running.
> >>>
> >>> * PredictionIO's documentation, included in the code repo (docs/manual
> >>> directory), is built with Middleman and publicly hosted
> >>> https://docs.prediction.io
> >>>
> >>> * A blog to drive adoption and excitement at
> https://blog.prediction.io
> >>>
> >>> === Initial Committers ===
> >>>
> >>> * Pat Ferrell
> >>>
> >>> * Tamas Jambor
> >>>
> >>> * Justin Yip
> >>>
> >>> * Xusen Yin
> >>>
> >>> * Lee Moon Soo
> >>>
> >>> * Donald Szeto
> >>>
> >>> * Kenneth Chan
> >>>
> >>> * Tom Chan
> >>>
> >>> * Simon Chan
> >>>
> >>> * Marco Vivero
> >>>
> >>> * Matthew Tovbin
> >>>
> >>> * Yevgeny Khodorkovsky
> >>>
> >>> * Felipe Oliveira
> >>>
> >>> * Vitaly Gordon
> >>>
> >>> === Affiliations ===
> >>>
> >>> * Pat Ferrell - ActionML
> >>>
> >>> * Tamas Jambor - Channel4
> >>>
> >>> * Justin Yip - independent
> >>>
> >>> * Xusen Yin - USC
> >>>
> >>> * Lee Moon Soo - NFLabs
> >>>
> >>> * Donald Szeto - Salesforce
> >>>
> >>> * Kenneth Chan - Salesforce
> >>>
> >>> * Tom Chan - Salesforce
> >>>
> >>> * Simon Chan - Salesforce
> >>>
> >>> * Marco Vivero - Salesforce
> >>>
> >>> * Matthew Tovbin - Salesforce
> >>>
> >>> * Yevgeny Khodorkovsky - Salesforce
> >>>
> >>> * Felipe Oliveira - Salesforce
> >>>
> >>> * Vitaly Gordon - Salesforce
> >>>
> >>> === Sponsors ===
> >>>
> >>> ==== Champion ====
> >>>
> >>> Andrew Purtell <apurtell at apache dot org>
> >>>
> >>> ==== Nominated Mentors ====
> >>>
> >>> * Andrew Purtell <apurtell at apache dot org>
> >>>
> >>> * James Taylor <jtaylor at apache dot org>
> >>>
> >>> * Lars Hofhansl <larsh at apache dot org>
> >>>
> >>> * Suneel Marthi <smarthi at apache dot org>
> >>>
> >>> * Xiangrui Meng <meng at apache dot org>
> >>>
> >>> * Luciano Resende <lresende at apache dot org>
> >>>
> >>> ==== Sponsoring Entity ====
> >>>
> >>> Apache Incubator PMC
> >>>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> >> For additional commands, e-mail: general-help@incubator.apache.org
> >>
> >>
> > --
> > Jean-Baptiste Onofré
> > jbonofre@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > For additional commands, e-mail: general-help@incubator.apache.org
> >
> >
>

Re: [DISCUSS] PredictionIO incubation proposal

Posted by Suneel Marthi <sm...@apache.org>.
I do have a question about the proposed list of committers.

Does the list also include all of those folks who were with PredictionIO
(and had contributed to the project) and then chose to leave when PIO was
acquired by Salesforce?




On Mon, May 16, 2016 at 1:13 PM, Jean-Baptiste Onofré <jb...@nanthrax.net>
wrote:

> By the way, we have some discussion about integrating Zeppelin with Beam ;)
>
> Regards
> JB
>
> On 05/15/2016 02:32 AM, Roman Shaposhnik wrote:
>
>> Super excited to see this proposal! This will finally allow us to have
>> an ASF managed
>> backend for next generation data-driven apps that I see emerging quite
>> rapidly.
>>
>> The proposal looks great to me (although I'd recommend calling Scala
>> as an implementation
>> language more prominently since it may attract additional developers
>> with affinity to it).
>>
>> I do have two questions about technology:
>>     1. do you think it would be possible to leverage Apache Beam
>> (incubating)
>>         for abstracting away dependency on execution frameworks? My
>> understanding
>>         is that PredictionIO currently only run on Spark.
>>     2. is there a potential integration with Apache Zeppelin possible?
>>
>> Thanks,
>> Roman.
>>
>> On Fri, May 13, 2016 at 1:41 PM, Andrew Purtell <ap...@apache.org>
>> wrote:
>>
>>> Greetings,
>>>
>>> It is my pleasure to
>>>
>>> propose the PredictionIO project for incubation at the Apache Software
>>> Foundation.
>>>
>>> PredictionIO is a
>>> popular
>>> open
>>>
>>> source Machine Learning Server built on top of a state-of-the-art open
>>> source stack, including several Apache technologies, that
>>>
>>> enables developers to manage and deploy production-ready predictive
>>> services for various kinds of machine learning tasks
>>> , with more than 400 production deployments around the world and a
>>> growing
>>> contributor community.
>>>
>>>
>>> The text of the proposal is included below and is also available at
>>> https://wiki.apache.org/incubator/PredictionIO
>>>
>>> Best regards,
>>> Andrew Purtell
>>>
>>>
>>> = PredictionIO Proposal =
>>>
>>> === Abstract ===
>>> PredictionIO is an open source Machine Learning Server built on top of
>>> state-of-the-art open source stack, that enables developers to manage and
>>> deploy production-ready predictive services for various kinds of machine
>>> learning tasks.
>>>
>>> === Proposal ===
>>> The PredictionIO platform consists of the following components:
>>>
>>>   * PredictionIO framework - provides the machine learning stack for
>>>   building, evaluating and deploying engines with machine learning
>>>   algorithms. It uses Apache Spark for processing.
>>>
>>>   * Event Server - the machine learning analytics layer for unifying
>>> events
>>>   from multiple platforms. It can use Apache HBase or any JDBC backends
>>>   as its data store.
>>>
>>> The PredictionIO community also maintains a
>>>
>>> Template Gallery, a place to
>>> publish and download (free or proprietary) engine templates for different
>>> types of machine learning applications, and is a complemental part of the
>>> project. At this point we exclude the Template Gallery from the proposal,
>>> as it has a separate set of contributors and we’re not familiar with an
>>> Apache approved mechanism to maintain such a gallery.
>>>
>>> You can find the Template Gallery at https://templates.prediction.io/
>>>
>>> === Background ===
>>> PredictionIO was started with a mission to democratize and bring machine
>>> learning to the masses.
>>>
>>> Machine learning has traditionally been a luxury for big companies like
>>> Google, Facebook, and Netflix. There are ML libraries and tools lying
>>> around the internet but the effort of putting them all together as a
>>> production-ready infrastructure is a very resource-intensive task that is
>>> remotely reachable by individuals or small businesses.
>>>
>>> PredictionIO is a production-ready, full stack machine learning system
>>> that
>>> allows organizations of any scale to quickly deploy machine learning
>>> capabilities. It comes with official and community-contributed machine
>>> learning engine templates that are easy to customize.
>>>
>>> === Rationale ===
>>> As usage and number of contributors to PredictionIO has grown bigger and
>>> more diverse, we have sought for an independent framework for the project
>>> to keep thriving. We believe the Apache foundation is a great fit.
>>> Joining
>>> Apache would ensure that tried and true processes and procedures are in
>>> place for the growing number of organizations interested in contributing
>>> to PredictionIO. PredictionIO is also a good fit for the Apache
>>> foundation.
>>> PredictionIO was built on top of several Apache projects (HBase, Spark,
>>> Hadoop). We are familiar with the Apache process and believe that the
>>> democratic and meritocratic nature of the foundation aligns with the
>>> project goals.
>>>
>>> === Initial Goals ===
>>> The initial milestones will be to move the existing codebase to Apache
>>> and
>>> integrate with the Apache development process. Once this is accomplished,
>>> we plan for incremental development and releases that follow the Apache
>>> guidelines, as well as growing our developer and user communities.
>>>
>>> === Current Status ===
>>> PredictionIO has undergone nine minor releases and many patches.
>>> PredictionIO is being used in production by Salesforce.com as well as
>>> many
>>> other organizations and apps. The PredictionIO codebase is currently
>>> hosted at GitHub, which will form the basis of the Apache git repository.
>>>
>>> ==== Meritocracy ====
>>> We plan to invest in supporting a meritocracy. We will discuss the
>>> requirements in an open forum. We intend to invite additional developers
>>> to participate. We will encourage and monitor community participation so
>>> that privileges can be extended to those that contribute.
>>>
>>> ==== Community ====
>>> Acceptance into the Apache foundation would bolster the already strong
>>> user and developer community around PredictionIO. That community includes
>>> many contributors from various other companies, and an active mailing
>>> list
>>> composed of hundreds of users.
>>>
>>> ==== Core Developers ====
>>> The core developers of our project are listed in our contributors and
>>> initial PPMC below. Though many are employed at Salesforce.com, there are
>>> also engineers from ActionML, and independent developers.
>>>
>>> === Alignment ===
>>> The ASF is the natural choice to host the PredictionIO project as its
>>> goal
>>> is democratizing Machine Learning by making it more easily accessible to
>>> every user/developer. PredictionIO is built on top of several top level
>>> Apache projects as outlined above.
>>>
>>> === Known Risks ===
>>>
>>> ==== Orphaned products ====
>>> PredictionIO has a solid and growing community. It is deployed on
>>> production environments by companies of all sizes to run various kinds of
>>> predictive engines.
>>>
>>> In addition to the community contribution to PredictionIO framework, the
>>> community is also actively contributing new engines to the Template
>>> Gallery as well as SDKs and documentation for the project. Salesforce is
>>> committed to utilize and advance the PredictionIO code base and support
>>> its user community.
>>>
>>> ==== Inexperience with Open Source ====
>>> PredictionIO has existed as a healthy open source project for almost two
>>> years and is the most starred Scala project on GitHub. All of the
>>> proposed
>>> committers have contributed to ASF and Linux Foundation open source
>>> projects. Several current committers on Apache projects and Apache
>>> Members
>>> are involved in this proposal and intend to provide mentorship.
>>>
>>> ==== Homogeneous Developers ====
>>> The initial list of committers includes developers from several
>>> institutions, including Salesforce, ActionML, Channel4, USC as well as
>>> unaffiliated developers.
>>>
>>> ==== Reliance on Salaried Developers ====
>>> Like most open source projects, PredictionIO receives substantial support
>>> from salaried developers. PredictionIO development is partially supported
>>> by Salesforce.com, but there are many contributors from various other
>>> companies, and an active mailing list composed of hundreds of users. We
>>> will continue our efforts to ensure stewardship of the project to be
>>> independent of salaried developers by meritocratically promoting those
>>> contributors to committers.
>>>
>>> ==== Relationships with Other Apache Product ====
>>> PredictionIO relies heavily on top level apache projects such as Apache
>>> Spark, HBase and Hadoop. However it brings a distinguished functionality,
>>> rather than just an abstraction - Machine Learning in a plug-and-play
>>> fashion.
>>>
>>> Compared to Apache Mahout, which focuses on the development of a wide
>>> variety of algorithms, PredictionIO offers a platform to manage the whole
>>> machine learning workflow, including data collection, data preparation,
>>> modeling, deployment and management of predictive services in production
>>> environments.
>>>
>>> ==== An Excessive Fascination with the Apache Brand ====
>>> PredictionIO is already a widely known open source project. This proposal
>>> is not for the purpose of generating publicity. Rather, the primary
>>> benefits to joining Apache are those outlined in the Rationale section.
>>>
>>> === Documentation ===
>>> PredictionIO boasts rich and live documentation, included in the code
>>> repo
>>> (docs/manual directory), is built with Middleman, and publicly hosted at
>>> https://docs.prediction.io
>>>
>>> === Initial Source and Intellectual Property Submission Plan ===
>>> Currently, the PredictionIO codebase is distributed under the Apache 2.0
>>> License and hosted on GitHub:
>>> https://github.com/PredictionIO/PredictionIO
>>>
>>> === External Dependencies ===
>>> PredictionIO has the following external dependencies:
>>>   * Apache Hadoop 2.4.0 (optional, required only if YARN and HDFS are
>>> needed)
>>>   * Apache Spark 1.3.0 for Hadoop 2.4
>>>   * Java SE Development Kit 8
>>>   * and one of the following sets:
>>>
>>>     * PostgreSQL 9.1
>>>
>>>
>>> or
>>>
>>>
>>> * MySQL 5.1
>>>
>>>   or
>>>
>>>
>>>   * Apache HBase 0.98.6
>>>
>>>
>>> * Elasticsearch 1.4.0
>>>
>>> Upon acceptance to the incubator, we would begin a thorough analysis of
>>> all transitive dependencies to verify this information and introduce
>>> license checking into the build and release process by integrating with
>>> Apache RAT.
>>>
>>> === Cryptography ===
>>> PredictionIO does not include cryptographic code. We utilize standard
>>> JCE and JSSE APIs provided by the Java Runtime Environment.
>>>
>>> === Required Resources ===
>>> We request that following resources be created for the project to use
>>>
>>> ==== Mailing lists ====
>>>
>>> predictionio-private@incubator.apache.org (with moderated subscriptions)
>>>
>>> predictionio-dev
>>>
>>> predictionio-user
>>>
>>> predictionio-commits
>>>
>>> We will migrate the existing PredictionIO mailing lists.
>>>
>>> ==== Git repository ====
>>> The PredictionIO team would like to use Git for source control, due to
>>> our
>>> current use of GitHub.
>>>
>>> git://git.apache.org/incubator-predictionio
>>>
>>> ==== Documentation ====
>>> https://predictionio.incubator.apache.org/docs/
>>>
>>> ==== JIRA instance ====
>>> PredictionIO currently uses the GitHub issue tracking system associated
>>> with its repository: https://github.com/PredictionIO/PredictionIO/issues
>>> .
>>> We will migrate to Apache JIRA.
>>>
>>> JIRA PREDICTIONIO
>>> https://issues.apache.org/jira/browse/PREDICTIONIO
>>>
>>> ==== Other Resources ====
>>> * TravisCI for builds and test running.
>>>
>>> * PredictionIO's documentation, included in the code repo (docs/manual
>>> directory), is built with Middleman and publicly hosted
>>> https://docs.prediction.io
>>>
>>> * A blog to drive adoption and excitement at https://blog.prediction.io
>>>
>>> === Initial Committers ===
>>>
>>> * Pat Ferrell
>>>
>>> * Tamas Jambor
>>>
>>> * Justin Yip
>>>
>>> * Xusen Yin
>>>
>>> * Lee Moon Soo
>>>
>>> * Donald Szeto
>>>
>>> * Kenneth Chan
>>>
>>> * Tom Chan
>>>
>>> * Simon Chan
>>>
>>> * Marco Vivero
>>>
>>> * Matthew Tovbin
>>>
>>> * Yevgeny Khodorkovsky
>>>
>>> * Felipe Oliveira
>>>
>>> * Vitaly Gordon
>>>
>>> === Affiliations ===
>>>
>>> * Pat Ferrell - ActionML
>>>
>>> * Tamas Jambor - Channel4
>>>
>>> * Justin Yip - independent
>>>
>>> * Xusen Yin - USC
>>>
>>> * Lee Moon Soo - NFLabs
>>>
>>> * Donald Szeto - Salesforce
>>>
>>> * Kenneth Chan - Salesforce
>>>
>>> * Tom Chan - Salesforce
>>>
>>> * Simon Chan - Salesforce
>>>
>>> * Marco Vivero - Salesforce
>>>
>>> * Matthew Tovbin - Salesforce
>>>
>>> * Yevgeny Khodorkovsky - Salesforce
>>>
>>> * Felipe Oliveira - Salesforce
>>>
>>> * Vitaly Gordon - Salesforce
>>>
>>> === Sponsors ===
>>>
>>> ==== Champion ====
>>>
>>> Andrew Purtell <apurtell at apache dot org>
>>>
>>> ==== Nominated Mentors ====
>>>
>>> * Andrew Purtell <apurtell at apache dot org>
>>>
>>> * James Taylor <jtaylor at apache dot org>
>>>
>>> * Lars Hofhansl <larsh at apache dot org>
>>>
>>> * Suneel Marthi <smarthi at apache dot org>
>>>
>>> * Xiangrui Meng <meng at apache dot org>
>>>
>>> * Luciano Resende <lresende at apache dot org>
>>>
>>> ==== Sponsoring Entity ====
>>>
>>> Apache Incubator PMC
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>> For additional commands, e-mail: general-help@incubator.apache.org
>>
>>
> --
> Jean-Baptiste Onofré
> jbonofre@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>

Re: [DISCUSS] PredictionIO incubation proposal

Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.
By the way, we have some discussion about integrating Zeppelin with Beam ;)

Regards
JB

On 05/15/2016 02:32 AM, Roman Shaposhnik wrote:
> Super excited to see this proposal! This will finally allow us to have
> an ASF managed
> backend for next generation data-driven apps that I see emerging quite rapidly.
>
> The proposal looks great to me (although I'd recommend calling Scala
> as an implementation
> language more prominently since it may attract additional developers
> with affinity to it).
>
> I do have two questions about technology:
>     1. do you think it would be possible to leverage Apache Beam (incubating)
>         for abstracting away dependency on execution frameworks? My understanding
>         is that PredictionIO currently only run on Spark.
>     2. is there a potential integration with Apache Zeppelin possible?
>
> Thanks,
> Roman.
>
> On Fri, May 13, 2016 at 1:41 PM, Andrew Purtell <ap...@apache.org> wrote:
>> Greetings,
>>
>> It is my pleasure to
>>
>> propose the PredictionIO project for incubation at the Apache Software
>> Foundation.
>>
>> PredictionIO is a
>> popular
>> open
>>
>> source Machine Learning Server built on top of a state-of-the-art open
>> source stack, including several Apache technologies, that
>>
>> enables developers to manage and deploy production-ready predictive
>> services for various kinds of machine learning tasks
>> , with more than 400 production deployments around the world and a growing
>> contributor community.
>>
>>
>> The text of the proposal is included below and is also available at
>> https://wiki.apache.org/incubator/PredictionIO
>>
>> Best regards,
>> Andrew Purtell
>>
>>
>> = PredictionIO Proposal =
>>
>> === Abstract ===
>> PredictionIO is an open source Machine Learning Server built on top of
>> state-of-the-art open source stack, that enables developers to manage and
>> deploy production-ready predictive services for various kinds of machine
>> learning tasks.
>>
>> === Proposal ===
>> The PredictionIO platform consists of the following components:
>>
>>   * PredictionIO framework - provides the machine learning stack for
>>   building, evaluating and deploying engines with machine learning
>>   algorithms. It uses Apache Spark for processing.
>>
>>   * Event Server - the machine learning analytics layer for unifying events
>>   from multiple platforms. It can use Apache HBase or any JDBC backends
>>   as its data store.
>>
>> The PredictionIO community also maintains a
>>
>> Template Gallery, a place to
>> publish and download (free or proprietary) engine templates for different
>> types of machine learning applications, and is a complemental part of the
>> project. At this point we exclude the Template Gallery from the proposal,
>> as it has a separate set of contributors and we\u2019re not familiar with an
>> Apache approved mechanism to maintain such a gallery.
>>
>> You can find the Template Gallery at https://templates.prediction.io/
>>
>> === Background ===
>> PredictionIO was started with a mission to democratize and bring machine
>> learning to the masses.
>>
>> Machine learning has traditionally been a luxury for big companies like
>> Google, Facebook, and Netflix. There are ML libraries and tools lying
>> around the internet but the effort of putting them all together as a
>> production-ready infrastructure is a very resource-intensive task that is
>> remotely reachable by individuals or small businesses.
>>
>> PredictionIO is a production-ready, full stack machine learning system that
>> allows organizations of any scale to quickly deploy machine learning
>> capabilities. It comes with official and community-contributed machine
>> learning engine templates that are easy to customize.
>>
>> === Rationale ===
>> As usage and number of contributors to PredictionIO has grown bigger and
>> more diverse, we have sought for an independent framework for the project
>> to keep thriving. We believe the Apache foundation is a great fit. Joining
>> Apache would ensure that tried and true processes and procedures are in
>> place for the growing number of organizations interested in contributing
>> to PredictionIO. PredictionIO is also a good fit for the Apache foundation.
>> PredictionIO was built on top of several Apache projects (HBase, Spark,
>> Hadoop). We are familiar with the Apache process and believe that the
>> democratic and meritocratic nature of the foundation aligns with the
>> project goals.
>>
>> === Initial Goals ===
>> The initial milestones will be to move the existing codebase to Apache and
>> integrate with the Apache development process. Once this is accomplished,
>> we plan for incremental development and releases that follow the Apache
>> guidelines, as well as growing our developer and user communities.
>>
>> === Current Status ===
>> PredictionIO has undergone nine minor releases and many patches.
>> PredictionIO is being used in production by Salesforce.com as well as many
>> other organizations and apps. The PredictionIO codebase is currently
>> hosted at GitHub, which will form the basis of the Apache git repository.
>>
>> ==== Meritocracy ====
>> We plan to invest in supporting a meritocracy. We will discuss the
>> requirements in an open forum. We intend to invite additional developers
>> to participate. We will encourage and monitor community participation so
>> that privileges can be extended to those that contribute.
>>
>> ==== Community ====
>> Acceptance into the Apache foundation would bolster the already strong
>> user and developer community around PredictionIO. That community includes
>> many contributors from various other companies, and an active mailing list
>> composed of hundreds of users.
>>
>> ==== Core Developers ====
>> The core developers of our project are listed in our contributors and
>> initial PPMC below. Though many are employed at Salesforce.com, there are
>> also engineers from ActionML, and independent developers.
>>
>> === Alignment ===
>> The ASF is the natural choice to host the PredictionIO project as its goal
>> is democratizing Machine Learning by making it more easily accessible to
>> every user/developer. PredictionIO is built on top of several top level
>> Apache projects as outlined above.
>>
>> === Known Risks ===
>>
>> ==== Orphaned products ====
>> PredictionIO has a solid and growing community. It is deployed on
>> production environments by companies of all sizes to run various kinds of
>> predictive engines.
>>
>> In addition to the community contribution to PredictionIO framework, the
>> community is also actively contributing new engines to the Template
>> Gallery as well as SDKs and documentation for the project. Salesforce is
>> committed to utilize and advance the PredictionIO code base and support
>> its user community.
>>
>> ==== Inexperience with Open Source ====
>> PredictionIO has existed as a healthy open source project for almost two
>> years and is the most starred Scala project on GitHub. All of the proposed
>> committers have contributed to ASF and Linux Foundation open source
>> projects. Several current committers on Apache projects and Apache Members
>> are involved in this proposal and intend to provide mentorship.
>>
>> ==== Homogeneous Developers ====
>> The initial list of committers includes developers from several
>> institutions, including Salesforce, ActionML, Channel4, USC as well as
>> unaffiliated developers.
>>
>> ==== Reliance on Salaried Developers ====
>> Like most open source projects, PredictionIO receives substantial support
>> from salaried developers. PredictionIO development is partially supported
>> by Salesforce.com, but there are many contributors from various other
>> companies, and an active mailing list composed of hundreds of users. We
>> will continue our efforts to ensure stewardship of the project to be
>> independent of salaried developers by meritocratically promoting those
>> contributors to committers.
>>
>> ==== Relationships with Other Apache Product ====
>> PredictionIO relies heavily on top level apache projects such as Apache
>> Spark, HBase and Hadoop. However it brings a distinguished functionality,
>> rather than just an abstraction - Machine Learning in a plug-and-play
>> fashion.
>>
>> Compared to Apache Mahout, which focuses on the development of a wide
>> variety of algorithms, PredictionIO offers a platform to manage the whole
>> machine learning workflow, including data collection, data preparation,
>> modeling, deployment and management of predictive services in production
>> environments.
>>
>> ==== An Excessive Fascination with the Apache Brand ====
>> PredictionIO is already a widely known open source project. This proposal
>> is not for the purpose of generating publicity. Rather, the primary
>> benefits to joining Apache are those outlined in the Rationale section.
>>
>> === Documentation ===
>> PredictionIO boasts rich and live documentation, included in the code repo
>> (docs/manual directory), is built with Middleman, and publicly hosted at
>> https://docs.prediction.io
>>
>> === Initial Source and Intellectual Property Submission Plan ===
>> Currently, the PredictionIO codebase is distributed under the Apache 2.0
>> License and hosted on GitHub: https://github.com/PredictionIO/PredictionIO
>>
>> === External Dependencies ===
>> PredictionIO has the following external dependencies:
>>   * Apache Hadoop 2.4.0 (optional, required only if YARN and HDFS are needed)
>>   * Apache Spark 1.3.0 for Hadoop 2.4
>>   * Java SE Development Kit 8
>>   * and one of the following sets:
>>
>>     * PostgreSQL 9.1
>>
>>
>> or
>>
>>
>> * MySQL 5.1
>>
>>   or
>>
>>
>>   * Apache HBase 0.98.6
>>
>>
>> * Elasticsearch 1.4.0
>>
>> Upon acceptance to the incubator, we would begin a thorough analysis of
>> all transitive dependencies to verify this information and introduce
>> license checking into the build and release process by integrating with
>> Apache RAT.
>>
>> === Cryptography ===
>> PredictionIO does not include cryptographic code. We utilize standard
>> JCE and JSSE APIs provided by the Java Runtime Environment.
>>
>> === Required Resources ===
>> We request that following resources be created for the project to use
>>
>> ==== Mailing lists ====
>>
>> predictionio-private@incubator.apache.org (with moderated subscriptions)
>>
>> predictionio-dev
>>
>> predictionio-user
>>
>> predictionio-commits
>>
>> We will migrate the existing PredictionIO mailing lists.
>>
>> ==== Git repository ====
>> The PredictionIO team would like to use Git for source control, due to our
>> current use of GitHub.
>>
>> git://git.apache.org/incubator-predictionio
>>
>> ==== Documentation ====
>> https://predictionio.incubator.apache.org/docs/
>>
>> ==== JIRA instance ====
>> PredictionIO currently uses the GitHub issue tracking system associated
>> with its repository: https://github.com/PredictionIO/PredictionIO/issues.
>> We will migrate to Apache JIRA.
>>
>> JIRA PREDICTIONIO
>> https://issues.apache.org/jira/browse/PREDICTIONIO
>>
>> ==== Other Resources ====
>> * TravisCI for builds and test running.
>>
>> * PredictionIO's documentation, included in the code repo (docs/manual
>> directory), is built with Middleman and publicly hosted
>> https://docs.prediction.io
>>
>> * A blog to drive adoption and excitement at https://blog.prediction.io
>>
>> === Initial Committers ===
>>
>> * Pat Ferrell
>>
>> * Tamas Jambor
>>
>> * Justin Yip
>>
>> * Xusen Yin
>>
>> * Lee Moon Soo
>>
>> * Donald Szeto
>>
>> * Kenneth Chan
>>
>> * Tom Chan
>>
>> * Simon Chan
>>
>> * Marco Vivero
>>
>> * Matthew Tovbin
>>
>> * Yevgeny Khodorkovsky
>>
>> * Felipe Oliveira
>>
>> * Vitaly Gordon
>>
>> === Affiliations ===
>>
>> * Pat Ferrell - ActionML
>>
>> * Tamas Jambor - Channel4
>>
>> * Justin Yip - independent
>>
>> * Xusen Yin - USC
>>
>> * Lee Moon Soo - NFLabs
>>
>> * Donald Szeto - Salesforce
>>
>> * Kenneth Chan - Salesforce
>>
>> * Tom Chan - Salesforce
>>
>> * Simon Chan - Salesforce
>>
>> * Marco Vivero - Salesforce
>>
>> * Matthew Tovbin - Salesforce
>>
>> * Yevgeny Khodorkovsky - Salesforce
>>
>> * Felipe Oliveira - Salesforce
>>
>> * Vitaly Gordon - Salesforce
>>
>> === Sponsors ===
>>
>> ==== Champion ====
>>
>> Andrew Purtell <apurtell at apache dot org>
>>
>> ==== Nominated Mentors ====
>>
>> * Andrew Purtell <apurtell at apache dot org>
>>
>> * James Taylor <jtaylor at apache dot org>
>>
>> * Lars Hofhansl <larsh at apache dot org>
>>
>> * Suneel Marthi <smarthi at apache dot org>
>>
>> * Xiangrui Meng <meng at apache dot org>
>>
>> * Luciano Resende <lresende at apache dot org>
>>
>> ==== Sponsoring Entity ====
>>
>> Apache Incubator PMC
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>

-- 
Jean-Baptiste Onofr�
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [DISCUSS] PredictionIO incubation proposal

Posted by Simon Chan <si...@salesforce.com>.
Thanks Roman.

1. Apache Beam looks promising. I agree it can potentially be extremely
useful in, for example, Data Preparator of DASE-architecture engine of
PredictionIO so it can leverage Spark/Flink/Google Dataflow.  Look forward
to hearing more about it.

2. The integration with Apache Zeppelin is definitely a great suggestion.
In fact, Lee Moon Soo, an initial committer of Zeppelin is also listed as
committer in this proposal. Some works have been done previously (
https://docs.prediction.io/datacollection/analytics-zeppelin/) but I
anticipate a tighter collaboration with Apache Zeppelin after PredictionIO
becomes an Apache project.

Regards,
Simon

On Saturday, May 14, 2016, Andrew Purtell <an...@gmail.com> wrote:

> Yikes, apologies for the formatting. It looked fine in Gmail when I sent
> it alas.
>
> I must let the proposers respond to the technical questions but I think I
> can make the general observation that would-be contributors proposing and
> performing work on new and better Apache ecosystem integrations would be
> excellent for the health of the new podling and the ecosystem at large.
>
>
> > On May 14, 2016, at 5:32 PM, Roman Shaposhnik <roman@shaposhnik.org
> <javascript:;>> wrote:
> >
> > Super excited to see this proposal! This will finally allow us to have
> > an ASF managed
> > backend for next generation data-driven apps that I see emerging quite
> rapidly.
> >
> > The proposal looks great to me (although I'd recommend calling Scala
> > as an implementation
> > language more prominently since it may attract additional developers
> > with affinity to it).
> >
> > I do have two questions about technology:
> >   1. do you think it would be possible to leverage Apache Beam
> (incubating)
> >       for abstracting away dependency on execution frameworks? My
> understanding
> >       is that PredictionIO currently only run on Spark.
> >   2. is there a potential integration with Apache Zeppelin possible?
> >
> > Thanks,
> > Roman.
> >
> >> On Fri, May 13, 2016 at 1:41 PM, Andrew Purtell <apurtell@apache.org
> <javascript:;>> wrote:
> >> Greetings,
> >>
> >> It is my pleasure to
> >>
> >> propose the PredictionIO project for incubation at the Apache Software
> >> Foundation.
> >>
> >> PredictionIO is a
> >> popular
> >> open
> >>
> >> source Machine Learning Server built on top of a state-of-the-art open
> >> source stack, including several Apache technologies, that
> >>
> >> enables developers to manage and deploy production-ready predictive
> >> services for various kinds of machine learning tasks
> >> , with more than 400 production deployments around the world and a
> growing
> >> contributor community.
> >>
> >>
> >> The text of the proposal is included below and is also available at
> >> https://wiki.apache.org/incubator/PredictionIO
> >>
> >> Best regards,
> >> Andrew Purtell
> >>
> >>
> >> = PredictionIO Proposal =
> >>
> >> === Abstract ===
> >> PredictionIO is an open source Machine Learning Server built on top of
> >> state-of-the-art open source stack, that enables developers to manage
> and
> >> deploy production-ready predictive services for various kinds of machine
> >> learning tasks.
> >>
> >> === Proposal ===
> >> The PredictionIO platform consists of the following components:
> >>
> >> * PredictionIO framework - provides the machine learning stack for
> >> building, evaluating and deploying engines with machine learning
> >> algorithms. It uses Apache Spark for processing.
> >>
> >> * Event Server - the machine learning analytics layer for unifying
> events
> >> from multiple platforms. It can use Apache HBase or any JDBC backends
> >> as its data store.
> >>
> >> The PredictionIO community also maintains a
> >>
> >> Template Gallery, a place to
> >> publish and download (free or proprietary) engine templates for
> different
> >> types of machine learning applications, and is a complemental part of
> the
> >> project. At this point we exclude the Template Gallery from the
> proposal,
> >> as it has a separate set of contributors and we’re not familiar with an
> >> Apache approved mechanism to maintain such a gallery.
> >>
> >> You can find the Template Gallery at https://templates.prediction.io/
> >>
> >> === Background ===
> >> PredictionIO was started with a mission to democratize and bring machine
> >> learning to the masses.
> >>
> >> Machine learning has traditionally been a luxury for big companies like
> >> Google, Facebook, and Netflix. There are ML libraries and tools lying
> >> around the internet but the effort of putting them all together as a
> >> production-ready infrastructure is a very resource-intensive task that
> is
> >> remotely reachable by individuals or small businesses.
> >>
> >> PredictionIO is a production-ready, full stack machine learning system
> that
> >> allows organizations of any scale to quickly deploy machine learning
> >> capabilities. It comes with official and community-contributed machine
> >> learning engine templates that are easy to customize.
> >>
> >> === Rationale ===
> >> As usage and number of contributors to PredictionIO has grown bigger and
> >> more diverse, we have sought for an independent framework for the
> project
> >> to keep thriving. We believe the Apache foundation is a great fit.
> Joining
> >> Apache would ensure that tried and true processes and procedures are in
> >> place for the growing number of organizations interested in contributing
> >> to PredictionIO. PredictionIO is also a good fit for the Apache
> foundation.
> >> PredictionIO was built on top of several Apache projects (HBase, Spark,
> >> Hadoop). We are familiar with the Apache process and believe that the
> >> democratic and meritocratic nature of the foundation aligns with the
> >> project goals.
> >>
> >> === Initial Goals ===
> >> The initial milestones will be to move the existing codebase to Apache
> and
> >> integrate with the Apache development process. Once this is
> accomplished,
> >> we plan for incremental development and releases that follow the Apache
> >> guidelines, as well as growing our developer and user communities.
> >>
> >> === Current Status ===
> >> PredictionIO has undergone nine minor releases and many patches.
> >> PredictionIO is being used in production by Salesforce.com as well as
> many
> >> other organizations and apps. The PredictionIO codebase is currently
> >> hosted at GitHub, which will form the basis of the Apache git
> repository.
> >>
> >> ==== Meritocracy ====
> >> We plan to invest in supporting a meritocracy. We will discuss the
> >> requirements in an open forum. We intend to invite additional developers
> >> to participate. We will encourage and monitor community participation so
> >> that privileges can be extended to those that contribute.
> >>
> >> ==== Community ====
> >> Acceptance into the Apache foundation would bolster the already strong
> >> user and developer community around PredictionIO. That community
> includes
> >> many contributors from various other companies, and an active mailing
> list
> >> composed of hundreds of users.
> >>
> >> ==== Core Developers ====
> >> The core developers of our project are listed in our contributors and
> >> initial PPMC below. Though many are employed at Salesforce.com, there
> are
> >> also engineers from ActionML, and independent developers.
> >>
> >> === Alignment ===
> >> The ASF is the natural choice to host the PredictionIO project as its
> goal
> >> is democratizing Machine Learning by making it more easily accessible to
> >> every user/developer. PredictionIO is built on top of several top level
> >> Apache projects as outlined above.
> >>
> >> === Known Risks ===
> >>
> >> ==== Orphaned products ====
> >> PredictionIO has a solid and growing community. It is deployed on
> >> production environments by companies of all sizes to run various kinds
> of
> >> predictive engines.
> >>
> >> In addition to the community contribution to PredictionIO framework, the
> >> community is also actively contributing new engines to the Template
> >> Gallery as well as SDKs and documentation for the project. Salesforce is
> >> committed to utilize and advance the PredictionIO code base and support
> >> its user community.
> >>
> >> ==== Inexperience with Open Source ====
> >> PredictionIO has existed as a healthy open source project for almost two
> >> years and is the most starred Scala project on GitHub. All of the
> proposed
> >> committers have contributed to ASF and Linux Foundation open source
> >> projects. Several current committers on Apache projects and Apache
> Members
> >> are involved in this proposal and intend to provide mentorship.
> >>
> >> ==== Homogeneous Developers ====
> >> The initial list of committers includes developers from several
> >> institutions, including Salesforce, ActionML, Channel4, USC as well as
> >> unaffiliated developers.
> >>
> >> ==== Reliance on Salaried Developers ====
> >> Like most open source projects, PredictionIO receives substantial
> support
> >> from salaried developers. PredictionIO development is partially
> supported
> >> by Salesforce.com, but there are many contributors from various other
> >> companies, and an active mailing list composed of hundreds of users. We
> >> will continue our efforts to ensure stewardship of the project to be
> >> independent of salaried developers by meritocratically promoting those
> >> contributors to committers.
> >>
> >> ==== Relationships with Other Apache Product ====
> >> PredictionIO relies heavily on top level apache projects such as Apache
> >> Spark, HBase and Hadoop. However it brings a distinguished
> functionality,
> >> rather than just an abstraction - Machine Learning in a plug-and-play
> >> fashion.
> >>
> >> Compared to Apache Mahout, which focuses on the development of a wide
> >> variety of algorithms, PredictionIO offers a platform to manage the
> whole
> >> machine learning workflow, including data collection, data preparation,
> >> modeling, deployment and management of predictive services in production
> >> environments.
> >>
> >> ==== An Excessive Fascination with the Apache Brand ====
> >> PredictionIO is already a widely known open source project. This
> proposal
> >> is not for the purpose of generating publicity. Rather, the primary
> >> benefits to joining Apache are those outlined in the Rationale section.
> >>
> >> === Documentation ===
> >> PredictionIO boasts rich and live documentation, included in the code
> repo
> >> (docs/manual directory), is built with Middleman, and publicly hosted at
> >> https://docs.prediction.io
> >>
> >> === Initial Source and Intellectual Property Submission Plan ===
> >> Currently, the PredictionIO codebase is distributed under the Apache 2.0
> >> License and hosted on GitHub:
> https://github.com/PredictionIO/PredictionIO
> >>
> >> === External Dependencies ===
> >> PredictionIO has the following external dependencies:
> >> * Apache Hadoop 2.4.0 (optional, required only if YARN and HDFS are
> needed)
> >> * Apache Spark 1.3.0 for Hadoop 2.4
> >> * Java SE Development Kit 8
> >> * and one of the following sets:
> >>
> >>   * PostgreSQL 9.1
> >>
> >>
> >> or
> >>
> >>
> >> * MySQL 5.1
> >>
> >> or
> >>
> >>
> >> * Apache HBase 0.98.6
> >>
> >>
> >> * Elasticsearch 1.4.0
> >>
> >> Upon acceptance to the incubator, we would begin a thorough analysis of
> >> all transitive dependencies to verify this information and introduce
> >> license checking into the build and release process by integrating with
> >> Apache RAT.
> >>
> >> === Cryptography ===
> >> PredictionIO does not include cryptographic code. We utilize standard
> >> JCE and JSSE APIs provided by the Java Runtime Environment.
> >>
> >> === Required Resources ===
> >> We request that following resources be created for the project to use
> >>
> >> ==== Mailing lists ====
> >>
> >> predictionio-private@incubator.apache.org <javascript:;> (with
> moderated subscriptions)
> >>
> >> predictionio-dev
> >>
> >> predictionio-user
> >>
> >> predictionio-commits
> >>
> >> We will migrate the existing PredictionIO mailing lists.
> >>
> >> ==== Git repository ====
> >> The PredictionIO team would like to use Git for source control, due to
> our
> >> current use of GitHub.
> >>
> >> git://git.apache.org/incubator-predictionio
> >>
> >> ==== Documentation ====
> >> https://predictionio.incubator.apache.org/docs/
> >>
> >> ==== JIRA instance ====
> >> PredictionIO currently uses the GitHub issue tracking system associated
> >> with its repository:
> https://github.com/PredictionIO/PredictionIO/issues.
> >> We will migrate to Apache JIRA.
> >>
> >> JIRA PREDICTIONIO
> >> https://issues.apache.org/jira/browse/PREDICTIONIO
> >>
> >> ==== Other Resources ====
> >> * TravisCI for builds and test running.
> >>
> >> * PredictionIO's documentation, included in the code repo (docs/manual
> >> directory), is built with Middleman and publicly hosted
> >> https://docs.prediction.io
> >>
> >> * A blog to drive adoption and excitement at https://blog.prediction.io
> >>
> >> === Initial Committers ===
> >>
> >> * Pat Ferrell
> >>
> >> * Tamas Jambor
> >>
> >> * Justin Yip
> >>
> >> * Xusen Yin
> >>
> >> * Lee Moon Soo
> >>
> >> * Donald Szeto
> >>
> >> * Kenneth Chan
> >>
> >> * Tom Chan
> >>
> >> * Simon Chan
> >>
> >> * Marco Vivero
> >>
> >> * Matthew Tovbin
> >>
> >> * Yevgeny Khodorkovsky
> >>
> >> * Felipe Oliveira
> >>
> >> * Vitaly Gordon
> >>
> >> === Affiliations ===
> >>
> >> * Pat Ferrell - ActionML
> >>
> >> * Tamas Jambor - Channel4
> >>
> >> * Justin Yip - independent
> >>
> >> * Xusen Yin - USC
> >>
> >> * Lee Moon Soo - NFLabs
> >>
> >> * Donald Szeto - Salesforce
> >>
> >> * Kenneth Chan - Salesforce
> >>
> >> * Tom Chan - Salesforce
> >>
> >> * Simon Chan - Salesforce
> >>
> >> * Marco Vivero - Salesforce
> >>
> >> * Matthew Tovbin - Salesforce
> >>
> >> * Yevgeny Khodorkovsky - Salesforce
> >>
> >> * Felipe Oliveira - Salesforce
> >>
> >> * Vitaly Gordon - Salesforce
> >>
> >> === Sponsors ===
> >>
> >> ==== Champion ====
> >>
> >> Andrew Purtell <apurtell at apache dot org>
> >>
> >> ==== Nominated Mentors ====
> >>
> >> * Andrew Purtell <apurtell at apache dot org>
> >>
> >> * James Taylor <jtaylor at apache dot org>
> >>
> >> * Lars Hofhansl <larsh at apache dot org>
> >>
> >> * Suneel Marthi <smarthi at apache dot org>
> >>
> >> * Xiangrui Meng <meng at apache dot org>
> >>
> >> * Luciano Resende <lresende at apache dot org>
> >>
> >> ==== Sponsoring Entity ====
> >>
> >> Apache Incubator PMC
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> <javascript:;>
> > For additional commands, e-mail: general-help@incubator.apache.org
> <javascript:;>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> <javascript:;>
> For additional commands, e-mail: general-help@incubator.apache.org
> <javascript:;>
>
>

Re: [DISCUSS] PredictionIO incubation proposal

Posted by Andrew Purtell <an...@gmail.com>.
Yikes, apologies for the formatting. It looked fine in Gmail when I sent it alas. 

I must let the proposers respond to the technical questions but I think I can make the general observation that would-be contributors proposing and performing work on new and better Apache ecosystem integrations would be excellent for the health of the new podling and the ecosystem at large. 


> On May 14, 2016, at 5:32 PM, Roman Shaposhnik <ro...@shaposhnik.org> wrote:
> 
> Super excited to see this proposal! This will finally allow us to have
> an ASF managed
> backend for next generation data-driven apps that I see emerging quite rapidly.
> 
> The proposal looks great to me (although I'd recommend calling Scala
> as an implementation
> language more prominently since it may attract additional developers
> with affinity to it).
> 
> I do have two questions about technology:
>   1. do you think it would be possible to leverage Apache Beam (incubating)
>       for abstracting away dependency on execution frameworks? My understanding
>       is that PredictionIO currently only run on Spark.
>   2. is there a potential integration with Apache Zeppelin possible?
> 
> Thanks,
> Roman.
> 
>> On Fri, May 13, 2016 at 1:41 PM, Andrew Purtell <ap...@apache.org> wrote:
>> Greetings,
>> 
>> It is my pleasure to
>> 
>> propose the PredictionIO project for incubation at the Apache Software
>> Foundation.
>> 
>> PredictionIO is a
>> popular
>> open
>> 
>> source Machine Learning Server built on top of a state-of-the-art open
>> source stack, including several Apache technologies, that
>> 
>> enables developers to manage and deploy production-ready predictive
>> services for various kinds of machine learning tasks
>> , with more than 400 production deployments around the world and a growing
>> contributor community.
>> 
>> 
>> The text of the proposal is included below and is also available at
>> https://wiki.apache.org/incubator/PredictionIO
>> 
>> Best regards,
>> Andrew Purtell
>> 
>> 
>> = PredictionIO Proposal =
>> 
>> === Abstract ===
>> PredictionIO is an open source Machine Learning Server built on top of
>> state-of-the-art open source stack, that enables developers to manage and
>> deploy production-ready predictive services for various kinds of machine
>> learning tasks.
>> 
>> === Proposal ===
>> The PredictionIO platform consists of the following components:
>> 
>> * PredictionIO framework - provides the machine learning stack for
>> building, evaluating and deploying engines with machine learning
>> algorithms. It uses Apache Spark for processing.
>> 
>> * Event Server - the machine learning analytics layer for unifying events
>> from multiple platforms. It can use Apache HBase or any JDBC backends
>> as its data store.
>> 
>> The PredictionIO community also maintains a
>> 
>> Template Gallery, a place to
>> publish and download (free or proprietary) engine templates for different
>> types of machine learning applications, and is a complemental part of the
>> project. At this point we exclude the Template Gallery from the proposal,
>> as it has a separate set of contributors and we’re not familiar with an
>> Apache approved mechanism to maintain such a gallery.
>> 
>> You can find the Template Gallery at https://templates.prediction.io/
>> 
>> === Background ===
>> PredictionIO was started with a mission to democratize and bring machine
>> learning to the masses.
>> 
>> Machine learning has traditionally been a luxury for big companies like
>> Google, Facebook, and Netflix. There are ML libraries and tools lying
>> around the internet but the effort of putting them all together as a
>> production-ready infrastructure is a very resource-intensive task that is
>> remotely reachable by individuals or small businesses.
>> 
>> PredictionIO is a production-ready, full stack machine learning system that
>> allows organizations of any scale to quickly deploy machine learning
>> capabilities. It comes with official and community-contributed machine
>> learning engine templates that are easy to customize.
>> 
>> === Rationale ===
>> As usage and number of contributors to PredictionIO has grown bigger and
>> more diverse, we have sought for an independent framework for the project
>> to keep thriving. We believe the Apache foundation is a great fit. Joining
>> Apache would ensure that tried and true processes and procedures are in
>> place for the growing number of organizations interested in contributing
>> to PredictionIO. PredictionIO is also a good fit for the Apache foundation.
>> PredictionIO was built on top of several Apache projects (HBase, Spark,
>> Hadoop). We are familiar with the Apache process and believe that the
>> democratic and meritocratic nature of the foundation aligns with the
>> project goals.
>> 
>> === Initial Goals ===
>> The initial milestones will be to move the existing codebase to Apache and
>> integrate with the Apache development process. Once this is accomplished,
>> we plan for incremental development and releases that follow the Apache
>> guidelines, as well as growing our developer and user communities.
>> 
>> === Current Status ===
>> PredictionIO has undergone nine minor releases and many patches.
>> PredictionIO is being used in production by Salesforce.com as well as many
>> other organizations and apps. The PredictionIO codebase is currently
>> hosted at GitHub, which will form the basis of the Apache git repository.
>> 
>> ==== Meritocracy ====
>> We plan to invest in supporting a meritocracy. We will discuss the
>> requirements in an open forum. We intend to invite additional developers
>> to participate. We will encourage and monitor community participation so
>> that privileges can be extended to those that contribute.
>> 
>> ==== Community ====
>> Acceptance into the Apache foundation would bolster the already strong
>> user and developer community around PredictionIO. That community includes
>> many contributors from various other companies, and an active mailing list
>> composed of hundreds of users.
>> 
>> ==== Core Developers ====
>> The core developers of our project are listed in our contributors and
>> initial PPMC below. Though many are employed at Salesforce.com, there are
>> also engineers from ActionML, and independent developers.
>> 
>> === Alignment ===
>> The ASF is the natural choice to host the PredictionIO project as its goal
>> is democratizing Machine Learning by making it more easily accessible to
>> every user/developer. PredictionIO is built on top of several top level
>> Apache projects as outlined above.
>> 
>> === Known Risks ===
>> 
>> ==== Orphaned products ====
>> PredictionIO has a solid and growing community. It is deployed on
>> production environments by companies of all sizes to run various kinds of
>> predictive engines.
>> 
>> In addition to the community contribution to PredictionIO framework, the
>> community is also actively contributing new engines to the Template
>> Gallery as well as SDKs and documentation for the project. Salesforce is
>> committed to utilize and advance the PredictionIO code base and support
>> its user community.
>> 
>> ==== Inexperience with Open Source ====
>> PredictionIO has existed as a healthy open source project for almost two
>> years and is the most starred Scala project on GitHub. All of the proposed
>> committers have contributed to ASF and Linux Foundation open source
>> projects. Several current committers on Apache projects and Apache Members
>> are involved in this proposal and intend to provide mentorship.
>> 
>> ==== Homogeneous Developers ====
>> The initial list of committers includes developers from several
>> institutions, including Salesforce, ActionML, Channel4, USC as well as
>> unaffiliated developers.
>> 
>> ==== Reliance on Salaried Developers ====
>> Like most open source projects, PredictionIO receives substantial support
>> from salaried developers. PredictionIO development is partially supported
>> by Salesforce.com, but there are many contributors from various other
>> companies, and an active mailing list composed of hundreds of users. We
>> will continue our efforts to ensure stewardship of the project to be
>> independent of salaried developers by meritocratically promoting those
>> contributors to committers.
>> 
>> ==== Relationships with Other Apache Product ====
>> PredictionIO relies heavily on top level apache projects such as Apache
>> Spark, HBase and Hadoop. However it brings a distinguished functionality,
>> rather than just an abstraction - Machine Learning in a plug-and-play
>> fashion.
>> 
>> Compared to Apache Mahout, which focuses on the development of a wide
>> variety of algorithms, PredictionIO offers a platform to manage the whole
>> machine learning workflow, including data collection, data preparation,
>> modeling, deployment and management of predictive services in production
>> environments.
>> 
>> ==== An Excessive Fascination with the Apache Brand ====
>> PredictionIO is already a widely known open source project. This proposal
>> is not for the purpose of generating publicity. Rather, the primary
>> benefits to joining Apache are those outlined in the Rationale section.
>> 
>> === Documentation ===
>> PredictionIO boasts rich and live documentation, included in the code repo
>> (docs/manual directory), is built with Middleman, and publicly hosted at
>> https://docs.prediction.io
>> 
>> === Initial Source and Intellectual Property Submission Plan ===
>> Currently, the PredictionIO codebase is distributed under the Apache 2.0
>> License and hosted on GitHub: https://github.com/PredictionIO/PredictionIO
>> 
>> === External Dependencies ===
>> PredictionIO has the following external dependencies:
>> * Apache Hadoop 2.4.0 (optional, required only if YARN and HDFS are needed)
>> * Apache Spark 1.3.0 for Hadoop 2.4
>> * Java SE Development Kit 8
>> * and one of the following sets:
>> 
>>   * PostgreSQL 9.1
>> 
>> 
>> or
>> 
>> 
>> * MySQL 5.1
>> 
>> or
>> 
>> 
>> * Apache HBase 0.98.6
>> 
>> 
>> * Elasticsearch 1.4.0
>> 
>> Upon acceptance to the incubator, we would begin a thorough analysis of
>> all transitive dependencies to verify this information and introduce
>> license checking into the build and release process by integrating with
>> Apache RAT.
>> 
>> === Cryptography ===
>> PredictionIO does not include cryptographic code. We utilize standard
>> JCE and JSSE APIs provided by the Java Runtime Environment.
>> 
>> === Required Resources ===
>> We request that following resources be created for the project to use
>> 
>> ==== Mailing lists ====
>> 
>> predictionio-private@incubator.apache.org (with moderated subscriptions)
>> 
>> predictionio-dev
>> 
>> predictionio-user
>> 
>> predictionio-commits
>> 
>> We will migrate the existing PredictionIO mailing lists.
>> 
>> ==== Git repository ====
>> The PredictionIO team would like to use Git for source control, due to our
>> current use of GitHub.
>> 
>> git://git.apache.org/incubator-predictionio
>> 
>> ==== Documentation ====
>> https://predictionio.incubator.apache.org/docs/
>> 
>> ==== JIRA instance ====
>> PredictionIO currently uses the GitHub issue tracking system associated
>> with its repository: https://github.com/PredictionIO/PredictionIO/issues.
>> We will migrate to Apache JIRA.
>> 
>> JIRA PREDICTIONIO
>> https://issues.apache.org/jira/browse/PREDICTIONIO
>> 
>> ==== Other Resources ====
>> * TravisCI for builds and test running.
>> 
>> * PredictionIO's documentation, included in the code repo (docs/manual
>> directory), is built with Middleman and publicly hosted
>> https://docs.prediction.io
>> 
>> * A blog to drive adoption and excitement at https://blog.prediction.io
>> 
>> === Initial Committers ===
>> 
>> * Pat Ferrell
>> 
>> * Tamas Jambor
>> 
>> * Justin Yip
>> 
>> * Xusen Yin
>> 
>> * Lee Moon Soo
>> 
>> * Donald Szeto
>> 
>> * Kenneth Chan
>> 
>> * Tom Chan
>> 
>> * Simon Chan
>> 
>> * Marco Vivero
>> 
>> * Matthew Tovbin
>> 
>> * Yevgeny Khodorkovsky
>> 
>> * Felipe Oliveira
>> 
>> * Vitaly Gordon
>> 
>> === Affiliations ===
>> 
>> * Pat Ferrell - ActionML
>> 
>> * Tamas Jambor - Channel4
>> 
>> * Justin Yip - independent
>> 
>> * Xusen Yin - USC
>> 
>> * Lee Moon Soo - NFLabs
>> 
>> * Donald Szeto - Salesforce
>> 
>> * Kenneth Chan - Salesforce
>> 
>> * Tom Chan - Salesforce
>> 
>> * Simon Chan - Salesforce
>> 
>> * Marco Vivero - Salesforce
>> 
>> * Matthew Tovbin - Salesforce
>> 
>> * Yevgeny Khodorkovsky - Salesforce
>> 
>> * Felipe Oliveira - Salesforce
>> 
>> * Vitaly Gordon - Salesforce
>> 
>> === Sponsors ===
>> 
>> ==== Champion ====
>> 
>> Andrew Purtell <apurtell at apache dot org>
>> 
>> ==== Nominated Mentors ====
>> 
>> * Andrew Purtell <apurtell at apache dot org>
>> 
>> * James Taylor <jtaylor at apache dot org>
>> 
>> * Lars Hofhansl <larsh at apache dot org>
>> 
>> * Suneel Marthi <smarthi at apache dot org>
>> 
>> * Xiangrui Meng <meng at apache dot org>
>> 
>> * Luciano Resende <lresende at apache dot org>
>> 
>> ==== Sponsoring Entity ====
>> 
>> Apache Incubator PMC
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [DISCUSS] PredictionIO incubation proposal

Posted by Roman Shaposhnik <ro...@shaposhnik.org>.
Super excited to see this proposal! This will finally allow us to have
an ASF managed
backend for next generation data-driven apps that I see emerging quite rapidly.

The proposal looks great to me (although I'd recommend calling Scala
as an implementation
language more prominently since it may attract additional developers
with affinity to it).

I do have two questions about technology:
   1. do you think it would be possible to leverage Apache Beam (incubating)
       for abstracting away dependency on execution frameworks? My understanding
       is that PredictionIO currently only run on Spark.
   2. is there a potential integration with Apache Zeppelin possible?

Thanks,
Roman.

On Fri, May 13, 2016 at 1:41 PM, Andrew Purtell <ap...@apache.org> wrote:
> Greetings,
>
> It is my pleasure to
>
> propose the PredictionIO project for incubation at the Apache Software
> Foundation.
>
> PredictionIO is a
> popular
> open
>
> source Machine Learning Server built on top of a state-of-the-art open
> source stack, including several Apache technologies, that
>
> enables developers to manage and deploy production-ready predictive
> services for various kinds of machine learning tasks
> , with more than 400 production deployments around the world and a growing
> contributor community.
>
>
> The text of the proposal is included below and is also available at
> https://wiki.apache.org/incubator/PredictionIO
>
> Best regards,
> Andrew Purtell
>
>
> = PredictionIO Proposal =
>
> === Abstract ===
> PredictionIO is an open source Machine Learning Server built on top of
> state-of-the-art open source stack, that enables developers to manage and
> deploy production-ready predictive services for various kinds of machine
> learning tasks.
>
> === Proposal ===
> The PredictionIO platform consists of the following components:
>
>  * PredictionIO framework - provides the machine learning stack for
>  building, evaluating and deploying engines with machine learning
>  algorithms. It uses Apache Spark for processing.
>
>  * Event Server - the machine learning analytics layer for unifying events
>  from multiple platforms. It can use Apache HBase or any JDBC backends
>  as its data store.
>
> The PredictionIO community also maintains a
>
> Template Gallery, a place to
> publish and download (free or proprietary) engine templates for different
> types of machine learning applications, and is a complemental part of the
> project. At this point we exclude the Template Gallery from the proposal,
> as it has a separate set of contributors and we’re not familiar with an
> Apache approved mechanism to maintain such a gallery.
>
> You can find the Template Gallery at https://templates.prediction.io/
>
> === Background ===
> PredictionIO was started with a mission to democratize and bring machine
> learning to the masses.
>
> Machine learning has traditionally been a luxury for big companies like
> Google, Facebook, and Netflix. There are ML libraries and tools lying
> around the internet but the effort of putting them all together as a
> production-ready infrastructure is a very resource-intensive task that is
> remotely reachable by individuals or small businesses.
>
> PredictionIO is a production-ready, full stack machine learning system that
> allows organizations of any scale to quickly deploy machine learning
> capabilities. It comes with official and community-contributed machine
> learning engine templates that are easy to customize.
>
> === Rationale ===
> As usage and number of contributors to PredictionIO has grown bigger and
> more diverse, we have sought for an independent framework for the project
> to keep thriving. We believe the Apache foundation is a great fit. Joining
> Apache would ensure that tried and true processes and procedures are in
> place for the growing number of organizations interested in contributing
> to PredictionIO. PredictionIO is also a good fit for the Apache foundation.
> PredictionIO was built on top of several Apache projects (HBase, Spark,
> Hadoop). We are familiar with the Apache process and believe that the
> democratic and meritocratic nature of the foundation aligns with the
> project goals.
>
> === Initial Goals ===
> The initial milestones will be to move the existing codebase to Apache and
> integrate with the Apache development process. Once this is accomplished,
> we plan for incremental development and releases that follow the Apache
> guidelines, as well as growing our developer and user communities.
>
> === Current Status ===
> PredictionIO has undergone nine minor releases and many patches.
> PredictionIO is being used in production by Salesforce.com as well as many
> other organizations and apps. The PredictionIO codebase is currently
> hosted at GitHub, which will form the basis of the Apache git repository.
>
> ==== Meritocracy ====
> We plan to invest in supporting a meritocracy. We will discuss the
> requirements in an open forum. We intend to invite additional developers
> to participate. We will encourage and monitor community participation so
> that privileges can be extended to those that contribute.
>
> ==== Community ====
> Acceptance into the Apache foundation would bolster the already strong
> user and developer community around PredictionIO. That community includes
> many contributors from various other companies, and an active mailing list
> composed of hundreds of users.
>
> ==== Core Developers ====
> The core developers of our project are listed in our contributors and
> initial PPMC below. Though many are employed at Salesforce.com, there are
> also engineers from ActionML, and independent developers.
>
> === Alignment ===
> The ASF is the natural choice to host the PredictionIO project as its goal
> is democratizing Machine Learning by making it more easily accessible to
> every user/developer. PredictionIO is built on top of several top level
> Apache projects as outlined above.
>
> === Known Risks ===
>
> ==== Orphaned products ====
> PredictionIO has a solid and growing community. It is deployed on
> production environments by companies of all sizes to run various kinds of
> predictive engines.
>
> In addition to the community contribution to PredictionIO framework, the
> community is also actively contributing new engines to the Template
> Gallery as well as SDKs and documentation for the project. Salesforce is
> committed to utilize and advance the PredictionIO code base and support
> its user community.
>
> ==== Inexperience with Open Source ====
> PredictionIO has existed as a healthy open source project for almost two
> years and is the most starred Scala project on GitHub. All of the proposed
> committers have contributed to ASF and Linux Foundation open source
> projects. Several current committers on Apache projects and Apache Members
> are involved in this proposal and intend to provide mentorship.
>
> ==== Homogeneous Developers ====
> The initial list of committers includes developers from several
> institutions, including Salesforce, ActionML, Channel4, USC as well as
> unaffiliated developers.
>
> ==== Reliance on Salaried Developers ====
> Like most open source projects, PredictionIO receives substantial support
> from salaried developers. PredictionIO development is partially supported
> by Salesforce.com, but there are many contributors from various other
> companies, and an active mailing list composed of hundreds of users. We
> will continue our efforts to ensure stewardship of the project to be
> independent of salaried developers by meritocratically promoting those
> contributors to committers.
>
> ==== Relationships with Other Apache Product ====
> PredictionIO relies heavily on top level apache projects such as Apache
> Spark, HBase and Hadoop. However it brings a distinguished functionality,
> rather than just an abstraction - Machine Learning in a plug-and-play
> fashion.
>
> Compared to Apache Mahout, which focuses on the development of a wide
> variety of algorithms, PredictionIO offers a platform to manage the whole
> machine learning workflow, including data collection, data preparation,
> modeling, deployment and management of predictive services in production
> environments.
>
> ==== An Excessive Fascination with the Apache Brand ====
> PredictionIO is already a widely known open source project. This proposal
> is not for the purpose of generating publicity. Rather, the primary
> benefits to joining Apache are those outlined in the Rationale section.
>
> === Documentation ===
> PredictionIO boasts rich and live documentation, included in the code repo
> (docs/manual directory), is built with Middleman, and publicly hosted at
> https://docs.prediction.io
>
> === Initial Source and Intellectual Property Submission Plan ===
> Currently, the PredictionIO codebase is distributed under the Apache 2.0
> License and hosted on GitHub: https://github.com/PredictionIO/PredictionIO
>
> === External Dependencies ===
> PredictionIO has the following external dependencies:
>  * Apache Hadoop 2.4.0 (optional, required only if YARN and HDFS are needed)
>  * Apache Spark 1.3.0 for Hadoop 2.4
>  * Java SE Development Kit 8
>  * and one of the following sets:
>
>    * PostgreSQL 9.1
>
>
> or
>
>
> * MySQL 5.1
>
>  or
>
>
>  * Apache HBase 0.98.6
>
>
> * Elasticsearch 1.4.0
>
> Upon acceptance to the incubator, we would begin a thorough analysis of
> all transitive dependencies to verify this information and introduce
> license checking into the build and release process by integrating with
> Apache RAT.
>
> === Cryptography ===
> PredictionIO does not include cryptographic code. We utilize standard
> JCE and JSSE APIs provided by the Java Runtime Environment.
>
> === Required Resources ===
> We request that following resources be created for the project to use
>
> ==== Mailing lists ====
>
> predictionio-private@incubator.apache.org (with moderated subscriptions)
>
> predictionio-dev
>
> predictionio-user
>
> predictionio-commits
>
> We will migrate the existing PredictionIO mailing lists.
>
> ==== Git repository ====
> The PredictionIO team would like to use Git for source control, due to our
> current use of GitHub.
>
> git://git.apache.org/incubator-predictionio
>
> ==== Documentation ====
> https://predictionio.incubator.apache.org/docs/
>
> ==== JIRA instance ====
> PredictionIO currently uses the GitHub issue tracking system associated
> with its repository: https://github.com/PredictionIO/PredictionIO/issues.
> We will migrate to Apache JIRA.
>
> JIRA PREDICTIONIO
> https://issues.apache.org/jira/browse/PREDICTIONIO
>
> ==== Other Resources ====
> * TravisCI for builds and test running.
>
> * PredictionIO's documentation, included in the code repo (docs/manual
> directory), is built with Middleman and publicly hosted
> https://docs.prediction.io
>
> * A blog to drive adoption and excitement at https://blog.prediction.io
>
> === Initial Committers ===
>
> * Pat Ferrell
>
> * Tamas Jambor
>
> * Justin Yip
>
> * Xusen Yin
>
> * Lee Moon Soo
>
> * Donald Szeto
>
> * Kenneth Chan
>
> * Tom Chan
>
> * Simon Chan
>
> * Marco Vivero
>
> * Matthew Tovbin
>
> * Yevgeny Khodorkovsky
>
> * Felipe Oliveira
>
> * Vitaly Gordon
>
> === Affiliations ===
>
> * Pat Ferrell - ActionML
>
> * Tamas Jambor - Channel4
>
> * Justin Yip - independent
>
> * Xusen Yin - USC
>
> * Lee Moon Soo - NFLabs
>
> * Donald Szeto - Salesforce
>
> * Kenneth Chan - Salesforce
>
> * Tom Chan - Salesforce
>
> * Simon Chan - Salesforce
>
> * Marco Vivero - Salesforce
>
> * Matthew Tovbin - Salesforce
>
> * Yevgeny Khodorkovsky - Salesforce
>
> * Felipe Oliveira - Salesforce
>
> * Vitaly Gordon - Salesforce
>
> === Sponsors ===
>
> ==== Champion ====
>
> Andrew Purtell <apurtell at apache dot org>
>
> ==== Nominated Mentors ====
>
> * Andrew Purtell <apurtell at apache dot org>
>
> * James Taylor <jtaylor at apache dot org>
>
> * Lars Hofhansl <larsh at apache dot org>
>
> * Suneel Marthi <smarthi at apache dot org>
>
> * Xiangrui Meng <meng at apache dot org>
>
> * Luciano Resende <lresende at apache dot org>
>
> ==== Sponsoring Entity ====
>
> Apache Incubator PMC

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org