You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@incubator.apache.org by Poorna Chandra <po...@apache.org> on 2016/03/04 02:29:49 UTC

[VOTE] Accept Tephra into the Apache Incubator

Hi All,

Tephra proposal was sent out for discussion last week. The proposal is
available at https://wiki.apache.org/incubator/TephraProposal

Please vote to accept Tephra into the Apache Incubator. The vote will be
open for the next 72 hours.

[ ] +1 Accept Tephra as an Apache Incubator podling.
[ ] +0 Abstain.
[ ] -1 Don’t accept Tephra as an Apache Incubator podling because ...

Thanks,
Poorna.

------

= Abstract =

Tephra is a system for providing globally consistent transactions on
top of Apache HBase and other storage engines.

= Proposal =

Tephra is a transaction engine for distributed data stores like Apache HBase.
It provides ACID semantics for concurrent data operations that span over region
boundaries in HBase using Optimistic Concurrency Control.

= Background =

HBase provides strong consistency with row- or region-level ACID
operations. However, it sacrifices cross-region and cross-table
consistency in favor of scalability. This trade-off requires application
developers to handle  the complexity of ensuring consistency when their
modifications span region boundaries. By providing support for global
transactions that span regions, tables, or multiple RPCs,
Tephra simplifies application development on top of HBase, without a
significant impact on performance or scalability for many workloads.

Tephra leverages HBase’s native data versioning to provide multi-versioned
concurrency control (MVCC) for transactional reads and writes.
With MVCC capability, each transaction sees its own consistent “snapshot” of
data, providing snapshot isolation of concurrent transactions.
MVCC along with conflict detection and handling enables Optimistic Concurrency
Control.

Tephra consists of three main components:
 * Transaction Server – maintains global view of transaction state, assigns
   new transaction IDs and performs conflict detection;
 * Transaction Client – coordinates start, commit, and rollback of
transactions; and
 * Transaction Processor Coprocessor – applies filtering to the data read (based
   on a given transaction’s state) and cleans up any data from old
   (no longer visible) transactions.

Although Tephra only supports HBase now, it can be extended to support
transactions on any store that has multi-versioning and rollback
support. The transactions
can span over multiple stores and storage paradigms.

= Rationale =

Tephra has simple abstractions which can be used by an application to
add transaction support over HBase. By abstracting away transaction
handling using Tephra, the application is freed of
transaction logic, and the application developer can focus on the use case.
Also, Tephra can be extended to support transactions on data sources other
than HBase.

By making Tephra an Apache open source project, we believe that there will
be wider adoption and more opportunities for Tephra to be integrated
into other Apache projects.

= Current Status =

Tephra was built at Cask Data Inc. initially as part of
open-source framework Cask Data Application Platform (CDAP)
[[http://cdap.io/]].
It was later converted into an independent open source project with
Apache 2.0 License [[https://github.com/caskdata/tephra]].

Tephra is used in CDAP as the transaction engine. As part of CDAP, Tephra
has been deployed at multiple companies.

Apache Phoenix is using Tephra as transaction engine in the next release.

== Meritocracy ==

Our intent with this incubator proposal is to start building a diverse
developer community around Tephra following the Apache meritocracy model.
Since Tephra was initially developed in early 2013, we have had fast
adoption and contributions within Cask Data. We are looking forward to
new contributors. We wish to build a community based on Apache's
meritocracy principles, working with those who contribute significantly to
the project and welcoming them to be committers both during the incubation
process and beyond.

== Community ==

Core developers of Tephra are at Cask Data. Recently the developer community
has expanded to include folks from Apache Phoenix. We hope to extend our
contributor base significantly and we will invite all who are interested
in working on distributed transaction engine.

== Core Developers ==

A few engineers from Cask Data and outside have developed Tephra:
Andreas Neumann, Terence Yim, Gary Helmling, Andrew Purtell and
Poorna Chandra.


== Alignment ==

The ASF is the natural choice to host the Tephra project as its goal of
encouraging community-driven open source projects fits with our vision for
Tephra.

Additionally, many other projects with which we are familiar and expect
Tephra to integrate with, such as Phoenix, Zookeeper, HDFS, log4j, and others
mentioned in the External Dependencies section are Apache projects, and
Tephra will benefit by close proximity to them.

= Known Risks =

== Orphaned Products ==

There is very little risk of Tephra being orphaned, as it is a key part of
Cask Data’s products. The core Tephra developers plan to continue to work
on Tephra, and Cask Data has funding in place to support their efforts
going forward.
Also with Phoenix using Tephra for transactions, Phoenix developers are
keen on contributing to Tephra.


== Inexperience with Open Source ==

Several of the core developers have experience with open source
development. Andreas Neumann is an Apache committer for Oozie and Twill.
Terence Yim is an Apache committer for Helix and Twill. Poorna Chandra
is an Apache committer for Twill. Gary Helmling is a committer for
Apache Twill and a committer and PMC member for Apache HBase.
James Taylor is PMC chair for Apache Phoenix, PMC member of Apache Calcite,
and an IPMC member.

== Homogeneous Developers ==

The current core developers are all Cask Data employees. However, we
intend to establish a developer community that includes independent and
corporate contributors. We are encouraging new contributors via our mailing
lists, public presentations, and personal contacts, and we will continue to
do so.

Apache Phoenix developers have already contributed several patches to Tephra,
and have expressed interest in becoming long term contributors.

== Reliance on Salaried Developers ==

Currently, these developers are paid to work on Tephra. Once the project has
built a community, we expect to attract committers, developers and community
other than the current core developers. However, because Cask Data
products use Tephra internally, the reliance on salaried developers is
unlikely to change, at least in the near term.

== Relationships with Other Apache Products ==

Tephra is deeply integrated with Apache projects. Tephra provides transactions
over Apache HBase, and uses Apache Twill and Apache Zookeeper for coordination.
A number of other Apache projects are Tephra dependencies, and are
listed in the External Dependencies section.

In addition, Apache Phoenix is using Tephra as the transaction engine.

== An Excessive Fascination with the Apache Brand ==

While we respect the reputation of the Apache brand and have no doubt that
it will attract contributors and users, our interest is primarily to give
Tephra a solid home as an open source project following an established
development model. We have also given additional reasons in the Rationale
and Alignment sections.

= Documentation =

The current documentation for Tephra is at https://github.com/caskdata/tephra.

= Initial Source =

Tephra codebase is currently hosted at https://github.com/caskdata/tephra.

= Source and Intellectual Property Submission Plan =

Tephra codebase is currently licensed under Apache 2.0 license.
Cask Data owns the trademark for "Tephra". As part of the incubation process
Cask Data will transfer the trademark to Apache Foundation.

= External Dependencies =

The dependencies all have Apache-compatible licenses:
 * dropwizard metrics (Apache 2.0)
 * fastutil (Apache 2.0)
 * gson (Apache 2.0)
 * guava-libraries (Apache 2.0)
 * guice (Apache 2.0)
 * hadoop (Apache 2.0)
 * hbase (Apache 2.0)
 * hdfs (Apache 2.0)
 * junit (EPL v1.0)
 * logback (EPL v1.0 )
 * slf4j (MIT)
 * thrift (Apache 2.0)
 * twill (Apache 2.0)
 * zookeeper (Apache 2.0)

= Cryptography =

Tephra does not use cryptography itself, however it can run on secure Hadoop,
which uses Kerberos.

= Required Resources =

== Mailing Lists ==

 * tephra-private for private PMC discussions (with moderated subscriptions)
 * tephra-dev for technical discussions among contributors
 * tephra-commits for notification about commits

== Subversion Directory ==

Git is the preferred source control system: git://git.apache.org/tephra

== Issue Tracking ==

JIRA Tephra (TEPHRA)

== Other Resources ==

The existing code already has unit tests, so we would like a Hudson
instance to run them whenever a new patch is submitted. This can be added
after project creation.

= Initial Committers =

 * Andreas Neumann <anew at apache dot org>
 * Terence Yim <chtyim at apache dot org>
 * Poorna Chandra <poorna at apache dot org>
 * Gokul Gunasekaran <gokul at cask dot co>
 * James Taylor <jamestaylor at apache dot org>
 * Thomas D'Silva <tdsilva at apache dot org>
 * Gary Helmling <garyh at apache dot org>

= Affiliations =

 * Andreas Neumann (Cask Data)
 * Terence Yim (Cask Data)
 * Poorna Chandra (Cask Data)
 * Gokul Gunasekaran (Cask Data)
 * James Taylor (Salesforce.com)
 * Thomas D'Silva (Salesforce.com)
 * Gary Helmling (Facebook)

= Sponsors =

== Champion ==

James Taylor <jamestaylor at apache dot org> (V.P., Apache Phoenix)

== Nominated Mentors ==

 * James Taylor <jamestaylor at apache dot org>
 * Lars Hofhansl <larsh at apache dot org>
 * Andrew Purtell <apurtell at apache dot org>
 * Alan Gates <gates at apache dot org>
 * Henry Saputra <hsaputra at apache dot org>

== Sponsoring Entity ==

We are requesting that the Incubator sponsor this project.

Re: [VOTE] Accept Tephra into the Apache Incubator

Posted by Phillip Rhodes <mo...@gmail.com>.
+1


This message optimized for indexing by NSA PRISM

On Fri, Mar 18, 2016 at 2:53 PM, Stack <st...@duboce.net> wrote:

> I'm late, but let me add my +1 anyways.
> St.Ack
>
> On Thu, Mar 3, 2016 at 5:29 PM, Poorna Chandra <po...@apache.org> wrote:
>
> > Hi All,
> >
> > Tephra proposal was sent out for discussion last week. The proposal is
> > available at https://wiki.apache.org/incubator/TephraProposal
> >
> > Please vote to accept Tephra into the Apache Incubator. The vote will be
> > open for the next 72 hours.
> >
> > [ ] +1 Accept Tephra as an Apache Incubator podling.
> > [ ] +0 Abstain.
> > [ ] -1 Don’t accept Tephra as an Apache Incubator podling because ...
> >
> > Thanks,
> > Poorna.
> >
> > ------
> >
> > = Abstract =
> >
> > Tephra is a system for providing globally consistent transactions on
> > top of Apache HBase and other storage engines.
> >
> > = Proposal =
> >
> > Tephra is a transaction engine for distributed data stores like Apache
> > HBase.
> > It provides ACID semantics for concurrent data operations that span over
> > region
> > boundaries in HBase using Optimistic Concurrency Control.
> >
> > = Background =
> >
> > HBase provides strong consistency with row- or region-level ACID
> > operations. However, it sacrifices cross-region and cross-table
> > consistency in favor of scalability. This trade-off requires application
> > developers to handle  the complexity of ensuring consistency when their
> > modifications span region boundaries. By providing support for global
> > transactions that span regions, tables, or multiple RPCs,
> > Tephra simplifies application development on top of HBase, without a
> > significant impact on performance or scalability for many workloads.
> >
> > Tephra leverages HBase’s native data versioning to provide
> multi-versioned
> > concurrency control (MVCC) for transactional reads and writes.
> > With MVCC capability, each transaction sees its own consistent “snapshot”
> > of
> > data, providing snapshot isolation of concurrent transactions.
> > MVCC along with conflict detection and handling enables Optimistic
> > Concurrency
> > Control.
> >
> > Tephra consists of three main components:
> >  * Transaction Server – maintains global view of transaction state,
> assigns
> >    new transaction IDs and performs conflict detection;
> >  * Transaction Client – coordinates start, commit, and rollback of
> > transactions; and
> >  * Transaction Processor Coprocessor – applies filtering to the data read
> > (based
> >    on a given transaction’s state) and cleans up any data from old
> >    (no longer visible) transactions.
> >
> > Although Tephra only supports HBase now, it can be extended to support
> > transactions on any store that has multi-versioning and rollback
> > support. The transactions
> > can span over multiple stores and storage paradigms.
> >
> > = Rationale =
> >
> > Tephra has simple abstractions which can be used by an application to
> > add transaction support over HBase. By abstracting away transaction
> > handling using Tephra, the application is freed of
> > transaction logic, and the application developer can focus on the use
> case.
> > Also, Tephra can be extended to support transactions on data sources
> other
> > than HBase.
> >
> > By making Tephra an Apache open source project, we believe that there
> will
> > be wider adoption and more opportunities for Tephra to be integrated
> > into other Apache projects.
> >
> > = Current Status =
> >
> > Tephra was built at Cask Data Inc. initially as part of
> > open-source framework Cask Data Application Platform (CDAP)
> > [[http://cdap.io/]].
> > It was later converted into an independent open source project with
> > Apache 2.0 License [[https://github.com/caskdata/tephra]].
> >
> > Tephra is used in CDAP as the transaction engine. As part of CDAP, Tephra
> > has been deployed at multiple companies.
> >
> > Apache Phoenix is using Tephra as transaction engine in the next release.
> >
> > == Meritocracy ==
> >
> > Our intent with this incubator proposal is to start building a diverse
> > developer community around Tephra following the Apache meritocracy model.
> > Since Tephra was initially developed in early 2013, we have had fast
> > adoption and contributions within Cask Data. We are looking forward to
> > new contributors. We wish to build a community based on Apache's
> > meritocracy principles, working with those who contribute significantly
> to
> > the project and welcoming them to be committers both during the
> incubation
> > process and beyond.
> >
> > == Community ==
> >
> > Core developers of Tephra are at Cask Data. Recently the developer
> > community
> > has expanded to include folks from Apache Phoenix. We hope to extend our
> > contributor base significantly and we will invite all who are interested
> > in working on distributed transaction engine.
> >
> > == Core Developers ==
> >
> > A few engineers from Cask Data and outside have developed Tephra:
> > Andreas Neumann, Terence Yim, Gary Helmling, Andrew Purtell and
> > Poorna Chandra.
> >
> >
> > == Alignment ==
> >
> > The ASF is the natural choice to host the Tephra project as its goal of
> > encouraging community-driven open source projects fits with our vision
> for
> > Tephra.
> >
> > Additionally, many other projects with which we are familiar and expect
> > Tephra to integrate with, such as Phoenix, Zookeeper, HDFS, log4j, and
> > others
> > mentioned in the External Dependencies section are Apache projects, and
> > Tephra will benefit by close proximity to them.
> >
> > = Known Risks =
> >
> > == Orphaned Products ==
> >
> > There is very little risk of Tephra being orphaned, as it is a key part
> of
> > Cask Data’s products. The core Tephra developers plan to continue to work
> > on Tephra, and Cask Data has funding in place to support their efforts
> > going forward.
> > Also with Phoenix using Tephra for transactions, Phoenix developers are
> > keen on contributing to Tephra.
> >
> >
> > == Inexperience with Open Source ==
> >
> > Several of the core developers have experience with open source
> > development. Andreas Neumann is an Apache committer for Oozie and Twill.
> > Terence Yim is an Apache committer for Helix and Twill. Poorna Chandra
> > is an Apache committer for Twill. Gary Helmling is a committer for
> > Apache Twill and a committer and PMC member for Apache HBase.
> > James Taylor is PMC chair for Apache Phoenix, PMC member of Apache
> Calcite,
> > and an IPMC member.
> >
> > == Homogeneous Developers ==
> >
> > The current core developers are all Cask Data employees. However, we
> > intend to establish a developer community that includes independent and
> > corporate contributors. We are encouraging new contributors via our
> mailing
> > lists, public presentations, and personal contacts, and we will continue
> to
> > do so.
> >
> > Apache Phoenix developers have already contributed several patches to
> > Tephra,
> > and have expressed interest in becoming long term contributors.
> >
> > == Reliance on Salaried Developers ==
> >
> > Currently, these developers are paid to work on Tephra. Once the project
> > has
> > built a community, we expect to attract committers, developers and
> > community
> > other than the current core developers. However, because Cask Data
> > products use Tephra internally, the reliance on salaried developers is
> > unlikely to change, at least in the near term.
> >
> > == Relationships with Other Apache Products ==
> >
> > Tephra is deeply integrated with Apache projects. Tephra provides
> > transactions
> > over Apache HBase, and uses Apache Twill and Apache Zookeeper for
> > coordination.
> > A number of other Apache projects are Tephra dependencies, and are
> > listed in the External Dependencies section.
> >
> > In addition, Apache Phoenix is using Tephra as the transaction engine.
> >
> > == An Excessive Fascination with the Apache Brand ==
> >
> > While we respect the reputation of the Apache brand and have no doubt
> that
> > it will attract contributors and users, our interest is primarily to give
> > Tephra a solid home as an open source project following an established
> > development model. We have also given additional reasons in the Rationale
> > and Alignment sections.
> >
> > = Documentation =
> >
> > The current documentation for Tephra is at
> > https://github.com/caskdata/tephra.
> >
> > = Initial Source =
> >
> > Tephra codebase is currently hosted at
> https://github.com/caskdata/tephra.
> >
> > = Source and Intellectual Property Submission Plan =
> >
> > Tephra codebase is currently licensed under Apache 2.0 license.
> > Cask Data owns the trademark for "Tephra". As part of the incubation
> > process
> > Cask Data will transfer the trademark to Apache Foundation.
> >
> > = External Dependencies =
> >
> > The dependencies all have Apache-compatible licenses:
> >  * dropwizard metrics (Apache 2.0)
> >  * fastutil (Apache 2.0)
> >  * gson (Apache 2.0)
> >  * guava-libraries (Apache 2.0)
> >  * guice (Apache 2.0)
> >  * hadoop (Apache 2.0)
> >  * hbase (Apache 2.0)
> >  * hdfs (Apache 2.0)
> >  * junit (EPL v1.0)
> >  * logback (EPL v1.0 )
> >  * slf4j (MIT)
> >  * thrift (Apache 2.0)
> >  * twill (Apache 2.0)
> >  * zookeeper (Apache 2.0)
> >
> > = Cryptography =
> >
> > Tephra does not use cryptography itself, however it can run on secure
> > Hadoop,
> > which uses Kerberos.
> >
> > = Required Resources =
> >
> > == Mailing Lists ==
> >
> >  * tephra-private for private PMC discussions (with moderated
> > subscriptions)
> >  * tephra-dev for technical discussions among contributors
> >  * tephra-commits for notification about commits
> >
> > == Subversion Directory ==
> >
> > Git is the preferred source control system: git://git.apache.org/tephra
> >
> > == Issue Tracking ==
> >
> > JIRA Tephra (TEPHRA)
> >
> > == Other Resources ==
> >
> > The existing code already has unit tests, so we would like a Hudson
> > instance to run them whenever a new patch is submitted. This can be added
> > after project creation.
> >
> > = Initial Committers =
> >
> >  * Andreas Neumann <anew at apache dot org>
> >  * Terence Yim <chtyim at apache dot org>
> >  * Poorna Chandra <poorna at apache dot org>
> >  * Gokul Gunasekaran <gokul at cask dot co>
> >  * James Taylor <jamestaylor at apache dot org>
> >  * Thomas D'Silva <tdsilva at apache dot org>
> >  * Gary Helmling <garyh at apache dot org>
> >
> > = Affiliations =
> >
> >  * Andreas Neumann (Cask Data)
> >  * Terence Yim (Cask Data)
> >  * Poorna Chandra (Cask Data)
> >  * Gokul Gunasekaran (Cask Data)
> >  * James Taylor (Salesforce.com)
> >  * Thomas D'Silva (Salesforce.com)
> >  * Gary Helmling (Facebook)
> >
> > = Sponsors =
> >
> > == Champion ==
> >
> > James Taylor <jamestaylor at apache dot org> (V.P., Apache Phoenix)
> >
> > == Nominated Mentors ==
> >
> >  * James Taylor <jamestaylor at apache dot org>
> >  * Lars Hofhansl <larsh at apache dot org>
> >  * Andrew Purtell <apurtell at apache dot org>
> >  * Alan Gates <gates at apache dot org>
> >  * Henry Saputra <hsaputra at apache dot org>
> >
> > == Sponsoring Entity ==
> >
> > We are requesting that the Incubator sponsor this project.
> >
>

Re: [VOTE] Accept Tephra into the Apache Incubator

Posted by Stack <st...@duboce.net>.
I'm late, but let me add my +1 anyways.
St.Ack

On Thu, Mar 3, 2016 at 5:29 PM, Poorna Chandra <po...@apache.org> wrote:

> Hi All,
>
> Tephra proposal was sent out for discussion last week. The proposal is
> available at https://wiki.apache.org/incubator/TephraProposal
>
> Please vote to accept Tephra into the Apache Incubator. The vote will be
> open for the next 72 hours.
>
> [ ] +1 Accept Tephra as an Apache Incubator podling.
> [ ] +0 Abstain.
> [ ] -1 Don’t accept Tephra as an Apache Incubator podling because ...
>
> Thanks,
> Poorna.
>
> ------
>
> = Abstract =
>
> Tephra is a system for providing globally consistent transactions on
> top of Apache HBase and other storage engines.
>
> = Proposal =
>
> Tephra is a transaction engine for distributed data stores like Apache
> HBase.
> It provides ACID semantics for concurrent data operations that span over
> region
> boundaries in HBase using Optimistic Concurrency Control.
>
> = Background =
>
> HBase provides strong consistency with row- or region-level ACID
> operations. However, it sacrifices cross-region and cross-table
> consistency in favor of scalability. This trade-off requires application
> developers to handle  the complexity of ensuring consistency when their
> modifications span region boundaries. By providing support for global
> transactions that span regions, tables, or multiple RPCs,
> Tephra simplifies application development on top of HBase, without a
> significant impact on performance or scalability for many workloads.
>
> Tephra leverages HBase’s native data versioning to provide multi-versioned
> concurrency control (MVCC) for transactional reads and writes.
> With MVCC capability, each transaction sees its own consistent “snapshot”
> of
> data, providing snapshot isolation of concurrent transactions.
> MVCC along with conflict detection and handling enables Optimistic
> Concurrency
> Control.
>
> Tephra consists of three main components:
>  * Transaction Server – maintains global view of transaction state, assigns
>    new transaction IDs and performs conflict detection;
>  * Transaction Client – coordinates start, commit, and rollback of
> transactions; and
>  * Transaction Processor Coprocessor – applies filtering to the data read
> (based
>    on a given transaction’s state) and cleans up any data from old
>    (no longer visible) transactions.
>
> Although Tephra only supports HBase now, it can be extended to support
> transactions on any store that has multi-versioning and rollback
> support. The transactions
> can span over multiple stores and storage paradigms.
>
> = Rationale =
>
> Tephra has simple abstractions which can be used by an application to
> add transaction support over HBase. By abstracting away transaction
> handling using Tephra, the application is freed of
> transaction logic, and the application developer can focus on the use case.
> Also, Tephra can be extended to support transactions on data sources other
> than HBase.
>
> By making Tephra an Apache open source project, we believe that there will
> be wider adoption and more opportunities for Tephra to be integrated
> into other Apache projects.
>
> = Current Status =
>
> Tephra was built at Cask Data Inc. initially as part of
> open-source framework Cask Data Application Platform (CDAP)
> [[http://cdap.io/]].
> It was later converted into an independent open source project with
> Apache 2.0 License [[https://github.com/caskdata/tephra]].
>
> Tephra is used in CDAP as the transaction engine. As part of CDAP, Tephra
> has been deployed at multiple companies.
>
> Apache Phoenix is using Tephra as transaction engine in the next release.
>
> == Meritocracy ==
>
> Our intent with this incubator proposal is to start building a diverse
> developer community around Tephra following the Apache meritocracy model.
> Since Tephra was initially developed in early 2013, we have had fast
> adoption and contributions within Cask Data. We are looking forward to
> new contributors. We wish to build a community based on Apache's
> meritocracy principles, working with those who contribute significantly to
> the project and welcoming them to be committers both during the incubation
> process and beyond.
>
> == Community ==
>
> Core developers of Tephra are at Cask Data. Recently the developer
> community
> has expanded to include folks from Apache Phoenix. We hope to extend our
> contributor base significantly and we will invite all who are interested
> in working on distributed transaction engine.
>
> == Core Developers ==
>
> A few engineers from Cask Data and outside have developed Tephra:
> Andreas Neumann, Terence Yim, Gary Helmling, Andrew Purtell and
> Poorna Chandra.
>
>
> == Alignment ==
>
> The ASF is the natural choice to host the Tephra project as its goal of
> encouraging community-driven open source projects fits with our vision for
> Tephra.
>
> Additionally, many other projects with which we are familiar and expect
> Tephra to integrate with, such as Phoenix, Zookeeper, HDFS, log4j, and
> others
> mentioned in the External Dependencies section are Apache projects, and
> Tephra will benefit by close proximity to them.
>
> = Known Risks =
>
> == Orphaned Products ==
>
> There is very little risk of Tephra being orphaned, as it is a key part of
> Cask Data’s products. The core Tephra developers plan to continue to work
> on Tephra, and Cask Data has funding in place to support their efforts
> going forward.
> Also with Phoenix using Tephra for transactions, Phoenix developers are
> keen on contributing to Tephra.
>
>
> == Inexperience with Open Source ==
>
> Several of the core developers have experience with open source
> development. Andreas Neumann is an Apache committer for Oozie and Twill.
> Terence Yim is an Apache committer for Helix and Twill. Poorna Chandra
> is an Apache committer for Twill. Gary Helmling is a committer for
> Apache Twill and a committer and PMC member for Apache HBase.
> James Taylor is PMC chair for Apache Phoenix, PMC member of Apache Calcite,
> and an IPMC member.
>
> == Homogeneous Developers ==
>
> The current core developers are all Cask Data employees. However, we
> intend to establish a developer community that includes independent and
> corporate contributors. We are encouraging new contributors via our mailing
> lists, public presentations, and personal contacts, and we will continue to
> do so.
>
> Apache Phoenix developers have already contributed several patches to
> Tephra,
> and have expressed interest in becoming long term contributors.
>
> == Reliance on Salaried Developers ==
>
> Currently, these developers are paid to work on Tephra. Once the project
> has
> built a community, we expect to attract committers, developers and
> community
> other than the current core developers. However, because Cask Data
> products use Tephra internally, the reliance on salaried developers is
> unlikely to change, at least in the near term.
>
> == Relationships with Other Apache Products ==
>
> Tephra is deeply integrated with Apache projects. Tephra provides
> transactions
> over Apache HBase, and uses Apache Twill and Apache Zookeeper for
> coordination.
> A number of other Apache projects are Tephra dependencies, and are
> listed in the External Dependencies section.
>
> In addition, Apache Phoenix is using Tephra as the transaction engine.
>
> == An Excessive Fascination with the Apache Brand ==
>
> While we respect the reputation of the Apache brand and have no doubt that
> it will attract contributors and users, our interest is primarily to give
> Tephra a solid home as an open source project following an established
> development model. We have also given additional reasons in the Rationale
> and Alignment sections.
>
> = Documentation =
>
> The current documentation for Tephra is at
> https://github.com/caskdata/tephra.
>
> = Initial Source =
>
> Tephra codebase is currently hosted at https://github.com/caskdata/tephra.
>
> = Source and Intellectual Property Submission Plan =
>
> Tephra codebase is currently licensed under Apache 2.0 license.
> Cask Data owns the trademark for "Tephra". As part of the incubation
> process
> Cask Data will transfer the trademark to Apache Foundation.
>
> = External Dependencies =
>
> The dependencies all have Apache-compatible licenses:
>  * dropwizard metrics (Apache 2.0)
>  * fastutil (Apache 2.0)
>  * gson (Apache 2.0)
>  * guava-libraries (Apache 2.0)
>  * guice (Apache 2.0)
>  * hadoop (Apache 2.0)
>  * hbase (Apache 2.0)
>  * hdfs (Apache 2.0)
>  * junit (EPL v1.0)
>  * logback (EPL v1.0 )
>  * slf4j (MIT)
>  * thrift (Apache 2.0)
>  * twill (Apache 2.0)
>  * zookeeper (Apache 2.0)
>
> = Cryptography =
>
> Tephra does not use cryptography itself, however it can run on secure
> Hadoop,
> which uses Kerberos.
>
> = Required Resources =
>
> == Mailing Lists ==
>
>  * tephra-private for private PMC discussions (with moderated
> subscriptions)
>  * tephra-dev for technical discussions among contributors
>  * tephra-commits for notification about commits
>
> == Subversion Directory ==
>
> Git is the preferred source control system: git://git.apache.org/tephra
>
> == Issue Tracking ==
>
> JIRA Tephra (TEPHRA)
>
> == Other Resources ==
>
> The existing code already has unit tests, so we would like a Hudson
> instance to run them whenever a new patch is submitted. This can be added
> after project creation.
>
> = Initial Committers =
>
>  * Andreas Neumann <anew at apache dot org>
>  * Terence Yim <chtyim at apache dot org>
>  * Poorna Chandra <poorna at apache dot org>
>  * Gokul Gunasekaran <gokul at cask dot co>
>  * James Taylor <jamestaylor at apache dot org>
>  * Thomas D'Silva <tdsilva at apache dot org>
>  * Gary Helmling <garyh at apache dot org>
>
> = Affiliations =
>
>  * Andreas Neumann (Cask Data)
>  * Terence Yim (Cask Data)
>  * Poorna Chandra (Cask Data)
>  * Gokul Gunasekaran (Cask Data)
>  * James Taylor (Salesforce.com)
>  * Thomas D'Silva (Salesforce.com)
>  * Gary Helmling (Facebook)
>
> = Sponsors =
>
> == Champion ==
>
> James Taylor <jamestaylor at apache dot org> (V.P., Apache Phoenix)
>
> == Nominated Mentors ==
>
>  * James Taylor <jamestaylor at apache dot org>
>  * Lars Hofhansl <larsh at apache dot org>
>  * Andrew Purtell <apurtell at apache dot org>
>  * Alan Gates <gates at apache dot org>
>  * Henry Saputra <hsaputra at apache dot org>
>
> == Sponsoring Entity ==
>
> We are requesting that the Incubator sponsor this project.
>

Re: [VOTE] Accept Tephra into the Apache Incubator

Posted by Naresh Agarwal <na...@gmail.com>.
+1 (non-binding)

Looking forward to this project.

Thanks
Naresh Agarwal
On 7 Mar 2016 02:09, <la...@apache.org> wrote:

> +1 (binding)
> Exciting!
>
>       From: Poorna Chandra <po...@apache.org>
>  To: general@incubator.apache.org
>  Sent: Thursday, March 3, 2016 5:29 PM
>  Subject: [VOTE] Accept Tephra into the Apache Incubator
>
> Hi All,
>
> Tephra proposal was sent out for discussion last week. The proposal is
> available at https://wiki.apache.org/incubator/TephraProposal
>
> Please vote to accept Tephra into the Apache Incubator. The vote will be
> open for the next 72 hours.
>
> [ ] +1 Accept Tephra as an Apache Incubator podling.
> [ ] +0 Abstain.
> [ ] -1 Don’t accept Tephra as an Apache Incubator podling because ...
>
> Thanks,
> Poorna.
>
> ------
>
> = Abstract =
>
> Tephra is a system for providing globally consistent transactions on
> top of Apache HBase and other storage engines.
>
> = Proposal =
>
> Tephra is a transaction engine for distributed data stores like Apache
> HBase.
> It provides ACID semantics for concurrent data operations that span over
> region
> boundaries in HBase using Optimistic Concurrency Control.
>
> = Background =
>
> HBase provides strong consistency with row- or region-level ACID
> operations. However, it sacrifices cross-region and cross-table
> consistency in favor of scalability. This trade-off requires application
> developers to handle  the complexity of ensuring consistency when their
> modifications span region boundaries. By providing support for global
> transactions that span regions, tables, or multiple RPCs,
> Tephra simplifies application development on top of HBase, without a
> significant impact on performance or scalability for many workloads.
>
> Tephra leverages HBase’s native data versioning to provide multi-versioned
> concurrency control (MVCC) for transactional reads and writes.
> With MVCC capability, each transaction sees its own consistent “snapshot”
> of
> data, providing snapshot isolation of concurrent transactions.
> MVCC along with conflict detection and handling enables Optimistic
> Concurrency
> Control.
>
> Tephra consists of three main components:
>  * Transaction Server – maintains global view of transaction state, assigns
>   new transaction IDs and performs conflict detection;
>  * Transaction Client – coordinates start, commit, and rollback of
> transactions; and
>  * Transaction Processor Coprocessor – applies filtering to the data read
> (based
>   on a given transaction’s state) and cleans up any data from old
>   (no longer visible) transactions.
>
> Although Tephra only supports HBase now, it can be extended to support
> transactions on any store that has multi-versioning and rollback
> support. The transactions
> can span over multiple stores and storage paradigms.
>
> = Rationale =
>
> Tephra has simple abstractions which can be used by an application to
> add transaction support over HBase. By abstracting away transaction
> handling using Tephra, the application is freed of
> transaction logic, and the application developer can focus on the use case.
> Also, Tephra can be extended to support transactions on data sources other
> than HBase.
>
> By making Tephra an Apache open source project, we believe that there will
> be wider adoption and more opportunities for Tephra to be integrated
> into other Apache projects.
>
> = Current Status =
>
> Tephra was built at Cask Data Inc. initially as part of
> open-source framework Cask Data Application Platform (CDAP)
> [[http://cdap.io/]].
> It was later converted into an independent open source project with
> Apache 2.0 License [[https://github.com/caskdata/tephra]].
>
> Tephra is used in CDAP as the transaction engine. As part of CDAP, Tephra
> has been deployed at multiple companies.
>
> Apache Phoenix is using Tephra as transaction engine in the next release.
>
> == Meritocracy ==
>
> Our intent with this incubator proposal is to start building a diverse
> developer community around Tephra following the Apache meritocracy model.
> Since Tephra was initially developed in early 2013, we have had fast
> adoption and contributions within Cask Data. We are looking forward to
> new contributors. We wish to build a community based on Apache's
> meritocracy principles, working with those who contribute significantly to
> the project and welcoming them to be committers both during the incubation
> process and beyond.
>
> == Community ==
>
> Core developers of Tephra are at Cask Data. Recently the developer
> community
> has expanded to include folks from Apache Phoenix. We hope to extend our
> contributor base significantly and we will invite all who are interested
> in working on distributed transaction engine.
>
> == Core Developers ==
>
> A few engineers from Cask Data and outside have developed Tephra:
> Andreas Neumann, Terence Yim, Gary Helmling, Andrew Purtell and
> Poorna Chandra.
>
>
> == Alignment ==
>
> The ASF is the natural choice to host the Tephra project as its goal of
> encouraging community-driven open source projects fits with our vision for
> Tephra.
>
> Additionally, many other projects with which we are familiar and expect
> Tephra to integrate with, such as Phoenix, Zookeeper, HDFS, log4j, and
> others
> mentioned in the External Dependencies section are Apache projects, and
> Tephra will benefit by close proximity to them.
>
> = Known Risks =
>
> == Orphaned Products ==
>
> There is very little risk of Tephra being orphaned, as it is a key part of
> Cask Data’s products. The core Tephra developers plan to continue to work
> on Tephra, and Cask Data has funding in place to support their efforts
> going forward.
> Also with Phoenix using Tephra for transactions, Phoenix developers are
> keen on contributing to Tephra.
>
>
> == Inexperience with Open Source ==
>
> Several of the core developers have experience with open source
> development. Andreas Neumann is an Apache committer for Oozie and Twill.
> Terence Yim is an Apache committer for Helix and Twill. Poorna Chandra
> is an Apache committer for Twill. Gary Helmling is a committer for
> Apache Twill and a committer and PMC member for Apache HBase.
> James Taylor is PMC chair for Apache Phoenix, PMC member of Apache Calcite,
> and an IPMC member.
>
> == Homogeneous Developers ==
>
> The current core developers are all Cask Data employees. However, we
> intend to establish a developer community that includes independent and
> corporate contributors. We are encouraging new contributors via our mailing
> lists, public presentations, and personal contacts, and we will continue to
> do so.
>
> Apache Phoenix developers have already contributed several patches to
> Tephra,
> and have expressed interest in becoming long term contributors.
>
> == Reliance on Salaried Developers ==
>
> Currently, these developers are paid to work on Tephra. Once the project
> has
> built a community, we expect to attract committers, developers and
> community
> other than the current core developers. However, because Cask Data
> products use Tephra internally, the reliance on salaried developers is
> unlikely to change, at least in the near term.
>
> == Relationships with Other Apache Products ==
>
> Tephra is deeply integrated with Apache projects. Tephra provides
> transactions
> over Apache HBase, and uses Apache Twill and Apache Zookeeper for
> coordination.
> A number of other Apache projects are Tephra dependencies, and are
> listed in the External Dependencies section.
>
> In addition, Apache Phoenix is using Tephra as the transaction engine.
>
> == An Excessive Fascination with the Apache Brand ==
>
> While we respect the reputation of the Apache brand and have no doubt that
> it will attract contributors and users, our interest is primarily to give
> Tephra a solid home as an open source project following an established
> development model. We have also given additional reasons in the Rationale
> and Alignment sections.
>
> = Documentation =
>
> The current documentation for Tephra is at
> https://github.com/caskdata/tephra.
>
> = Initial Source =
>
> Tephra codebase is currently hosted at https://github.com/caskdata/tephra.
>
> = Source and Intellectual Property Submission Plan =
>
> Tephra codebase is currently licensed under Apache 2.0 license.
> Cask Data owns the trademark for "Tephra". As part of the incubation
> process
> Cask Data will transfer the trademark to Apache Foundation.
>
> = External Dependencies =
>
> The dependencies all have Apache-compatible licenses:
>  * dropwizard metrics (Apache 2.0)
>  * fastutil (Apache 2.0)
>  * gson (Apache 2.0)
>  * guava-libraries (Apache 2.0)
>  * guice (Apache 2.0)
>  * hadoop (Apache 2.0)
>  * hbase (Apache 2.0)
>  * hdfs (Apache 2.0)
>  * junit (EPL v1.0)
>  * logback (EPL v1.0 )
>  * slf4j (MIT)
>  * thrift (Apache 2.0)
>  * twill (Apache 2.0)
>  * zookeeper (Apache 2.0)
>
> = Cryptography =
>
> Tephra does not use cryptography itself, however it can run on secure
> Hadoop,
> which uses Kerberos.
>
> = Required Resources =
>
> == Mailing Lists ==
>
>  * tephra-private for private PMC discussions (with moderated
> subscriptions)
>  * tephra-dev for technical discussions among contributors
>  * tephra-commits for notification about commits
>
> == Subversion Directory ==
>
> Git is the preferred source control system: git://git.apache.org/tephra
>
> == Issue Tracking ==
>
> JIRA Tephra (TEPHRA)
>
> == Other Resources ==
>
> The existing code already has unit tests, so we would like a Hudson
> instance to run them whenever a new patch is submitted. This can be added
> after project creation.
>
> = Initial Committers =
>
>  * Andreas Neumann <anew at apache dot org>
>  * Terence Yim <chtyim at apache dot org>
>  * Poorna Chandra <poorna at apache dot org>
>  * Gokul Gunasekaran <gokul at cask dot co>
>  * James Taylor <jamestaylor at apache dot org>
>  * Thomas D'Silva <tdsilva at apache dot org>
>  * Gary Helmling <garyh at apache dot org>
>
> = Affiliations =
>
>  * Andreas Neumann (Cask Data)
>  * Terence Yim (Cask Data)
>  * Poorna Chandra (Cask Data)
>  * Gokul Gunasekaran (Cask Data)
>  * James Taylor (Salesforce.com)
>  * Thomas D'Silva (Salesforce.com)
>  * Gary Helmling (Facebook)
>
> = Sponsors =
>
> == Champion ==
>
> James Taylor <jamestaylor at apache dot org> (V.P., Apache Phoenix)
>
> == Nominated Mentors ==
>
>  * James Taylor <jamestaylor at apache dot org>
>  * Lars Hofhansl <larsh at apache dot org>
>  * Andrew Purtell <apurtell at apache dot org>
>  * Alan Gates <gates at apache dot org>
>  * Henry Saputra <hsaputra at apache dot org>
>
> == Sponsoring Entity ==
>
> We are requesting that the Incubator sponsor this project.
>
>

Re: [VOTE] Accept Tephra into the Apache Incubator

Posted by la...@apache.org.
+1 (binding)
Exciting!

      From: Poorna Chandra <po...@apache.org>
 To: general@incubator.apache.org 
 Sent: Thursday, March 3, 2016 5:29 PM
 Subject: [VOTE] Accept Tephra into the Apache Incubator
   
Hi All,

Tephra proposal was sent out for discussion last week. The proposal is
available at https://wiki.apache.org/incubator/TephraProposal

Please vote to accept Tephra into the Apache Incubator. The vote will be
open for the next 72 hours.

[ ] +1 Accept Tephra as an Apache Incubator podling.
[ ] +0 Abstain.
[ ] -1 Don’t accept Tephra as an Apache Incubator podling because ...

Thanks,
Poorna.

------

= Abstract =

Tephra is a system for providing globally consistent transactions on
top of Apache HBase and other storage engines.

= Proposal =

Tephra is a transaction engine for distributed data stores like Apache HBase.
It provides ACID semantics for concurrent data operations that span over region
boundaries in HBase using Optimistic Concurrency Control.

= Background =

HBase provides strong consistency with row- or region-level ACID
operations. However, it sacrifices cross-region and cross-table
consistency in favor of scalability. This trade-off requires application
developers to handle  the complexity of ensuring consistency when their
modifications span region boundaries. By providing support for global
transactions that span regions, tables, or multiple RPCs,
Tephra simplifies application development on top of HBase, without a
significant impact on performance or scalability for many workloads.

Tephra leverages HBase’s native data versioning to provide multi-versioned
concurrency control (MVCC) for transactional reads and writes.
With MVCC capability, each transaction sees its own consistent “snapshot” of
data, providing snapshot isolation of concurrent transactions.
MVCC along with conflict detection and handling enables Optimistic Concurrency
Control.

Tephra consists of three main components:
 * Transaction Server – maintains global view of transaction state, assigns
  new transaction IDs and performs conflict detection;
 * Transaction Client – coordinates start, commit, and rollback of
transactions; and
 * Transaction Processor Coprocessor – applies filtering to the data read (based
  on a given transaction’s state) and cleans up any data from old
  (no longer visible) transactions.

Although Tephra only supports HBase now, it can be extended to support
transactions on any store that has multi-versioning and rollback
support. The transactions
can span over multiple stores and storage paradigms.

= Rationale =

Tephra has simple abstractions which can be used by an application to
add transaction support over HBase. By abstracting away transaction
handling using Tephra, the application is freed of
transaction logic, and the application developer can focus on the use case.
Also, Tephra can be extended to support transactions on data sources other
than HBase.

By making Tephra an Apache open source project, we believe that there will
be wider adoption and more opportunities for Tephra to be integrated
into other Apache projects.

= Current Status =

Tephra was built at Cask Data Inc. initially as part of
open-source framework Cask Data Application Platform (CDAP)
[[http://cdap.io/]].
It was later converted into an independent open source project with
Apache 2.0 License [[https://github.com/caskdata/tephra]].

Tephra is used in CDAP as the transaction engine. As part of CDAP, Tephra
has been deployed at multiple companies.

Apache Phoenix is using Tephra as transaction engine in the next release.

== Meritocracy ==

Our intent with this incubator proposal is to start building a diverse
developer community around Tephra following the Apache meritocracy model.
Since Tephra was initially developed in early 2013, we have had fast
adoption and contributions within Cask Data. We are looking forward to
new contributors. We wish to build a community based on Apache's
meritocracy principles, working with those who contribute significantly to
the project and welcoming them to be committers both during the incubation
process and beyond.

== Community ==

Core developers of Tephra are at Cask Data. Recently the developer community
has expanded to include folks from Apache Phoenix. We hope to extend our
contributor base significantly and we will invite all who are interested
in working on distributed transaction engine.

== Core Developers ==

A few engineers from Cask Data and outside have developed Tephra:
Andreas Neumann, Terence Yim, Gary Helmling, Andrew Purtell and
Poorna Chandra.


== Alignment ==

The ASF is the natural choice to host the Tephra project as its goal of
encouraging community-driven open source projects fits with our vision for
Tephra.

Additionally, many other projects with which we are familiar and expect
Tephra to integrate with, such as Phoenix, Zookeeper, HDFS, log4j, and others
mentioned in the External Dependencies section are Apache projects, and
Tephra will benefit by close proximity to them.

= Known Risks =

== Orphaned Products ==

There is very little risk of Tephra being orphaned, as it is a key part of
Cask Data’s products. The core Tephra developers plan to continue to work
on Tephra, and Cask Data has funding in place to support their efforts
going forward.
Also with Phoenix using Tephra for transactions, Phoenix developers are
keen on contributing to Tephra.


== Inexperience with Open Source ==

Several of the core developers have experience with open source
development. Andreas Neumann is an Apache committer for Oozie and Twill.
Terence Yim is an Apache committer for Helix and Twill. Poorna Chandra
is an Apache committer for Twill. Gary Helmling is a committer for
Apache Twill and a committer and PMC member for Apache HBase.
James Taylor is PMC chair for Apache Phoenix, PMC member of Apache Calcite,
and an IPMC member.

== Homogeneous Developers ==

The current core developers are all Cask Data employees. However, we
intend to establish a developer community that includes independent and
corporate contributors. We are encouraging new contributors via our mailing
lists, public presentations, and personal contacts, and we will continue to
do so.

Apache Phoenix developers have already contributed several patches to Tephra,
and have expressed interest in becoming long term contributors.

== Reliance on Salaried Developers ==

Currently, these developers are paid to work on Tephra. Once the project has
built a community, we expect to attract committers, developers and community
other than the current core developers. However, because Cask Data
products use Tephra internally, the reliance on salaried developers is
unlikely to change, at least in the near term.

== Relationships with Other Apache Products ==

Tephra is deeply integrated with Apache projects. Tephra provides transactions
over Apache HBase, and uses Apache Twill and Apache Zookeeper for coordination.
A number of other Apache projects are Tephra dependencies, and are
listed in the External Dependencies section.

In addition, Apache Phoenix is using Tephra as the transaction engine.

== An Excessive Fascination with the Apache Brand ==

While we respect the reputation of the Apache brand and have no doubt that
it will attract contributors and users, our interest is primarily to give
Tephra a solid home as an open source project following an established
development model. We have also given additional reasons in the Rationale
and Alignment sections.

= Documentation =

The current documentation for Tephra is at https://github.com/caskdata/tephra.

= Initial Source =

Tephra codebase is currently hosted at https://github.com/caskdata/tephra.

= Source and Intellectual Property Submission Plan =

Tephra codebase is currently licensed under Apache 2.0 license.
Cask Data owns the trademark for "Tephra". As part of the incubation process
Cask Data will transfer the trademark to Apache Foundation.

= External Dependencies =

The dependencies all have Apache-compatible licenses:
 * dropwizard metrics (Apache 2.0)
 * fastutil (Apache 2.0)
 * gson (Apache 2.0)
 * guava-libraries (Apache 2.0)
 * guice (Apache 2.0)
 * hadoop (Apache 2.0)
 * hbase (Apache 2.0)
 * hdfs (Apache 2.0)
 * junit (EPL v1.0)
 * logback (EPL v1.0 )
 * slf4j (MIT)
 * thrift (Apache 2.0)
 * twill (Apache 2.0)
 * zookeeper (Apache 2.0)

= Cryptography =

Tephra does not use cryptography itself, however it can run on secure Hadoop,
which uses Kerberos.

= Required Resources =

== Mailing Lists ==

 * tephra-private for private PMC discussions (with moderated subscriptions)
 * tephra-dev for technical discussions among contributors
 * tephra-commits for notification about commits

== Subversion Directory ==

Git is the preferred source control system: git://git.apache.org/tephra

== Issue Tracking ==

JIRA Tephra (TEPHRA)

== Other Resources ==

The existing code already has unit tests, so we would like a Hudson
instance to run them whenever a new patch is submitted. This can be added
after project creation.

= Initial Committers =

 * Andreas Neumann <anew at apache dot org>
 * Terence Yim <chtyim at apache dot org>
 * Poorna Chandra <poorna at apache dot org>
 * Gokul Gunasekaran <gokul at cask dot co>
 * James Taylor <jamestaylor at apache dot org>
 * Thomas D'Silva <tdsilva at apache dot org>
 * Gary Helmling <garyh at apache dot org>

= Affiliations =

 * Andreas Neumann (Cask Data)
 * Terence Yim (Cask Data)
 * Poorna Chandra (Cask Data)
 * Gokul Gunasekaran (Cask Data)
 * James Taylor (Salesforce.com)
 * Thomas D'Silva (Salesforce.com)
 * Gary Helmling (Facebook)

= Sponsors =

== Champion ==

James Taylor <jamestaylor at apache dot org> (V.P., Apache Phoenix)

== Nominated Mentors ==

 * James Taylor <jamestaylor at apache dot org>
 * Lars Hofhansl <larsh at apache dot org>
 * Andrew Purtell <apurtell at apache dot org>
 * Alan Gates <gates at apache dot org>
 * Henry Saputra <hsaputra at apache dot org>

== Sponsoring Entity ==

We are requesting that the Incubator sponsor this project.

  

Re: [VOTE] Accept Tephra into the Apache Incubator

Posted by Julian Hyde <jh...@apache.org>.
+1 (binding)

> On Mar 4, 2016, at 12:57 PM, Alan Gates <al...@gmail.com> wrote:
> 
> +1 (binding).
> 
> Alan.
> 
>> On Mar 3, 2016, at 17:29, Poorna Chandra <po...@apache.org> wrote:
>> 
>> Hi All,
>> 
>> Tephra proposal was sent out for discussion last week. The proposal is
>> available at https://wiki.apache.org/incubator/TephraProposal
>> 
>> Please vote to accept Tephra into the Apache Incubator. The vote will be
>> open for the next 72 hours.
>> 
>> [ ] +1 Accept Tephra as an Apache Incubator podling.
>> [ ] +0 Abstain.
>> [ ] -1 Don’t accept Tephra as an Apache Incubator podling because ...
>> 
>> Thanks,
>> Poorna.
>> 
>> ------
>> 
>> = Abstract =
>> 
>> Tephra is a system for providing globally consistent transactions on
>> top of Apache HBase and other storage engines.
>> 
>> = Proposal =
>> 
>> Tephra is a transaction engine for distributed data stores like Apache HBase.
>> It provides ACID semantics for concurrent data operations that span over region
>> boundaries in HBase using Optimistic Concurrency Control.
>> 
>> = Background =
>> 
>> HBase provides strong consistency with row- or region-level ACID
>> operations. However, it sacrifices cross-region and cross-table
>> consistency in favor of scalability. This trade-off requires application
>> developers to handle  the complexity of ensuring consistency when their
>> modifications span region boundaries. By providing support for global
>> transactions that span regions, tables, or multiple RPCs,
>> Tephra simplifies application development on top of HBase, without a
>> significant impact on performance or scalability for many workloads.
>> 
>> Tephra leverages HBase’s native data versioning to provide multi-versioned
>> concurrency control (MVCC) for transactional reads and writes.
>> With MVCC capability, each transaction sees its own consistent “snapshot” of
>> data, providing snapshot isolation of concurrent transactions.
>> MVCC along with conflict detection and handling enables Optimistic Concurrency
>> Control.
>> 
>> Tephra consists of three main components:
>> * Transaction Server – maintains global view of transaction state, assigns
>>  new transaction IDs and performs conflict detection;
>> * Transaction Client – coordinates start, commit, and rollback of
>> transactions; and
>> * Transaction Processor Coprocessor – applies filtering to the data read (based
>>  on a given transaction’s state) and cleans up any data from old
>>  (no longer visible) transactions.
>> 
>> Although Tephra only supports HBase now, it can be extended to support
>> transactions on any store that has multi-versioning and rollback
>> support. The transactions
>> can span over multiple stores and storage paradigms.
>> 
>> = Rationale =
>> 
>> Tephra has simple abstractions which can be used by an application to
>> add transaction support over HBase. By abstracting away transaction
>> handling using Tephra, the application is freed of
>> transaction logic, and the application developer can focus on the use case.
>> Also, Tephra can be extended to support transactions on data sources other
>> than HBase.
>> 
>> By making Tephra an Apache open source project, we believe that there will
>> be wider adoption and more opportunities for Tephra to be integrated
>> into other Apache projects.
>> 
>> = Current Status =
>> 
>> Tephra was built at Cask Data Inc. initially as part of
>> open-source framework Cask Data Application Platform (CDAP)
>> [[http://cdap.io/]].
>> It was later converted into an independent open source project with
>> Apache 2.0 License [[https://github.com/caskdata/tephra]].
>> 
>> Tephra is used in CDAP as the transaction engine. As part of CDAP, Tephra
>> has been deployed at multiple companies.
>> 
>> Apache Phoenix is using Tephra as transaction engine in the next release.
>> 
>> == Meritocracy ==
>> 
>> Our intent with this incubator proposal is to start building a diverse
>> developer community around Tephra following the Apache meritocracy model.
>> Since Tephra was initially developed in early 2013, we have had fast
>> adoption and contributions within Cask Data. We are looking forward to
>> new contributors. We wish to build a community based on Apache's
>> meritocracy principles, working with those who contribute significantly to
>> the project and welcoming them to be committers both during the incubation
>> process and beyond.
>> 
>> == Community ==
>> 
>> Core developers of Tephra are at Cask Data. Recently the developer community
>> has expanded to include folks from Apache Phoenix. We hope to extend our
>> contributor base significantly and we will invite all who are interested
>> in working on distributed transaction engine.
>> 
>> == Core Developers ==
>> 
>> A few engineers from Cask Data and outside have developed Tephra:
>> Andreas Neumann, Terence Yim, Gary Helmling, Andrew Purtell and
>> Poorna Chandra.
>> 
>> 
>> == Alignment ==
>> 
>> The ASF is the natural choice to host the Tephra project as its goal of
>> encouraging community-driven open source projects fits with our vision for
>> Tephra.
>> 
>> Additionally, many other projects with which we are familiar and expect
>> Tephra to integrate with, such as Phoenix, Zookeeper, HDFS, log4j, and others
>> mentioned in the External Dependencies section are Apache projects, and
>> Tephra will benefit by close proximity to them.
>> 
>> = Known Risks =
>> 
>> == Orphaned Products ==
>> 
>> There is very little risk of Tephra being orphaned, as it is a key part of
>> Cask Data’s products. The core Tephra developers plan to continue to work
>> on Tephra, and Cask Data has funding in place to support their efforts
>> going forward.
>> Also with Phoenix using Tephra for transactions, Phoenix developers are
>> keen on contributing to Tephra.
>> 
>> 
>> == Inexperience with Open Source ==
>> 
>> Several of the core developers have experience with open source
>> development. Andreas Neumann is an Apache committer for Oozie and Twill.
>> Terence Yim is an Apache committer for Helix and Twill. Poorna Chandra
>> is an Apache committer for Twill. Gary Helmling is a committer for
>> Apache Twill and a committer and PMC member for Apache HBase.
>> James Taylor is PMC chair for Apache Phoenix, PMC member of Apache Calcite,
>> and an IPMC member.
>> 
>> == Homogeneous Developers ==
>> 
>> The current core developers are all Cask Data employees. However, we
>> intend to establish a developer community that includes independent and
>> corporate contributors. We are encouraging new contributors via our mailing
>> lists, public presentations, and personal contacts, and we will continue to
>> do so.
>> 
>> Apache Phoenix developers have already contributed several patches to Tephra,
>> and have expressed interest in becoming long term contributors.
>> 
>> == Reliance on Salaried Developers ==
>> 
>> Currently, these developers are paid to work on Tephra. Once the project has
>> built a community, we expect to attract committers, developers and community
>> other than the current core developers. However, because Cask Data
>> products use Tephra internally, the reliance on salaried developers is
>> unlikely to change, at least in the near term.
>> 
>> == Relationships with Other Apache Products ==
>> 
>> Tephra is deeply integrated with Apache projects. Tephra provides transactions
>> over Apache HBase, and uses Apache Twill and Apache Zookeeper for coordination.
>> A number of other Apache projects are Tephra dependencies, and are
>> listed in the External Dependencies section.
>> 
>> In addition, Apache Phoenix is using Tephra as the transaction engine.
>> 
>> == An Excessive Fascination with the Apache Brand ==
>> 
>> While we respect the reputation of the Apache brand and have no doubt that
>> it will attract contributors and users, our interest is primarily to give
>> Tephra a solid home as an open source project following an established
>> development model. We have also given additional reasons in the Rationale
>> and Alignment sections.
>> 
>> = Documentation =
>> 
>> The current documentation for Tephra is at https://github.com/caskdata/tephra.
>> 
>> = Initial Source =
>> 
>> Tephra codebase is currently hosted at https://github.com/caskdata/tephra.
>> 
>> = Source and Intellectual Property Submission Plan =
>> 
>> Tephra codebase is currently licensed under Apache 2.0 license.
>> Cask Data owns the trademark for "Tephra". As part of the incubation process
>> Cask Data will transfer the trademark to Apache Foundation.
>> 
>> = External Dependencies =
>> 
>> The dependencies all have Apache-compatible licenses:
>> * dropwizard metrics (Apache 2.0)
>> * fastutil (Apache 2.0)
>> * gson (Apache 2.0)
>> * guava-libraries (Apache 2.0)
>> * guice (Apache 2.0)
>> * hadoop (Apache 2.0)
>> * hbase (Apache 2.0)
>> * hdfs (Apache 2.0)
>> * junit (EPL v1.0)
>> * logback (EPL v1.0 )
>> * slf4j (MIT)
>> * thrift (Apache 2.0)
>> * twill (Apache 2.0)
>> * zookeeper (Apache 2.0)
>> 
>> = Cryptography =
>> 
>> Tephra does not use cryptography itself, however it can run on secure Hadoop,
>> which uses Kerberos.
>> 
>> = Required Resources =
>> 
>> == Mailing Lists ==
>> 
>> * tephra-private for private PMC discussions (with moderated subscriptions)
>> * tephra-dev for technical discussions among contributors
>> * tephra-commits for notification about commits
>> 
>> == Subversion Directory ==
>> 
>> Git is the preferred source control system: git://git.apache.org/tephra
>> 
>> == Issue Tracking ==
>> 
>> JIRA Tephra (TEPHRA)
>> 
>> == Other Resources ==
>> 
>> The existing code already has unit tests, so we would like a Hudson
>> instance to run them whenever a new patch is submitted. This can be added
>> after project creation.
>> 
>> = Initial Committers =
>> 
>> * Andreas Neumann <anew at apache dot org>
>> * Terence Yim <chtyim at apache dot org>
>> * Poorna Chandra <poorna at apache dot org>
>> * Gokul Gunasekaran <gokul at cask dot co>
>> * James Taylor <jamestaylor at apache dot org>
>> * Thomas D'Silva <tdsilva at apache dot org>
>> * Gary Helmling <garyh at apache dot org>
>> 
>> = Affiliations =
>> 
>> * Andreas Neumann (Cask Data)
>> * Terence Yim (Cask Data)
>> * Poorna Chandra (Cask Data)
>> * Gokul Gunasekaran (Cask Data)
>> * James Taylor (Salesforce.com)
>> * Thomas D'Silva (Salesforce.com)
>> * Gary Helmling (Facebook)
>> 
>> = Sponsors =
>> 
>> == Champion ==
>> 
>> James Taylor <jamestaylor at apache dot org> (V.P., Apache Phoenix)
>> 
>> == Nominated Mentors ==
>> 
>> * James Taylor <jamestaylor at apache dot org>
>> * Lars Hofhansl <larsh at apache dot org>
>> * Andrew Purtell <apurtell at apache dot org>
>> * Alan Gates <gates at apache dot org>
>> * Henry Saputra <hsaputra at apache dot org>
>> 
>> == Sponsoring Entity ==
>> 
>> We are requesting that the Incubator sponsor this project.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [VOTE] Accept Tephra into the Apache Incubator

Posted by Alan Gates <al...@gmail.com>.
+1 (binding).

Alan.

> On Mar 3, 2016, at 17:29, Poorna Chandra <po...@apache.org> wrote:
> 
> Hi All,
> 
> Tephra proposal was sent out for discussion last week. The proposal is
> available at https://wiki.apache.org/incubator/TephraProposal
> 
> Please vote to accept Tephra into the Apache Incubator. The vote will be
> open for the next 72 hours.
> 
> [ ] +1 Accept Tephra as an Apache Incubator podling.
> [ ] +0 Abstain.
> [ ] -1 Don’t accept Tephra as an Apache Incubator podling because ...
> 
> Thanks,
> Poorna.
> 
> ------
> 
> = Abstract =
> 
> Tephra is a system for providing globally consistent transactions on
> top of Apache HBase and other storage engines.
> 
> = Proposal =
> 
> Tephra is a transaction engine for distributed data stores like Apache HBase.
> It provides ACID semantics for concurrent data operations that span over region
> boundaries in HBase using Optimistic Concurrency Control.
> 
> = Background =
> 
> HBase provides strong consistency with row- or region-level ACID
> operations. However, it sacrifices cross-region and cross-table
> consistency in favor of scalability. This trade-off requires application
> developers to handle  the complexity of ensuring consistency when their
> modifications span region boundaries. By providing support for global
> transactions that span regions, tables, or multiple RPCs,
> Tephra simplifies application development on top of HBase, without a
> significant impact on performance or scalability for many workloads.
> 
> Tephra leverages HBase’s native data versioning to provide multi-versioned
> concurrency control (MVCC) for transactional reads and writes.
> With MVCC capability, each transaction sees its own consistent “snapshot” of
> data, providing snapshot isolation of concurrent transactions.
> MVCC along with conflict detection and handling enables Optimistic Concurrency
> Control.
> 
> Tephra consists of three main components:
> * Transaction Server – maintains global view of transaction state, assigns
>   new transaction IDs and performs conflict detection;
> * Transaction Client – coordinates start, commit, and rollback of
> transactions; and
> * Transaction Processor Coprocessor – applies filtering to the data read (based
>   on a given transaction’s state) and cleans up any data from old
>   (no longer visible) transactions.
> 
> Although Tephra only supports HBase now, it can be extended to support
> transactions on any store that has multi-versioning and rollback
> support. The transactions
> can span over multiple stores and storage paradigms.
> 
> = Rationale =
> 
> Tephra has simple abstractions which can be used by an application to
> add transaction support over HBase. By abstracting away transaction
> handling using Tephra, the application is freed of
> transaction logic, and the application developer can focus on the use case.
> Also, Tephra can be extended to support transactions on data sources other
> than HBase.
> 
> By making Tephra an Apache open source project, we believe that there will
> be wider adoption and more opportunities for Tephra to be integrated
> into other Apache projects.
> 
> = Current Status =
> 
> Tephra was built at Cask Data Inc. initially as part of
> open-source framework Cask Data Application Platform (CDAP)
> [[http://cdap.io/]].
> It was later converted into an independent open source project with
> Apache 2.0 License [[https://github.com/caskdata/tephra]].
> 
> Tephra is used in CDAP as the transaction engine. As part of CDAP, Tephra
> has been deployed at multiple companies.
> 
> Apache Phoenix is using Tephra as transaction engine in the next release.
> 
> == Meritocracy ==
> 
> Our intent with this incubator proposal is to start building a diverse
> developer community around Tephra following the Apache meritocracy model.
> Since Tephra was initially developed in early 2013, we have had fast
> adoption and contributions within Cask Data. We are looking forward to
> new contributors. We wish to build a community based on Apache's
> meritocracy principles, working with those who contribute significantly to
> the project and welcoming them to be committers both during the incubation
> process and beyond.
> 
> == Community ==
> 
> Core developers of Tephra are at Cask Data. Recently the developer community
> has expanded to include folks from Apache Phoenix. We hope to extend our
> contributor base significantly and we will invite all who are interested
> in working on distributed transaction engine.
> 
> == Core Developers ==
> 
> A few engineers from Cask Data and outside have developed Tephra:
> Andreas Neumann, Terence Yim, Gary Helmling, Andrew Purtell and
> Poorna Chandra.
> 
> 
> == Alignment ==
> 
> The ASF is the natural choice to host the Tephra project as its goal of
> encouraging community-driven open source projects fits with our vision for
> Tephra.
> 
> Additionally, many other projects with which we are familiar and expect
> Tephra to integrate with, such as Phoenix, Zookeeper, HDFS, log4j, and others
> mentioned in the External Dependencies section are Apache projects, and
> Tephra will benefit by close proximity to them.
> 
> = Known Risks =
> 
> == Orphaned Products ==
> 
> There is very little risk of Tephra being orphaned, as it is a key part of
> Cask Data’s products. The core Tephra developers plan to continue to work
> on Tephra, and Cask Data has funding in place to support their efforts
> going forward.
> Also with Phoenix using Tephra for transactions, Phoenix developers are
> keen on contributing to Tephra.
> 
> 
> == Inexperience with Open Source ==
> 
> Several of the core developers have experience with open source
> development. Andreas Neumann is an Apache committer for Oozie and Twill.
> Terence Yim is an Apache committer for Helix and Twill. Poorna Chandra
> is an Apache committer for Twill. Gary Helmling is a committer for
> Apache Twill and a committer and PMC member for Apache HBase.
> James Taylor is PMC chair for Apache Phoenix, PMC member of Apache Calcite,
> and an IPMC member.
> 
> == Homogeneous Developers ==
> 
> The current core developers are all Cask Data employees. However, we
> intend to establish a developer community that includes independent and
> corporate contributors. We are encouraging new contributors via our mailing
> lists, public presentations, and personal contacts, and we will continue to
> do so.
> 
> Apache Phoenix developers have already contributed several patches to Tephra,
> and have expressed interest in becoming long term contributors.
> 
> == Reliance on Salaried Developers ==
> 
> Currently, these developers are paid to work on Tephra. Once the project has
> built a community, we expect to attract committers, developers and community
> other than the current core developers. However, because Cask Data
> products use Tephra internally, the reliance on salaried developers is
> unlikely to change, at least in the near term.
> 
> == Relationships with Other Apache Products ==
> 
> Tephra is deeply integrated with Apache projects. Tephra provides transactions
> over Apache HBase, and uses Apache Twill and Apache Zookeeper for coordination.
> A number of other Apache projects are Tephra dependencies, and are
> listed in the External Dependencies section.
> 
> In addition, Apache Phoenix is using Tephra as the transaction engine.
> 
> == An Excessive Fascination with the Apache Brand ==
> 
> While we respect the reputation of the Apache brand and have no doubt that
> it will attract contributors and users, our interest is primarily to give
> Tephra a solid home as an open source project following an established
> development model. We have also given additional reasons in the Rationale
> and Alignment sections.
> 
> = Documentation =
> 
> The current documentation for Tephra is at https://github.com/caskdata/tephra.
> 
> = Initial Source =
> 
> Tephra codebase is currently hosted at https://github.com/caskdata/tephra.
> 
> = Source and Intellectual Property Submission Plan =
> 
> Tephra codebase is currently licensed under Apache 2.0 license.
> Cask Data owns the trademark for "Tephra". As part of the incubation process
> Cask Data will transfer the trademark to Apache Foundation.
> 
> = External Dependencies =
> 
> The dependencies all have Apache-compatible licenses:
> * dropwizard metrics (Apache 2.0)
> * fastutil (Apache 2.0)
> * gson (Apache 2.0)
> * guava-libraries (Apache 2.0)
> * guice (Apache 2.0)
> * hadoop (Apache 2.0)
> * hbase (Apache 2.0)
> * hdfs (Apache 2.0)
> * junit (EPL v1.0)
> * logback (EPL v1.0 )
> * slf4j (MIT)
> * thrift (Apache 2.0)
> * twill (Apache 2.0)
> * zookeeper (Apache 2.0)
> 
> = Cryptography =
> 
> Tephra does not use cryptography itself, however it can run on secure Hadoop,
> which uses Kerberos.
> 
> = Required Resources =
> 
> == Mailing Lists ==
> 
> * tephra-private for private PMC discussions (with moderated subscriptions)
> * tephra-dev for technical discussions among contributors
> * tephra-commits for notification about commits
> 
> == Subversion Directory ==
> 
> Git is the preferred source control system: git://git.apache.org/tephra
> 
> == Issue Tracking ==
> 
> JIRA Tephra (TEPHRA)
> 
> == Other Resources ==
> 
> The existing code already has unit tests, so we would like a Hudson
> instance to run them whenever a new patch is submitted. This can be added
> after project creation.
> 
> = Initial Committers =
> 
> * Andreas Neumann <anew at apache dot org>
> * Terence Yim <chtyim at apache dot org>
> * Poorna Chandra <poorna at apache dot org>
> * Gokul Gunasekaran <gokul at cask dot co>
> * James Taylor <jamestaylor at apache dot org>
> * Thomas D'Silva <tdsilva at apache dot org>
> * Gary Helmling <garyh at apache dot org>
> 
> = Affiliations =
> 
> * Andreas Neumann (Cask Data)
> * Terence Yim (Cask Data)
> * Poorna Chandra (Cask Data)
> * Gokul Gunasekaran (Cask Data)
> * James Taylor (Salesforce.com)
> * Thomas D'Silva (Salesforce.com)
> * Gary Helmling (Facebook)
> 
> = Sponsors =
> 
> == Champion ==
> 
> James Taylor <jamestaylor at apache dot org> (V.P., Apache Phoenix)
> 
> == Nominated Mentors ==
> 
> * James Taylor <jamestaylor at apache dot org>
> * Lars Hofhansl <larsh at apache dot org>
> * Andrew Purtell <apurtell at apache dot org>
> * Alan Gates <gates at apache dot org>
> * Henry Saputra <hsaputra at apache dot org>
> 
> == Sponsoring Entity ==
> 
> We are requesting that the Incubator sponsor this project.


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [VOTE] Accept Tephra into the Apache Incubator

Posted by James Taylor <ja...@apache.org>.
+1 (binding)

On Thursday, March 3, 2016, Henry Saputra <he...@gmail.com> wrote:

> +1 (binding)
>
> On Thu, Mar 3, 2016 at 5:29 PM, Poorna Chandra <poorna@apache.org
> <javascript:;>> wrote:
>
> > Hi All,
> >
> > Tephra proposal was sent out for discussion last week. The proposal is
> > available at https://wiki.apache.org/incubator/TephraProposal
> >
> > Please vote to accept Tephra into the Apache Incubator. The vote will be
> > open for the next 72 hours.
> >
> > [ ] +1 Accept Tephra as an Apache Incubator podling.
> > [ ] +0 Abstain.
> > [ ] -1 Don’t accept Tephra as an Apache Incubator podling because ...
> >
> > Thanks,
> > Poorna.
> >
> > ------
> >
> > = Abstract =
> >
> > Tephra is a system for providing globally consistent transactions on
> > top of Apache HBase and other storage engines.
> >
> > = Proposal =
> >
> > Tephra is a transaction engine for distributed data stores like Apache
> > HBase.
> > It provides ACID semantics for concurrent data operations that span over
> > region
> > boundaries in HBase using Optimistic Concurrency Control.
> >
> > = Background =
> >
> > HBase provides strong consistency with row- or region-level ACID
> > operations. However, it sacrifices cross-region and cross-table
> > consistency in favor of scalability. This trade-off requires application
> > developers to handle  the complexity of ensuring consistency when their
> > modifications span region boundaries. By providing support for global
> > transactions that span regions, tables, or multiple RPCs,
> > Tephra simplifies application development on top of HBase, without a
> > significant impact on performance or scalability for many workloads.
> >
> > Tephra leverages HBase’s native data versioning to provide
> multi-versioned
> > concurrency control (MVCC) for transactional reads and writes.
> > With MVCC capability, each transaction sees its own consistent “snapshot”
> > of
> > data, providing snapshot isolation of concurrent transactions.
> > MVCC along with conflict detection and handling enables Optimistic
> > Concurrency
> > Control.
> >
> > Tephra consists of three main components:
> >  * Transaction Server – maintains global view of transaction state,
> assigns
> >    new transaction IDs and performs conflict detection;
> >  * Transaction Client – coordinates start, commit, and rollback of
> > transactions; and
> >  * Transaction Processor Coprocessor – applies filtering to the data read
> > (based
> >    on a given transaction’s state) and cleans up any data from old
> >    (no longer visible) transactions.
> >
> > Although Tephra only supports HBase now, it can be extended to support
> > transactions on any store that has multi-versioning and rollback
> > support. The transactions
> > can span over multiple stores and storage paradigms.
> >
> > = Rationale =
> >
> > Tephra has simple abstractions which can be used by an application to
> > add transaction support over HBase. By abstracting away transaction
> > handling using Tephra, the application is freed of
> > transaction logic, and the application developer can focus on the use
> case.
> > Also, Tephra can be extended to support transactions on data sources
> other
> > than HBase.
> >
> > By making Tephra an Apache open source project, we believe that there
> will
> > be wider adoption and more opportunities for Tephra to be integrated
> > into other Apache projects.
> >
> > = Current Status =
> >
> > Tephra was built at Cask Data Inc. initially as part of
> > open-source framework Cask Data Application Platform (CDAP)
> > [[http://cdap.io/]].
> > It was later converted into an independent open source project with
> > Apache 2.0 License [[https://github.com/caskdata/tephra]].
> >
> > Tephra is used in CDAP as the transaction engine. As part of CDAP, Tephra
> > has been deployed at multiple companies.
> >
> > Apache Phoenix is using Tephra as transaction engine in the next release.
> >
> > == Meritocracy ==
> >
> > Our intent with this incubator proposal is to start building a diverse
> > developer community around Tephra following the Apache meritocracy model.
> > Since Tephra was initially developed in early 2013, we have had fast
> > adoption and contributions within Cask Data. We are looking forward to
> > new contributors. We wish to build a community based on Apache's
> > meritocracy principles, working with those who contribute significantly
> to
> > the project and welcoming them to be committers both during the
> incubation
> > process and beyond.
> >
> > == Community ==
> >
> > Core developers of Tephra are at Cask Data. Recently the developer
> > community
> > has expanded to include folks from Apache Phoenix. We hope to extend our
> > contributor base significantly and we will invite all who are interested
> > in working on distributed transaction engine.
> >
> > == Core Developers ==
> >
> > A few engineers from Cask Data and outside have developed Tephra:
> > Andreas Neumann, Terence Yim, Gary Helmling, Andrew Purtell and
> > Poorna Chandra.
> >
> >
> > == Alignment ==
> >
> > The ASF is the natural choice to host the Tephra project as its goal of
> > encouraging community-driven open source projects fits with our vision
> for
> > Tephra.
> >
> > Additionally, many other projects with which we are familiar and expect
> > Tephra to integrate with, such as Phoenix, Zookeeper, HDFS, log4j, and
> > others
> > mentioned in the External Dependencies section are Apache projects, and
> > Tephra will benefit by close proximity to them.
> >
> > = Known Risks =
> >
> > == Orphaned Products ==
> >
> > There is very little risk of Tephra being orphaned, as it is a key part
> of
> > Cask Data’s products. The core Tephra developers plan to continue to work
> > on Tephra, and Cask Data has funding in place to support their efforts
> > going forward.
> > Also with Phoenix using Tephra for transactions, Phoenix developers are
> > keen on contributing to Tephra.
> >
> >
> > == Inexperience with Open Source ==
> >
> > Several of the core developers have experience with open source
> > development. Andreas Neumann is an Apache committer for Oozie and Twill.
> > Terence Yim is an Apache committer for Helix and Twill. Poorna Chandra
> > is an Apache committer for Twill. Gary Helmling is a committer for
> > Apache Twill and a committer and PMC member for Apache HBase.
> > James Taylor is PMC chair for Apache Phoenix, PMC member of Apache
> Calcite,
> > and an IPMC member.
> >
> > == Homogeneous Developers ==
> >
> > The current core developers are all Cask Data employees. However, we
> > intend to establish a developer community that includes independent and
> > corporate contributors. We are encouraging new contributors via our
> mailing
> > lists, public presentations, and personal contacts, and we will continue
> to
> > do so.
> >
> > Apache Phoenix developers have already contributed several patches to
> > Tephra,
> > and have expressed interest in becoming long term contributors.
> >
> > == Reliance on Salaried Developers ==
> >
> > Currently, these developers are paid to work on Tephra. Once the project
> > has
> > built a community, we expect to attract committers, developers and
> > community
> > other than the current core developers. However, because Cask Data
> > products use Tephra internally, the reliance on salaried developers is
> > unlikely to change, at least in the near term.
> >
> > == Relationships with Other Apache Products ==
> >
> > Tephra is deeply integrated with Apache projects. Tephra provides
> > transactions
> > over Apache HBase, and uses Apache Twill and Apache Zookeeper for
> > coordination.
> > A number of other Apache projects are Tephra dependencies, and are
> > listed in the External Dependencies section.
> >
> > In addition, Apache Phoenix is using Tephra as the transaction engine.
> >
> > == An Excessive Fascination with the Apache Brand ==
> >
> > While we respect the reputation of the Apache brand and have no doubt
> that
> > it will attract contributors and users, our interest is primarily to give
> > Tephra a solid home as an open source project following an established
> > development model. We have also given additional reasons in the Rationale
> > and Alignment sections.
> >
> > = Documentation =
> >
> > The current documentation for Tephra is at
> > https://github.com/caskdata/tephra.
> >
> > = Initial Source =
> >
> > Tephra codebase is currently hosted at
> https://github.com/caskdata/tephra.
> >
> > = Source and Intellectual Property Submission Plan =
> >
> > Tephra codebase is currently licensed under Apache 2.0 license.
> > Cask Data owns the trademark for "Tephra". As part of the incubation
> > process
> > Cask Data will transfer the trademark to Apache Foundation.
> >
> > = External Dependencies =
> >
> > The dependencies all have Apache-compatible licenses:
> >  * dropwizard metrics (Apache 2.0)
> >  * fastutil (Apache 2.0)
> >  * gson (Apache 2.0)
> >  * guava-libraries (Apache 2.0)
> >  * guice (Apache 2.0)
> >  * hadoop (Apache 2.0)
> >  * hbase (Apache 2.0)
> >  * hdfs (Apache 2.0)
> >  * junit (EPL v1.0)
> >  * logback (EPL v1.0 )
> >  * slf4j (MIT)
> >  * thrift (Apache 2.0)
> >  * twill (Apache 2.0)
> >  * zookeeper (Apache 2.0)
> >
> > = Cryptography =
> >
> > Tephra does not use cryptography itself, however it can run on secure
> > Hadoop,
> > which uses Kerberos.
> >
> > = Required Resources =
> >
> > == Mailing Lists ==
> >
> >  * tephra-private for private PMC discussions (with moderated
> > subscriptions)
> >  * tephra-dev for technical discussions among contributors
> >  * tephra-commits for notification about commits
> >
> > == Subversion Directory ==
> >
> > Git is the preferred source control system: git://git.apache.org/tephra
> >
> > == Issue Tracking ==
> >
> > JIRA Tephra (TEPHRA)
> >
> > == Other Resources ==
> >
> > The existing code already has unit tests, so we would like a Hudson
> > instance to run them whenever a new patch is submitted. This can be added
> > after project creation.
> >
> > = Initial Committers =
> >
> >  * Andreas Neumann <anew at apache dot org>
> >  * Terence Yim <chtyim at apache dot org>
> >  * Poorna Chandra <poorna at apache dot org>
> >  * Gokul Gunasekaran <gokul at cask dot co>
> >  * James Taylor <jamestaylor at apache dot org>
> >  * Thomas D'Silva <tdsilva at apache dot org>
> >  * Gary Helmling <garyh at apache dot org>
> >
> > = Affiliations =
> >
> >  * Andreas Neumann (Cask Data)
> >  * Terence Yim (Cask Data)
> >  * Poorna Chandra (Cask Data)
> >  * Gokul Gunasekaran (Cask Data)
> >  * James Taylor (Salesforce.com)
> >  * Thomas D'Silva (Salesforce.com)
> >  * Gary Helmling (Facebook)
> >
> > = Sponsors =
> >
> > == Champion ==
> >
> > James Taylor <jamestaylor at apache dot org> (V.P., Apache Phoenix)
> >
> > == Nominated Mentors ==
> >
> >  * James Taylor <jamestaylor at apache dot org>
> >  * Lars Hofhansl <larsh at apache dot org>
> >  * Andrew Purtell <apurtell at apache dot org>
> >  * Alan Gates <gates at apache dot org>
> >  * Henry Saputra <hsaputra at apache dot org>
> >
> > == Sponsoring Entity ==
> >
> > We are requesting that the Incubator sponsor this project.
> >
>

Re: [VOTE] Accept Tephra into the Apache Incubator

Posted by Henry Saputra <he...@gmail.com>.
+1 (binding)

On Thu, Mar 3, 2016 at 5:29 PM, Poorna Chandra <po...@apache.org> wrote:

> Hi All,
>
> Tephra proposal was sent out for discussion last week. The proposal is
> available at https://wiki.apache.org/incubator/TephraProposal
>
> Please vote to accept Tephra into the Apache Incubator. The vote will be
> open for the next 72 hours.
>
> [ ] +1 Accept Tephra as an Apache Incubator podling.
> [ ] +0 Abstain.
> [ ] -1 Don’t accept Tephra as an Apache Incubator podling because ...
>
> Thanks,
> Poorna.
>
> ------
>
> = Abstract =
>
> Tephra is a system for providing globally consistent transactions on
> top of Apache HBase and other storage engines.
>
> = Proposal =
>
> Tephra is a transaction engine for distributed data stores like Apache
> HBase.
> It provides ACID semantics for concurrent data operations that span over
> region
> boundaries in HBase using Optimistic Concurrency Control.
>
> = Background =
>
> HBase provides strong consistency with row- or region-level ACID
> operations. However, it sacrifices cross-region and cross-table
> consistency in favor of scalability. This trade-off requires application
> developers to handle  the complexity of ensuring consistency when their
> modifications span region boundaries. By providing support for global
> transactions that span regions, tables, or multiple RPCs,
> Tephra simplifies application development on top of HBase, without a
> significant impact on performance or scalability for many workloads.
>
> Tephra leverages HBase’s native data versioning to provide multi-versioned
> concurrency control (MVCC) for transactional reads and writes.
> With MVCC capability, each transaction sees its own consistent “snapshot”
> of
> data, providing snapshot isolation of concurrent transactions.
> MVCC along with conflict detection and handling enables Optimistic
> Concurrency
> Control.
>
> Tephra consists of three main components:
>  * Transaction Server – maintains global view of transaction state, assigns
>    new transaction IDs and performs conflict detection;
>  * Transaction Client – coordinates start, commit, and rollback of
> transactions; and
>  * Transaction Processor Coprocessor – applies filtering to the data read
> (based
>    on a given transaction’s state) and cleans up any data from old
>    (no longer visible) transactions.
>
> Although Tephra only supports HBase now, it can be extended to support
> transactions on any store that has multi-versioning and rollback
> support. The transactions
> can span over multiple stores and storage paradigms.
>
> = Rationale =
>
> Tephra has simple abstractions which can be used by an application to
> add transaction support over HBase. By abstracting away transaction
> handling using Tephra, the application is freed of
> transaction logic, and the application developer can focus on the use case.
> Also, Tephra can be extended to support transactions on data sources other
> than HBase.
>
> By making Tephra an Apache open source project, we believe that there will
> be wider adoption and more opportunities for Tephra to be integrated
> into other Apache projects.
>
> = Current Status =
>
> Tephra was built at Cask Data Inc. initially as part of
> open-source framework Cask Data Application Platform (CDAP)
> [[http://cdap.io/]].
> It was later converted into an independent open source project with
> Apache 2.0 License [[https://github.com/caskdata/tephra]].
>
> Tephra is used in CDAP as the transaction engine. As part of CDAP, Tephra
> has been deployed at multiple companies.
>
> Apache Phoenix is using Tephra as transaction engine in the next release.
>
> == Meritocracy ==
>
> Our intent with this incubator proposal is to start building a diverse
> developer community around Tephra following the Apache meritocracy model.
> Since Tephra was initially developed in early 2013, we have had fast
> adoption and contributions within Cask Data. We are looking forward to
> new contributors. We wish to build a community based on Apache's
> meritocracy principles, working with those who contribute significantly to
> the project and welcoming them to be committers both during the incubation
> process and beyond.
>
> == Community ==
>
> Core developers of Tephra are at Cask Data. Recently the developer
> community
> has expanded to include folks from Apache Phoenix. We hope to extend our
> contributor base significantly and we will invite all who are interested
> in working on distributed transaction engine.
>
> == Core Developers ==
>
> A few engineers from Cask Data and outside have developed Tephra:
> Andreas Neumann, Terence Yim, Gary Helmling, Andrew Purtell and
> Poorna Chandra.
>
>
> == Alignment ==
>
> The ASF is the natural choice to host the Tephra project as its goal of
> encouraging community-driven open source projects fits with our vision for
> Tephra.
>
> Additionally, many other projects with which we are familiar and expect
> Tephra to integrate with, such as Phoenix, Zookeeper, HDFS, log4j, and
> others
> mentioned in the External Dependencies section are Apache projects, and
> Tephra will benefit by close proximity to them.
>
> = Known Risks =
>
> == Orphaned Products ==
>
> There is very little risk of Tephra being orphaned, as it is a key part of
> Cask Data’s products. The core Tephra developers plan to continue to work
> on Tephra, and Cask Data has funding in place to support their efforts
> going forward.
> Also with Phoenix using Tephra for transactions, Phoenix developers are
> keen on contributing to Tephra.
>
>
> == Inexperience with Open Source ==
>
> Several of the core developers have experience with open source
> development. Andreas Neumann is an Apache committer for Oozie and Twill.
> Terence Yim is an Apache committer for Helix and Twill. Poorna Chandra
> is an Apache committer for Twill. Gary Helmling is a committer for
> Apache Twill and a committer and PMC member for Apache HBase.
> James Taylor is PMC chair for Apache Phoenix, PMC member of Apache Calcite,
> and an IPMC member.
>
> == Homogeneous Developers ==
>
> The current core developers are all Cask Data employees. However, we
> intend to establish a developer community that includes independent and
> corporate contributors. We are encouraging new contributors via our mailing
> lists, public presentations, and personal contacts, and we will continue to
> do so.
>
> Apache Phoenix developers have already contributed several patches to
> Tephra,
> and have expressed interest in becoming long term contributors.
>
> == Reliance on Salaried Developers ==
>
> Currently, these developers are paid to work on Tephra. Once the project
> has
> built a community, we expect to attract committers, developers and
> community
> other than the current core developers. However, because Cask Data
> products use Tephra internally, the reliance on salaried developers is
> unlikely to change, at least in the near term.
>
> == Relationships with Other Apache Products ==
>
> Tephra is deeply integrated with Apache projects. Tephra provides
> transactions
> over Apache HBase, and uses Apache Twill and Apache Zookeeper for
> coordination.
> A number of other Apache projects are Tephra dependencies, and are
> listed in the External Dependencies section.
>
> In addition, Apache Phoenix is using Tephra as the transaction engine.
>
> == An Excessive Fascination with the Apache Brand ==
>
> While we respect the reputation of the Apache brand and have no doubt that
> it will attract contributors and users, our interest is primarily to give
> Tephra a solid home as an open source project following an established
> development model. We have also given additional reasons in the Rationale
> and Alignment sections.
>
> = Documentation =
>
> The current documentation for Tephra is at
> https://github.com/caskdata/tephra.
>
> = Initial Source =
>
> Tephra codebase is currently hosted at https://github.com/caskdata/tephra.
>
> = Source and Intellectual Property Submission Plan =
>
> Tephra codebase is currently licensed under Apache 2.0 license.
> Cask Data owns the trademark for "Tephra". As part of the incubation
> process
> Cask Data will transfer the trademark to Apache Foundation.
>
> = External Dependencies =
>
> The dependencies all have Apache-compatible licenses:
>  * dropwizard metrics (Apache 2.0)
>  * fastutil (Apache 2.0)
>  * gson (Apache 2.0)
>  * guava-libraries (Apache 2.0)
>  * guice (Apache 2.0)
>  * hadoop (Apache 2.0)
>  * hbase (Apache 2.0)
>  * hdfs (Apache 2.0)
>  * junit (EPL v1.0)
>  * logback (EPL v1.0 )
>  * slf4j (MIT)
>  * thrift (Apache 2.0)
>  * twill (Apache 2.0)
>  * zookeeper (Apache 2.0)
>
> = Cryptography =
>
> Tephra does not use cryptography itself, however it can run on secure
> Hadoop,
> which uses Kerberos.
>
> = Required Resources =
>
> == Mailing Lists ==
>
>  * tephra-private for private PMC discussions (with moderated
> subscriptions)
>  * tephra-dev for technical discussions among contributors
>  * tephra-commits for notification about commits
>
> == Subversion Directory ==
>
> Git is the preferred source control system: git://git.apache.org/tephra
>
> == Issue Tracking ==
>
> JIRA Tephra (TEPHRA)
>
> == Other Resources ==
>
> The existing code already has unit tests, so we would like a Hudson
> instance to run them whenever a new patch is submitted. This can be added
> after project creation.
>
> = Initial Committers =
>
>  * Andreas Neumann <anew at apache dot org>
>  * Terence Yim <chtyim at apache dot org>
>  * Poorna Chandra <poorna at apache dot org>
>  * Gokul Gunasekaran <gokul at cask dot co>
>  * James Taylor <jamestaylor at apache dot org>
>  * Thomas D'Silva <tdsilva at apache dot org>
>  * Gary Helmling <garyh at apache dot org>
>
> = Affiliations =
>
>  * Andreas Neumann (Cask Data)
>  * Terence Yim (Cask Data)
>  * Poorna Chandra (Cask Data)
>  * Gokul Gunasekaran (Cask Data)
>  * James Taylor (Salesforce.com)
>  * Thomas D'Silva (Salesforce.com)
>  * Gary Helmling (Facebook)
>
> = Sponsors =
>
> == Champion ==
>
> James Taylor <jamestaylor at apache dot org> (V.P., Apache Phoenix)
>
> == Nominated Mentors ==
>
>  * James Taylor <jamestaylor at apache dot org>
>  * Lars Hofhansl <larsh at apache dot org>
>  * Andrew Purtell <apurtell at apache dot org>
>  * Alan Gates <gates at apache dot org>
>  * Henry Saputra <hsaputra at apache dot org>
>
> == Sponsoring Entity ==
>
> We are requesting that the Incubator sponsor this project.
>

RE: [VOTE] Accept Tephra into the Apache Incubator

Posted by "Vasudevan, Ramkrishna S" <ra...@intel.com>.
+1(non-binding)

Regards
Ram

-----Original Message-----
From: Andrew Purtell [mailto:andrew.purtell@gmail.com] 
Sent: Friday, March 4, 2016 11:55 AM
To: general@incubator.apache.org
Subject: Re: [VOTE] Accept Tephra into the Apache Incubator

+1 (binding)

> On Mar 3, 2016, at 5:29 PM, Poorna Chandra <po...@apache.org> wrote:
> 
> Hi All,
> 
> Tephra proposal was sent out for discussion last week. The proposal is 
> available at https://wiki.apache.org/incubator/TephraProposal
> 
> Please vote to accept Tephra into the Apache Incubator. The vote will 
> be open for the next 72 hours.
> 
> [ ] +1 Accept Tephra as an Apache Incubator podling.
> [ ] +0 Abstain.
> [ ] -1 Don’t accept Tephra as an Apache Incubator podling because ...
> 
> Thanks,
> Poorna.
> 
> ------
> 
> = Abstract =
> 
> Tephra is a system for providing globally consistent transactions on 
> top of Apache HBase and other storage engines.
> 
> = Proposal =
> 
> Tephra is a transaction engine for distributed data stores like Apache HBase.
> It provides ACID semantics for concurrent data operations that span 
> over region boundaries in HBase using Optimistic Concurrency Control.
> 
> = Background =
> 
> HBase provides strong consistency with row- or region-level ACID 
> operations. However, it sacrifices cross-region and cross-table 
> consistency in favor of scalability. This trade-off requires 
> application developers to handle  the complexity of ensuring 
> consistency when their modifications span region boundaries. By 
> providing support for global transactions that span regions, tables, 
> or multiple RPCs, Tephra simplifies application development on top of 
> HBase, without a significant impact on performance or scalability for many workloads.
> 
> Tephra leverages HBase’s native data versioning to provide 
> multi-versioned concurrency control (MVCC) for transactional reads and writes.
> With MVCC capability, each transaction sees its own consistent 
> “snapshot” of data, providing snapshot isolation of concurrent transactions.
> MVCC along with conflict detection and handling enables Optimistic 
> Concurrency Control.
> 
> Tephra consists of three main components:
> * Transaction Server – maintains global view of transaction state, assigns
>   new transaction IDs and performs conflict detection;
> * Transaction Client – coordinates start, commit, and rollback of 
> transactions; and
> * Transaction Processor Coprocessor – applies filtering to the data read (based
>   on a given transaction’s state) and cleans up any data from old
>   (no longer visible) transactions.
> 
> Although Tephra only supports HBase now, it can be extended to support 
> transactions on any store that has multi-versioning and rollback 
> support. The transactions can span over multiple stores and storage 
> paradigms.
> 
> = Rationale =
> 
> Tephra has simple abstractions which can be used by an application to 
> add transaction support over HBase. By abstracting away transaction 
> handling using Tephra, the application is freed of transaction logic, 
> and the application developer can focus on the use case.
> Also, Tephra can be extended to support transactions on data sources 
> other than HBase.
> 
> By making Tephra an Apache open source project, we believe that there 
> will be wider adoption and more opportunities for Tephra to be 
> integrated into other Apache projects.
> 
> = Current Status =
> 
> Tephra was built at Cask Data Inc. initially as part of open-source 
> framework Cask Data Application Platform (CDAP) [[http://cdap.io/]].
> It was later converted into an independent open source project with 
> Apache 2.0 License [[https://github.com/caskdata/tephra]].
> 
> Tephra is used in CDAP as the transaction engine. As part of CDAP, 
> Tephra has been deployed at multiple companies.
> 
> Apache Phoenix is using Tephra as transaction engine in the next release.
> 
> == Meritocracy ==
> 
> Our intent with this incubator proposal is to start building a diverse 
> developer community around Tephra following the Apache meritocracy model.
> Since Tephra was initially developed in early 2013, we have had fast 
> adoption and contributions within Cask Data. We are looking forward to 
> new contributors. We wish to build a community based on Apache's 
> meritocracy principles, working with those who contribute 
> significantly to the project and welcoming them to be committers both 
> during the incubation process and beyond.
> 
> == Community ==
> 
> Core developers of Tephra are at Cask Data. Recently the developer 
> community has expanded to include folks from Apache Phoenix. We hope 
> to extend our contributor base significantly and we will invite all 
> who are interested in working on distributed transaction engine.
> 
> == Core Developers ==
> 
> A few engineers from Cask Data and outside have developed Tephra:
> Andreas Neumann, Terence Yim, Gary Helmling, Andrew Purtell and Poorna 
> Chandra.
> 
> 
> == Alignment ==
> 
> The ASF is the natural choice to host the Tephra project as its goal 
> of encouraging community-driven open source projects fits with our 
> vision for Tephra.
> 
> Additionally, many other projects with which we are familiar and 
> expect Tephra to integrate with, such as Phoenix, Zookeeper, HDFS, 
> log4j, and others mentioned in the External Dependencies section are 
> Apache projects, and Tephra will benefit by close proximity to them.
> 
> = Known Risks =
> 
> == Orphaned Products ==
> 
> There is very little risk of Tephra being orphaned, as it is a key 
> part of Cask Data’s products. The core Tephra developers plan to 
> continue to work on Tephra, and Cask Data has funding in place to 
> support their efforts going forward.
> Also with Phoenix using Tephra for transactions, Phoenix developers 
> are keen on contributing to Tephra.
> 
> 
> == Inexperience with Open Source ==
> 
> Several of the core developers have experience with open source 
> development. Andreas Neumann is an Apache committer for Oozie and Twill.
> Terence Yim is an Apache committer for Helix and Twill. Poorna Chandra 
> is an Apache committer for Twill. Gary Helmling is a committer for 
> Apache Twill and a committer and PMC member for Apache HBase.
> James Taylor is PMC chair for Apache Phoenix, PMC member of Apache 
> Calcite, and an IPMC member.
> 
> == Homogeneous Developers ==
> 
> The current core developers are all Cask Data employees. However, we 
> intend to establish a developer community that includes independent 
> and corporate contributors. We are encouraging new contributors via 
> our mailing lists, public presentations, and personal contacts, and we 
> will continue to do so.
> 
> Apache Phoenix developers have already contributed several patches to 
> Tephra, and have expressed interest in becoming long term contributors.
> 
> == Reliance on Salaried Developers ==
> 
> Currently, these developers are paid to work on Tephra. Once the 
> project has built a community, we expect to attract committers, 
> developers and community other than the current core developers. 
> However, because Cask Data products use Tephra internally, the 
> reliance on salaried developers is unlikely to change, at least in the near term.
> 
> == Relationships with Other Apache Products ==
> 
> Tephra is deeply integrated with Apache projects. Tephra provides 
> transactions over Apache HBase, and uses Apache Twill and Apache Zookeeper for coordination.
> A number of other Apache projects are Tephra dependencies, and are 
> listed in the External Dependencies section.
> 
> In addition, Apache Phoenix is using Tephra as the transaction engine.
> 
> == An Excessive Fascination with the Apache Brand ==
> 
> While we respect the reputation of the Apache brand and have no doubt 
> that it will attract contributors and users, our interest is primarily 
> to give Tephra a solid home as an open source project following an 
> established development model. We have also given additional reasons 
> in the Rationale and Alignment sections.
> 
> = Documentation =
> 
> The current documentation for Tephra is at https://github.com/caskdata/tephra.
> 
> = Initial Source =
> 
> Tephra codebase is currently hosted at https://github.com/caskdata/tephra.
> 
> = Source and Intellectual Property Submission Plan =
> 
> Tephra codebase is currently licensed under Apache 2.0 license.
> Cask Data owns the trademark for "Tephra". As part of the incubation 
> process Cask Data will transfer the trademark to Apache Foundation.
> 
> = External Dependencies =
> 
> The dependencies all have Apache-compatible licenses:
> * dropwizard metrics (Apache 2.0)
> * fastutil (Apache 2.0)
> * gson (Apache 2.0)
> * guava-libraries (Apache 2.0)
> * guice (Apache 2.0)
> * hadoop (Apache 2.0)
> * hbase (Apache 2.0)
> * hdfs (Apache 2.0)
> * junit (EPL v1.0)
> * logback (EPL v1.0 )
> * slf4j (MIT)
> * thrift (Apache 2.0)
> * twill (Apache 2.0)
> * zookeeper (Apache 2.0)
> 
> = Cryptography =
> 
> Tephra does not use cryptography itself, however it can run on secure 
> Hadoop, which uses Kerberos.
> 
> = Required Resources =
> 
> == Mailing Lists ==
> 
> * tephra-private for private PMC discussions (with moderated 
> subscriptions)
> * tephra-dev for technical discussions among contributors
> * tephra-commits for notification about commits
> 
> == Subversion Directory ==
> 
> Git is the preferred source control system: 
> git://git.apache.org/tephra
> 
> == Issue Tracking ==
> 
> JIRA Tephra (TEPHRA)
> 
> == Other Resources ==
> 
> The existing code already has unit tests, so we would like a Hudson 
> instance to run them whenever a new patch is submitted. This can be 
> added after project creation.
> 
> = Initial Committers =
> 
> * Andreas Neumann <anew at apache dot org>
> * Terence Yim <chtyim at apache dot org>
> * Poorna Chandra <poorna at apache dot org>
> * Gokul Gunasekaran <gokul at cask dot co>
> * James Taylor <jamestaylor at apache dot org>
> * Thomas D'Silva <tdsilva at apache dot org>
> * Gary Helmling <garyh at apache dot org>
> 
> = Affiliations =
> 
> * Andreas Neumann (Cask Data)
> * Terence Yim (Cask Data)
> * Poorna Chandra (Cask Data)
> * Gokul Gunasekaran (Cask Data)
> * James Taylor (Salesforce.com)
> * Thomas D'Silva (Salesforce.com)
> * Gary Helmling (Facebook)
> 
> = Sponsors =
> 
> == Champion ==
> 
> James Taylor <jamestaylor at apache dot org> (V.P., Apache Phoenix)
> 
> == Nominated Mentors ==
> 
> * James Taylor <jamestaylor at apache dot org>
> * Lars Hofhansl <larsh at apache dot org>
> * Andrew Purtell <apurtell at apache dot org>
> * Alan Gates <gates at apache dot org>
> * Henry Saputra <hsaputra at apache dot org>
> 
> == Sponsoring Entity ==
> 
> We are requesting that the Incubator sponsor this project.

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


RE: [VOTE] Accept Tephra into the Apache Incubator

Posted by "Vasudevan, Ramkrishna S" <ra...@intel.com>.
+1 (non-binding)

Regards
Ram

-----Original Message-----
From: Andrew Purtell [mailto:andrew.purtell@gmail.com] 
Sent: Friday, March 4, 2016 11:55 AM
To: general@incubator.apache.org
Subject: Re: [VOTE] Accept Tephra into the Apache Incubator

+1 (binding)

> On Mar 3, 2016, at 5:29 PM, Poorna Chandra <po...@apache.org> wrote:
> 
> Hi All,
> 
> Tephra proposal was sent out for discussion last week. The proposal is 
> available at https://wiki.apache.org/incubator/TephraProposal
> 
> Please vote to accept Tephra into the Apache Incubator. The vote will 
> be open for the next 72 hours.
> 
> [ ] +1 Accept Tephra as an Apache Incubator podling.
> [ ] +0 Abstain.
> [ ] -1 Don’t accept Tephra as an Apache Incubator podling because ...
> 
> Thanks,
> Poorna.
> 
> ------
> 
> = Abstract =
> 
> Tephra is a system for providing globally consistent transactions on 
> top of Apache HBase and other storage engines.
> 
> = Proposal =
> 
> Tephra is a transaction engine for distributed data stores like Apache HBase.
> It provides ACID semantics for concurrent data operations that span 
> over region boundaries in HBase using Optimistic Concurrency Control.
> 
> = Background =
> 
> HBase provides strong consistency with row- or region-level ACID 
> operations. However, it sacrifices cross-region and cross-table 
> consistency in favor of scalability. This trade-off requires 
> application developers to handle  the complexity of ensuring 
> consistency when their modifications span region boundaries. By 
> providing support for global transactions that span regions, tables, 
> or multiple RPCs, Tephra simplifies application development on top of 
> HBase, without a significant impact on performance or scalability for many workloads.
> 
> Tephra leverages HBase’s native data versioning to provide 
> multi-versioned concurrency control (MVCC) for transactional reads and writes.
> With MVCC capability, each transaction sees its own consistent 
> “snapshot” of data, providing snapshot isolation of concurrent transactions.
> MVCC along with conflict detection and handling enables Optimistic 
> Concurrency Control.
> 
> Tephra consists of three main components:
> * Transaction Server – maintains global view of transaction state, assigns
>   new transaction IDs and performs conflict detection;
> * Transaction Client – coordinates start, commit, and rollback of 
> transactions; and
> * Transaction Processor Coprocessor – applies filtering to the data read (based
>   on a given transaction’s state) and cleans up any data from old
>   (no longer visible) transactions.
> 
> Although Tephra only supports HBase now, it can be extended to support 
> transactions on any store that has multi-versioning and rollback 
> support. The transactions can span over multiple stores and storage 
> paradigms.
> 
> = Rationale =
> 
> Tephra has simple abstractions which can be used by an application to 
> add transaction support over HBase. By abstracting away transaction 
> handling using Tephra, the application is freed of transaction logic, 
> and the application developer can focus on the use case.
> Also, Tephra can be extended to support transactions on data sources 
> other than HBase.
> 
> By making Tephra an Apache open source project, we believe that there 
> will be wider adoption and more opportunities for Tephra to be 
> integrated into other Apache projects.
> 
> = Current Status =
> 
> Tephra was built at Cask Data Inc. initially as part of open-source 
> framework Cask Data Application Platform (CDAP) [[http://cdap.io/]].
> It was later converted into an independent open source project with 
> Apache 2.0 License [[https://github.com/caskdata/tephra]].
> 
> Tephra is used in CDAP as the transaction engine. As part of CDAP, 
> Tephra has been deployed at multiple companies.
> 
> Apache Phoenix is using Tephra as transaction engine in the next release.
> 
> == Meritocracy ==
> 
> Our intent with this incubator proposal is to start building a diverse 
> developer community around Tephra following the Apache meritocracy model.
> Since Tephra was initially developed in early 2013, we have had fast 
> adoption and contributions within Cask Data. We are looking forward to 
> new contributors. We wish to build a community based on Apache's 
> meritocracy principles, working with those who contribute 
> significantly to the project and welcoming them to be committers both 
> during the incubation process and beyond.
> 
> == Community ==
> 
> Core developers of Tephra are at Cask Data. Recently the developer 
> community has expanded to include folks from Apache Phoenix. We hope 
> to extend our contributor base significantly and we will invite all 
> who are interested in working on distributed transaction engine.
> 
> == Core Developers ==
> 
> A few engineers from Cask Data and outside have developed Tephra:
> Andreas Neumann, Terence Yim, Gary Helmling, Andrew Purtell and Poorna 
> Chandra.
> 
> 
> == Alignment ==
> 
> The ASF is the natural choice to host the Tephra project as its goal 
> of encouraging community-driven open source projects fits with our 
> vision for Tephra.
> 
> Additionally, many other projects with which we are familiar and 
> expect Tephra to integrate with, such as Phoenix, Zookeeper, HDFS, 
> log4j, and others mentioned in the External Dependencies section are 
> Apache projects, and Tephra will benefit by close proximity to them.
> 
> = Known Risks =
> 
> == Orphaned Products ==
> 
> There is very little risk of Tephra being orphaned, as it is a key 
> part of Cask Data’s products. The core Tephra developers plan to 
> continue to work on Tephra, and Cask Data has funding in place to 
> support their efforts going forward.
> Also with Phoenix using Tephra for transactions, Phoenix developers 
> are keen on contributing to Tephra.
> 
> 
> == Inexperience with Open Source ==
> 
> Several of the core developers have experience with open source 
> development. Andreas Neumann is an Apache committer for Oozie and Twill.
> Terence Yim is an Apache committer for Helix and Twill. Poorna Chandra 
> is an Apache committer for Twill. Gary Helmling is a committer for 
> Apache Twill and a committer and PMC member for Apache HBase.
> James Taylor is PMC chair for Apache Phoenix, PMC member of Apache 
> Calcite, and an IPMC member.
> 
> == Homogeneous Developers ==
> 
> The current core developers are all Cask Data employees. However, we 
> intend to establish a developer community that includes independent 
> and corporate contributors. We are encouraging new contributors via 
> our mailing lists, public presentations, and personal contacts, and we 
> will continue to do so.
> 
> Apache Phoenix developers have already contributed several patches to 
> Tephra, and have expressed interest in becoming long term contributors.
> 
> == Reliance on Salaried Developers ==
> 
> Currently, these developers are paid to work on Tephra. Once the 
> project has built a community, we expect to attract committers, 
> developers and community other than the current core developers. 
> However, because Cask Data products use Tephra internally, the 
> reliance on salaried developers is unlikely to change, at least in the near term.
> 
> == Relationships with Other Apache Products ==
> 
> Tephra is deeply integrated with Apache projects. Tephra provides 
> transactions over Apache HBase, and uses Apache Twill and Apache Zookeeper for coordination.
> A number of other Apache projects are Tephra dependencies, and are 
> listed in the External Dependencies section.
> 
> In addition, Apache Phoenix is using Tephra as the transaction engine.
> 
> == An Excessive Fascination with the Apache Brand ==
> 
> While we respect the reputation of the Apache brand and have no doubt 
> that it will attract contributors and users, our interest is primarily 
> to give Tephra a solid home as an open source project following an 
> established development model. We have also given additional reasons 
> in the Rationale and Alignment sections.
> 
> = Documentation =
> 
> The current documentation for Tephra is at https://github.com/caskdata/tephra.
> 
> = Initial Source =
> 
> Tephra codebase is currently hosted at https://github.com/caskdata/tephra.
> 
> = Source and Intellectual Property Submission Plan =
> 
> Tephra codebase is currently licensed under Apache 2.0 license.
> Cask Data owns the trademark for "Tephra". As part of the incubation 
> process Cask Data will transfer the trademark to Apache Foundation.
> 
> = External Dependencies =
> 
> The dependencies all have Apache-compatible licenses:
> * dropwizard metrics (Apache 2.0)
> * fastutil (Apache 2.0)
> * gson (Apache 2.0)
> * guava-libraries (Apache 2.0)
> * guice (Apache 2.0)
> * hadoop (Apache 2.0)
> * hbase (Apache 2.0)
> * hdfs (Apache 2.0)
> * junit (EPL v1.0)
> * logback (EPL v1.0 )
> * slf4j (MIT)
> * thrift (Apache 2.0)
> * twill (Apache 2.0)
> * zookeeper (Apache 2.0)
> 
> = Cryptography =
> 
> Tephra does not use cryptography itself, however it can run on secure 
> Hadoop, which uses Kerberos.
> 
> = Required Resources =
> 
> == Mailing Lists ==
> 
> * tephra-private for private PMC discussions (with moderated 
> subscriptions)
> * tephra-dev for technical discussions among contributors
> * tephra-commits for notification about commits
> 
> == Subversion Directory ==
> 
> Git is the preferred source control system: 
> git://git.apache.org/tephra
> 
> == Issue Tracking ==
> 
> JIRA Tephra (TEPHRA)
> 
> == Other Resources ==
> 
> The existing code already has unit tests, so we would like a Hudson 
> instance to run them whenever a new patch is submitted. This can be 
> added after project creation.
> 
> = Initial Committers =
> 
> * Andreas Neumann <anew at apache dot org>
> * Terence Yim <chtyim at apache dot org>
> * Poorna Chandra <poorna at apache dot org>
> * Gokul Gunasekaran <gokul at cask dot co>
> * James Taylor <jamestaylor at apache dot org>
> * Thomas D'Silva <tdsilva at apache dot org>
> * Gary Helmling <garyh at apache dot org>
> 
> = Affiliations =
> 
> * Andreas Neumann (Cask Data)
> * Terence Yim (Cask Data)
> * Poorna Chandra (Cask Data)
> * Gokul Gunasekaran (Cask Data)
> * James Taylor (Salesforce.com)
> * Thomas D'Silva (Salesforce.com)
> * Gary Helmling (Facebook)
> 
> = Sponsors =
> 
> == Champion ==
> 
> James Taylor <jamestaylor at apache dot org> (V.P., Apache Phoenix)
> 
> == Nominated Mentors ==
> 
> * James Taylor <jamestaylor at apache dot org>
> * Lars Hofhansl <larsh at apache dot org>
> * Andrew Purtell <apurtell at apache dot org>
> * Alan Gates <gates at apache dot org>
> * Henry Saputra <hsaputra at apache dot org>
> 
> == Sponsoring Entity ==
> 
> We are requesting that the Incubator sponsor this project.

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org

Re: [VOTE] Accept Tephra into the Apache Incubator

Posted by Andrew Purtell <an...@gmail.com>.
+1 (binding)

> On Mar 3, 2016, at 5:29 PM, Poorna Chandra <po...@apache.org> wrote:
> 
> Hi All,
> 
> Tephra proposal was sent out for discussion last week. The proposal is
> available at https://wiki.apache.org/incubator/TephraProposal
> 
> Please vote to accept Tephra into the Apache Incubator. The vote will be
> open for the next 72 hours.
> 
> [ ] +1 Accept Tephra as an Apache Incubator podling.
> [ ] +0 Abstain.
> [ ] -1 Don’t accept Tephra as an Apache Incubator podling because ...
> 
> Thanks,
> Poorna.
> 
> ------
> 
> = Abstract =
> 
> Tephra is a system for providing globally consistent transactions on
> top of Apache HBase and other storage engines.
> 
> = Proposal =
> 
> Tephra is a transaction engine for distributed data stores like Apache HBase.
> It provides ACID semantics for concurrent data operations that span over region
> boundaries in HBase using Optimistic Concurrency Control.
> 
> = Background =
> 
> HBase provides strong consistency with row- or region-level ACID
> operations. However, it sacrifices cross-region and cross-table
> consistency in favor of scalability. This trade-off requires application
> developers to handle  the complexity of ensuring consistency when their
> modifications span region boundaries. By providing support for global
> transactions that span regions, tables, or multiple RPCs,
> Tephra simplifies application development on top of HBase, without a
> significant impact on performance or scalability for many workloads.
> 
> Tephra leverages HBase’s native data versioning to provide multi-versioned
> concurrency control (MVCC) for transactional reads and writes.
> With MVCC capability, each transaction sees its own consistent “snapshot” of
> data, providing snapshot isolation of concurrent transactions.
> MVCC along with conflict detection and handling enables Optimistic Concurrency
> Control.
> 
> Tephra consists of three main components:
> * Transaction Server – maintains global view of transaction state, assigns
>   new transaction IDs and performs conflict detection;
> * Transaction Client – coordinates start, commit, and rollback of
> transactions; and
> * Transaction Processor Coprocessor – applies filtering to the data read (based
>   on a given transaction’s state) and cleans up any data from old
>   (no longer visible) transactions.
> 
> Although Tephra only supports HBase now, it can be extended to support
> transactions on any store that has multi-versioning and rollback
> support. The transactions
> can span over multiple stores and storage paradigms.
> 
> = Rationale =
> 
> Tephra has simple abstractions which can be used by an application to
> add transaction support over HBase. By abstracting away transaction
> handling using Tephra, the application is freed of
> transaction logic, and the application developer can focus on the use case.
> Also, Tephra can be extended to support transactions on data sources other
> than HBase.
> 
> By making Tephra an Apache open source project, we believe that there will
> be wider adoption and more opportunities for Tephra to be integrated
> into other Apache projects.
> 
> = Current Status =
> 
> Tephra was built at Cask Data Inc. initially as part of
> open-source framework Cask Data Application Platform (CDAP)
> [[http://cdap.io/]].
> It was later converted into an independent open source project with
> Apache 2.0 License [[https://github.com/caskdata/tephra]].
> 
> Tephra is used in CDAP as the transaction engine. As part of CDAP, Tephra
> has been deployed at multiple companies.
> 
> Apache Phoenix is using Tephra as transaction engine in the next release.
> 
> == Meritocracy ==
> 
> Our intent with this incubator proposal is to start building a diverse
> developer community around Tephra following the Apache meritocracy model.
> Since Tephra was initially developed in early 2013, we have had fast
> adoption and contributions within Cask Data. We are looking forward to
> new contributors. We wish to build a community based on Apache's
> meritocracy principles, working with those who contribute significantly to
> the project and welcoming them to be committers both during the incubation
> process and beyond.
> 
> == Community ==
> 
> Core developers of Tephra are at Cask Data. Recently the developer community
> has expanded to include folks from Apache Phoenix. We hope to extend our
> contributor base significantly and we will invite all who are interested
> in working on distributed transaction engine.
> 
> == Core Developers ==
> 
> A few engineers from Cask Data and outside have developed Tephra:
> Andreas Neumann, Terence Yim, Gary Helmling, Andrew Purtell and
> Poorna Chandra.
> 
> 
> == Alignment ==
> 
> The ASF is the natural choice to host the Tephra project as its goal of
> encouraging community-driven open source projects fits with our vision for
> Tephra.
> 
> Additionally, many other projects with which we are familiar and expect
> Tephra to integrate with, such as Phoenix, Zookeeper, HDFS, log4j, and others
> mentioned in the External Dependencies section are Apache projects, and
> Tephra will benefit by close proximity to them.
> 
> = Known Risks =
> 
> == Orphaned Products ==
> 
> There is very little risk of Tephra being orphaned, as it is a key part of
> Cask Data’s products. The core Tephra developers plan to continue to work
> on Tephra, and Cask Data has funding in place to support their efforts
> going forward.
> Also with Phoenix using Tephra for transactions, Phoenix developers are
> keen on contributing to Tephra.
> 
> 
> == Inexperience with Open Source ==
> 
> Several of the core developers have experience with open source
> development. Andreas Neumann is an Apache committer for Oozie and Twill.
> Terence Yim is an Apache committer for Helix and Twill. Poorna Chandra
> is an Apache committer for Twill. Gary Helmling is a committer for
> Apache Twill and a committer and PMC member for Apache HBase.
> James Taylor is PMC chair for Apache Phoenix, PMC member of Apache Calcite,
> and an IPMC member.
> 
> == Homogeneous Developers ==
> 
> The current core developers are all Cask Data employees. However, we
> intend to establish a developer community that includes independent and
> corporate contributors. We are encouraging new contributors via our mailing
> lists, public presentations, and personal contacts, and we will continue to
> do so.
> 
> Apache Phoenix developers have already contributed several patches to Tephra,
> and have expressed interest in becoming long term contributors.
> 
> == Reliance on Salaried Developers ==
> 
> Currently, these developers are paid to work on Tephra. Once the project has
> built a community, we expect to attract committers, developers and community
> other than the current core developers. However, because Cask Data
> products use Tephra internally, the reliance on salaried developers is
> unlikely to change, at least in the near term.
> 
> == Relationships with Other Apache Products ==
> 
> Tephra is deeply integrated with Apache projects. Tephra provides transactions
> over Apache HBase, and uses Apache Twill and Apache Zookeeper for coordination.
> A number of other Apache projects are Tephra dependencies, and are
> listed in the External Dependencies section.
> 
> In addition, Apache Phoenix is using Tephra as the transaction engine.
> 
> == An Excessive Fascination with the Apache Brand ==
> 
> While we respect the reputation of the Apache brand and have no doubt that
> it will attract contributors and users, our interest is primarily to give
> Tephra a solid home as an open source project following an established
> development model. We have also given additional reasons in the Rationale
> and Alignment sections.
> 
> = Documentation =
> 
> The current documentation for Tephra is at https://github.com/caskdata/tephra.
> 
> = Initial Source =
> 
> Tephra codebase is currently hosted at https://github.com/caskdata/tephra.
> 
> = Source and Intellectual Property Submission Plan =
> 
> Tephra codebase is currently licensed under Apache 2.0 license.
> Cask Data owns the trademark for "Tephra". As part of the incubation process
> Cask Data will transfer the trademark to Apache Foundation.
> 
> = External Dependencies =
> 
> The dependencies all have Apache-compatible licenses:
> * dropwizard metrics (Apache 2.0)
> * fastutil (Apache 2.0)
> * gson (Apache 2.0)
> * guava-libraries (Apache 2.0)
> * guice (Apache 2.0)
> * hadoop (Apache 2.0)
> * hbase (Apache 2.0)
> * hdfs (Apache 2.0)
> * junit (EPL v1.0)
> * logback (EPL v1.0 )
> * slf4j (MIT)
> * thrift (Apache 2.0)
> * twill (Apache 2.0)
> * zookeeper (Apache 2.0)
> 
> = Cryptography =
> 
> Tephra does not use cryptography itself, however it can run on secure Hadoop,
> which uses Kerberos.
> 
> = Required Resources =
> 
> == Mailing Lists ==
> 
> * tephra-private for private PMC discussions (with moderated subscriptions)
> * tephra-dev for technical discussions among contributors
> * tephra-commits for notification about commits
> 
> == Subversion Directory ==
> 
> Git is the preferred source control system: git://git.apache.org/tephra
> 
> == Issue Tracking ==
> 
> JIRA Tephra (TEPHRA)
> 
> == Other Resources ==
> 
> The existing code already has unit tests, so we would like a Hudson
> instance to run them whenever a new patch is submitted. This can be added
> after project creation.
> 
> = Initial Committers =
> 
> * Andreas Neumann <anew at apache dot org>
> * Terence Yim <chtyim at apache dot org>
> * Poorna Chandra <poorna at apache dot org>
> * Gokul Gunasekaran <gokul at cask dot co>
> * James Taylor <jamestaylor at apache dot org>
> * Thomas D'Silva <tdsilva at apache dot org>
> * Gary Helmling <garyh at apache dot org>
> 
> = Affiliations =
> 
> * Andreas Neumann (Cask Data)
> * Terence Yim (Cask Data)
> * Poorna Chandra (Cask Data)
> * Gokul Gunasekaran (Cask Data)
> * James Taylor (Salesforce.com)
> * Thomas D'Silva (Salesforce.com)
> * Gary Helmling (Facebook)
> 
> = Sponsors =
> 
> == Champion ==
> 
> James Taylor <jamestaylor at apache dot org> (V.P., Apache Phoenix)
> 
> == Nominated Mentors ==
> 
> * James Taylor <jamestaylor at apache dot org>
> * Lars Hofhansl <larsh at apache dot org>
> * Andrew Purtell <apurtell at apache dot org>
> * Alan Gates <gates at apache dot org>
> * Henry Saputra <hsaputra at apache dot org>
> 
> == Sponsoring Entity ==
> 
> We are requesting that the Incubator sponsor this project.

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [VOTE] Accept Tephra into the Apache Incubator

Posted by "Gangumalla, Uma" <um...@intel.com>.
+1 (non binding)

Regards,
Uma

On 3/3/16, 5:29 PM, "Poorna Chandra" <po...@apache.org> wrote:

>Hi All,
>
>Tephra proposal was sent out for discussion last week. The proposal is
>available at https://wiki.apache.org/incubator/TephraProposal
>
>Please vote to accept Tephra into the Apache Incubator. The vote will be
>open for the next 72 hours.
>
>[ ] +1 Accept Tephra as an Apache Incubator podling.
>[ ] +0 Abstain.
>[ ] -1 Don¹t accept Tephra as an Apache Incubator podling because ...
>
>Thanks,
>Poorna.
>
>------
>
>= Abstract =
>
>Tephra is a system for providing globally consistent transactions on
>top of Apache HBase and other storage engines.
>
>= Proposal =
>
>Tephra is a transaction engine for distributed data stores like Apache
>HBase.
>It provides ACID semantics for concurrent data operations that span over
>region
>boundaries in HBase using Optimistic Concurrency Control.
>
>= Background =
>
>HBase provides strong consistency with row- or region-level ACID
>operations. However, it sacrifices cross-region and cross-table
>consistency in favor of scalability. This trade-off requires application
>developers to handle  the complexity of ensuring consistency when their
>modifications span region boundaries. By providing support for global
>transactions that span regions, tables, or multiple RPCs,
>Tephra simplifies application development on top of HBase, without a
>significant impact on performance or scalability for many workloads.
>
>Tephra leverages HBase¹s native data versioning to provide multi-versioned
>concurrency control (MVCC) for transactional reads and writes.
>With MVCC capability, each transaction sees its own consistent ³snapshot²
>of
>data, providing snapshot isolation of concurrent transactions.
>MVCC along with conflict detection and handling enables Optimistic
>Concurrency
>Control.
>
>Tephra consists of three main components:
> * Transaction Server ­ maintains global view of transaction state,
>assigns
>   new transaction IDs and performs conflict detection;
> * Transaction Client ­ coordinates start, commit, and rollback of
>transactions; and
> * Transaction Processor Coprocessor ­ applies filtering to the data read
>(based
>   on a given transaction¹s state) and cleans up any data from old
>   (no longer visible) transactions.
>
>Although Tephra only supports HBase now, it can be extended to support
>transactions on any store that has multi-versioning and rollback
>support. The transactions
>can span over multiple stores and storage paradigms.
>
>= Rationale =
>
>Tephra has simple abstractions which can be used by an application to
>add transaction support over HBase. By abstracting away transaction
>handling using Tephra, the application is freed of
>transaction logic, and the application developer can focus on the use
>case.
>Also, Tephra can be extended to support transactions on data sources other
>than HBase.
>
>By making Tephra an Apache open source project, we believe that there will
>be wider adoption and more opportunities for Tephra to be integrated
>into other Apache projects.
>
>= Current Status =
>
>Tephra was built at Cask Data Inc. initially as part of
>open-source framework Cask Data Application Platform (CDAP)
>[[http://cdap.io/]].
>It was later converted into an independent open source project with
>Apache 2.0 License [[https://github.com/caskdata/tephra]].
>
>Tephra is used in CDAP as the transaction engine. As part of CDAP, Tephra
>has been deployed at multiple companies.
>
>Apache Phoenix is using Tephra as transaction engine in the next release.
>
>== Meritocracy ==
>
>Our intent with this incubator proposal is to start building a diverse
>developer community around Tephra following the Apache meritocracy model.
>Since Tephra was initially developed in early 2013, we have had fast
>adoption and contributions within Cask Data. We are looking forward to
>new contributors. We wish to build a community based on Apache's
>meritocracy principles, working with those who contribute significantly to
>the project and welcoming them to be committers both during the incubation
>process and beyond.
>
>== Community ==
>
>Core developers of Tephra are at Cask Data. Recently the developer
>community
>has expanded to include folks from Apache Phoenix. We hope to extend our
>contributor base significantly and we will invite all who are interested
>in working on distributed transaction engine.
>
>== Core Developers ==
>
>A few engineers from Cask Data and outside have developed Tephra:
>Andreas Neumann, Terence Yim, Gary Helmling, Andrew Purtell and
>Poorna Chandra.
>
>
>== Alignment ==
>
>The ASF is the natural choice to host the Tephra project as its goal of
>encouraging community-driven open source projects fits with our vision for
>Tephra.
>
>Additionally, many other projects with which we are familiar and expect
>Tephra to integrate with, such as Phoenix, Zookeeper, HDFS, log4j, and
>others
>mentioned in the External Dependencies section are Apache projects, and
>Tephra will benefit by close proximity to them.
>
>= Known Risks =
>
>== Orphaned Products ==
>
>There is very little risk of Tephra being orphaned, as it is a key part of
>Cask Data¹s products. The core Tephra developers plan to continue to work
>on Tephra, and Cask Data has funding in place to support their efforts
>going forward.
>Also with Phoenix using Tephra for transactions, Phoenix developers are
>keen on contributing to Tephra.
>
>
>== Inexperience with Open Source ==
>
>Several of the core developers have experience with open source
>development. Andreas Neumann is an Apache committer for Oozie and Twill.
>Terence Yim is an Apache committer for Helix and Twill. Poorna Chandra
>is an Apache committer for Twill. Gary Helmling is a committer for
>Apache Twill and a committer and PMC member for Apache HBase.
>James Taylor is PMC chair for Apache Phoenix, PMC member of Apache
>Calcite,
>and an IPMC member.
>
>== Homogeneous Developers ==
>
>The current core developers are all Cask Data employees. However, we
>intend to establish a developer community that includes independent and
>corporate contributors. We are encouraging new contributors via our
>mailing
>lists, public presentations, and personal contacts, and we will continue
>to
>do so.
>
>Apache Phoenix developers have already contributed several patches to
>Tephra,
>and have expressed interest in becoming long term contributors.
>
>== Reliance on Salaried Developers ==
>
>Currently, these developers are paid to work on Tephra. Once the project
>has
>built a community, we expect to attract committers, developers and
>community
>other than the current core developers. However, because Cask Data
>products use Tephra internally, the reliance on salaried developers is
>unlikely to change, at least in the near term.
>
>== Relationships with Other Apache Products ==
>
>Tephra is deeply integrated with Apache projects. Tephra provides
>transactions
>over Apache HBase, and uses Apache Twill and Apache Zookeeper for
>coordination.
>A number of other Apache projects are Tephra dependencies, and are
>listed in the External Dependencies section.
>
>In addition, Apache Phoenix is using Tephra as the transaction engine.
>
>== An Excessive Fascination with the Apache Brand ==
>
>While we respect the reputation of the Apache brand and have no doubt that
>it will attract contributors and users, our interest is primarily to give
>Tephra a solid home as an open source project following an established
>development model. We have also given additional reasons in the Rationale
>and Alignment sections.
>
>= Documentation =
>
>The current documentation for Tephra is at
>https://github.com/caskdata/tephra.
>
>= Initial Source =
>
>Tephra codebase is currently hosted at https://github.com/caskdata/tephra.
>
>= Source and Intellectual Property Submission Plan =
>
>Tephra codebase is currently licensed under Apache 2.0 license.
>Cask Data owns the trademark for "Tephra". As part of the incubation
>process
>Cask Data will transfer the trademark to Apache Foundation.
>
>= External Dependencies =
>
>The dependencies all have Apache-compatible licenses:
> * dropwizard metrics (Apache 2.0)
> * fastutil (Apache 2.0)
> * gson (Apache 2.0)
> * guava-libraries (Apache 2.0)
> * guice (Apache 2.0)
> * hadoop (Apache 2.0)
> * hbase (Apache 2.0)
> * hdfs (Apache 2.0)
> * junit (EPL v1.0)
> * logback (EPL v1.0 )
> * slf4j (MIT)
> * thrift (Apache 2.0)
> * twill (Apache 2.0)
> * zookeeper (Apache 2.0)
>
>= Cryptography =
>
>Tephra does not use cryptography itself, however it can run on secure
>Hadoop,
>which uses Kerberos.
>
>= Required Resources =
>
>== Mailing Lists ==
>
> * tephra-private for private PMC discussions (with moderated
>subscriptions)
> * tephra-dev for technical discussions among contributors
> * tephra-commits for notification about commits
>
>== Subversion Directory ==
>
>Git is the preferred source control system: git://git.apache.org/tephra
>
>== Issue Tracking ==
>
>JIRA Tephra (TEPHRA)
>
>== Other Resources ==
>
>The existing code already has unit tests, so we would like a Hudson
>instance to run them whenever a new patch is submitted. This can be added
>after project creation.
>
>= Initial Committers =
>
> * Andreas Neumann <anew at apache dot org>
> * Terence Yim <chtyim at apache dot org>
> * Poorna Chandra <poorna at apache dot org>
> * Gokul Gunasekaran <gokul at cask dot co>
> * James Taylor <jamestaylor at apache dot org>
> * Thomas D'Silva <tdsilva at apache dot org>
> * Gary Helmling <garyh at apache dot org>
>
>= Affiliations =
>
> * Andreas Neumann (Cask Data)
> * Terence Yim (Cask Data)
> * Poorna Chandra (Cask Data)
> * Gokul Gunasekaran (Cask Data)
> * James Taylor (Salesforce.com)
> * Thomas D'Silva (Salesforce.com)
> * Gary Helmling (Facebook)
>
>= Sponsors =
>
>== Champion ==
>
>James Taylor <jamestaylor at apache dot org> (V.P., Apache Phoenix)
>
>== Nominated Mentors ==
>
> * James Taylor <jamestaylor at apache dot org>
> * Lars Hofhansl <larsh at apache dot org>
> * Andrew Purtell <apurtell at apache dot org>
> * Alan Gates <gates at apache dot org>
> * Henry Saputra <hsaputra at apache dot org>
>
>== Sponsoring Entity ==
>
>We are requesting that the Incubator sponsor this project.


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [VOTE] Accept Tephra into the Apache Incubator

Posted by Andreas Neumann <an...@apache.org>.
+1 (non-binding)

-Andreas.

On Fri, Mar 4, 2016 at 12:19 PM, Terence Yim <ch...@gmail.com> wrote:

> +1 (non-binding)
>
> Terence
>
> On Fri, Mar 4, 2016 at 1:13 AM, Jean-Baptiste Onofré <jb...@nanthrax.net>
> wrote:
>
> > +1 (binding)
> >
> > Regards
> > JB
> >
> >
> > On 03/04/2016 02:29 AM, Poorna Chandra wrote:
> >
> >> Hi All,
> >>
> >> Tephra proposal was sent out for discussion last week. The proposal is
> >> available at https://wiki.apache.org/incubator/TephraProposal
> >>
> >> Please vote to accept Tephra into the Apache Incubator. The vote will be
> >> open for the next 72 hours.
> >>
> >> [ ] +1 Accept Tephra as an Apache Incubator podling.
> >> [ ] +0 Abstain.
> >> [ ] -1 Don’t accept Tephra as an Apache Incubator podling because ...
> >>
> >> Thanks,
> >> Poorna.
> >>
> >> ------
> >>
> >> = Abstract =
> >>
> >> Tephra is a system for providing globally consistent transactions on
> >> top of Apache HBase and other storage engines.
> >>
> >> = Proposal =
> >>
> >> Tephra is a transaction engine for distributed data stores like Apache
> >> HBase.
> >> It provides ACID semantics for concurrent data operations that span over
> >> region
> >> boundaries in HBase using Optimistic Concurrency Control.
> >>
> >> = Background =
> >>
> >> HBase provides strong consistency with row- or region-level ACID
> >> operations. However, it sacrifices cross-region and cross-table
> >> consistency in favor of scalability. This trade-off requires application
> >> developers to handle  the complexity of ensuring consistency when their
> >> modifications span region boundaries. By providing support for global
> >> transactions that span regions, tables, or multiple RPCs,
> >> Tephra simplifies application development on top of HBase, without a
> >> significant impact on performance or scalability for many workloads.
> >>
> >> Tephra leverages HBase’s native data versioning to provide
> multi-versioned
> >> concurrency control (MVCC) for transactional reads and writes.
> >> With MVCC capability, each transaction sees its own consistent
> “snapshot”
> >> of
> >> data, providing snapshot isolation of concurrent transactions.
> >> MVCC along with conflict detection and handling enables Optimistic
> >> Concurrency
> >> Control.
> >>
> >> Tephra consists of three main components:
> >>   * Transaction Server – maintains global view of transaction state,
> >> assigns
> >>     new transaction IDs and performs conflict detection;
> >>   * Transaction Client – coordinates start, commit, and rollback of
> >> transactions; and
> >>   * Transaction Processor Coprocessor – applies filtering to the data
> >> read (based
> >>     on a given transaction’s state) and cleans up any data from old
> >>     (no longer visible) transactions.
> >>
> >> Although Tephra only supports HBase now, it can be extended to support
> >> transactions on any store that has multi-versioning and rollback
> >> support. The transactions
> >> can span over multiple stores and storage paradigms.
> >>
> >> = Rationale =
> >>
> >> Tephra has simple abstractions which can be used by an application to
> >> add transaction support over HBase. By abstracting away transaction
> >> handling using Tephra, the application is freed of
> >> transaction logic, and the application developer can focus on the use
> >> case.
> >> Also, Tephra can be extended to support transactions on data sources
> other
> >> than HBase.
> >>
> >> By making Tephra an Apache open source project, we believe that there
> will
> >> be wider adoption and more opportunities for Tephra to be integrated
> >> into other Apache projects.
> >>
> >> = Current Status =
> >>
> >> Tephra was built at Cask Data Inc. initially as part of
> >> open-source framework Cask Data Application Platform (CDAP)
> >> [[http://cdap.io/]].
> >> It was later converted into an independent open source project with
> >> Apache 2.0 License [[https://github.com/caskdata/tephra]].
> >>
> >> Tephra is used in CDAP as the transaction engine. As part of CDAP,
> Tephra
> >> has been deployed at multiple companies.
> >>
> >> Apache Phoenix is using Tephra as transaction engine in the next
> release.
> >>
> >> == Meritocracy ==
> >>
> >> Our intent with this incubator proposal is to start building a diverse
> >> developer community around Tephra following the Apache meritocracy
> model.
> >> Since Tephra was initially developed in early 2013, we have had fast
> >> adoption and contributions within Cask Data. We are looking forward to
> >> new contributors. We wish to build a community based on Apache's
> >> meritocracy principles, working with those who contribute significantly
> to
> >> the project and welcoming them to be committers both during the
> incubation
> >> process and beyond.
> >>
> >> == Community ==
> >>
> >> Core developers of Tephra are at Cask Data. Recently the developer
> >> community
> >> has expanded to include folks from Apache Phoenix. We hope to extend our
> >> contributor base significantly and we will invite all who are interested
> >> in working on distributed transaction engine.
> >>
> >> == Core Developers ==
> >>
> >> A few engineers from Cask Data and outside have developed Tephra:
> >> Andreas Neumann, Terence Yim, Gary Helmling, Andrew Purtell and
> >> Poorna Chandra.
> >>
> >>
> >> == Alignment ==
> >>
> >> The ASF is the natural choice to host the Tephra project as its goal of
> >> encouraging community-driven open source projects fits with our vision
> for
> >> Tephra.
> >>
> >> Additionally, many other projects with which we are familiar and expect
> >> Tephra to integrate with, such as Phoenix, Zookeeper, HDFS, log4j, and
> >> others
> >> mentioned in the External Dependencies section are Apache projects, and
> >> Tephra will benefit by close proximity to them.
> >>
> >> = Known Risks =
> >>
> >> == Orphaned Products ==
> >>
> >> There is very little risk of Tephra being orphaned, as it is a key part
> of
> >> Cask Data’s products. The core Tephra developers plan to continue to
> work
> >> on Tephra, and Cask Data has funding in place to support their efforts
> >> going forward.
> >> Also with Phoenix using Tephra for transactions, Phoenix developers are
> >> keen on contributing to Tephra.
> >>
> >>
> >> == Inexperience with Open Source ==
> >>
> >> Several of the core developers have experience with open source
> >> development. Andreas Neumann is an Apache committer for Oozie and Twill.
> >> Terence Yim is an Apache committer for Helix and Twill. Poorna Chandra
> >> is an Apache committer for Twill. Gary Helmling is a committer for
> >> Apache Twill and a committer and PMC member for Apache HBase.
> >> James Taylor is PMC chair for Apache Phoenix, PMC member of Apache
> >> Calcite,
> >> and an IPMC member.
> >>
> >> == Homogeneous Developers ==
> >>
> >> The current core developers are all Cask Data employees. However, we
> >> intend to establish a developer community that includes independent and
> >> corporate contributors. We are encouraging new contributors via our
> >> mailing
> >> lists, public presentations, and personal contacts, and we will continue
> >> to
> >> do so.
> >>
> >> Apache Phoenix developers have already contributed several patches to
> >> Tephra,
> >> and have expressed interest in becoming long term contributors.
> >>
> >> == Reliance on Salaried Developers ==
> >>
> >> Currently, these developers are paid to work on Tephra. Once the project
> >> has
> >> built a community, we expect to attract committers, developers and
> >> community
> >> other than the current core developers. However, because Cask Data
> >> products use Tephra internally, the reliance on salaried developers is
> >> unlikely to change, at least in the near term.
> >>
> >> == Relationships with Other Apache Products ==
> >>
> >> Tephra is deeply integrated with Apache projects. Tephra provides
> >> transactions
> >> over Apache HBase, and uses Apache Twill and Apache Zookeeper for
> >> coordination.
> >> A number of other Apache projects are Tephra dependencies, and are
> >> listed in the External Dependencies section.
> >>
> >> In addition, Apache Phoenix is using Tephra as the transaction engine.
> >>
> >> == An Excessive Fascination with the Apache Brand ==
> >>
> >> While we respect the reputation of the Apache brand and have no doubt
> that
> >> it will attract contributors and users, our interest is primarily to
> give
> >> Tephra a solid home as an open source project following an established
> >> development model. We have also given additional reasons in the
> Rationale
> >> and Alignment sections.
> >>
> >> = Documentation =
> >>
> >> The current documentation for Tephra is at
> >> https://github.com/caskdata/tephra.
> >>
> >> = Initial Source =
> >>
> >> Tephra codebase is currently hosted at
> https://github.com/caskdata/tephra
> >> .
> >>
> >> = Source and Intellectual Property Submission Plan =
> >>
> >> Tephra codebase is currently licensed under Apache 2.0 license.
> >> Cask Data owns the trademark for "Tephra". As part of the incubation
> >> process
> >> Cask Data will transfer the trademark to Apache Foundation.
> >>
> >> = External Dependencies =
> >>
> >> The dependencies all have Apache-compatible licenses:
> >>   * dropwizard metrics (Apache 2.0)
> >>   * fastutil (Apache 2.0)
> >>   * gson (Apache 2.0)
> >>   * guava-libraries (Apache 2.0)
> >>   * guice (Apache 2.0)
> >>   * hadoop (Apache 2.0)
> >>   * hbase (Apache 2.0)
> >>   * hdfs (Apache 2.0)
> >>   * junit (EPL v1.0)
> >>   * logback (EPL v1.0 )
> >>   * slf4j (MIT)
> >>   * thrift (Apache 2.0)
> >>   * twill (Apache 2.0)
> >>   * zookeeper (Apache 2.0)
> >>
> >> = Cryptography =
> >>
> >> Tephra does not use cryptography itself, however it can run on secure
> >> Hadoop,
> >> which uses Kerberos.
> >>
> >> = Required Resources =
> >>
> >> == Mailing Lists ==
> >>
> >>   * tephra-private for private PMC discussions (with moderated
> >> subscriptions)
> >>   * tephra-dev for technical discussions among contributors
> >>   * tephra-commits for notification about commits
> >>
> >> == Subversion Directory ==
> >>
> >> Git is the preferred source control system: git://git.apache.org/tephra
> >>
> >> == Issue Tracking ==
> >>
> >> JIRA Tephra (TEPHRA)
> >>
> >> == Other Resources ==
> >>
> >> The existing code already has unit tests, so we would like a Hudson
> >> instance to run them whenever a new patch is submitted. This can be
> added
> >> after project creation.
> >>
> >> = Initial Committers =
> >>
> >>   * Andreas Neumann <anew at apache dot org>
> >>   * Terence Yim <chtyim at apache dot org>
> >>   * Poorna Chandra <poorna at apache dot org>
> >>   * Gokul Gunasekaran <gokul at cask dot co>
> >>   * James Taylor <jamestaylor at apache dot org>
> >>   * Thomas D'Silva <tdsilva at apache dot org>
> >>   * Gary Helmling <garyh at apache dot org>
> >>
> >> = Affiliations =
> >>
> >>   * Andreas Neumann (Cask Data)
> >>   * Terence Yim (Cask Data)
> >>   * Poorna Chandra (Cask Data)
> >>   * Gokul Gunasekaran (Cask Data)
> >>   * James Taylor (Salesforce.com)
> >>   * Thomas D'Silva (Salesforce.com)
> >>   * Gary Helmling (Facebook)
> >>
> >> = Sponsors =
> >>
> >> == Champion ==
> >>
> >> James Taylor <jamestaylor at apache dot org> (V.P., Apache Phoenix)
> >>
> >> == Nominated Mentors ==
> >>
> >>   * James Taylor <jamestaylor at apache dot org>
> >>   * Lars Hofhansl <larsh at apache dot org>
> >>   * Andrew Purtell <apurtell at apache dot org>
> >>   * Alan Gates <gates at apache dot org>
> >>   * Henry Saputra <hsaputra at apache dot org>
> >>
> >> == Sponsoring Entity ==
> >>
> >> We are requesting that the Incubator sponsor this project.
> >>
> >>
> > --
> > Jean-Baptiste Onofré
> > jbonofre@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > For additional commands, e-mail: general-help@incubator.apache.org
> >
> >
>

Re: [VOTE] Accept Tephra into the Apache Incubator

Posted by Terence Yim <ch...@gmail.com>.
+1 (non-binding)

Terence

On Fri, Mar 4, 2016 at 1:13 AM, Jean-Baptiste Onofré <jb...@nanthrax.net>
wrote:

> +1 (binding)
>
> Regards
> JB
>
>
> On 03/04/2016 02:29 AM, Poorna Chandra wrote:
>
>> Hi All,
>>
>> Tephra proposal was sent out for discussion last week. The proposal is
>> available at https://wiki.apache.org/incubator/TephraProposal
>>
>> Please vote to accept Tephra into the Apache Incubator. The vote will be
>> open for the next 72 hours.
>>
>> [ ] +1 Accept Tephra as an Apache Incubator podling.
>> [ ] +0 Abstain.
>> [ ] -1 Don’t accept Tephra as an Apache Incubator podling because ...
>>
>> Thanks,
>> Poorna.
>>
>> ------
>>
>> = Abstract =
>>
>> Tephra is a system for providing globally consistent transactions on
>> top of Apache HBase and other storage engines.
>>
>> = Proposal =
>>
>> Tephra is a transaction engine for distributed data stores like Apache
>> HBase.
>> It provides ACID semantics for concurrent data operations that span over
>> region
>> boundaries in HBase using Optimistic Concurrency Control.
>>
>> = Background =
>>
>> HBase provides strong consistency with row- or region-level ACID
>> operations. However, it sacrifices cross-region and cross-table
>> consistency in favor of scalability. This trade-off requires application
>> developers to handle  the complexity of ensuring consistency when their
>> modifications span region boundaries. By providing support for global
>> transactions that span regions, tables, or multiple RPCs,
>> Tephra simplifies application development on top of HBase, without a
>> significant impact on performance or scalability for many workloads.
>>
>> Tephra leverages HBase’s native data versioning to provide multi-versioned
>> concurrency control (MVCC) for transactional reads and writes.
>> With MVCC capability, each transaction sees its own consistent “snapshot”
>> of
>> data, providing snapshot isolation of concurrent transactions.
>> MVCC along with conflict detection and handling enables Optimistic
>> Concurrency
>> Control.
>>
>> Tephra consists of three main components:
>>   * Transaction Server – maintains global view of transaction state,
>> assigns
>>     new transaction IDs and performs conflict detection;
>>   * Transaction Client – coordinates start, commit, and rollback of
>> transactions; and
>>   * Transaction Processor Coprocessor – applies filtering to the data
>> read (based
>>     on a given transaction’s state) and cleans up any data from old
>>     (no longer visible) transactions.
>>
>> Although Tephra only supports HBase now, it can be extended to support
>> transactions on any store that has multi-versioning and rollback
>> support. The transactions
>> can span over multiple stores and storage paradigms.
>>
>> = Rationale =
>>
>> Tephra has simple abstractions which can be used by an application to
>> add transaction support over HBase. By abstracting away transaction
>> handling using Tephra, the application is freed of
>> transaction logic, and the application developer can focus on the use
>> case.
>> Also, Tephra can be extended to support transactions on data sources other
>> than HBase.
>>
>> By making Tephra an Apache open source project, we believe that there will
>> be wider adoption and more opportunities for Tephra to be integrated
>> into other Apache projects.
>>
>> = Current Status =
>>
>> Tephra was built at Cask Data Inc. initially as part of
>> open-source framework Cask Data Application Platform (CDAP)
>> [[http://cdap.io/]].
>> It was later converted into an independent open source project with
>> Apache 2.0 License [[https://github.com/caskdata/tephra]].
>>
>> Tephra is used in CDAP as the transaction engine. As part of CDAP, Tephra
>> has been deployed at multiple companies.
>>
>> Apache Phoenix is using Tephra as transaction engine in the next release.
>>
>> == Meritocracy ==
>>
>> Our intent with this incubator proposal is to start building a diverse
>> developer community around Tephra following the Apache meritocracy model.
>> Since Tephra was initially developed in early 2013, we have had fast
>> adoption and contributions within Cask Data. We are looking forward to
>> new contributors. We wish to build a community based on Apache's
>> meritocracy principles, working with those who contribute significantly to
>> the project and welcoming them to be committers both during the incubation
>> process and beyond.
>>
>> == Community ==
>>
>> Core developers of Tephra are at Cask Data. Recently the developer
>> community
>> has expanded to include folks from Apache Phoenix. We hope to extend our
>> contributor base significantly and we will invite all who are interested
>> in working on distributed transaction engine.
>>
>> == Core Developers ==
>>
>> A few engineers from Cask Data and outside have developed Tephra:
>> Andreas Neumann, Terence Yim, Gary Helmling, Andrew Purtell and
>> Poorna Chandra.
>>
>>
>> == Alignment ==
>>
>> The ASF is the natural choice to host the Tephra project as its goal of
>> encouraging community-driven open source projects fits with our vision for
>> Tephra.
>>
>> Additionally, many other projects with which we are familiar and expect
>> Tephra to integrate with, such as Phoenix, Zookeeper, HDFS, log4j, and
>> others
>> mentioned in the External Dependencies section are Apache projects, and
>> Tephra will benefit by close proximity to them.
>>
>> = Known Risks =
>>
>> == Orphaned Products ==
>>
>> There is very little risk of Tephra being orphaned, as it is a key part of
>> Cask Data’s products. The core Tephra developers plan to continue to work
>> on Tephra, and Cask Data has funding in place to support their efforts
>> going forward.
>> Also with Phoenix using Tephra for transactions, Phoenix developers are
>> keen on contributing to Tephra.
>>
>>
>> == Inexperience with Open Source ==
>>
>> Several of the core developers have experience with open source
>> development. Andreas Neumann is an Apache committer for Oozie and Twill.
>> Terence Yim is an Apache committer for Helix and Twill. Poorna Chandra
>> is an Apache committer for Twill. Gary Helmling is a committer for
>> Apache Twill and a committer and PMC member for Apache HBase.
>> James Taylor is PMC chair for Apache Phoenix, PMC member of Apache
>> Calcite,
>> and an IPMC member.
>>
>> == Homogeneous Developers ==
>>
>> The current core developers are all Cask Data employees. However, we
>> intend to establish a developer community that includes independent and
>> corporate contributors. We are encouraging new contributors via our
>> mailing
>> lists, public presentations, and personal contacts, and we will continue
>> to
>> do so.
>>
>> Apache Phoenix developers have already contributed several patches to
>> Tephra,
>> and have expressed interest in becoming long term contributors.
>>
>> == Reliance on Salaried Developers ==
>>
>> Currently, these developers are paid to work on Tephra. Once the project
>> has
>> built a community, we expect to attract committers, developers and
>> community
>> other than the current core developers. However, because Cask Data
>> products use Tephra internally, the reliance on salaried developers is
>> unlikely to change, at least in the near term.
>>
>> == Relationships with Other Apache Products ==
>>
>> Tephra is deeply integrated with Apache projects. Tephra provides
>> transactions
>> over Apache HBase, and uses Apache Twill and Apache Zookeeper for
>> coordination.
>> A number of other Apache projects are Tephra dependencies, and are
>> listed in the External Dependencies section.
>>
>> In addition, Apache Phoenix is using Tephra as the transaction engine.
>>
>> == An Excessive Fascination with the Apache Brand ==
>>
>> While we respect the reputation of the Apache brand and have no doubt that
>> it will attract contributors and users, our interest is primarily to give
>> Tephra a solid home as an open source project following an established
>> development model. We have also given additional reasons in the Rationale
>> and Alignment sections.
>>
>> = Documentation =
>>
>> The current documentation for Tephra is at
>> https://github.com/caskdata/tephra.
>>
>> = Initial Source =
>>
>> Tephra codebase is currently hosted at https://github.com/caskdata/tephra
>> .
>>
>> = Source and Intellectual Property Submission Plan =
>>
>> Tephra codebase is currently licensed under Apache 2.0 license.
>> Cask Data owns the trademark for "Tephra". As part of the incubation
>> process
>> Cask Data will transfer the trademark to Apache Foundation.
>>
>> = External Dependencies =
>>
>> The dependencies all have Apache-compatible licenses:
>>   * dropwizard metrics (Apache 2.0)
>>   * fastutil (Apache 2.0)
>>   * gson (Apache 2.0)
>>   * guava-libraries (Apache 2.0)
>>   * guice (Apache 2.0)
>>   * hadoop (Apache 2.0)
>>   * hbase (Apache 2.0)
>>   * hdfs (Apache 2.0)
>>   * junit (EPL v1.0)
>>   * logback (EPL v1.0 )
>>   * slf4j (MIT)
>>   * thrift (Apache 2.0)
>>   * twill (Apache 2.0)
>>   * zookeeper (Apache 2.0)
>>
>> = Cryptography =
>>
>> Tephra does not use cryptography itself, however it can run on secure
>> Hadoop,
>> which uses Kerberos.
>>
>> = Required Resources =
>>
>> == Mailing Lists ==
>>
>>   * tephra-private for private PMC discussions (with moderated
>> subscriptions)
>>   * tephra-dev for technical discussions among contributors
>>   * tephra-commits for notification about commits
>>
>> == Subversion Directory ==
>>
>> Git is the preferred source control system: git://git.apache.org/tephra
>>
>> == Issue Tracking ==
>>
>> JIRA Tephra (TEPHRA)
>>
>> == Other Resources ==
>>
>> The existing code already has unit tests, so we would like a Hudson
>> instance to run them whenever a new patch is submitted. This can be added
>> after project creation.
>>
>> = Initial Committers =
>>
>>   * Andreas Neumann <anew at apache dot org>
>>   * Terence Yim <chtyim at apache dot org>
>>   * Poorna Chandra <poorna at apache dot org>
>>   * Gokul Gunasekaran <gokul at cask dot co>
>>   * James Taylor <jamestaylor at apache dot org>
>>   * Thomas D'Silva <tdsilva at apache dot org>
>>   * Gary Helmling <garyh at apache dot org>
>>
>> = Affiliations =
>>
>>   * Andreas Neumann (Cask Data)
>>   * Terence Yim (Cask Data)
>>   * Poorna Chandra (Cask Data)
>>   * Gokul Gunasekaran (Cask Data)
>>   * James Taylor (Salesforce.com)
>>   * Thomas D'Silva (Salesforce.com)
>>   * Gary Helmling (Facebook)
>>
>> = Sponsors =
>>
>> == Champion ==
>>
>> James Taylor <jamestaylor at apache dot org> (V.P., Apache Phoenix)
>>
>> == Nominated Mentors ==
>>
>>   * James Taylor <jamestaylor at apache dot org>
>>   * Lars Hofhansl <larsh at apache dot org>
>>   * Andrew Purtell <apurtell at apache dot org>
>>   * Alan Gates <gates at apache dot org>
>>   * Henry Saputra <hsaputra at apache dot org>
>>
>> == Sponsoring Entity ==
>>
>> We are requesting that the Incubator sponsor this project.
>>
>>
> --
> Jean-Baptiste Onofré
> jbonofre@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>

Re: [VOTE] Accept Tephra into the Apache Incubator

Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.
+1 (binding)

Regards
JB

On 03/04/2016 02:29 AM, Poorna Chandra wrote:
> Hi All,
>
> Tephra proposal was sent out for discussion last week. The proposal is
> available at https://wiki.apache.org/incubator/TephraProposal
>
> Please vote to accept Tephra into the Apache Incubator. The vote will be
> open for the next 72 hours.
>
> [ ] +1 Accept Tephra as an Apache Incubator podling.
> [ ] +0 Abstain.
> [ ] -1 Don’t accept Tephra as an Apache Incubator podling because ...
>
> Thanks,
> Poorna.
>
> ------
>
> = Abstract =
>
> Tephra is a system for providing globally consistent transactions on
> top of Apache HBase and other storage engines.
>
> = Proposal =
>
> Tephra is a transaction engine for distributed data stores like Apache HBase.
> It provides ACID semantics for concurrent data operations that span over region
> boundaries in HBase using Optimistic Concurrency Control.
>
> = Background =
>
> HBase provides strong consistency with row- or region-level ACID
> operations. However, it sacrifices cross-region and cross-table
> consistency in favor of scalability. This trade-off requires application
> developers to handle  the complexity of ensuring consistency when their
> modifications span region boundaries. By providing support for global
> transactions that span regions, tables, or multiple RPCs,
> Tephra simplifies application development on top of HBase, without a
> significant impact on performance or scalability for many workloads.
>
> Tephra leverages HBase’s native data versioning to provide multi-versioned
> concurrency control (MVCC) for transactional reads and writes.
> With MVCC capability, each transaction sees its own consistent “snapshot” of
> data, providing snapshot isolation of concurrent transactions.
> MVCC along with conflict detection and handling enables Optimistic Concurrency
> Control.
>
> Tephra consists of three main components:
>   * Transaction Server – maintains global view of transaction state, assigns
>     new transaction IDs and performs conflict detection;
>   * Transaction Client – coordinates start, commit, and rollback of
> transactions; and
>   * Transaction Processor Coprocessor – applies filtering to the data read (based
>     on a given transaction’s state) and cleans up any data from old
>     (no longer visible) transactions.
>
> Although Tephra only supports HBase now, it can be extended to support
> transactions on any store that has multi-versioning and rollback
> support. The transactions
> can span over multiple stores and storage paradigms.
>
> = Rationale =
>
> Tephra has simple abstractions which can be used by an application to
> add transaction support over HBase. By abstracting away transaction
> handling using Tephra, the application is freed of
> transaction logic, and the application developer can focus on the use case.
> Also, Tephra can be extended to support transactions on data sources other
> than HBase.
>
> By making Tephra an Apache open source project, we believe that there will
> be wider adoption and more opportunities for Tephra to be integrated
> into other Apache projects.
>
> = Current Status =
>
> Tephra was built at Cask Data Inc. initially as part of
> open-source framework Cask Data Application Platform (CDAP)
> [[http://cdap.io/]].
> It was later converted into an independent open source project with
> Apache 2.0 License [[https://github.com/caskdata/tephra]].
>
> Tephra is used in CDAP as the transaction engine. As part of CDAP, Tephra
> has been deployed at multiple companies.
>
> Apache Phoenix is using Tephra as transaction engine in the next release.
>
> == Meritocracy ==
>
> Our intent with this incubator proposal is to start building a diverse
> developer community around Tephra following the Apache meritocracy model.
> Since Tephra was initially developed in early 2013, we have had fast
> adoption and contributions within Cask Data. We are looking forward to
> new contributors. We wish to build a community based on Apache's
> meritocracy principles, working with those who contribute significantly to
> the project and welcoming them to be committers both during the incubation
> process and beyond.
>
> == Community ==
>
> Core developers of Tephra are at Cask Data. Recently the developer community
> has expanded to include folks from Apache Phoenix. We hope to extend our
> contributor base significantly and we will invite all who are interested
> in working on distributed transaction engine.
>
> == Core Developers ==
>
> A few engineers from Cask Data and outside have developed Tephra:
> Andreas Neumann, Terence Yim, Gary Helmling, Andrew Purtell and
> Poorna Chandra.
>
>
> == Alignment ==
>
> The ASF is the natural choice to host the Tephra project as its goal of
> encouraging community-driven open source projects fits with our vision for
> Tephra.
>
> Additionally, many other projects with which we are familiar and expect
> Tephra to integrate with, such as Phoenix, Zookeeper, HDFS, log4j, and others
> mentioned in the External Dependencies section are Apache projects, and
> Tephra will benefit by close proximity to them.
>
> = Known Risks =
>
> == Orphaned Products ==
>
> There is very little risk of Tephra being orphaned, as it is a key part of
> Cask Data’s products. The core Tephra developers plan to continue to work
> on Tephra, and Cask Data has funding in place to support their efforts
> going forward.
> Also with Phoenix using Tephra for transactions, Phoenix developers are
> keen on contributing to Tephra.
>
>
> == Inexperience with Open Source ==
>
> Several of the core developers have experience with open source
> development. Andreas Neumann is an Apache committer for Oozie and Twill.
> Terence Yim is an Apache committer for Helix and Twill. Poorna Chandra
> is an Apache committer for Twill. Gary Helmling is a committer for
> Apache Twill and a committer and PMC member for Apache HBase.
> James Taylor is PMC chair for Apache Phoenix, PMC member of Apache Calcite,
> and an IPMC member.
>
> == Homogeneous Developers ==
>
> The current core developers are all Cask Data employees. However, we
> intend to establish a developer community that includes independent and
> corporate contributors. We are encouraging new contributors via our mailing
> lists, public presentations, and personal contacts, and we will continue to
> do so.
>
> Apache Phoenix developers have already contributed several patches to Tephra,
> and have expressed interest in becoming long term contributors.
>
> == Reliance on Salaried Developers ==
>
> Currently, these developers are paid to work on Tephra. Once the project has
> built a community, we expect to attract committers, developers and community
> other than the current core developers. However, because Cask Data
> products use Tephra internally, the reliance on salaried developers is
> unlikely to change, at least in the near term.
>
> == Relationships with Other Apache Products ==
>
> Tephra is deeply integrated with Apache projects. Tephra provides transactions
> over Apache HBase, and uses Apache Twill and Apache Zookeeper for coordination.
> A number of other Apache projects are Tephra dependencies, and are
> listed in the External Dependencies section.
>
> In addition, Apache Phoenix is using Tephra as the transaction engine.
>
> == An Excessive Fascination with the Apache Brand ==
>
> While we respect the reputation of the Apache brand and have no doubt that
> it will attract contributors and users, our interest is primarily to give
> Tephra a solid home as an open source project following an established
> development model. We have also given additional reasons in the Rationale
> and Alignment sections.
>
> = Documentation =
>
> The current documentation for Tephra is at https://github.com/caskdata/tephra.
>
> = Initial Source =
>
> Tephra codebase is currently hosted at https://github.com/caskdata/tephra.
>
> = Source and Intellectual Property Submission Plan =
>
> Tephra codebase is currently licensed under Apache 2.0 license.
> Cask Data owns the trademark for "Tephra". As part of the incubation process
> Cask Data will transfer the trademark to Apache Foundation.
>
> = External Dependencies =
>
> The dependencies all have Apache-compatible licenses:
>   * dropwizard metrics (Apache 2.0)
>   * fastutil (Apache 2.0)
>   * gson (Apache 2.0)
>   * guava-libraries (Apache 2.0)
>   * guice (Apache 2.0)
>   * hadoop (Apache 2.0)
>   * hbase (Apache 2.0)
>   * hdfs (Apache 2.0)
>   * junit (EPL v1.0)
>   * logback (EPL v1.0 )
>   * slf4j (MIT)
>   * thrift (Apache 2.0)
>   * twill (Apache 2.0)
>   * zookeeper (Apache 2.0)
>
> = Cryptography =
>
> Tephra does not use cryptography itself, however it can run on secure Hadoop,
> which uses Kerberos.
>
> = Required Resources =
>
> == Mailing Lists ==
>
>   * tephra-private for private PMC discussions (with moderated subscriptions)
>   * tephra-dev for technical discussions among contributors
>   * tephra-commits for notification about commits
>
> == Subversion Directory ==
>
> Git is the preferred source control system: git://git.apache.org/tephra
>
> == Issue Tracking ==
>
> JIRA Tephra (TEPHRA)
>
> == Other Resources ==
>
> The existing code already has unit tests, so we would like a Hudson
> instance to run them whenever a new patch is submitted. This can be added
> after project creation.
>
> = Initial Committers =
>
>   * Andreas Neumann <anew at apache dot org>
>   * Terence Yim <chtyim at apache dot org>
>   * Poorna Chandra <poorna at apache dot org>
>   * Gokul Gunasekaran <gokul at cask dot co>
>   * James Taylor <jamestaylor at apache dot org>
>   * Thomas D'Silva <tdsilva at apache dot org>
>   * Gary Helmling <garyh at apache dot org>
>
> = Affiliations =
>
>   * Andreas Neumann (Cask Data)
>   * Terence Yim (Cask Data)
>   * Poorna Chandra (Cask Data)
>   * Gokul Gunasekaran (Cask Data)
>   * James Taylor (Salesforce.com)
>   * Thomas D'Silva (Salesforce.com)
>   * Gary Helmling (Facebook)
>
> = Sponsors =
>
> == Champion ==
>
> James Taylor <jamestaylor at apache dot org> (V.P., Apache Phoenix)
>
> == Nominated Mentors ==
>
>   * James Taylor <jamestaylor at apache dot org>
>   * Lars Hofhansl <larsh at apache dot org>
>   * Andrew Purtell <apurtell at apache dot org>
>   * Alan Gates <gates at apache dot org>
>   * Henry Saputra <hsaputra at apache dot org>
>
> == Sponsoring Entity ==
>
> We are requesting that the Incubator sponsor this project.
>

-- 
Jean-Baptiste Onofré
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org