You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@incubator.apache.org by Henry Robinson <he...@cloudera.com> on 2015/11/24 22:03:50 UTC

[VOTE] Accept Impala into the Apache Incubator

Hi -

The [DISCUSS] thread has been quiet for a few days, so I think there's been
sufficient opportunity for discussion around our proposal to bring Impala
to the ASF Incubator.

I'd like to call a VOTE on that proposal, which is on the wiki at
https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
below.

During the discussion period, the proposal has been amended to add Brock
Noland as a new mentor, to add one missed committer from the list and to
correct some issues with the dependency list.

Please cast your votes as follows:

[] +1, accept Impala into the Incubator
[] +/-0, non-counted vote to express a disposition
[] -1, do not accept Impala into the Incubator (please give your reason(s))

As with the concurrent Kudu vote, I propose leaving the vote open for a
full seven days (to close at Tuesday, December 1st at noon PST), due to the
upcoming US holiday.

Thanks,
Henry

--------

= Abstract =
Impala is a high-performance C++ and Java SQL query engine for data stored
in Apache Hadoop-based clusters.

= Proposal =

We propose to contribute the Impala codebase and associated artifacts (e.g.
documentation, web-site content etc.) to the Apache Software Foundation
with the intent of forming a productive, meritocratic and open community
around Impala’s continued development, according to the ‘Apache Way’.

Cloudera owns several trademarks regarding Impala, and proposes to transfer
ownership of those trademarks in full to the ASF.

= Background =
Engineers at Cloudera developed Impala and released it as an
Apache-licensed open-source project in Fall 2012. Impala was written as a
brand-new, modern C++ SQL engine targeted from the start for data stored in
Apache Hadoop clusters.

Impala’s most important benefit to users is high-performance, making it
extremely appropriate for common enterprise analytic and business
intelligence workloads. This is achieved by a number of software
techniques, including: native support for data stored in HDFS and related
filesystems, just-in-time compilation and optimization of individual query
plans, high-performance C++ codebase and massively-parallel distributed
architecture. In benchmarks, Impala is routinely amongst the very highest
performing SQL query engines.

= Rationale =

Despite the exciting innovation in the so-called ‘big-data’ space, SQL
remains by far the most common interface for interacting with data in both
traditional warehouses and modern ‘big-data’ clusters. There is clearly a
need, as evidenced by the eager adoption of Impala and other SQL engines in
enterprise contexts, for a query engine that offers the familiar SQL
interface, but that has been specifically designed to operate in massive,
distributed clusters rather than in traditional, fixed-hardware,
warehouse-specific deployments. Impala is one such query engine.

We believe that the ASF is the right venue to foster an open-source
community around Impala’s development. We expect that Impala will benefit
from more productive collaboration with related Apache projects, and under
the auspices of the ASF will attract talented contributors who will push
Impala’s development forward at pace.

We believe that the timing is right for Impala’s development to move
wholesale to the ASF: Impala is well-established, has been Apache-licensed
open-source for more than three years, and the core project is relatively
stable. We are excited to see where an ASF-based community can take Impala
from this strong starting point.

= Initial Goals =
Our initial goals are as follows:

 * Establish ASF-compatible engineering practices and workflows
 * Refactor and publish existing internal build scripts and test
infrastructure, in order to make them usable by any community member.
 * Transfer source code, documentation and associated artifacts to the ASF.
 * Grow the user and developer communities

= Current Status =

Impala is developed as an Apache-licensed open-source project. The source
code is available at http://github.com/cloudera/Impala, and developer
documentation is at https://github.com/cloudera/Impala/wiki. The majority
of commits to the project have come from Cloudera-employed developers, but
we have accepted some contributions from individuals from other
organizations.

All code reviews are done via a public instance of the Gerrit review tool
at http://gerrit.cloudera.org:8080/, and discussed on a public mailing
list. All patches must be reviewed before they are accepted into the
codebase, via a voting mechanism that is similar to that used on Apache
projects such as Hadoop and HBase.

Before a patch is committed, it must pass a suite of pre-commit tests.
These tests are currently run on Cloudera’s internal infrastructure. One of
our initial goals will be to work with the ASF Infrastructure team to find
a way to run these tests in an acceptable way on publicly accessible
machines.

Issues are tracked in JIRA at https://issues.cloudera.org/projects/IMPALA,
in a way that is extremely similar to existing practices at other ASF
projects.

= Meritocracy =

We understand the central importance of meritocracy to the Apache Way. We
will work to establish a welcoming, fair and meritocratic community, in
part by expanding the set of committers on the project. Although Impala’s
committer list will initially be dominated by members of the Impala
engineering team at Cloudera, we look forward to growing a rich user and
developer community.

= Community =
Impala has a strong user community (see
https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user), and a
growing developer community (see
https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev). We wish
to attract more developers to the project, and we believe that the ASF’s
open and meritocratic philosophy will help us with this. We note the
success of other, similar projects already part of the ASF.

= Core Developers =
Most - but not all - of Impala’s core developers are not currently
affiliated with the ASF, and will require new ICLAs.

= Alignment =
Impala is related to several other Apache projects:

 * Data that is read by Impala is very often stored in Apache Hadoop
clusters powered by the HDFS filesystem.
 * Impala can also read data stored in Apache HBase
 * Metadata for databases, tables and so on is read by Impala from Apache
Hive.
 * The preferred data format for HDFS-based tables is Apache Parquet, and
Apache Avro is also a supported data format.
 * Impala is closely integrated with Kudu, which is also being proposed to
the Incubator.
 * Impala uses Apache Thrift as its RPC and serialization framework of
choice.

= Known Risks =

== Orphaned Products ==
Impala is used by most of Cloudera’s customers, and Cloudera remains
committed to developing and supporting the project. Cloudera has a strong
track record in standing behind projects that were contributed to the ASF
by its employees, including Apache Flume, Apache Sqoop, and others. Other
companies both ship and support Impala, lending credence to the idea that
Impala is not at risk of being suddenly orphaned.

== Inexperience with Open Source ==
Although all committers on the initial list have significant experience
with at least one open-source project - namely Impala - fewer have much
experience with ASF-based software projects as contributors and community
members. However, with the guidance of our mentors, committers who do have
ASF experience, and time to learn during Incubation, we are confident that
the project can be run in accordance with Apache principles on an ongoing
basis.

== Homogeneous Developers ==

The initial committers are employees of Cloudera.

The project has received some contributions from developers outside of
Cloudera, from individuals belonging to organizations such as Intel and
Google, from hobbyists and from students using Impala to advance their
understanding of distributed databases. The project attracted an active
user community as well. We hope to continue to encourage contributions from
these developers and community members and grow them into committers after
they have had time to continue their contributions.

== Reliance on Salaried Developers ==

Many of Impala’s initial set of committers work full-time on Impala, and
are paid to do so. However, as mentioned elsewhere, we anticipate growth in
the developer community which we hope will include hobbyists and academics
who have an interested in distributed data systems.

== An Excessive Fascination with the Apache Brand ==
Although we hope that Impala benefits from the Apache Brand, any reflected
goodwill to Cloudera as the contributing entity is not the goal of
establishing Impala as an Apache project. We will work with the Incubator
PMC and the PRC to ensure that the Apache Brand is respected.

= Documentation =
Impala: A Modern, Open-Source SQL Engine for Hadoop (
http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf)

Impala’s developer wiki (https://github.com/cloudera/Impala/wiki)

Impala’s auto-generated API documentation (
http://impala.io/doc/html/index.html)

= Initial Source =
Impala’s initial source contribution will come from
http://github.com/cloudera/Impala/.

= External Dependencies =

Impala depends upon a number of third-party libraries, which we list below.
We intend to compile a LICENSE.txt file in the very short term (see
https://issues.cloudera.org/browse/IMPALA-2670).

 * Google gflags (BSD)
 * Google glog (BSD)
 * Apache Thrift (Apache Software License v2.0)
 * Apache Commons (Apache Software License v2.0)
 * Apache Hadoop (Apache Software License v2.0)
 * Apache HBase (Apache Software License v2.0)
 * Apache Hive (Apache Software License v2.0)
 * Boost (Boost Software License)
 * OpenLdap (OpenLDAP Software License)
 * rapidjson (MIT)
 * Google RE2 (BSD-style)
 * lz4 (BSD)
 * snappy (BSD)
 * cyrus-sasl (CMU License)
 * Apache Avro (Apache Software License v2.0)
 * Cloudera squeasel (Apache Software License v2.0)
 * Apache htrace (Incubating) (Apache Software License v2.0)
 * Apache Sentry (Incubating) (Apache Software License v2.0)
 * Apache Shiro (Apache Software License v2.0)
 * Twitter Bootstrap (Apache Software License v2.0)
 * d3 (BSD)
 * LLVM (BSD-like)

Build and test dependencies:

 * ant (Apache Software License v2.0)
 * Apache Maven (Apache Software License v2.0)
 * cmake (BSD)
 * clang (BSD)
 * Google gtest (Apache Software License v2.0)

= Required Resources =

We request that following resources be created for the project to use:

== Mailing lists ==

 * private@impala.incubator.apache.org (moderated subscriptions)
 * commits@impala.incubator.apache.org
 * dev@impala.incubator.apache.org
 * issues@impala.incubator.apache.org
 * user@impala.incubator.apache.org

== Git repository ==
https://git.apache.org/impala.git

== JIRA instance ==
JIRA project IMPALA (IMPALA or IMP)

== Other Resources ==
We hope to continue using Gerrit for our code review and commit workflow.
We are involved with discussions that the Kudu team at Cloudera have been
having with Jake Farrell to start discussions on how Gerrit can fit into
the ASF. We know that several other ASF projects or podlings are also
interested in Gerrit.

If the Infrastructure team does not have the bandwidth to support gerrit,
we will continue to support our own instance of gerrit for Impala, and make
the necessary integrations such that commits are properly authenticated and
maintain sufficient provenance to uphold the ASF standards (e.g. via the
solution adopted by the AsterixDB podling).

= Initial Committers =

 * Tim Armstrong
 * Alex Behm
 * Taras Bobrovytsky
 * Casey Ching
 * Martin Grund
 * Daniel Hecht
 * Michael Ho
 * Matthew Jacobs
 * Ishaan Joshi
 * Lenni Kuff
 * Marcel Kornacker
 * Sailesh Mukil
 * Henry Robinson
 * John Russell
 * Dimitris Tsirogiannis
 * Skye Wanderman-Milne
 * Juan Yu

== Affiliations ==
All: Cloudera Inc.

= Sponsors =

== Champion ==
Tom White

== Nominated Mentors ==
 * Tom White (Cloudera)
 * Todd Lipcon (Cloudera)
 * Carl Steinbach (LinkedIn)
 * Brock Noland (StreamSets)


= Sponsoring Entity =
We ask that the Incubator PMC sponsor this proposal.

Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Patrick Angeles <pa...@gmail.com>.
+1 (non binding)

On Tue, Nov 24, 2015 at 4:21 PM, Arvind Prabhakar <ar...@apache.org> wrote:

> +1 (binding)
>
> Regards,
> Arvind Prabhakar
>
> On Tue, Nov 24, 2015 at 1:03 PM, Henry Robinson <he...@cloudera.com>
> wrote:
>
> > Hi -
> >
> > The [DISCUSS] thread has been quiet for a few days, so I think there's
> been
> > sufficient opportunity for discussion around our proposal to bring Impala
> > to the ASF Incubator.
> >
> > I'd like to call a VOTE on that proposal, which is on the wiki at
> > https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
> > below.
> >
> > During the discussion period, the proposal has been amended to add Brock
> > Noland as a new mentor, to add one missed committer from the list and to
> > correct some issues with the dependency list.
> >
> > Please cast your votes as follows:
> >
> > [] +1, accept Impala into the Incubator
> > [] +/-0, non-counted vote to express a disposition
> > [] -1, do not accept Impala into the Incubator (please give your
> reason(s))
> >
> > As with the concurrent Kudu vote, I propose leaving the vote open for a
> > full seven days (to close at Tuesday, December 1st at noon PST), due to
> the
> > upcoming US holiday.
> >
> > Thanks,
> > Henry
> >
> > --------
> >
> > = Abstract =
> > Impala is a high-performance C++ and Java SQL query engine for data
> stored
> > in Apache Hadoop-based clusters.
> >
> > = Proposal =
> >
> > We propose to contribute the Impala codebase and associated artifacts
> (e.g.
> > documentation, web-site content etc.) to the Apache Software Foundation
> > with the intent of forming a productive, meritocratic and open community
> > around Impala’s continued development, according to the ‘Apache Way’.
> >
> > Cloudera owns several trademarks regarding Impala, and proposes to
> transfer
> > ownership of those trademarks in full to the ASF.
> >
> > = Background =
> > Engineers at Cloudera developed Impala and released it as an
> > Apache-licensed open-source project in Fall 2012. Impala was written as a
> > brand-new, modern C++ SQL engine targeted from the start for data stored
> in
> > Apache Hadoop clusters.
> >
> > Impala’s most important benefit to users is high-performance, making it
> > extremely appropriate for common enterprise analytic and business
> > intelligence workloads. This is achieved by a number of software
> > techniques, including: native support for data stored in HDFS and related
> > filesystems, just-in-time compilation and optimization of individual
> query
> > plans, high-performance C++ codebase and massively-parallel distributed
> > architecture. In benchmarks, Impala is routinely amongst the very highest
> > performing SQL query engines.
> >
> > = Rationale =
> >
> > Despite the exciting innovation in the so-called ‘big-data’ space, SQL
> > remains by far the most common interface for interacting with data in
> both
> > traditional warehouses and modern ‘big-data’ clusters. There is clearly a
> > need, as evidenced by the eager adoption of Impala and other SQL engines
> in
> > enterprise contexts, for a query engine that offers the familiar SQL
> > interface, but that has been specifically designed to operate in massive,
> > distributed clusters rather than in traditional, fixed-hardware,
> > warehouse-specific deployments. Impala is one such query engine.
> >
> > We believe that the ASF is the right venue to foster an open-source
> > community around Impala’s development. We expect that Impala will benefit
> > from more productive collaboration with related Apache projects, and
> under
> > the auspices of the ASF will attract talented contributors who will push
> > Impala’s development forward at pace.
> >
> > We believe that the timing is right for Impala’s development to move
> > wholesale to the ASF: Impala is well-established, has been
> Apache-licensed
> > open-source for more than three years, and the core project is relatively
> > stable. We are excited to see where an ASF-based community can take
> Impala
> > from this strong starting point.
> >
> > = Initial Goals =
> > Our initial goals are as follows:
> >
> >  * Establish ASF-compatible engineering practices and workflows
> >  * Refactor and publish existing internal build scripts and test
> > infrastructure, in order to make them usable by any community member.
> >  * Transfer source code, documentation and associated artifacts to the
> ASF.
> >  * Grow the user and developer communities
> >
> > = Current Status =
> >
> > Impala is developed as an Apache-licensed open-source project. The source
> > code is available at http://github.com/cloudera/Impala, and developer
> > documentation is at https://github.com/cloudera/Impala/wiki. The
> majority
> > of commits to the project have come from Cloudera-employed developers,
> but
> > we have accepted some contributions from individuals from other
> > organizations.
> >
> > All code reviews are done via a public instance of the Gerrit review tool
> > at http://gerrit.cloudera.org:8080/, and discussed on a public mailing
> > list. All patches must be reviewed before they are accepted into the
> > codebase, via a voting mechanism that is similar to that used on Apache
> > projects such as Hadoop and HBase.
> >
> > Before a patch is committed, it must pass a suite of pre-commit tests.
> > These tests are currently run on Cloudera’s internal infrastructure. One
> of
> > our initial goals will be to work with the ASF Infrastructure team to
> find
> > a way to run these tests in an acceptable way on publicly accessible
> > machines.
> >
> > Issues are tracked in JIRA at
> https://issues.cloudera.org/projects/IMPALA,
> > in a way that is extremely similar to existing practices at other ASF
> > projects.
> >
> > = Meritocracy =
> >
> > We understand the central importance of meritocracy to the Apache Way. We
> > will work to establish a welcoming, fair and meritocratic community, in
> > part by expanding the set of committers on the project. Although Impala’s
> > committer list will initially be dominated by members of the Impala
> > engineering team at Cloudera, we look forward to growing a rich user and
> > developer community.
> >
> > = Community =
> > Impala has a strong user community (see
> > https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user),
> and a
> > growing developer community (see
> > https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev). We
> > wish
> > to attract more developers to the project, and we believe that the ASF’s
> > open and meritocratic philosophy will help us with this. We note the
> > success of other, similar projects already part of the ASF.
> >
> > = Core Developers =
> > Most - but not all - of Impala’s core developers are not currently
> > affiliated with the ASF, and will require new ICLAs.
> >
> > = Alignment =
> > Impala is related to several other Apache projects:
> >
> >  * Data that is read by Impala is very often stored in Apache Hadoop
> > clusters powered by the HDFS filesystem.
> >  * Impala can also read data stored in Apache HBase
> >  * Metadata for databases, tables and so on is read by Impala from Apache
> > Hive.
> >  * The preferred data format for HDFS-based tables is Apache Parquet, and
> > Apache Avro is also a supported data format.
> >  * Impala is closely integrated with Kudu, which is also being proposed
> to
> > the Incubator.
> >  * Impala uses Apache Thrift as its RPC and serialization framework of
> > choice.
> >
> > = Known Risks =
> >
> > == Orphaned Products ==
> > Impala is used by most of Cloudera’s customers, and Cloudera remains
> > committed to developing and supporting the project. Cloudera has a strong
> > track record in standing behind projects that were contributed to the ASF
> > by its employees, including Apache Flume, Apache Sqoop, and others. Other
> > companies both ship and support Impala, lending credence to the idea that
> > Impala is not at risk of being suddenly orphaned.
> >
> > == Inexperience with Open Source ==
> > Although all committers on the initial list have significant experience
> > with at least one open-source project - namely Impala - fewer have much
> > experience with ASF-based software projects as contributors and community
> > members. However, with the guidance of our mentors, committers who do
> have
> > ASF experience, and time to learn during Incubation, we are confident
> that
> > the project can be run in accordance with Apache principles on an ongoing
> > basis.
> >
> > == Homogeneous Developers ==
> >
> > The initial committers are employees of Cloudera.
> >
> > The project has received some contributions from developers outside of
> > Cloudera, from individuals belonging to organizations such as Intel and
> > Google, from hobbyists and from students using Impala to advance their
> > understanding of distributed databases. The project attracted an active
> > user community as well. We hope to continue to encourage contributions
> from
> > these developers and community members and grow them into committers
> after
> > they have had time to continue their contributions.
> >
> > == Reliance on Salaried Developers ==
> >
> > Many of Impala’s initial set of committers work full-time on Impala, and
> > are paid to do so. However, as mentioned elsewhere, we anticipate growth
> in
> > the developer community which we hope will include hobbyists and
> academics
> > who have an interested in distributed data systems.
> >
> > == An Excessive Fascination with the Apache Brand ==
> > Although we hope that Impala benefits from the Apache Brand, any
> reflected
> > goodwill to Cloudera as the contributing entity is not the goal of
> > establishing Impala as an Apache project. We will work with the Incubator
> > PMC and the PRC to ensure that the Apache Brand is respected.
> >
> > = Documentation =
> > Impala: A Modern, Open-Source SQL Engine for Hadoop (
> > http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf)
> >
> > Impala’s developer wiki (https://github.com/cloudera/Impala/wiki)
> >
> > Impala’s auto-generated API documentation (
> > http://impala.io/doc/html/index.html)
> >
> > = Initial Source =
> > Impala’s initial source contribution will come from
> > http://github.com/cloudera/Impala/.
> >
> > = External Dependencies =
> >
> > Impala depends upon a number of third-party libraries, which we list
> below.
> > We intend to compile a LICENSE.txt file in the very short term (see
> > https://issues.cloudera.org/browse/IMPALA-2670).
> >
> >  * Google gflags (BSD)
> >  * Google glog (BSD)
> >  * Apache Thrift (Apache Software License v2.0)
> >  * Apache Commons (Apache Software License v2.0)
> >  * Apache Hadoop (Apache Software License v2.0)
> >  * Apache HBase (Apache Software License v2.0)
> >  * Apache Hive (Apache Software License v2.0)
> >  * Boost (Boost Software License)
> >  * OpenLdap (OpenLDAP Software License)
> >  * rapidjson (MIT)
> >  * Google RE2 (BSD-style)
> >  * lz4 (BSD)
> >  * snappy (BSD)
> >  * cyrus-sasl (CMU License)
> >  * Apache Avro (Apache Software License v2.0)
> >  * Cloudera squeasel (Apache Software License v2.0)
> >  * Apache htrace (Incubating) (Apache Software License v2.0)
> >  * Apache Sentry (Incubating) (Apache Software License v2.0)
> >  * Apache Shiro (Apache Software License v2.0)
> >  * Twitter Bootstrap (Apache Software License v2.0)
> >  * d3 (BSD)
> >  * LLVM (BSD-like)
> >
> > Build and test dependencies:
> >
> >  * ant (Apache Software License v2.0)
> >  * Apache Maven (Apache Software License v2.0)
> >  * cmake (BSD)
> >  * clang (BSD)
> >  * Google gtest (Apache Software License v2.0)
> >
> > = Required Resources =
> >
> > We request that following resources be created for the project to use:
> >
> > == Mailing lists ==
> >
> >  * private@impala.incubator.apache.org (moderated subscriptions)
> >  * commits@impala.incubator.apache.org
> >  * dev@impala.incubator.apache.org
> >  * issues@impala.incubator.apache.org
> >  * user@impala.incubator.apache.org
> >
> > == Git repository ==
> > https://git.apache.org/impala.git
> >
> > == JIRA instance ==
> > JIRA project IMPALA (IMPALA or IMP)
> >
> > == Other Resources ==
> > We hope to continue using Gerrit for our code review and commit workflow.
> > We are involved with discussions that the Kudu team at Cloudera have been
> > having with Jake Farrell to start discussions on how Gerrit can fit into
> > the ASF. We know that several other ASF projects or podlings are also
> > interested in Gerrit.
> >
> > If the Infrastructure team does not have the bandwidth to support gerrit,
> > we will continue to support our own instance of gerrit for Impala, and
> make
> > the necessary integrations such that commits are properly authenticated
> and
> > maintain sufficient provenance to uphold the ASF standards (e.g. via the
> > solution adopted by the AsterixDB podling).
> >
> > = Initial Committers =
> >
> >  * Tim Armstrong
> >  * Alex Behm
> >  * Taras Bobrovytsky
> >  * Casey Ching
> >  * Martin Grund
> >  * Daniel Hecht
> >  * Michael Ho
> >  * Matthew Jacobs
> >  * Ishaan Joshi
> >  * Lenni Kuff
> >  * Marcel Kornacker
> >  * Sailesh Mukil
> >  * Henry Robinson
> >  * John Russell
> >  * Dimitris Tsirogiannis
> >  * Skye Wanderman-Milne
> >  * Juan Yu
> >
> > == Affiliations ==
> > All: Cloudera Inc.
> >
> > = Sponsors =
> >
> > == Champion ==
> > Tom White
> >
> > == Nominated Mentors ==
> >  * Tom White (Cloudera)
> >  * Todd Lipcon (Cloudera)
> >  * Carl Steinbach (LinkedIn)
> >  * Brock Noland (StreamSets)
> >
> >
> > = Sponsoring Entity =
> > We ask that the Incubator PMC sponsor this proposal.
> >
>

Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Arvind Prabhakar <ar...@apache.org>.
+1 (binding)

Regards,
Arvind Prabhakar

On Tue, Nov 24, 2015 at 1:03 PM, Henry Robinson <he...@cloudera.com> wrote:

> Hi -
>
> The [DISCUSS] thread has been quiet for a few days, so I think there's been
> sufficient opportunity for discussion around our proposal to bring Impala
> to the ASF Incubator.
>
> I'd like to call a VOTE on that proposal, which is on the wiki at
> https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
> below.
>
> During the discussion period, the proposal has been amended to add Brock
> Noland as a new mentor, to add one missed committer from the list and to
> correct some issues with the dependency list.
>
> Please cast your votes as follows:
>
> [] +1, accept Impala into the Incubator
> [] +/-0, non-counted vote to express a disposition
> [] -1, do not accept Impala into the Incubator (please give your reason(s))
>
> As with the concurrent Kudu vote, I propose leaving the vote open for a
> full seven days (to close at Tuesday, December 1st at noon PST), due to the
> upcoming US holiday.
>
> Thanks,
> Henry
>
> --------
>
> = Abstract =
> Impala is a high-performance C++ and Java SQL query engine for data stored
> in Apache Hadoop-based clusters.
>
> = Proposal =
>
> We propose to contribute the Impala codebase and associated artifacts (e.g.
> documentation, web-site content etc.) to the Apache Software Foundation
> with the intent of forming a productive, meritocratic and open community
> around Impala’s continued development, according to the ‘Apache Way’.
>
> Cloudera owns several trademarks regarding Impala, and proposes to transfer
> ownership of those trademarks in full to the ASF.
>
> = Background =
> Engineers at Cloudera developed Impala and released it as an
> Apache-licensed open-source project in Fall 2012. Impala was written as a
> brand-new, modern C++ SQL engine targeted from the start for data stored in
> Apache Hadoop clusters.
>
> Impala’s most important benefit to users is high-performance, making it
> extremely appropriate for common enterprise analytic and business
> intelligence workloads. This is achieved by a number of software
> techniques, including: native support for data stored in HDFS and related
> filesystems, just-in-time compilation and optimization of individual query
> plans, high-performance C++ codebase and massively-parallel distributed
> architecture. In benchmarks, Impala is routinely amongst the very highest
> performing SQL query engines.
>
> = Rationale =
>
> Despite the exciting innovation in the so-called ‘big-data’ space, SQL
> remains by far the most common interface for interacting with data in both
> traditional warehouses and modern ‘big-data’ clusters. There is clearly a
> need, as evidenced by the eager adoption of Impala and other SQL engines in
> enterprise contexts, for a query engine that offers the familiar SQL
> interface, but that has been specifically designed to operate in massive,
> distributed clusters rather than in traditional, fixed-hardware,
> warehouse-specific deployments. Impala is one such query engine.
>
> We believe that the ASF is the right venue to foster an open-source
> community around Impala’s development. We expect that Impala will benefit
> from more productive collaboration with related Apache projects, and under
> the auspices of the ASF will attract talented contributors who will push
> Impala’s development forward at pace.
>
> We believe that the timing is right for Impala’s development to move
> wholesale to the ASF: Impala is well-established, has been Apache-licensed
> open-source for more than three years, and the core project is relatively
> stable. We are excited to see where an ASF-based community can take Impala
> from this strong starting point.
>
> = Initial Goals =
> Our initial goals are as follows:
>
>  * Establish ASF-compatible engineering practices and workflows
>  * Refactor and publish existing internal build scripts and test
> infrastructure, in order to make them usable by any community member.
>  * Transfer source code, documentation and associated artifacts to the ASF.
>  * Grow the user and developer communities
>
> = Current Status =
>
> Impala is developed as an Apache-licensed open-source project. The source
> code is available at http://github.com/cloudera/Impala, and developer
> documentation is at https://github.com/cloudera/Impala/wiki. The majority
> of commits to the project have come from Cloudera-employed developers, but
> we have accepted some contributions from individuals from other
> organizations.
>
> All code reviews are done via a public instance of the Gerrit review tool
> at http://gerrit.cloudera.org:8080/, and discussed on a public mailing
> list. All patches must be reviewed before they are accepted into the
> codebase, via a voting mechanism that is similar to that used on Apache
> projects such as Hadoop and HBase.
>
> Before a patch is committed, it must pass a suite of pre-commit tests.
> These tests are currently run on Cloudera’s internal infrastructure. One of
> our initial goals will be to work with the ASF Infrastructure team to find
> a way to run these tests in an acceptable way on publicly accessible
> machines.
>
> Issues are tracked in JIRA at https://issues.cloudera.org/projects/IMPALA,
> in a way that is extremely similar to existing practices at other ASF
> projects.
>
> = Meritocracy =
>
> We understand the central importance of meritocracy to the Apache Way. We
> will work to establish a welcoming, fair and meritocratic community, in
> part by expanding the set of committers on the project. Although Impala’s
> committer list will initially be dominated by members of the Impala
> engineering team at Cloudera, we look forward to growing a rich user and
> developer community.
>
> = Community =
> Impala has a strong user community (see
> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user), and a
> growing developer community (see
> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev). We
> wish
> to attract more developers to the project, and we believe that the ASF’s
> open and meritocratic philosophy will help us with this. We note the
> success of other, similar projects already part of the ASF.
>
> = Core Developers =
> Most - but not all - of Impala’s core developers are not currently
> affiliated with the ASF, and will require new ICLAs.
>
> = Alignment =
> Impala is related to several other Apache projects:
>
>  * Data that is read by Impala is very often stored in Apache Hadoop
> clusters powered by the HDFS filesystem.
>  * Impala can also read data stored in Apache HBase
>  * Metadata for databases, tables and so on is read by Impala from Apache
> Hive.
>  * The preferred data format for HDFS-based tables is Apache Parquet, and
> Apache Avro is also a supported data format.
>  * Impala is closely integrated with Kudu, which is also being proposed to
> the Incubator.
>  * Impala uses Apache Thrift as its RPC and serialization framework of
> choice.
>
> = Known Risks =
>
> == Orphaned Products ==
> Impala is used by most of Cloudera’s customers, and Cloudera remains
> committed to developing and supporting the project. Cloudera has a strong
> track record in standing behind projects that were contributed to the ASF
> by its employees, including Apache Flume, Apache Sqoop, and others. Other
> companies both ship and support Impala, lending credence to the idea that
> Impala is not at risk of being suddenly orphaned.
>
> == Inexperience with Open Source ==
> Although all committers on the initial list have significant experience
> with at least one open-source project - namely Impala - fewer have much
> experience with ASF-based software projects as contributors and community
> members. However, with the guidance of our mentors, committers who do have
> ASF experience, and time to learn during Incubation, we are confident that
> the project can be run in accordance with Apache principles on an ongoing
> basis.
>
> == Homogeneous Developers ==
>
> The initial committers are employees of Cloudera.
>
> The project has received some contributions from developers outside of
> Cloudera, from individuals belonging to organizations such as Intel and
> Google, from hobbyists and from students using Impala to advance their
> understanding of distributed databases. The project attracted an active
> user community as well. We hope to continue to encourage contributions from
> these developers and community members and grow them into committers after
> they have had time to continue their contributions.
>
> == Reliance on Salaried Developers ==
>
> Many of Impala’s initial set of committers work full-time on Impala, and
> are paid to do so. However, as mentioned elsewhere, we anticipate growth in
> the developer community which we hope will include hobbyists and academics
> who have an interested in distributed data systems.
>
> == An Excessive Fascination with the Apache Brand ==
> Although we hope that Impala benefits from the Apache Brand, any reflected
> goodwill to Cloudera as the contributing entity is not the goal of
> establishing Impala as an Apache project. We will work with the Incubator
> PMC and the PRC to ensure that the Apache Brand is respected.
>
> = Documentation =
> Impala: A Modern, Open-Source SQL Engine for Hadoop (
> http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf)
>
> Impala’s developer wiki (https://github.com/cloudera/Impala/wiki)
>
> Impala’s auto-generated API documentation (
> http://impala.io/doc/html/index.html)
>
> = Initial Source =
> Impala’s initial source contribution will come from
> http://github.com/cloudera/Impala/.
>
> = External Dependencies =
>
> Impala depends upon a number of third-party libraries, which we list below.
> We intend to compile a LICENSE.txt file in the very short term (see
> https://issues.cloudera.org/browse/IMPALA-2670).
>
>  * Google gflags (BSD)
>  * Google glog (BSD)
>  * Apache Thrift (Apache Software License v2.0)
>  * Apache Commons (Apache Software License v2.0)
>  * Apache Hadoop (Apache Software License v2.0)
>  * Apache HBase (Apache Software License v2.0)
>  * Apache Hive (Apache Software License v2.0)
>  * Boost (Boost Software License)
>  * OpenLdap (OpenLDAP Software License)
>  * rapidjson (MIT)
>  * Google RE2 (BSD-style)
>  * lz4 (BSD)
>  * snappy (BSD)
>  * cyrus-sasl (CMU License)
>  * Apache Avro (Apache Software License v2.0)
>  * Cloudera squeasel (Apache Software License v2.0)
>  * Apache htrace (Incubating) (Apache Software License v2.0)
>  * Apache Sentry (Incubating) (Apache Software License v2.0)
>  * Apache Shiro (Apache Software License v2.0)
>  * Twitter Bootstrap (Apache Software License v2.0)
>  * d3 (BSD)
>  * LLVM (BSD-like)
>
> Build and test dependencies:
>
>  * ant (Apache Software License v2.0)
>  * Apache Maven (Apache Software License v2.0)
>  * cmake (BSD)
>  * clang (BSD)
>  * Google gtest (Apache Software License v2.0)
>
> = Required Resources =
>
> We request that following resources be created for the project to use:
>
> == Mailing lists ==
>
>  * private@impala.incubator.apache.org (moderated subscriptions)
>  * commits@impala.incubator.apache.org
>  * dev@impala.incubator.apache.org
>  * issues@impala.incubator.apache.org
>  * user@impala.incubator.apache.org
>
> == Git repository ==
> https://git.apache.org/impala.git
>
> == JIRA instance ==
> JIRA project IMPALA (IMPALA or IMP)
>
> == Other Resources ==
> We hope to continue using Gerrit for our code review and commit workflow.
> We are involved with discussions that the Kudu team at Cloudera have been
> having with Jake Farrell to start discussions on how Gerrit can fit into
> the ASF. We know that several other ASF projects or podlings are also
> interested in Gerrit.
>
> If the Infrastructure team does not have the bandwidth to support gerrit,
> we will continue to support our own instance of gerrit for Impala, and make
> the necessary integrations such that commits are properly authenticated and
> maintain sufficient provenance to uphold the ASF standards (e.g. via the
> solution adopted by the AsterixDB podling).
>
> = Initial Committers =
>
>  * Tim Armstrong
>  * Alex Behm
>  * Taras Bobrovytsky
>  * Casey Ching
>  * Martin Grund
>  * Daniel Hecht
>  * Michael Ho
>  * Matthew Jacobs
>  * Ishaan Joshi
>  * Lenni Kuff
>  * Marcel Kornacker
>  * Sailesh Mukil
>  * Henry Robinson
>  * John Russell
>  * Dimitris Tsirogiannis
>  * Skye Wanderman-Milne
>  * Juan Yu
>
> == Affiliations ==
> All: Cloudera Inc.
>
> = Sponsors =
>
> == Champion ==
> Tom White
>
> == Nominated Mentors ==
>  * Tom White (Cloudera)
>  * Todd Lipcon (Cloudera)
>  * Carl Steinbach (LinkedIn)
>  * Brock Noland (StreamSets)
>
>
> = Sponsoring Entity =
> We ask that the Incubator PMC sponsor this proposal.
>

Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Roman Shaposhnik <rv...@apache.org>.
On Tue, Nov 24, 2015 at 1:03 PM, Henry Robinson <he...@cloudera.com> wrote:
> Hi -
>
> The [DISCUSS] thread has been quiet for a few days, so I think there's been
> sufficient opportunity for discussion around our proposal to bring Impala
> to the ASF Incubator.
>
> I'd like to call a VOTE on that proposal, which is on the wiki at
> https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
> below.
>
> During the discussion period, the proposal has been amended to add Brock
> Noland as a new mentor, to add one missed committer from the list and to
> correct some issues with the dependency list.
>
> Please cast your votes as follows:
>
> [] +1, accept Impala into the Incubator
> [] +/-0, non-counted vote to express a disposition
> [] -1, do not accept Impala into the Incubator (please give your reason(s))

-1 (binding)

I wasn't convinced by the results of the RTC vs. CTR discussion
and given the initial composition of the community, I'd like to see
an initial commitment to erring on the side of inclusiveness rather
that the walled-garden community protected by Gerrit.

Thanks,
Roman.

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Amol Kekre <am...@datatorrent.com>.
+1 (non-binding)

Amol


On Wed, Nov 25, 2015 at 12:44 AM, Tom White <to...@apache.org> wrote:

> +1 (binding)
>
> Tom
>
> On Tue, Nov 24, 2015 at 9:03 PM, Henry Robinson <he...@cloudera.com>
> wrote:
> > Hi -
> >
> > The [DISCUSS] thread has been quiet for a few days, so I think there's
> been
> > sufficient opportunity for discussion around our proposal to bring Impala
> > to the ASF Incubator.
> >
> > I'd like to call a VOTE on that proposal, which is on the wiki at
> > https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
> > below.
> >
> > During the discussion period, the proposal has been amended to add Brock
> > Noland as a new mentor, to add one missed committer from the list and to
> > correct some issues with the dependency list.
> >
> > Please cast your votes as follows:
> >
> > [] +1, accept Impala into the Incubator
> > [] +/-0, non-counted vote to express a disposition
> > [] -1, do not accept Impala into the Incubator (please give your
> reason(s))
> >
> > As with the concurrent Kudu vote, I propose leaving the vote open for a
> > full seven days (to close at Tuesday, December 1st at noon PST), due to
> the
> > upcoming US holiday.
> >
> > Thanks,
> > Henry
> >
> > --------
> >
> > = Abstract =
> > Impala is a high-performance C++ and Java SQL query engine for data
> stored
> > in Apache Hadoop-based clusters.
> >
> > = Proposal =
> >
> > We propose to contribute the Impala codebase and associated artifacts
> (e.g.
> > documentation, web-site content etc.) to the Apache Software Foundation
> > with the intent of forming a productive, meritocratic and open community
> > around Impala’s continued development, according to the ‘Apache Way’.
> >
> > Cloudera owns several trademarks regarding Impala, and proposes to
> transfer
> > ownership of those trademarks in full to the ASF.
> >
> > = Background =
> > Engineers at Cloudera developed Impala and released it as an
> > Apache-licensed open-source project in Fall 2012. Impala was written as a
> > brand-new, modern C++ SQL engine targeted from the start for data stored
> in
> > Apache Hadoop clusters.
> >
> > Impala’s most important benefit to users is high-performance, making it
> > extremely appropriate for common enterprise analytic and business
> > intelligence workloads. This is achieved by a number of software
> > techniques, including: native support for data stored in HDFS and related
> > filesystems, just-in-time compilation and optimization of individual
> query
> > plans, high-performance C++ codebase and massively-parallel distributed
> > architecture. In benchmarks, Impala is routinely amongst the very highest
> > performing SQL query engines.
> >
> > = Rationale =
> >
> > Despite the exciting innovation in the so-called ‘big-data’ space, SQL
> > remains by far the most common interface for interacting with data in
> both
> > traditional warehouses and modern ‘big-data’ clusters. There is clearly a
> > need, as evidenced by the eager adoption of Impala and other SQL engines
> in
> > enterprise contexts, for a query engine that offers the familiar SQL
> > interface, but that has been specifically designed to operate in massive,
> > distributed clusters rather than in traditional, fixed-hardware,
> > warehouse-specific deployments. Impala is one such query engine.
> >
> > We believe that the ASF is the right venue to foster an open-source
> > community around Impala’s development. We expect that Impala will benefit
> > from more productive collaboration with related Apache projects, and
> under
> > the auspices of the ASF will attract talented contributors who will push
> > Impala’s development forward at pace.
> >
> > We believe that the timing is right for Impala’s development to move
> > wholesale to the ASF: Impala is well-established, has been
> Apache-licensed
> > open-source for more than three years, and the core project is relatively
> > stable. We are excited to see where an ASF-based community can take
> Impala
> > from this strong starting point.
> >
> > = Initial Goals =
> > Our initial goals are as follows:
> >
> >  * Establish ASF-compatible engineering practices and workflows
> >  * Refactor and publish existing internal build scripts and test
> > infrastructure, in order to make them usable by any community member.
> >  * Transfer source code, documentation and associated artifacts to the
> ASF.
> >  * Grow the user and developer communities
> >
> > = Current Status =
> >
> > Impala is developed as an Apache-licensed open-source project. The source
> > code is available at http://github.com/cloudera/Impala, and developer
> > documentation is at https://github.com/cloudera/Impala/wiki. The
> majority
> > of commits to the project have come from Cloudera-employed developers,
> but
> > we have accepted some contributions from individuals from other
> > organizations.
> >
> > All code reviews are done via a public instance of the Gerrit review tool
> > at http://gerrit.cloudera.org:8080/, and discussed on a public mailing
> > list. All patches must be reviewed before they are accepted into the
> > codebase, via a voting mechanism that is similar to that used on Apache
> > projects such as Hadoop and HBase.
> >
> > Before a patch is committed, it must pass a suite of pre-commit tests.
> > These tests are currently run on Cloudera’s internal infrastructure. One
> of
> > our initial goals will be to work with the ASF Infrastructure team to
> find
> > a way to run these tests in an acceptable way on publicly accessible
> > machines.
> >
> > Issues are tracked in JIRA at
> https://issues.cloudera.org/projects/IMPALA,
> > in a way that is extremely similar to existing practices at other ASF
> > projects.
> >
> > = Meritocracy =
> >
> > We understand the central importance of meritocracy to the Apache Way. We
> > will work to establish a welcoming, fair and meritocratic community, in
> > part by expanding the set of committers on the project. Although Impala’s
> > committer list will initially be dominated by members of the Impala
> > engineering team at Cloudera, we look forward to growing a rich user and
> > developer community.
> >
> > = Community =
> > Impala has a strong user community (see
> > https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user),
> and a
> > growing developer community (see
> > https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev). We
> wish
> > to attract more developers to the project, and we believe that the ASF’s
> > open and meritocratic philosophy will help us with this. We note the
> > success of other, similar projects already part of the ASF.
> >
> > = Core Developers =
> > Most - but not all - of Impala’s core developers are not currently
> > affiliated with the ASF, and will require new ICLAs.
> >
> > = Alignment =
> > Impala is related to several other Apache projects:
> >
> >  * Data that is read by Impala is very often stored in Apache Hadoop
> > clusters powered by the HDFS filesystem.
> >  * Impala can also read data stored in Apache HBase
> >  * Metadata for databases, tables and so on is read by Impala from Apache
> > Hive.
> >  * The preferred data format for HDFS-based tables is Apache Parquet, and
> > Apache Avro is also a supported data format.
> >  * Impala is closely integrated with Kudu, which is also being proposed
> to
> > the Incubator.
> >  * Impala uses Apache Thrift as its RPC and serialization framework of
> > choice.
> >
> > = Known Risks =
> >
> > == Orphaned Products ==
> > Impala is used by most of Cloudera’s customers, and Cloudera remains
> > committed to developing and supporting the project. Cloudera has a strong
> > track record in standing behind projects that were contributed to the ASF
> > by its employees, including Apache Flume, Apache Sqoop, and others. Other
> > companies both ship and support Impala, lending credence to the idea that
> > Impala is not at risk of being suddenly orphaned.
> >
> > == Inexperience with Open Source ==
> > Although all committers on the initial list have significant experience
> > with at least one open-source project - namely Impala - fewer have much
> > experience with ASF-based software projects as contributors and community
> > members. However, with the guidance of our mentors, committers who do
> have
> > ASF experience, and time to learn during Incubation, we are confident
> that
> > the project can be run in accordance with Apache principles on an ongoing
> > basis.
> >
> > == Homogeneous Developers ==
> >
> > The initial committers are employees of Cloudera.
> >
> > The project has received some contributions from developers outside of
> > Cloudera, from individuals belonging to organizations such as Intel and
> > Google, from hobbyists and from students using Impala to advance their
> > understanding of distributed databases. The project attracted an active
> > user community as well. We hope to continue to encourage contributions
> from
> > these developers and community members and grow them into committers
> after
> > they have had time to continue their contributions.
> >
> > == Reliance on Salaried Developers ==
> >
> > Many of Impala’s initial set of committers work full-time on Impala, and
> > are paid to do so. However, as mentioned elsewhere, we anticipate growth
> in
> > the developer community which we hope will include hobbyists and
> academics
> > who have an interested in distributed data systems.
> >
> > == An Excessive Fascination with the Apache Brand ==
> > Although we hope that Impala benefits from the Apache Brand, any
> reflected
> > goodwill to Cloudera as the contributing entity is not the goal of
> > establishing Impala as an Apache project. We will work with the Incubator
> > PMC and the PRC to ensure that the Apache Brand is respected.
> >
> > = Documentation =
> > Impala: A Modern, Open-Source SQL Engine for Hadoop (
> > http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf)
> >
> > Impala’s developer wiki (https://github.com/cloudera/Impala/wiki)
> >
> > Impala’s auto-generated API documentation (
> > http://impala.io/doc/html/index.html)
> >
> > = Initial Source =
> > Impala’s initial source contribution will come from
> > http://github.com/cloudera/Impala/.
> >
> > = External Dependencies =
> >
> > Impala depends upon a number of third-party libraries, which we list
> below.
> > We intend to compile a LICENSE.txt file in the very short term (see
> > https://issues.cloudera.org/browse/IMPALA-2670).
> >
> >  * Google gflags (BSD)
> >  * Google glog (BSD)
> >  * Apache Thrift (Apache Software License v2.0)
> >  * Apache Commons (Apache Software License v2.0)
> >  * Apache Hadoop (Apache Software License v2.0)
> >  * Apache HBase (Apache Software License v2.0)
> >  * Apache Hive (Apache Software License v2.0)
> >  * Boost (Boost Software License)
> >  * OpenLdap (OpenLDAP Software License)
> >  * rapidjson (MIT)
> >  * Google RE2 (BSD-style)
> >  * lz4 (BSD)
> >  * snappy (BSD)
> >  * cyrus-sasl (CMU License)
> >  * Apache Avro (Apache Software License v2.0)
> >  * Cloudera squeasel (Apache Software License v2.0)
> >  * Apache htrace (Incubating) (Apache Software License v2.0)
> >  * Apache Sentry (Incubating) (Apache Software License v2.0)
> >  * Apache Shiro (Apache Software License v2.0)
> >  * Twitter Bootstrap (Apache Software License v2.0)
> >  * d3 (BSD)
> >  * LLVM (BSD-like)
> >
> > Build and test dependencies:
> >
> >  * ant (Apache Software License v2.0)
> >  * Apache Maven (Apache Software License v2.0)
> >  * cmake (BSD)
> >  * clang (BSD)
> >  * Google gtest (Apache Software License v2.0)
> >
> > = Required Resources =
> >
> > We request that following resources be created for the project to use:
> >
> > == Mailing lists ==
> >
> >  * private@impala.incubator.apache.org (moderated subscriptions)
> >  * commits@impala.incubator.apache.org
> >  * dev@impala.incubator.apache.org
> >  * issues@impala.incubator.apache.org
> >  * user@impala.incubator.apache.org
> >
> > == Git repository ==
> > https://git.apache.org/impala.git
> >
> > == JIRA instance ==
> > JIRA project IMPALA (IMPALA or IMP)
> >
> > == Other Resources ==
> > We hope to continue using Gerrit for our code review and commit workflow.
> > We are involved with discussions that the Kudu team at Cloudera have been
> > having with Jake Farrell to start discussions on how Gerrit can fit into
> > the ASF. We know that several other ASF projects or podlings are also
> > interested in Gerrit.
> >
> > If the Infrastructure team does not have the bandwidth to support gerrit,
> > we will continue to support our own instance of gerrit for Impala, and
> make
> > the necessary integrations such that commits are properly authenticated
> and
> > maintain sufficient provenance to uphold the ASF standards (e.g. via the
> > solution adopted by the AsterixDB podling).
> >
> > = Initial Committers =
> >
> >  * Tim Armstrong
> >  * Alex Behm
> >  * Taras Bobrovytsky
> >  * Casey Ching
> >  * Martin Grund
> >  * Daniel Hecht
> >  * Michael Ho
> >  * Matthew Jacobs
> >  * Ishaan Joshi
> >  * Lenni Kuff
> >  * Marcel Kornacker
> >  * Sailesh Mukil
> >  * Henry Robinson
> >  * John Russell
> >  * Dimitris Tsirogiannis
> >  * Skye Wanderman-Milne
> >  * Juan Yu
> >
> > == Affiliations ==
> > All: Cloudera Inc.
> >
> > = Sponsors =
> >
> > == Champion ==
> > Tom White
> >
> > == Nominated Mentors ==
> >  * Tom White (Cloudera)
> >  * Todd Lipcon (Cloudera)
> >  * Carl Steinbach (LinkedIn)
> >  * Brock Noland (StreamSets)
> >
> >
> > = Sponsoring Entity =
> > We ask that the Incubator PMC sponsor this proposal.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>

Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Alex Karasulu <ak...@apache.org>.
+1 (binding)

On Wed, Nov 25, 2015 at 10:44 AM, Tom White <to...@apache.org> wrote:

> +1 (binding)
>
> Tom
>
> On Tue, Nov 24, 2015 at 9:03 PM, Henry Robinson <he...@cloudera.com>
> wrote:
> > Hi -
> >
> > The [DISCUSS] thread has been quiet for a few days, so I think there's
> been
> > sufficient opportunity for discussion around our proposal to bring Impala
> > to the ASF Incubator.
> >
> > I'd like to call a VOTE on that proposal, which is on the wiki at
> > https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
> > below.
> >
> > During the discussion period, the proposal has been amended to add Brock
> > Noland as a new mentor, to add one missed committer from the list and to
> > correct some issues with the dependency list.
> >
> > Please cast your votes as follows:
> >
> > [] +1, accept Impala into the Incubator
> > [] +/-0, non-counted vote to express a disposition
> > [] -1, do not accept Impala into the Incubator (please give your
> reason(s))
> >
> > As with the concurrent Kudu vote, I propose leaving the vote open for a
> > full seven days (to close at Tuesday, December 1st at noon PST), due to
> the
> > upcoming US holiday.
> >
> > Thanks,
> > Henry
> >
> > --------
> >
> > = Abstract =
> > Impala is a high-performance C++ and Java SQL query engine for data
> stored
> > in Apache Hadoop-based clusters.
> >
> > = Proposal =
> >
> > We propose to contribute the Impala codebase and associated artifacts
> (e.g.
> > documentation, web-site content etc.) to the Apache Software Foundation
> > with the intent of forming a productive, meritocratic and open community
> > around Impala’s continued development, according to the ‘Apache Way’.
> >
> > Cloudera owns several trademarks regarding Impala, and proposes to
> transfer
> > ownership of those trademarks in full to the ASF.
> >
> > = Background =
> > Engineers at Cloudera developed Impala and released it as an
> > Apache-licensed open-source project in Fall 2012. Impala was written as a
> > brand-new, modern C++ SQL engine targeted from the start for data stored
> in
> > Apache Hadoop clusters.
> >
> > Impala’s most important benefit to users is high-performance, making it
> > extremely appropriate for common enterprise analytic and business
> > intelligence workloads. This is achieved by a number of software
> > techniques, including: native support for data stored in HDFS and related
> > filesystems, just-in-time compilation and optimization of individual
> query
> > plans, high-performance C++ codebase and massively-parallel distributed
> > architecture. In benchmarks, Impala is routinely amongst the very highest
> > performing SQL query engines.
> >
> > = Rationale =
> >
> > Despite the exciting innovation in the so-called ‘big-data’ space, SQL
> > remains by far the most common interface for interacting with data in
> both
> > traditional warehouses and modern ‘big-data’ clusters. There is clearly a
> > need, as evidenced by the eager adoption of Impala and other SQL engines
> in
> > enterprise contexts, for a query engine that offers the familiar SQL
> > interface, but that has been specifically designed to operate in massive,
> > distributed clusters rather than in traditional, fixed-hardware,
> > warehouse-specific deployments. Impala is one such query engine.
> >
> > We believe that the ASF is the right venue to foster an open-source
> > community around Impala’s development. We expect that Impala will benefit
> > from more productive collaboration with related Apache projects, and
> under
> > the auspices of the ASF will attract talented contributors who will push
> > Impala’s development forward at pace.
> >
> > We believe that the timing is right for Impala’s development to move
> > wholesale to the ASF: Impala is well-established, has been
> Apache-licensed
> > open-source for more than three years, and the core project is relatively
> > stable. We are excited to see where an ASF-based community can take
> Impala
> > from this strong starting point.
> >
> > = Initial Goals =
> > Our initial goals are as follows:
> >
> >  * Establish ASF-compatible engineering practices and workflows
> >  * Refactor and publish existing internal build scripts and test
> > infrastructure, in order to make them usable by any community member.
> >  * Transfer source code, documentation and associated artifacts to the
> ASF.
> >  * Grow the user and developer communities
> >
> > = Current Status =
> >
> > Impala is developed as an Apache-licensed open-source project. The source
> > code is available at http://github.com/cloudera/Impala, and developer
> > documentation is at https://github.com/cloudera/Impala/wiki. The
> majority
> > of commits to the project have come from Cloudera-employed developers,
> but
> > we have accepted some contributions from individuals from other
> > organizations.
> >
> > All code reviews are done via a public instance of the Gerrit review tool
> > at http://gerrit.cloudera.org:8080/, and discussed on a public mailing
> > list. All patches must be reviewed before they are accepted into the
> > codebase, via a voting mechanism that is similar to that used on Apache
> > projects such as Hadoop and HBase.
> >
> > Before a patch is committed, it must pass a suite of pre-commit tests.
> > These tests are currently run on Cloudera’s internal infrastructure. One
> of
> > our initial goals will be to work with the ASF Infrastructure team to
> find
> > a way to run these tests in an acceptable way on publicly accessible
> > machines.
> >
> > Issues are tracked in JIRA at
> https://issues.cloudera.org/projects/IMPALA,
> > in a way that is extremely similar to existing practices at other ASF
> > projects.
> >
> > = Meritocracy =
> >
> > We understand the central importance of meritocracy to the Apache Way. We
> > will work to establish a welcoming, fair and meritocratic community, in
> > part by expanding the set of committers on the project. Although Impala’s
> > committer list will initially be dominated by members of the Impala
> > engineering team at Cloudera, we look forward to growing a rich user and
> > developer community.
> >
> > = Community =
> > Impala has a strong user community (see
> > https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user),
> and a
> > growing developer community (see
> > https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev). We
> wish
> > to attract more developers to the project, and we believe that the ASF’s
> > open and meritocratic philosophy will help us with this. We note the
> > success of other, similar projects already part of the ASF.
> >
> > = Core Developers =
> > Most - but not all - of Impala’s core developers are not currently
> > affiliated with the ASF, and will require new ICLAs.
> >
> > = Alignment =
> > Impala is related to several other Apache projects:
> >
> >  * Data that is read by Impala is very often stored in Apache Hadoop
> > clusters powered by the HDFS filesystem.
> >  * Impala can also read data stored in Apache HBase
> >  * Metadata for databases, tables and so on is read by Impala from Apache
> > Hive.
> >  * The preferred data format for HDFS-based tables is Apache Parquet, and
> > Apache Avro is also a supported data format.
> >  * Impala is closely integrated with Kudu, which is also being proposed
> to
> > the Incubator.
> >  * Impala uses Apache Thrift as its RPC and serialization framework of
> > choice.
> >
> > = Known Risks =
> >
> > == Orphaned Products ==
> > Impala is used by most of Cloudera’s customers, and Cloudera remains
> > committed to developing and supporting the project. Cloudera has a strong
> > track record in standing behind projects that were contributed to the ASF
> > by its employees, including Apache Flume, Apache Sqoop, and others. Other
> > companies both ship and support Impala, lending credence to the idea that
> > Impala is not at risk of being suddenly orphaned.
> >
> > == Inexperience with Open Source ==
> > Although all committers on the initial list have significant experience
> > with at least one open-source project - namely Impala - fewer have much
> > experience with ASF-based software projects as contributors and community
> > members. However, with the guidance of our mentors, committers who do
> have
> > ASF experience, and time to learn during Incubation, we are confident
> that
> > the project can be run in accordance with Apache principles on an ongoing
> > basis.
> >
> > == Homogeneous Developers ==
> >
> > The initial committers are employees of Cloudera.
> >
> > The project has received some contributions from developers outside of
> > Cloudera, from individuals belonging to organizations such as Intel and
> > Google, from hobbyists and from students using Impala to advance their
> > understanding of distributed databases. The project attracted an active
> > user community as well. We hope to continue to encourage contributions
> from
> > these developers and community members and grow them into committers
> after
> > they have had time to continue their contributions.
> >
> > == Reliance on Salaried Developers ==
> >
> > Many of Impala’s initial set of committers work full-time on Impala, and
> > are paid to do so. However, as mentioned elsewhere, we anticipate growth
> in
> > the developer community which we hope will include hobbyists and
> academics
> > who have an interested in distributed data systems.
> >
> > == An Excessive Fascination with the Apache Brand ==
> > Although we hope that Impala benefits from the Apache Brand, any
> reflected
> > goodwill to Cloudera as the contributing entity is not the goal of
> > establishing Impala as an Apache project. We will work with the Incubator
> > PMC and the PRC to ensure that the Apache Brand is respected.
> >
> > = Documentation =
> > Impala: A Modern, Open-Source SQL Engine for Hadoop (
> > http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf)
> >
> > Impala’s developer wiki (https://github.com/cloudera/Impala/wiki)
> >
> > Impala’s auto-generated API documentation (
> > http://impala.io/doc/html/index.html)
> >
> > = Initial Source =
> > Impala’s initial source contribution will come from
> > http://github.com/cloudera/Impala/.
> >
> > = External Dependencies =
> >
> > Impala depends upon a number of third-party libraries, which we list
> below.
> > We intend to compile a LICENSE.txt file in the very short term (see
> > https://issues.cloudera.org/browse/IMPALA-2670).
> >
> >  * Google gflags (BSD)
> >  * Google glog (BSD)
> >  * Apache Thrift (Apache Software License v2.0)
> >  * Apache Commons (Apache Software License v2.0)
> >  * Apache Hadoop (Apache Software License v2.0)
> >  * Apache HBase (Apache Software License v2.0)
> >  * Apache Hive (Apache Software License v2.0)
> >  * Boost (Boost Software License)
> >  * OpenLdap (OpenLDAP Software License)
> >  * rapidjson (MIT)
> >  * Google RE2 (BSD-style)
> >  * lz4 (BSD)
> >  * snappy (BSD)
> >  * cyrus-sasl (CMU License)
> >  * Apache Avro (Apache Software License v2.0)
> >  * Cloudera squeasel (Apache Software License v2.0)
> >  * Apache htrace (Incubating) (Apache Software License v2.0)
> >  * Apache Sentry (Incubating) (Apache Software License v2.0)
> >  * Apache Shiro (Apache Software License v2.0)
> >  * Twitter Bootstrap (Apache Software License v2.0)
> >  * d3 (BSD)
> >  * LLVM (BSD-like)
> >
> > Build and test dependencies:
> >
> >  * ant (Apache Software License v2.0)
> >  * Apache Maven (Apache Software License v2.0)
> >  * cmake (BSD)
> >  * clang (BSD)
> >  * Google gtest (Apache Software License v2.0)
> >
> > = Required Resources =
> >
> > We request that following resources be created for the project to use:
> >
> > == Mailing lists ==
> >
> >  * private@impala.incubator.apache.org (moderated subscriptions)
> >  * commits@impala.incubator.apache.org
> >  * dev@impala.incubator.apache.org
> >  * issues@impala.incubator.apache.org
> >  * user@impala.incubator.apache.org
> >
> > == Git repository ==
> > https://git.apache.org/impala.git
> >
> > == JIRA instance ==
> > JIRA project IMPALA (IMPALA or IMP)
> >
> > == Other Resources ==
> > We hope to continue using Gerrit for our code review and commit workflow.
> > We are involved with discussions that the Kudu team at Cloudera have been
> > having with Jake Farrell to start discussions on how Gerrit can fit into
> > the ASF. We know that several other ASF projects or podlings are also
> > interested in Gerrit.
> >
> > If the Infrastructure team does not have the bandwidth to support gerrit,
> > we will continue to support our own instance of gerrit for Impala, and
> make
> > the necessary integrations such that commits are properly authenticated
> and
> > maintain sufficient provenance to uphold the ASF standards (e.g. via the
> > solution adopted by the AsterixDB podling).
> >
> > = Initial Committers =
> >
> >  * Tim Armstrong
> >  * Alex Behm
> >  * Taras Bobrovytsky
> >  * Casey Ching
> >  * Martin Grund
> >  * Daniel Hecht
> >  * Michael Ho
> >  * Matthew Jacobs
> >  * Ishaan Joshi
> >  * Lenni Kuff
> >  * Marcel Kornacker
> >  * Sailesh Mukil
> >  * Henry Robinson
> >  * John Russell
> >  * Dimitris Tsirogiannis
> >  * Skye Wanderman-Milne
> >  * Juan Yu
> >
> > == Affiliations ==
> > All: Cloudera Inc.
> >
> > = Sponsors =
> >
> > == Champion ==
> > Tom White
> >
> > == Nominated Mentors ==
> >  * Tom White (Cloudera)
> >  * Todd Lipcon (Cloudera)
> >  * Carl Steinbach (LinkedIn)
> >  * Brock Noland (StreamSets)
> >
> >
> > = Sponsoring Entity =
> > We ask that the Incubator PMC sponsor this proposal.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>


-- 
Best Regards,
-- Alex

Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Tom White <to...@apache.org>.
+1 (binding)

Tom

On Tue, Nov 24, 2015 at 9:03 PM, Henry Robinson <he...@cloudera.com> wrote:
> Hi -
>
> The [DISCUSS] thread has been quiet for a few days, so I think there's been
> sufficient opportunity for discussion around our proposal to bring Impala
> to the ASF Incubator.
>
> I'd like to call a VOTE on that proposal, which is on the wiki at
> https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
> below.
>
> During the discussion period, the proposal has been amended to add Brock
> Noland as a new mentor, to add one missed committer from the list and to
> correct some issues with the dependency list.
>
> Please cast your votes as follows:
>
> [] +1, accept Impala into the Incubator
> [] +/-0, non-counted vote to express a disposition
> [] -1, do not accept Impala into the Incubator (please give your reason(s))
>
> As with the concurrent Kudu vote, I propose leaving the vote open for a
> full seven days (to close at Tuesday, December 1st at noon PST), due to the
> upcoming US holiday.
>
> Thanks,
> Henry
>
> --------
>
> = Abstract =
> Impala is a high-performance C++ and Java SQL query engine for data stored
> in Apache Hadoop-based clusters.
>
> = Proposal =
>
> We propose to contribute the Impala codebase and associated artifacts (e.g.
> documentation, web-site content etc.) to the Apache Software Foundation
> with the intent of forming a productive, meritocratic and open community
> around Impala’s continued development, according to the ‘Apache Way’.
>
> Cloudera owns several trademarks regarding Impala, and proposes to transfer
> ownership of those trademarks in full to the ASF.
>
> = Background =
> Engineers at Cloudera developed Impala and released it as an
> Apache-licensed open-source project in Fall 2012. Impala was written as a
> brand-new, modern C++ SQL engine targeted from the start for data stored in
> Apache Hadoop clusters.
>
> Impala’s most important benefit to users is high-performance, making it
> extremely appropriate for common enterprise analytic and business
> intelligence workloads. This is achieved by a number of software
> techniques, including: native support for data stored in HDFS and related
> filesystems, just-in-time compilation and optimization of individual query
> plans, high-performance C++ codebase and massively-parallel distributed
> architecture. In benchmarks, Impala is routinely amongst the very highest
> performing SQL query engines.
>
> = Rationale =
>
> Despite the exciting innovation in the so-called ‘big-data’ space, SQL
> remains by far the most common interface for interacting with data in both
> traditional warehouses and modern ‘big-data’ clusters. There is clearly a
> need, as evidenced by the eager adoption of Impala and other SQL engines in
> enterprise contexts, for a query engine that offers the familiar SQL
> interface, but that has been specifically designed to operate in massive,
> distributed clusters rather than in traditional, fixed-hardware,
> warehouse-specific deployments. Impala is one such query engine.
>
> We believe that the ASF is the right venue to foster an open-source
> community around Impala’s development. We expect that Impala will benefit
> from more productive collaboration with related Apache projects, and under
> the auspices of the ASF will attract talented contributors who will push
> Impala’s development forward at pace.
>
> We believe that the timing is right for Impala’s development to move
> wholesale to the ASF: Impala is well-established, has been Apache-licensed
> open-source for more than three years, and the core project is relatively
> stable. We are excited to see where an ASF-based community can take Impala
> from this strong starting point.
>
> = Initial Goals =
> Our initial goals are as follows:
>
>  * Establish ASF-compatible engineering practices and workflows
>  * Refactor and publish existing internal build scripts and test
> infrastructure, in order to make them usable by any community member.
>  * Transfer source code, documentation and associated artifacts to the ASF.
>  * Grow the user and developer communities
>
> = Current Status =
>
> Impala is developed as an Apache-licensed open-source project. The source
> code is available at http://github.com/cloudera/Impala, and developer
> documentation is at https://github.com/cloudera/Impala/wiki. The majority
> of commits to the project have come from Cloudera-employed developers, but
> we have accepted some contributions from individuals from other
> organizations.
>
> All code reviews are done via a public instance of the Gerrit review tool
> at http://gerrit.cloudera.org:8080/, and discussed on a public mailing
> list. All patches must be reviewed before they are accepted into the
> codebase, via a voting mechanism that is similar to that used on Apache
> projects such as Hadoop and HBase.
>
> Before a patch is committed, it must pass a suite of pre-commit tests.
> These tests are currently run on Cloudera’s internal infrastructure. One of
> our initial goals will be to work with the ASF Infrastructure team to find
> a way to run these tests in an acceptable way on publicly accessible
> machines.
>
> Issues are tracked in JIRA at https://issues.cloudera.org/projects/IMPALA,
> in a way that is extremely similar to existing practices at other ASF
> projects.
>
> = Meritocracy =
>
> We understand the central importance of meritocracy to the Apache Way. We
> will work to establish a welcoming, fair and meritocratic community, in
> part by expanding the set of committers on the project. Although Impala’s
> committer list will initially be dominated by members of the Impala
> engineering team at Cloudera, we look forward to growing a rich user and
> developer community.
>
> = Community =
> Impala has a strong user community (see
> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user), and a
> growing developer community (see
> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev). We wish
> to attract more developers to the project, and we believe that the ASF’s
> open and meritocratic philosophy will help us with this. We note the
> success of other, similar projects already part of the ASF.
>
> = Core Developers =
> Most - but not all - of Impala’s core developers are not currently
> affiliated with the ASF, and will require new ICLAs.
>
> = Alignment =
> Impala is related to several other Apache projects:
>
>  * Data that is read by Impala is very often stored in Apache Hadoop
> clusters powered by the HDFS filesystem.
>  * Impala can also read data stored in Apache HBase
>  * Metadata for databases, tables and so on is read by Impala from Apache
> Hive.
>  * The preferred data format for HDFS-based tables is Apache Parquet, and
> Apache Avro is also a supported data format.
>  * Impala is closely integrated with Kudu, which is also being proposed to
> the Incubator.
>  * Impala uses Apache Thrift as its RPC and serialization framework of
> choice.
>
> = Known Risks =
>
> == Orphaned Products ==
> Impala is used by most of Cloudera’s customers, and Cloudera remains
> committed to developing and supporting the project. Cloudera has a strong
> track record in standing behind projects that were contributed to the ASF
> by its employees, including Apache Flume, Apache Sqoop, and others. Other
> companies both ship and support Impala, lending credence to the idea that
> Impala is not at risk of being suddenly orphaned.
>
> == Inexperience with Open Source ==
> Although all committers on the initial list have significant experience
> with at least one open-source project - namely Impala - fewer have much
> experience with ASF-based software projects as contributors and community
> members. However, with the guidance of our mentors, committers who do have
> ASF experience, and time to learn during Incubation, we are confident that
> the project can be run in accordance with Apache principles on an ongoing
> basis.
>
> == Homogeneous Developers ==
>
> The initial committers are employees of Cloudera.
>
> The project has received some contributions from developers outside of
> Cloudera, from individuals belonging to organizations such as Intel and
> Google, from hobbyists and from students using Impala to advance their
> understanding of distributed databases. The project attracted an active
> user community as well. We hope to continue to encourage contributions from
> these developers and community members and grow them into committers after
> they have had time to continue their contributions.
>
> == Reliance on Salaried Developers ==
>
> Many of Impala’s initial set of committers work full-time on Impala, and
> are paid to do so. However, as mentioned elsewhere, we anticipate growth in
> the developer community which we hope will include hobbyists and academics
> who have an interested in distributed data systems.
>
> == An Excessive Fascination with the Apache Brand ==
> Although we hope that Impala benefits from the Apache Brand, any reflected
> goodwill to Cloudera as the contributing entity is not the goal of
> establishing Impala as an Apache project. We will work with the Incubator
> PMC and the PRC to ensure that the Apache Brand is respected.
>
> = Documentation =
> Impala: A Modern, Open-Source SQL Engine for Hadoop (
> http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf)
>
> Impala’s developer wiki (https://github.com/cloudera/Impala/wiki)
>
> Impala’s auto-generated API documentation (
> http://impala.io/doc/html/index.html)
>
> = Initial Source =
> Impala’s initial source contribution will come from
> http://github.com/cloudera/Impala/.
>
> = External Dependencies =
>
> Impala depends upon a number of third-party libraries, which we list below.
> We intend to compile a LICENSE.txt file in the very short term (see
> https://issues.cloudera.org/browse/IMPALA-2670).
>
>  * Google gflags (BSD)
>  * Google glog (BSD)
>  * Apache Thrift (Apache Software License v2.0)
>  * Apache Commons (Apache Software License v2.0)
>  * Apache Hadoop (Apache Software License v2.0)
>  * Apache HBase (Apache Software License v2.0)
>  * Apache Hive (Apache Software License v2.0)
>  * Boost (Boost Software License)
>  * OpenLdap (OpenLDAP Software License)
>  * rapidjson (MIT)
>  * Google RE2 (BSD-style)
>  * lz4 (BSD)
>  * snappy (BSD)
>  * cyrus-sasl (CMU License)
>  * Apache Avro (Apache Software License v2.0)
>  * Cloudera squeasel (Apache Software License v2.0)
>  * Apache htrace (Incubating) (Apache Software License v2.0)
>  * Apache Sentry (Incubating) (Apache Software License v2.0)
>  * Apache Shiro (Apache Software License v2.0)
>  * Twitter Bootstrap (Apache Software License v2.0)
>  * d3 (BSD)
>  * LLVM (BSD-like)
>
> Build and test dependencies:
>
>  * ant (Apache Software License v2.0)
>  * Apache Maven (Apache Software License v2.0)
>  * cmake (BSD)
>  * clang (BSD)
>  * Google gtest (Apache Software License v2.0)
>
> = Required Resources =
>
> We request that following resources be created for the project to use:
>
> == Mailing lists ==
>
>  * private@impala.incubator.apache.org (moderated subscriptions)
>  * commits@impala.incubator.apache.org
>  * dev@impala.incubator.apache.org
>  * issues@impala.incubator.apache.org
>  * user@impala.incubator.apache.org
>
> == Git repository ==
> https://git.apache.org/impala.git
>
> == JIRA instance ==
> JIRA project IMPALA (IMPALA or IMP)
>
> == Other Resources ==
> We hope to continue using Gerrit for our code review and commit workflow.
> We are involved with discussions that the Kudu team at Cloudera have been
> having with Jake Farrell to start discussions on how Gerrit can fit into
> the ASF. We know that several other ASF projects or podlings are also
> interested in Gerrit.
>
> If the Infrastructure team does not have the bandwidth to support gerrit,
> we will continue to support our own instance of gerrit for Impala, and make
> the necessary integrations such that commits are properly authenticated and
> maintain sufficient provenance to uphold the ASF standards (e.g. via the
> solution adopted by the AsterixDB podling).
>
> = Initial Committers =
>
>  * Tim Armstrong
>  * Alex Behm
>  * Taras Bobrovytsky
>  * Casey Ching
>  * Martin Grund
>  * Daniel Hecht
>  * Michael Ho
>  * Matthew Jacobs
>  * Ishaan Joshi
>  * Lenni Kuff
>  * Marcel Kornacker
>  * Sailesh Mukil
>  * Henry Robinson
>  * John Russell
>  * Dimitris Tsirogiannis
>  * Skye Wanderman-Milne
>  * Juan Yu
>
> == Affiliations ==
> All: Cloudera Inc.
>
> = Sponsors =
>
> == Champion ==
> Tom White
>
> == Nominated Mentors ==
>  * Tom White (Cloudera)
>  * Todd Lipcon (Cloudera)
>  * Carl Steinbach (LinkedIn)
>  * Brock Noland (StreamSets)
>
>
> = Sponsoring Entity =
> We ask that the Incubator PMC sponsor this proposal.

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Greg Stein <gs...@gmail.com>.
-1 (binding).

I'd like to see the community start with CTR, rather than mandatory reviews
via gerrit.


On Tue, Nov 24, 2015 at 3:03 PM, Henry Robinson <he...@cloudera.com> wrote:

> Hi -
>
> The [DISCUSS] thread has been quiet for a few days, so I think there's been
> sufficient opportunity for discussion around our proposal to bring Impala
> to the ASF Incubator.
>
> I'd like to call a VOTE on that proposal, which is on the wiki at
> https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
> below.
>
> During the discussion period, the proposal has been amended to add Brock
> Noland as a new mentor, to add one missed committer from the list and to
> correct some issues with the dependency list.
>
> Please cast your votes as follows:
>
> [] +1, accept Impala into the Incubator
> [] +/-0, non-counted vote to express a disposition
> [] -1, do not accept Impala into the Incubator (please give your reason(s))
>
> As with the concurrent Kudu vote, I propose leaving the vote open for a
> full seven days (to close at Tuesday, December 1st at noon PST), due to the
> upcoming US holiday.
>
> Thanks,
> Henry
>
> --------
>
> = Abstract =
> Impala is a high-performance C++ and Java SQL query engine for data stored
> in Apache Hadoop-based clusters.
>
> = Proposal =
>
> We propose to contribute the Impala codebase and associated artifacts (e.g.
> documentation, web-site content etc.) to the Apache Software Foundation
> with the intent of forming a productive, meritocratic and open community
> around Impala’s continued development, according to the ‘Apache Way’.
>
> Cloudera owns several trademarks regarding Impala, and proposes to transfer
> ownership of those trademarks in full to the ASF.
>
> = Background =
> Engineers at Cloudera developed Impala and released it as an
> Apache-licensed open-source project in Fall 2012. Impala was written as a
> brand-new, modern C++ SQL engine targeted from the start for data stored in
> Apache Hadoop clusters.
>
> Impala’s most important benefit to users is high-performance, making it
> extremely appropriate for common enterprise analytic and business
> intelligence workloads. This is achieved by a number of software
> techniques, including: native support for data stored in HDFS and related
> filesystems, just-in-time compilation and optimization of individual query
> plans, high-performance C++ codebase and massively-parallel distributed
> architecture. In benchmarks, Impala is routinely amongst the very highest
> performing SQL query engines.
>
> = Rationale =
>
> Despite the exciting innovation in the so-called ‘big-data’ space, SQL
> remains by far the most common interface for interacting with data in both
> traditional warehouses and modern ‘big-data’ clusters. There is clearly a
> need, as evidenced by the eager adoption of Impala and other SQL engines in
> enterprise contexts, for a query engine that offers the familiar SQL
> interface, but that has been specifically designed to operate in massive,
> distributed clusters rather than in traditional, fixed-hardware,
> warehouse-specific deployments. Impala is one such query engine.
>
> We believe that the ASF is the right venue to foster an open-source
> community around Impala’s development. We expect that Impala will benefit
> from more productive collaboration with related Apache projects, and under
> the auspices of the ASF will attract talented contributors who will push
> Impala’s development forward at pace.
>
> We believe that the timing is right for Impala’s development to move
> wholesale to the ASF: Impala is well-established, has been Apache-licensed
> open-source for more than three years, and the core project is relatively
> stable. We are excited to see where an ASF-based community can take Impala
> from this strong starting point.
>
> = Initial Goals =
> Our initial goals are as follows:
>
>  * Establish ASF-compatible engineering practices and workflows
>  * Refactor and publish existing internal build scripts and test
> infrastructure, in order to make them usable by any community member.
>  * Transfer source code, documentation and associated artifacts to the ASF.
>  * Grow the user and developer communities
>
> = Current Status =
>
> Impala is developed as an Apache-licensed open-source project. The source
> code is available at http://github.com/cloudera/Impala, and developer
> documentation is at https://github.com/cloudera/Impala/wiki. The majority
> of commits to the project have come from Cloudera-employed developers, but
> we have accepted some contributions from individuals from other
> organizations.
>
> All code reviews are done via a public instance of the Gerrit review tool
> at http://gerrit.cloudera.org:8080/, and discussed on a public mailing
> list. All patches must be reviewed before they are accepted into the
> codebase, via a voting mechanism that is similar to that used on Apache
> projects such as Hadoop and HBase.
>
> Before a patch is committed, it must pass a suite of pre-commit tests.
> These tests are currently run on Cloudera’s internal infrastructure. One of
> our initial goals will be to work with the ASF Infrastructure team to find
> a way to run these tests in an acceptable way on publicly accessible
> machines.
>
> Issues are tracked in JIRA at https://issues.cloudera.org/projects/IMPALA,
> in a way that is extremely similar to existing practices at other ASF
> projects.
>
> = Meritocracy =
>
> We understand the central importance of meritocracy to the Apache Way. We
> will work to establish a welcoming, fair and meritocratic community, in
> part by expanding the set of committers on the project. Although Impala’s
> committer list will initially be dominated by members of the Impala
> engineering team at Cloudera, we look forward to growing a rich user and
> developer community.
>
> = Community =
> Impala has a strong user community (see
> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user), and a
> growing developer community (see
> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev). We
> wish
> to attract more developers to the project, and we believe that the ASF’s
> open and meritocratic philosophy will help us with this. We note the
> success of other, similar projects already part of the ASF.
>
> = Core Developers =
> Most - but not all - of Impala’s core developers are not currently
> affiliated with the ASF, and will require new ICLAs.
>
> = Alignment =
> Impala is related to several other Apache projects:
>
>  * Data that is read by Impala is very often stored in Apache Hadoop
> clusters powered by the HDFS filesystem.
>  * Impala can also read data stored in Apache HBase
>  * Metadata for databases, tables and so on is read by Impala from Apache
> Hive.
>  * The preferred data format for HDFS-based tables is Apache Parquet, and
> Apache Avro is also a supported data format.
>  * Impala is closely integrated with Kudu, which is also being proposed to
> the Incubator.
>  * Impala uses Apache Thrift as its RPC and serialization framework of
> choice.
>
> = Known Risks =
>
> == Orphaned Products ==
> Impala is used by most of Cloudera’s customers, and Cloudera remains
> committed to developing and supporting the project. Cloudera has a strong
> track record in standing behind projects that were contributed to the ASF
> by its employees, including Apache Flume, Apache Sqoop, and others. Other
> companies both ship and support Impala, lending credence to the idea that
> Impala is not at risk of being suddenly orphaned.
>
> == Inexperience with Open Source ==
> Although all committers on the initial list have significant experience
> with at least one open-source project - namely Impala - fewer have much
> experience with ASF-based software projects as contributors and community
> members. However, with the guidance of our mentors, committers who do have
> ASF experience, and time to learn during Incubation, we are confident that
> the project can be run in accordance with Apache principles on an ongoing
> basis.
>
> == Homogeneous Developers ==
>
> The initial committers are employees of Cloudera.
>
> The project has received some contributions from developers outside of
> Cloudera, from individuals belonging to organizations such as Intel and
> Google, from hobbyists and from students using Impala to advance their
> understanding of distributed databases. The project attracted an active
> user community as well. We hope to continue to encourage contributions from
> these developers and community members and grow them into committers after
> they have had time to continue their contributions.
>
> == Reliance on Salaried Developers ==
>
> Many of Impala’s initial set of committers work full-time on Impala, and
> are paid to do so. However, as mentioned elsewhere, we anticipate growth in
> the developer community which we hope will include hobbyists and academics
> who have an interested in distributed data systems.
>
> == An Excessive Fascination with the Apache Brand ==
> Although we hope that Impala benefits from the Apache Brand, any reflected
> goodwill to Cloudera as the contributing entity is not the goal of
> establishing Impala as an Apache project. We will work with the Incubator
> PMC and the PRC to ensure that the Apache Brand is respected.
>
> = Documentation =
> Impala: A Modern, Open-Source SQL Engine for Hadoop (
> http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf)
>
> Impala’s developer wiki (https://github.com/cloudera/Impala/wiki)
>
> Impala’s auto-generated API documentation (
> http://impala.io/doc/html/index.html)
>
> = Initial Source =
> Impala’s initial source contribution will come from
> http://github.com/cloudera/Impala/.
>
> = External Dependencies =
>
> Impala depends upon a number of third-party libraries, which we list below.
> We intend to compile a LICENSE.txt file in the very short term (see
> https://issues.cloudera.org/browse/IMPALA-2670).
>
>  * Google gflags (BSD)
>  * Google glog (BSD)
>  * Apache Thrift (Apache Software License v2.0)
>  * Apache Commons (Apache Software License v2.0)
>  * Apache Hadoop (Apache Software License v2.0)
>  * Apache HBase (Apache Software License v2.0)
>  * Apache Hive (Apache Software License v2.0)
>  * Boost (Boost Software License)
>  * OpenLdap (OpenLDAP Software License)
>  * rapidjson (MIT)
>  * Google RE2 (BSD-style)
>  * lz4 (BSD)
>  * snappy (BSD)
>  * cyrus-sasl (CMU License)
>  * Apache Avro (Apache Software License v2.0)
>  * Cloudera squeasel (Apache Software License v2.0)
>  * Apache htrace (Incubating) (Apache Software License v2.0)
>  * Apache Sentry (Incubating) (Apache Software License v2.0)
>  * Apache Shiro (Apache Software License v2.0)
>  * Twitter Bootstrap (Apache Software License v2.0)
>  * d3 (BSD)
>  * LLVM (BSD-like)
>
> Build and test dependencies:
>
>  * ant (Apache Software License v2.0)
>  * Apache Maven (Apache Software License v2.0)
>  * cmake (BSD)
>  * clang (BSD)
>  * Google gtest (Apache Software License v2.0)
>
> = Required Resources =
>
> We request that following resources be created for the project to use:
>
> == Mailing lists ==
>
>  * private@impala.incubator.apache.org (moderated subscriptions)
>  * commits@impala.incubator.apache.org
>  * dev@impala.incubator.apache.org
>  * issues@impala.incubator.apache.org
>  * user@impala.incubator.apache.org
>
> == Git repository ==
> https://git.apache.org/impala.git
>
> == JIRA instance ==
> JIRA project IMPALA (IMPALA or IMP)
>
> == Other Resources ==
> We hope to continue using Gerrit for our code review and commit workflow.
> We are involved with discussions that the Kudu team at Cloudera have been
> having with Jake Farrell to start discussions on how Gerrit can fit into
> the ASF. We know that several other ASF projects or podlings are also
> interested in Gerrit.
>
> If the Infrastructure team does not have the bandwidth to support gerrit,
> we will continue to support our own instance of gerrit for Impala, and make
> the necessary integrations such that commits are properly authenticated and
> maintain sufficient provenance to uphold the ASF standards (e.g. via the
> solution adopted by the AsterixDB podling).
>
> = Initial Committers =
>
>  * Tim Armstrong
>  * Alex Behm
>  * Taras Bobrovytsky
>  * Casey Ching
>  * Martin Grund
>  * Daniel Hecht
>  * Michael Ho
>  * Matthew Jacobs
>  * Ishaan Joshi
>  * Lenni Kuff
>  * Marcel Kornacker
>  * Sailesh Mukil
>  * Henry Robinson
>  * John Russell
>  * Dimitris Tsirogiannis
>  * Skye Wanderman-Milne
>  * Juan Yu
>
> == Affiliations ==
> All: Cloudera Inc.
>
> = Sponsors =
>
> == Champion ==
> Tom White
>
> == Nominated Mentors ==
>  * Tom White (Cloudera)
>  * Todd Lipcon (Cloudera)
>  * Carl Steinbach (LinkedIn)
>  * Brock Noland (StreamSets)
>
>
> = Sponsoring Entity =
> We ask that the Incubator PMC sponsor this proposal.
>

Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Doug Cutting <cu...@apache.org>.
+1 (binding)

Doug

On Tue, Nov 24, 2015 at 1:03 PM, Henry Robinson <he...@cloudera.com> wrote:

> Hi -
>
> The [DISCUSS] thread has been quiet for a few days, so I think there's been
> sufficient opportunity for discussion around our proposal to bring Impala
> to the ASF Incubator.
>
> I'd like to call a VOTE on that proposal, which is on the wiki at
> https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
> below.
>
> During the discussion period, the proposal has been amended to add Brock
> Noland as a new mentor, to add one missed committer from the list and to
> correct some issues with the dependency list.
>
> Please cast your votes as follows:
>
> [] +1, accept Impala into the Incubator
> [] +/-0, non-counted vote to express a disposition
> [] -1, do not accept Impala into the Incubator (please give your reason(s))
>
> As with the concurrent Kudu vote, I propose leaving the vote open for a
> full seven days (to close at Tuesday, December 1st at noon PST), due to the
> upcoming US holiday.
>
> Thanks,
> Henry
>
> --------
>
> = Abstract =
> Impala is a high-performance C++ and Java SQL query engine for data stored
> in Apache Hadoop-based clusters.
>
> = Proposal =
>
> We propose to contribute the Impala codebase and associated artifacts (e.g.
> documentation, web-site content etc.) to the Apache Software Foundation
> with the intent of forming a productive, meritocratic and open community
> around Impala’s continued development, according to the ‘Apache Way’.
>
> Cloudera owns several trademarks regarding Impala, and proposes to transfer
> ownership of those trademarks in full to the ASF.
>
> = Background =
> Engineers at Cloudera developed Impala and released it as an
> Apache-licensed open-source project in Fall 2012. Impala was written as a
> brand-new, modern C++ SQL engine targeted from the start for data stored in
> Apache Hadoop clusters.
>
> Impala’s most important benefit to users is high-performance, making it
> extremely appropriate for common enterprise analytic and business
> intelligence workloads. This is achieved by a number of software
> techniques, including: native support for data stored in HDFS and related
> filesystems, just-in-time compilation and optimization of individual query
> plans, high-performance C++ codebase and massively-parallel distributed
> architecture. In benchmarks, Impala is routinely amongst the very highest
> performing SQL query engines.
>
> = Rationale =
>
> Despite the exciting innovation in the so-called ‘big-data’ space, SQL
> remains by far the most common interface for interacting with data in both
> traditional warehouses and modern ‘big-data’ clusters. There is clearly a
> need, as evidenced by the eager adoption of Impala and other SQL engines in
> enterprise contexts, for a query engine that offers the familiar SQL
> interface, but that has been specifically designed to operate in massive,
> distributed clusters rather than in traditional, fixed-hardware,
> warehouse-specific deployments. Impala is one such query engine.
>
> We believe that the ASF is the right venue to foster an open-source
> community around Impala’s development. We expect that Impala will benefit
> from more productive collaboration with related Apache projects, and under
> the auspices of the ASF will attract talented contributors who will push
> Impala’s development forward at pace.
>
> We believe that the timing is right for Impala’s development to move
> wholesale to the ASF: Impala is well-established, has been Apache-licensed
> open-source for more than three years, and the core project is relatively
> stable. We are excited to see where an ASF-based community can take Impala
> from this strong starting point.
>
> = Initial Goals =
> Our initial goals are as follows:
>
>  * Establish ASF-compatible engineering practices and workflows
>  * Refactor and publish existing internal build scripts and test
> infrastructure, in order to make them usable by any community member.
>  * Transfer source code, documentation and associated artifacts to the ASF.
>  * Grow the user and developer communities
>
> = Current Status =
>
> Impala is developed as an Apache-licensed open-source project. The source
> code is available at http://github.com/cloudera/Impala, and developer
> documentation is at https://github.com/cloudera/Impala/wiki. The majority
> of commits to the project have come from Cloudera-employed developers, but
> we have accepted some contributions from individuals from other
> organizations.
>
> All code reviews are done via a public instance of the Gerrit review tool
> at http://gerrit.cloudera.org:8080/, and discussed on a public mailing
> list. All patches must be reviewed before they are accepted into the
> codebase, via a voting mechanism that is similar to that used on Apache
> projects such as Hadoop and HBase.
>
> Before a patch is committed, it must pass a suite of pre-commit tests.
> These tests are currently run on Cloudera’s internal infrastructure. One of
> our initial goals will be to work with the ASF Infrastructure team to find
> a way to run these tests in an acceptable way on publicly accessible
> machines.
>
> Issues are tracked in JIRA at https://issues.cloudera.org/projects/IMPALA,
> in a way that is extremely similar to existing practices at other ASF
> projects.
>
> = Meritocracy =
>
> We understand the central importance of meritocracy to the Apache Way. We
> will work to establish a welcoming, fair and meritocratic community, in
> part by expanding the set of committers on the project. Although Impala’s
> committer list will initially be dominated by members of the Impala
> engineering team at Cloudera, we look forward to growing a rich user and
> developer community.
>
> = Community =
> Impala has a strong user community (see
> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user), and a
> growing developer community (see
> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev). We
> wish
> to attract more developers to the project, and we believe that the ASF’s
> open and meritocratic philosophy will help us with this. We note the
> success of other, similar projects already part of the ASF.
>
> = Core Developers =
> Most - but not all - of Impala’s core developers are not currently
> affiliated with the ASF, and will require new ICLAs.
>
> = Alignment =
> Impala is related to several other Apache projects:
>
>  * Data that is read by Impala is very often stored in Apache Hadoop
> clusters powered by the HDFS filesystem.
>  * Impala can also read data stored in Apache HBase
>  * Metadata for databases, tables and so on is read by Impala from Apache
> Hive.
>  * The preferred data format for HDFS-based tables is Apache Parquet, and
> Apache Avro is also a supported data format.
>  * Impala is closely integrated with Kudu, which is also being proposed to
> the Incubator.
>  * Impala uses Apache Thrift as its RPC and serialization framework of
> choice.
>
> = Known Risks =
>
> == Orphaned Products ==
> Impala is used by most of Cloudera’s customers, and Cloudera remains
> committed to developing and supporting the project. Cloudera has a strong
> track record in standing behind projects that were contributed to the ASF
> by its employees, including Apache Flume, Apache Sqoop, and others. Other
> companies both ship and support Impala, lending credence to the idea that
> Impala is not at risk of being suddenly orphaned.
>
> == Inexperience with Open Source ==
> Although all committers on the initial list have significant experience
> with at least one open-source project - namely Impala - fewer have much
> experience with ASF-based software projects as contributors and community
> members. However, with the guidance of our mentors, committers who do have
> ASF experience, and time to learn during Incubation, we are confident that
> the project can be run in accordance with Apache principles on an ongoing
> basis.
>
> == Homogeneous Developers ==
>
> The initial committers are employees of Cloudera.
>
> The project has received some contributions from developers outside of
> Cloudera, from individuals belonging to organizations such as Intel and
> Google, from hobbyists and from students using Impala to advance their
> understanding of distributed databases. The project attracted an active
> user community as well. We hope to continue to encourage contributions from
> these developers and community members and grow them into committers after
> they have had time to continue their contributions.
>
> == Reliance on Salaried Developers ==
>
> Many of Impala’s initial set of committers work full-time on Impala, and
> are paid to do so. However, as mentioned elsewhere, we anticipate growth in
> the developer community which we hope will include hobbyists and academics
> who have an interested in distributed data systems.
>
> == An Excessive Fascination with the Apache Brand ==
> Although we hope that Impala benefits from the Apache Brand, any reflected
> goodwill to Cloudera as the contributing entity is not the goal of
> establishing Impala as an Apache project. We will work with the Incubator
> PMC and the PRC to ensure that the Apache Brand is respected.
>
> = Documentation =
> Impala: A Modern, Open-Source SQL Engine for Hadoop (
> http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf)
>
> Impala’s developer wiki (https://github.com/cloudera/Impala/wiki)
>
> Impala’s auto-generated API documentation (
> http://impala.io/doc/html/index.html)
>
> = Initial Source =
> Impala’s initial source contribution will come from
> http://github.com/cloudera/Impala/.
>
> = External Dependencies =
>
> Impala depends upon a number of third-party libraries, which we list below.
> We intend to compile a LICENSE.txt file in the very short term (see
> https://issues.cloudera.org/browse/IMPALA-2670).
>
>  * Google gflags (BSD)
>  * Google glog (BSD)
>  * Apache Thrift (Apache Software License v2.0)
>  * Apache Commons (Apache Software License v2.0)
>  * Apache Hadoop (Apache Software License v2.0)
>  * Apache HBase (Apache Software License v2.0)
>  * Apache Hive (Apache Software License v2.0)
>  * Boost (Boost Software License)
>  * OpenLdap (OpenLDAP Software License)
>  * rapidjson (MIT)
>  * Google RE2 (BSD-style)
>  * lz4 (BSD)
>  * snappy (BSD)
>  * cyrus-sasl (CMU License)
>  * Apache Avro (Apache Software License v2.0)
>  * Cloudera squeasel (Apache Software License v2.0)
>  * Apache htrace (Incubating) (Apache Software License v2.0)
>  * Apache Sentry (Incubating) (Apache Software License v2.0)
>  * Apache Shiro (Apache Software License v2.0)
>  * Twitter Bootstrap (Apache Software License v2.0)
>  * d3 (BSD)
>  * LLVM (BSD-like)
>
> Build and test dependencies:
>
>  * ant (Apache Software License v2.0)
>  * Apache Maven (Apache Software License v2.0)
>  * cmake (BSD)
>  * clang (BSD)
>  * Google gtest (Apache Software License v2.0)
>
> = Required Resources =
>
> We request that following resources be created for the project to use:
>
> == Mailing lists ==
>
>  * private@impala.incubator.apache.org (moderated subscriptions)
>  * commits@impala.incubator.apache.org
>  * dev@impala.incubator.apache.org
>  * issues@impala.incubator.apache.org
>  * user@impala.incubator.apache.org
>
> == Git repository ==
> https://git.apache.org/impala.git
>
> == JIRA instance ==
> JIRA project IMPALA (IMPALA or IMP)
>
> == Other Resources ==
> We hope to continue using Gerrit for our code review and commit workflow.
> We are involved with discussions that the Kudu team at Cloudera have been
> having with Jake Farrell to start discussions on how Gerrit can fit into
> the ASF. We know that several other ASF projects or podlings are also
> interested in Gerrit.
>
> If the Infrastructure team does not have the bandwidth to support gerrit,
> we will continue to support our own instance of gerrit for Impala, and make
> the necessary integrations such that commits are properly authenticated and
> maintain sufficient provenance to uphold the ASF standards (e.g. via the
> solution adopted by the AsterixDB podling).
>
> = Initial Committers =
>
>  * Tim Armstrong
>  * Alex Behm
>  * Taras Bobrovytsky
>  * Casey Ching
>  * Martin Grund
>  * Daniel Hecht
>  * Michael Ho
>  * Matthew Jacobs
>  * Ishaan Joshi
>  * Lenni Kuff
>  * Marcel Kornacker
>  * Sailesh Mukil
>  * Henry Robinson
>  * John Russell
>  * Dimitris Tsirogiannis
>  * Skye Wanderman-Milne
>  * Juan Yu
>
> == Affiliations ==
> All: Cloudera Inc.
>
> = Sponsors =
>
> == Champion ==
> Tom White
>
> == Nominated Mentors ==
>  * Tom White (Cloudera)
>  * Todd Lipcon (Cloudera)
>  * Carl Steinbach (LinkedIn)
>  * Brock Noland (StreamSets)
>
>
> = Sponsoring Entity =
> We ask that the Incubator PMC sponsor this proposal.
>

Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Ashish <pa...@gmail.com>.
+1 (non-binding)

On Tue, Nov 24, 2015 at 1:03 PM, Henry Robinson <he...@cloudera.com> wrote:
> Hi -
>
> The [DISCUSS] thread has been quiet for a few days, so I think there's been
> sufficient opportunity for discussion around our proposal to bring Impala
> to the ASF Incubator.
>
> I'd like to call a VOTE on that proposal, which is on the wiki at
> https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
> below.
>
> During the discussion period, the proposal has been amended to add Brock
> Noland as a new mentor, to add one missed committer from the list and to
> correct some issues with the dependency list.
>
> Please cast your votes as follows:
>
> [] +1, accept Impala into the Incubator
> [] +/-0, non-counted vote to express a disposition
> [] -1, do not accept Impala into the Incubator (please give your reason(s))
>
> As with the concurrent Kudu vote, I propose leaving the vote open for a
> full seven days (to close at Tuesday, December 1st at noon PST), due to the
> upcoming US holiday.
>
> Thanks,
> Henry
>
> --------
>
> = Abstract =
> Impala is a high-performance C++ and Java SQL query engine for data stored
> in Apache Hadoop-based clusters.
>
> = Proposal =
>
> We propose to contribute the Impala codebase and associated artifacts (e.g.
> documentation, web-site content etc.) to the Apache Software Foundation
> with the intent of forming a productive, meritocratic and open community
> around Impala’s continued development, according to the ‘Apache Way’.
>
> Cloudera owns several trademarks regarding Impala, and proposes to transfer
> ownership of those trademarks in full to the ASF.
>
> = Background =
> Engineers at Cloudera developed Impala and released it as an
> Apache-licensed open-source project in Fall 2012. Impala was written as a
> brand-new, modern C++ SQL engine targeted from the start for data stored in
> Apache Hadoop clusters.
>
> Impala’s most important benefit to users is high-performance, making it
> extremely appropriate for common enterprise analytic and business
> intelligence workloads. This is achieved by a number of software
> techniques, including: native support for data stored in HDFS and related
> filesystems, just-in-time compilation and optimization of individual query
> plans, high-performance C++ codebase and massively-parallel distributed
> architecture. In benchmarks, Impala is routinely amongst the very highest
> performing SQL query engines.
>
> = Rationale =
>
> Despite the exciting innovation in the so-called ‘big-data’ space, SQL
> remains by far the most common interface for interacting with data in both
> traditional warehouses and modern ‘big-data’ clusters. There is clearly a
> need, as evidenced by the eager adoption of Impala and other SQL engines in
> enterprise contexts, for a query engine that offers the familiar SQL
> interface, but that has been specifically designed to operate in massive,
> distributed clusters rather than in traditional, fixed-hardware,
> warehouse-specific deployments. Impala is one such query engine.
>
> We believe that the ASF is the right venue to foster an open-source
> community around Impala’s development. We expect that Impala will benefit
> from more productive collaboration with related Apache projects, and under
> the auspices of the ASF will attract talented contributors who will push
> Impala’s development forward at pace.
>
> We believe that the timing is right for Impala’s development to move
> wholesale to the ASF: Impala is well-established, has been Apache-licensed
> open-source for more than three years, and the core project is relatively
> stable. We are excited to see where an ASF-based community can take Impala
> from this strong starting point.
>
> = Initial Goals =
> Our initial goals are as follows:
>
>  * Establish ASF-compatible engineering practices and workflows
>  * Refactor and publish existing internal build scripts and test
> infrastructure, in order to make them usable by any community member.
>  * Transfer source code, documentation and associated artifacts to the ASF.
>  * Grow the user and developer communities
>
> = Current Status =
>
> Impala is developed as an Apache-licensed open-source project. The source
> code is available at http://github.com/cloudera/Impala, and developer
> documentation is at https://github.com/cloudera/Impala/wiki. The majority
> of commits to the project have come from Cloudera-employed developers, but
> we have accepted some contributions from individuals from other
> organizations.
>
> All code reviews are done via a public instance of the Gerrit review tool
> at http://gerrit.cloudera.org:8080/, and discussed on a public mailing
> list. All patches must be reviewed before they are accepted into the
> codebase, via a voting mechanism that is similar to that used on Apache
> projects such as Hadoop and HBase.
>
> Before a patch is committed, it must pass a suite of pre-commit tests.
> These tests are currently run on Cloudera’s internal infrastructure. One of
> our initial goals will be to work with the ASF Infrastructure team to find
> a way to run these tests in an acceptable way on publicly accessible
> machines.
>
> Issues are tracked in JIRA at https://issues.cloudera.org/projects/IMPALA,
> in a way that is extremely similar to existing practices at other ASF
> projects.
>
> = Meritocracy =
>
> We understand the central importance of meritocracy to the Apache Way. We
> will work to establish a welcoming, fair and meritocratic community, in
> part by expanding the set of committers on the project. Although Impala’s
> committer list will initially be dominated by members of the Impala
> engineering team at Cloudera, we look forward to growing a rich user and
> developer community.
>
> = Community =
> Impala has a strong user community (see
> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user), and a
> growing developer community (see
> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev). We wish
> to attract more developers to the project, and we believe that the ASF’s
> open and meritocratic philosophy will help us with this. We note the
> success of other, similar projects already part of the ASF.
>
> = Core Developers =
> Most - but not all - of Impala’s core developers are not currently
> affiliated with the ASF, and will require new ICLAs.
>
> = Alignment =
> Impala is related to several other Apache projects:
>
>  * Data that is read by Impala is very often stored in Apache Hadoop
> clusters powered by the HDFS filesystem.
>  * Impala can also read data stored in Apache HBase
>  * Metadata for databases, tables and so on is read by Impala from Apache
> Hive.
>  * The preferred data format for HDFS-based tables is Apache Parquet, and
> Apache Avro is also a supported data format.
>  * Impala is closely integrated with Kudu, which is also being proposed to
> the Incubator.
>  * Impala uses Apache Thrift as its RPC and serialization framework of
> choice.
>
> = Known Risks =
>
> == Orphaned Products ==
> Impala is used by most of Cloudera’s customers, and Cloudera remains
> committed to developing and supporting the project. Cloudera has a strong
> track record in standing behind projects that were contributed to the ASF
> by its employees, including Apache Flume, Apache Sqoop, and others. Other
> companies both ship and support Impala, lending credence to the idea that
> Impala is not at risk of being suddenly orphaned.
>
> == Inexperience with Open Source ==
> Although all committers on the initial list have significant experience
> with at least one open-source project - namely Impala - fewer have much
> experience with ASF-based software projects as contributors and community
> members. However, with the guidance of our mentors, committers who do have
> ASF experience, and time to learn during Incubation, we are confident that
> the project can be run in accordance with Apache principles on an ongoing
> basis.
>
> == Homogeneous Developers ==
>
> The initial committers are employees of Cloudera.
>
> The project has received some contributions from developers outside of
> Cloudera, from individuals belonging to organizations such as Intel and
> Google, from hobbyists and from students using Impala to advance their
> understanding of distributed databases. The project attracted an active
> user community as well. We hope to continue to encourage contributions from
> these developers and community members and grow them into committers after
> they have had time to continue their contributions.
>
> == Reliance on Salaried Developers ==
>
> Many of Impala’s initial set of committers work full-time on Impala, and
> are paid to do so. However, as mentioned elsewhere, we anticipate growth in
> the developer community which we hope will include hobbyists and academics
> who have an interested in distributed data systems.
>
> == An Excessive Fascination with the Apache Brand ==
> Although we hope that Impala benefits from the Apache Brand, any reflected
> goodwill to Cloudera as the contributing entity is not the goal of
> establishing Impala as an Apache project. We will work with the Incubator
> PMC and the PRC to ensure that the Apache Brand is respected.
>
> = Documentation =
> Impala: A Modern, Open-Source SQL Engine for Hadoop (
> http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf)
>
> Impala’s developer wiki (https://github.com/cloudera/Impala/wiki)
>
> Impala’s auto-generated API documentation (
> http://impala.io/doc/html/index.html)
>
> = Initial Source =
> Impala’s initial source contribution will come from
> http://github.com/cloudera/Impala/.
>
> = External Dependencies =
>
> Impala depends upon a number of third-party libraries, which we list below.
> We intend to compile a LICENSE.txt file in the very short term (see
> https://issues.cloudera.org/browse/IMPALA-2670).
>
>  * Google gflags (BSD)
>  * Google glog (BSD)
>  * Apache Thrift (Apache Software License v2.0)
>  * Apache Commons (Apache Software License v2.0)
>  * Apache Hadoop (Apache Software License v2.0)
>  * Apache HBase (Apache Software License v2.0)
>  * Apache Hive (Apache Software License v2.0)
>  * Boost (Boost Software License)
>  * OpenLdap (OpenLDAP Software License)
>  * rapidjson (MIT)
>  * Google RE2 (BSD-style)
>  * lz4 (BSD)
>  * snappy (BSD)
>  * cyrus-sasl (CMU License)
>  * Apache Avro (Apache Software License v2.0)
>  * Cloudera squeasel (Apache Software License v2.0)
>  * Apache htrace (Incubating) (Apache Software License v2.0)
>  * Apache Sentry (Incubating) (Apache Software License v2.0)
>  * Apache Shiro (Apache Software License v2.0)
>  * Twitter Bootstrap (Apache Software License v2.0)
>  * d3 (BSD)
>  * LLVM (BSD-like)
>
> Build and test dependencies:
>
>  * ant (Apache Software License v2.0)
>  * Apache Maven (Apache Software License v2.0)
>  * cmake (BSD)
>  * clang (BSD)
>  * Google gtest (Apache Software License v2.0)
>
> = Required Resources =
>
> We request that following resources be created for the project to use:
>
> == Mailing lists ==
>
>  * private@impala.incubator.apache.org (moderated subscriptions)
>  * commits@impala.incubator.apache.org
>  * dev@impala.incubator.apache.org
>  * issues@impala.incubator.apache.org
>  * user@impala.incubator.apache.org
>
> == Git repository ==
> https://git.apache.org/impala.git
>
> == JIRA instance ==
> JIRA project IMPALA (IMPALA or IMP)
>
> == Other Resources ==
> We hope to continue using Gerrit for our code review and commit workflow.
> We are involved with discussions that the Kudu team at Cloudera have been
> having with Jake Farrell to start discussions on how Gerrit can fit into
> the ASF. We know that several other ASF projects or podlings are also
> interested in Gerrit.
>
> If the Infrastructure team does not have the bandwidth to support gerrit,
> we will continue to support our own instance of gerrit for Impala, and make
> the necessary integrations such that commits are properly authenticated and
> maintain sufficient provenance to uphold the ASF standards (e.g. via the
> solution adopted by the AsterixDB podling).
>
> = Initial Committers =
>
>  * Tim Armstrong
>  * Alex Behm
>  * Taras Bobrovytsky
>  * Casey Ching
>  * Martin Grund
>  * Daniel Hecht
>  * Michael Ho
>  * Matthew Jacobs
>  * Ishaan Joshi
>  * Lenni Kuff
>  * Marcel Kornacker
>  * Sailesh Mukil
>  * Henry Robinson
>  * John Russell
>  * Dimitris Tsirogiannis
>  * Skye Wanderman-Milne
>  * Juan Yu
>
> == Affiliations ==
> All: Cloudera Inc.
>
> = Sponsors =
>
> == Champion ==
> Tom White
>
> == Nominated Mentors ==
>  * Tom White (Cloudera)
>  * Todd Lipcon (Cloudera)
>  * Carl Steinbach (LinkedIn)
>  * Brock Noland (StreamSets)
>
>
> = Sponsoring Entity =
> We ask that the Incubator PMC sponsor this proposal.



-- 
thanks
ashish

Blog: http://www.ashishpaliwal.com/blog
My Photo Galleries: http://www.pbase.com/ashishpaliwal

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Sree V <sr...@yahoo.com.INVALID>.
+1 (non-binding) Thanking you.With RegardsSree 


    On Monday, November 30, 2015 9:34 AM, stack <sa...@gmail.com> wrote:
 

 +1 (binding)
St.Ack
On Nov 24, 2015 1:04 PM, "Henry Robinson" <he...@cloudera.com> wrote:

> Hi -
>
> The [DISCUSS] thread has been quiet for a few days, so I think there's been
> sufficient opportunity for discussion around our proposal to bring Impala
> to the ASF Incubator.
>
> I'd like to call a VOTE on that proposal, which is on the wiki at
> https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
> below.
>
> During the discussion period, the proposal has been amended to add Brock
> Noland as a new mentor, to add one missed committer from the list and to
> correct some issues with the dependency list.
>
> Please cast your votes as follows:
>
> [] +1, accept Impala into the Incubator
> [] +/-0, non-counted vote to express a disposition
> [] -1, do not accept Impala into the Incubator (please give your reason(s))
>
> As with the concurrent Kudu vote, I propose leaving the vote open for a
> full seven days (to close at Tuesday, December 1st at noon PST), due to the
> upcoming US holiday.
>
> Thanks,
> Henry
>
> --------
>
> = Abstract =
> Impala is a high-performance C++ and Java SQL query engine for data stored
> in Apache Hadoop-based clusters.
>
> = Proposal =
>
> We propose to contribute the Impala codebase and associated artifacts (e.g.
> documentation, web-site content etc.) to the Apache Software Foundation
> with the intent of forming a productive, meritocratic and open community
> around Impala’s continued development, according to the ‘Apache Way’.
>
> Cloudera owns several trademarks regarding Impala, and proposes to transfer
> ownership of those trademarks in full to the ASF.
>
> = Background =
> Engineers at Cloudera developed Impala and released it as an
> Apache-licensed open-source project in Fall 2012. Impala was written as a
> brand-new, modern C++ SQL engine targeted from the start for data stored in
> Apache Hadoop clusters.
>
> Impala’s most important benefit to users is high-performance, making it
> extremely appropriate for common enterprise analytic and business
> intelligence workloads. This is achieved by a number of software
> techniques, including: native support for data stored in HDFS and related
> filesystems, just-in-time compilation and optimization of individual query
> plans, high-performance C++ codebase and massively-parallel distributed
> architecture. In benchmarks, Impala is routinely amongst the very highest
> performing SQL query engines.
>
> = Rationale =
>
> Despite the exciting innovation in the so-called ‘big-data’ space, SQL
> remains by far the most common interface for interacting with data in both
> traditional warehouses and modern ‘big-data’ clusters. There is clearly a
> need, as evidenced by the eager adoption of Impala and other SQL engines in
> enterprise contexts, for a query engine that offers the familiar SQL
> interface, but that has been specifically designed to operate in massive,
> distributed clusters rather than in traditional, fixed-hardware,
> warehouse-specific deployments. Impala is one such query engine.
>
> We believe that the ASF is the right venue to foster an open-source
> community around Impala’s development. We expect that Impala will benefit
> from more productive collaboration with related Apache projects, and under
> the auspices of the ASF will attract talented contributors who will push
> Impala’s development forward at pace.
>
> We believe that the timing is right for Impala’s development to move
> wholesale to the ASF: Impala is well-established, has been Apache-licensed
> open-source for more than three years, and the core project is relatively
> stable. We are excited to see where an ASF-based community can take Impala
> from this strong starting point.
>
> = Initial Goals =
> Our initial goals are as follows:
>
>  * Establish ASF-compatible engineering practices and workflows
>  * Refactor and publish existing internal build scripts and test
> infrastructure, in order to make them usable by any community member.
>  * Transfer source code, documentation and associated artifacts to the ASF.
>  * Grow the user and developer communities
>
> = Current Status =
>
> Impala is developed as an Apache-licensed open-source project. The source
> code is available at http://github.com/cloudera/Impala, and developer
> documentation is at https://github.com/cloudera/Impala/wiki. The majority
> of commits to the project have come from Cloudera-employed developers, but
> we have accepted some contributions from individuals from other
> organizations.
>
> All code reviews are done via a public instance of the Gerrit review tool
> at http://gerrit.cloudera.org:8080/, and discussed on a public mailing
> list. All patches must be reviewed before they are accepted into the
> codebase, via a voting mechanism that is similar to that used on Apache
> projects such as Hadoop and HBase.
>
> Before a patch is committed, it must pass a suite of pre-commit tests.
> These tests are currently run on Cloudera’s internal infrastructure. One of
> our initial goals will be to work with the ASF Infrastructure team to find
> a way to run these tests in an acceptable way on publicly accessible
> machines.
>
> Issues are tracked in JIRA at https://issues.cloudera.org/projects/IMPALA,
> in a way that is extremely similar to existing practices at other ASF
> projects.
>
> = Meritocracy =
>
> We understand the central importance of meritocracy to the Apache Way. We
> will work to establish a welcoming, fair and meritocratic community, in
> part by expanding the set of committers on the project. Although Impala’s
> committer list will initially be dominated by members of the Impala
> engineering team at Cloudera, we look forward to growing a rich user and
> developer community.
>
> = Community =
> Impala has a strong user community (see
> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user), and a
> growing developer community (see
> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev). We
> wish
> to attract more developers to the project, and we believe that the ASF’s
> open and meritocratic philosophy will help us with this. We note the
> success of other, similar projects already part of the ASF.
>
> = Core Developers =
> Most - but not all - of Impala’s core developers are not currently
> affiliated with the ASF, and will require new ICLAs.
>
> = Alignment =
> Impala is related to several other Apache projects:
>
>  * Data that is read by Impala is very often stored in Apache Hadoop
> clusters powered by the HDFS filesystem.
>  * Impala can also read data stored in Apache HBase
>  * Metadata for databases, tables and so on is read by Impala from Apache
> Hive.
>  * The preferred data format for HDFS-based tables is Apache Parquet, and
> Apache Avro is also a supported data format.
>  * Impala is closely integrated with Kudu, which is also being proposed to
> the Incubator.
>  * Impala uses Apache Thrift as its RPC and serialization framework of
> choice.
>
> = Known Risks =
>
> == Orphaned Products ==
> Impala is used by most of Cloudera’s customers, and Cloudera remains
> committed to developing and supporting the project. Cloudera has a strong
> track record in standing behind projects that were contributed to the ASF
> by its employees, including Apache Flume, Apache Sqoop, and others. Other
> companies both ship and support Impala, lending credence to the idea that
> Impala is not at risk of being suddenly orphaned.
>
> == Inexperience with Open Source ==
> Although all committers on the initial list have significant experience
> with at least one open-source project - namely Impala - fewer have much
> experience with ASF-based software projects as contributors and community
> members. However, with the guidance of our mentors, committers who do have
> ASF experience, and time to learn during Incubation, we are confident that
> the project can be run in accordance with Apache principles on an ongoing
> basis.
>
> == Homogeneous Developers ==
>
> The initial committers are employees of Cloudera.
>
> The project has received some contributions from developers outside of
> Cloudera, from individuals belonging to organizations such as Intel and
> Google, from hobbyists and from students using Impala to advance their
> understanding of distributed databases. The project attracted an active
> user community as well. We hope to continue to encourage contributions from
> these developers and community members and grow them into committers after
> they have had time to continue their contributions.
>
> == Reliance on Salaried Developers ==
>
> Many of Impala’s initial set of committers work full-time on Impala, and
> are paid to do so. However, as mentioned elsewhere, we anticipate growth in
> the developer community which we hope will include hobbyists and academics
> who have an interested in distributed data systems.
>
> == An Excessive Fascination with the Apache Brand ==
> Although we hope that Impala benefits from the Apache Brand, any reflected
> goodwill to Cloudera as the contributing entity is not the goal of
> establishing Impala as an Apache project. We will work with the Incubator
> PMC and the PRC to ensure that the Apache Brand is respected.
>
> = Documentation =
> Impala: A Modern, Open-Source SQL Engine for Hadoop (
> http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf)
>
> Impala’s developer wiki (https://github.com/cloudera/Impala/wiki)
>
> Impala’s auto-generated API documentation (
> http://impala.io/doc/html/index.html)
>
> = Initial Source =
> Impala’s initial source contribution will come from
> http://github.com/cloudera/Impala/.
>
> = External Dependencies =
>
> Impala depends upon a number of third-party libraries, which we list below.
> We intend to compile a LICENSE.txt file in the very short term (see
> https://issues.cloudera.org/browse/IMPALA-2670).
>
>  * Google gflags (BSD)
>  * Google glog (BSD)
>  * Apache Thrift (Apache Software License v2.0)
>  * Apache Commons (Apache Software License v2.0)
>  * Apache Hadoop (Apache Software License v2.0)
>  * Apache HBase (Apache Software License v2.0)
>  * Apache Hive (Apache Software License v2.0)
>  * Boost (Boost Software License)
>  * OpenLdap (OpenLDAP Software License)
>  * rapidjson (MIT)
>  * Google RE2 (BSD-style)
>  * lz4 (BSD)
>  * snappy (BSD)
>  * cyrus-sasl (CMU License)
>  * Apache Avro (Apache Software License v2.0)
>  * Cloudera squeasel (Apache Software License v2.0)
>  * Apache htrace (Incubating) (Apache Software License v2.0)
>  * Apache Sentry (Incubating) (Apache Software License v2.0)
>  * Apache Shiro (Apache Software License v2.0)
>  * Twitter Bootstrap (Apache Software License v2.0)
>  * d3 (BSD)
>  * LLVM (BSD-like)
>
> Build and test dependencies:
>
>  * ant (Apache Software License v2.0)
>  * Apache Maven (Apache Software License v2.0)
>  * cmake (BSD)
>  * clang (BSD)
>  * Google gtest (Apache Software License v2.0)
>
> = Required Resources =
>
> We request that following resources be created for the project to use:
>
> == Mailing lists ==
>
>  * private@impala.incubator.apache.org (moderated subscriptions)
>  * commits@impala.incubator.apache.org
>  * dev@impala.incubator.apache.org
>  * issues@impala.incubator.apache.org
>  * user@impala.incubator.apache.org
>
> == Git repository ==
> https://git.apache.org/impala.git
>
> == JIRA instance ==
> JIRA project IMPALA (IMPALA or IMP)
>
> == Other Resources ==
> We hope to continue using Gerrit for our code review and commit workflow.
> We are involved with discussions that the Kudu team at Cloudera have been
> having with Jake Farrell to start discussions on how Gerrit can fit into
> the ASF. We know that several other ASF projects or podlings are also
> interested in Gerrit.
>
> If the Infrastructure team does not have the bandwidth to support gerrit,
> we will continue to support our own instance of gerrit for Impala, and make
> the necessary integrations such that commits are properly authenticated and
> maintain sufficient provenance to uphold the ASF standards (e.g. via the
> solution adopted by the AsterixDB podling).
>
> = Initial Committers =
>
>  * Tim Armstrong
>  * Alex Behm
>  * Taras Bobrovytsky
>  * Casey Ching
>  * Martin Grund
>  * Daniel Hecht
>  * Michael Ho
>  * Matthew Jacobs
>  * Ishaan Joshi
>  * Lenni Kuff
>  * Marcel Kornacker
>  * Sailesh Mukil
>  * Henry Robinson
>  * John Russell
>  * Dimitris Tsirogiannis
>  * Skye Wanderman-Milne
>  * Juan Yu
>
> == Affiliations ==
> All: Cloudera Inc.
>
> = Sponsors =
>
> == Champion ==
> Tom White
>
> == Nominated Mentors ==
>  * Tom White (Cloudera)
>  * Todd Lipcon (Cloudera)
>  * Carl Steinbach (LinkedIn)
>  * Brock Noland (StreamSets)
>
>
> = Sponsoring Entity =
> We ask that the Incubator PMC sponsor this proposal.
>

  

Re: [VOTE] Accept Impala into the Apache Incubator

Posted by stack <sa...@gmail.com>.
+1 (binding)
St.Ack
On Nov 24, 2015 1:04 PM, "Henry Robinson" <he...@cloudera.com> wrote:

> Hi -
>
> The [DISCUSS] thread has been quiet for a few days, so I think there's been
> sufficient opportunity for discussion around our proposal to bring Impala
> to the ASF Incubator.
>
> I'd like to call a VOTE on that proposal, which is on the wiki at
> https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
> below.
>
> During the discussion period, the proposal has been amended to add Brock
> Noland as a new mentor, to add one missed committer from the list and to
> correct some issues with the dependency list.
>
> Please cast your votes as follows:
>
> [] +1, accept Impala into the Incubator
> [] +/-0, non-counted vote to express a disposition
> [] -1, do not accept Impala into the Incubator (please give your reason(s))
>
> As with the concurrent Kudu vote, I propose leaving the vote open for a
> full seven days (to close at Tuesday, December 1st at noon PST), due to the
> upcoming US holiday.
>
> Thanks,
> Henry
>
> --------
>
> = Abstract =
> Impala is a high-performance C++ and Java SQL query engine for data stored
> in Apache Hadoop-based clusters.
>
> = Proposal =
>
> We propose to contribute the Impala codebase and associated artifacts (e.g.
> documentation, web-site content etc.) to the Apache Software Foundation
> with the intent of forming a productive, meritocratic and open community
> around Impala’s continued development, according to the ‘Apache Way’.
>
> Cloudera owns several trademarks regarding Impala, and proposes to transfer
> ownership of those trademarks in full to the ASF.
>
> = Background =
> Engineers at Cloudera developed Impala and released it as an
> Apache-licensed open-source project in Fall 2012. Impala was written as a
> brand-new, modern C++ SQL engine targeted from the start for data stored in
> Apache Hadoop clusters.
>
> Impala’s most important benefit to users is high-performance, making it
> extremely appropriate for common enterprise analytic and business
> intelligence workloads. This is achieved by a number of software
> techniques, including: native support for data stored in HDFS and related
> filesystems, just-in-time compilation and optimization of individual query
> plans, high-performance C++ codebase and massively-parallel distributed
> architecture. In benchmarks, Impala is routinely amongst the very highest
> performing SQL query engines.
>
> = Rationale =
>
> Despite the exciting innovation in the so-called ‘big-data’ space, SQL
> remains by far the most common interface for interacting with data in both
> traditional warehouses and modern ‘big-data’ clusters. There is clearly a
> need, as evidenced by the eager adoption of Impala and other SQL engines in
> enterprise contexts, for a query engine that offers the familiar SQL
> interface, but that has been specifically designed to operate in massive,
> distributed clusters rather than in traditional, fixed-hardware,
> warehouse-specific deployments. Impala is one such query engine.
>
> We believe that the ASF is the right venue to foster an open-source
> community around Impala’s development. We expect that Impala will benefit
> from more productive collaboration with related Apache projects, and under
> the auspices of the ASF will attract talented contributors who will push
> Impala’s development forward at pace.
>
> We believe that the timing is right for Impala’s development to move
> wholesale to the ASF: Impala is well-established, has been Apache-licensed
> open-source for more than three years, and the core project is relatively
> stable. We are excited to see where an ASF-based community can take Impala
> from this strong starting point.
>
> = Initial Goals =
> Our initial goals are as follows:
>
>  * Establish ASF-compatible engineering practices and workflows
>  * Refactor and publish existing internal build scripts and test
> infrastructure, in order to make them usable by any community member.
>  * Transfer source code, documentation and associated artifacts to the ASF.
>  * Grow the user and developer communities
>
> = Current Status =
>
> Impala is developed as an Apache-licensed open-source project. The source
> code is available at http://github.com/cloudera/Impala, and developer
> documentation is at https://github.com/cloudera/Impala/wiki. The majority
> of commits to the project have come from Cloudera-employed developers, but
> we have accepted some contributions from individuals from other
> organizations.
>
> All code reviews are done via a public instance of the Gerrit review tool
> at http://gerrit.cloudera.org:8080/, and discussed on a public mailing
> list. All patches must be reviewed before they are accepted into the
> codebase, via a voting mechanism that is similar to that used on Apache
> projects such as Hadoop and HBase.
>
> Before a patch is committed, it must pass a suite of pre-commit tests.
> These tests are currently run on Cloudera’s internal infrastructure. One of
> our initial goals will be to work with the ASF Infrastructure team to find
> a way to run these tests in an acceptable way on publicly accessible
> machines.
>
> Issues are tracked in JIRA at https://issues.cloudera.org/projects/IMPALA,
> in a way that is extremely similar to existing practices at other ASF
> projects.
>
> = Meritocracy =
>
> We understand the central importance of meritocracy to the Apache Way. We
> will work to establish a welcoming, fair and meritocratic community, in
> part by expanding the set of committers on the project. Although Impala’s
> committer list will initially be dominated by members of the Impala
> engineering team at Cloudera, we look forward to growing a rich user and
> developer community.
>
> = Community =
> Impala has a strong user community (see
> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user), and a
> growing developer community (see
> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev). We
> wish
> to attract more developers to the project, and we believe that the ASF’s
> open and meritocratic philosophy will help us with this. We note the
> success of other, similar projects already part of the ASF.
>
> = Core Developers =
> Most - but not all - of Impala’s core developers are not currently
> affiliated with the ASF, and will require new ICLAs.
>
> = Alignment =
> Impala is related to several other Apache projects:
>
>  * Data that is read by Impala is very often stored in Apache Hadoop
> clusters powered by the HDFS filesystem.
>  * Impala can also read data stored in Apache HBase
>  * Metadata for databases, tables and so on is read by Impala from Apache
> Hive.
>  * The preferred data format for HDFS-based tables is Apache Parquet, and
> Apache Avro is also a supported data format.
>  * Impala is closely integrated with Kudu, which is also being proposed to
> the Incubator.
>  * Impala uses Apache Thrift as its RPC and serialization framework of
> choice.
>
> = Known Risks =
>
> == Orphaned Products ==
> Impala is used by most of Cloudera’s customers, and Cloudera remains
> committed to developing and supporting the project. Cloudera has a strong
> track record in standing behind projects that were contributed to the ASF
> by its employees, including Apache Flume, Apache Sqoop, and others. Other
> companies both ship and support Impala, lending credence to the idea that
> Impala is not at risk of being suddenly orphaned.
>
> == Inexperience with Open Source ==
> Although all committers on the initial list have significant experience
> with at least one open-source project - namely Impala - fewer have much
> experience with ASF-based software projects as contributors and community
> members. However, with the guidance of our mentors, committers who do have
> ASF experience, and time to learn during Incubation, we are confident that
> the project can be run in accordance with Apache principles on an ongoing
> basis.
>
> == Homogeneous Developers ==
>
> The initial committers are employees of Cloudera.
>
> The project has received some contributions from developers outside of
> Cloudera, from individuals belonging to organizations such as Intel and
> Google, from hobbyists and from students using Impala to advance their
> understanding of distributed databases. The project attracted an active
> user community as well. We hope to continue to encourage contributions from
> these developers and community members and grow them into committers after
> they have had time to continue their contributions.
>
> == Reliance on Salaried Developers ==
>
> Many of Impala’s initial set of committers work full-time on Impala, and
> are paid to do so. However, as mentioned elsewhere, we anticipate growth in
> the developer community which we hope will include hobbyists and academics
> who have an interested in distributed data systems.
>
> == An Excessive Fascination with the Apache Brand ==
> Although we hope that Impala benefits from the Apache Brand, any reflected
> goodwill to Cloudera as the contributing entity is not the goal of
> establishing Impala as an Apache project. We will work with the Incubator
> PMC and the PRC to ensure that the Apache Brand is respected.
>
> = Documentation =
> Impala: A Modern, Open-Source SQL Engine for Hadoop (
> http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf)
>
> Impala’s developer wiki (https://github.com/cloudera/Impala/wiki)
>
> Impala’s auto-generated API documentation (
> http://impala.io/doc/html/index.html)
>
> = Initial Source =
> Impala’s initial source contribution will come from
> http://github.com/cloudera/Impala/.
>
> = External Dependencies =
>
> Impala depends upon a number of third-party libraries, which we list below.
> We intend to compile a LICENSE.txt file in the very short term (see
> https://issues.cloudera.org/browse/IMPALA-2670).
>
>  * Google gflags (BSD)
>  * Google glog (BSD)
>  * Apache Thrift (Apache Software License v2.0)
>  * Apache Commons (Apache Software License v2.0)
>  * Apache Hadoop (Apache Software License v2.0)
>  * Apache HBase (Apache Software License v2.0)
>  * Apache Hive (Apache Software License v2.0)
>  * Boost (Boost Software License)
>  * OpenLdap (OpenLDAP Software License)
>  * rapidjson (MIT)
>  * Google RE2 (BSD-style)
>  * lz4 (BSD)
>  * snappy (BSD)
>  * cyrus-sasl (CMU License)
>  * Apache Avro (Apache Software License v2.0)
>  * Cloudera squeasel (Apache Software License v2.0)
>  * Apache htrace (Incubating) (Apache Software License v2.0)
>  * Apache Sentry (Incubating) (Apache Software License v2.0)
>  * Apache Shiro (Apache Software License v2.0)
>  * Twitter Bootstrap (Apache Software License v2.0)
>  * d3 (BSD)
>  * LLVM (BSD-like)
>
> Build and test dependencies:
>
>  * ant (Apache Software License v2.0)
>  * Apache Maven (Apache Software License v2.0)
>  * cmake (BSD)
>  * clang (BSD)
>  * Google gtest (Apache Software License v2.0)
>
> = Required Resources =
>
> We request that following resources be created for the project to use:
>
> == Mailing lists ==
>
>  * private@impala.incubator.apache.org (moderated subscriptions)
>  * commits@impala.incubator.apache.org
>  * dev@impala.incubator.apache.org
>  * issues@impala.incubator.apache.org
>  * user@impala.incubator.apache.org
>
> == Git repository ==
> https://git.apache.org/impala.git
>
> == JIRA instance ==
> JIRA project IMPALA (IMPALA or IMP)
>
> == Other Resources ==
> We hope to continue using Gerrit for our code review and commit workflow.
> We are involved with discussions that the Kudu team at Cloudera have been
> having with Jake Farrell to start discussions on how Gerrit can fit into
> the ASF. We know that several other ASF projects or podlings are also
> interested in Gerrit.
>
> If the Infrastructure team does not have the bandwidth to support gerrit,
> we will continue to support our own instance of gerrit for Impala, and make
> the necessary integrations such that commits are properly authenticated and
> maintain sufficient provenance to uphold the ASF standards (e.g. via the
> solution adopted by the AsterixDB podling).
>
> = Initial Committers =
>
>  * Tim Armstrong
>  * Alex Behm
>  * Taras Bobrovytsky
>  * Casey Ching
>  * Martin Grund
>  * Daniel Hecht
>  * Michael Ho
>  * Matthew Jacobs
>  * Ishaan Joshi
>  * Lenni Kuff
>  * Marcel Kornacker
>  * Sailesh Mukil
>  * Henry Robinson
>  * John Russell
>  * Dimitris Tsirogiannis
>  * Skye Wanderman-Milne
>  * Juan Yu
>
> == Affiliations ==
> All: Cloudera Inc.
>
> = Sponsors =
>
> == Champion ==
> Tom White
>
> == Nominated Mentors ==
>  * Tom White (Cloudera)
>  * Todd Lipcon (Cloudera)
>  * Carl Steinbach (LinkedIn)
>  * Brock Noland (StreamSets)
>
>
> = Sponsoring Entity =
> We ask that the Incubator PMC sponsor this proposal.
>

Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Owen O'Malley <om...@apache.org>.
+1 (binding)

On Tue, Nov 24, 2015 at 9:10 PM, Ralph Goers <ra...@dslextreme.com>
wrote:

> -1 (binding)
> I’d like to see the project start with CTR and use RTC only for specific
> cases (like where tests must be modified, over X (1000 lines?) of code
> added, etc.
>
> Ralph
>
>
> > On Nov 24, 2015, at 2:03 PM, Henry Robinson <he...@cloudera.com> wrote:
> >
> > Hi -
> >
> > The [DISCUSS] thread has been quiet for a few days, so I think there's
> been
> > sufficient opportunity for discussion around our proposal to bring Impala
> > to the ASF Incubator.
> >
> > I'd like to call a VOTE on that proposal, which is on the wiki at
> > https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
> > below.
> >
> > During the discussion period, the proposal has been amended to add Brock
> > Noland as a new mentor, to add one missed committer from the list and to
> > correct some issues with the dependency list.
> >
> > Please cast your votes as follows:
> >
> > [] +1, accept Impala into the Incubator
> > [] +/-0, non-counted vote to express a disposition
> > [] -1, do not accept Impala into the Incubator (please give your
> reason(s))
> >
> > As with the concurrent Kudu vote, I propose leaving the vote open for a
> > full seven days (to close at Tuesday, December 1st at noon PST), due to
> the
> > upcoming US holiday.
> >
> > Thanks,
> > Henry
> >
> > --------
> >
> > = Abstract =
> > Impala is a high-performance C++ and Java SQL query engine for data
> stored
> > in Apache Hadoop-based clusters.
> >
> > = Proposal =
> >
> > We propose to contribute the Impala codebase and associated artifacts
> (e.g.
> > documentation, web-site content etc.) to the Apache Software Foundation
> > with the intent of forming a productive, meritocratic and open community
> > around Impala’s continued development, according to the ‘Apache Way’.
> >
> > Cloudera owns several trademarks regarding Impala, and proposes to
> transfer
> > ownership of those trademarks in full to the ASF.
> >
> > = Background =
> > Engineers at Cloudera developed Impala and released it as an
> > Apache-licensed open-source project in Fall 2012. Impala was written as a
> > brand-new, modern C++ SQL engine targeted from the start for data stored
> in
> > Apache Hadoop clusters.
> >
> > Impala’s most important benefit to users is high-performance, making it
> > extremely appropriate for common enterprise analytic and business
> > intelligence workloads. This is achieved by a number of software
> > techniques, including: native support for data stored in HDFS and related
> > filesystems, just-in-time compilation and optimization of individual
> query
> > plans, high-performance C++ codebase and massively-parallel distributed
> > architecture. In benchmarks, Impala is routinely amongst the very highest
> > performing SQL query engines.
> >
> > = Rationale =
> >
> > Despite the exciting innovation in the so-called ‘big-data’ space, SQL
> > remains by far the most common interface for interacting with data in
> both
> > traditional warehouses and modern ‘big-data’ clusters. There is clearly a
> > need, as evidenced by the eager adoption of Impala and other SQL engines
> in
> > enterprise contexts, for a query engine that offers the familiar SQL
> > interface, but that has been specifically designed to operate in massive,
> > distributed clusters rather than in traditional, fixed-hardware,
> > warehouse-specific deployments. Impala is one such query engine.
> >
> > We believe that the ASF is the right venue to foster an open-source
> > community around Impala’s development. We expect that Impala will benefit
> > from more productive collaboration with related Apache projects, and
> under
> > the auspices of the ASF will attract talented contributors who will push
> > Impala’s development forward at pace.
> >
> > We believe that the timing is right for Impala’s development to move
> > wholesale to the ASF: Impala is well-established, has been
> Apache-licensed
> > open-source for more than three years, and the core project is relatively
> > stable. We are excited to see where an ASF-based community can take
> Impala
> > from this strong starting point.
> >
> > = Initial Goals =
> > Our initial goals are as follows:
> >
> > * Establish ASF-compatible engineering practices and workflows
> > * Refactor and publish existing internal build scripts and test
> > infrastructure, in order to make them usable by any community member.
> > * Transfer source code, documentation and associated artifacts to the
> ASF.
> > * Grow the user and developer communities
> >
> > = Current Status =
> >
> > Impala is developed as an Apache-licensed open-source project. The source
> > code is available at http://github.com/cloudera/Impala, and developer
> > documentation is at https://github.com/cloudera/Impala/wiki. The
> majority
> > of commits to the project have come from Cloudera-employed developers,
> but
> > we have accepted some contributions from individuals from other
> > organizations.
> >
> > All code reviews are done via a public instance of the Gerrit review tool
> > at http://gerrit.cloudera.org:8080/, and discussed on a public mailing
> > list. All patches must be reviewed before they are accepted into the
> > codebase, via a voting mechanism that is similar to that used on Apache
> > projects such as Hadoop and HBase.
> >
> > Before a patch is committed, it must pass a suite of pre-commit tests.
> > These tests are currently run on Cloudera’s internal infrastructure. One
> of
> > our initial goals will be to work with the ASF Infrastructure team to
> find
> > a way to run these tests in an acceptable way on publicly accessible
> > machines.
> >
> > Issues are tracked in JIRA at
> https://issues.cloudera.org/projects/IMPALA,
> > in a way that is extremely similar to existing practices at other ASF
> > projects.
> >
> > = Meritocracy =
> >
> > We understand the central importance of meritocracy to the Apache Way. We
> > will work to establish a welcoming, fair and meritocratic community, in
> > part by expanding the set of committers on the project. Although Impala’s
> > committer list will initially be dominated by members of the Impala
> > engineering team at Cloudera, we look forward to growing a rich user and
> > developer community.
> >
> > = Community =
> > Impala has a strong user community (see
> > https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user),
> and a
> > growing developer community (see
> > https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev). We
> wish
> > to attract more developers to the project, and we believe that the ASF’s
> > open and meritocratic philosophy will help us with this. We note the
> > success of other, similar projects already part of the ASF.
> >
> > = Core Developers =
> > Most - but not all - of Impala’s core developers are not currently
> > affiliated with the ASF, and will require new ICLAs.
> >
> > = Alignment =
> > Impala is related to several other Apache projects:
> >
> > * Data that is read by Impala is very often stored in Apache Hadoop
> > clusters powered by the HDFS filesystem.
> > * Impala can also read data stored in Apache HBase
> > * Metadata for databases, tables and so on is read by Impala from Apache
> > Hive.
> > * The preferred data format for HDFS-based tables is Apache Parquet, and
> > Apache Avro is also a supported data format.
> > * Impala is closely integrated with Kudu, which is also being proposed to
> > the Incubator.
> > * Impala uses Apache Thrift as its RPC and serialization framework of
> > choice.
> >
> > = Known Risks =
> >
> > == Orphaned Products ==
> > Impala is used by most of Cloudera’s customers, and Cloudera remains
> > committed to developing and supporting the project. Cloudera has a strong
> > track record in standing behind projects that were contributed to the ASF
> > by its employees, including Apache Flume, Apache Sqoop, and others. Other
> > companies both ship and support Impala, lending credence to the idea that
> > Impala is not at risk of being suddenly orphaned.
> >
> > == Inexperience with Open Source ==
> > Although all committers on the initial list have significant experience
> > with at least one open-source project - namely Impala - fewer have much
> > experience with ASF-based software projects as contributors and community
> > members. However, with the guidance of our mentors, committers who do
> have
> > ASF experience, and time to learn during Incubation, we are confident
> that
> > the project can be run in accordance with Apache principles on an ongoing
> > basis.
> >
> > == Homogeneous Developers ==
> >
> > The initial committers are employees of Cloudera.
> >
> > The project has received some contributions from developers outside of
> > Cloudera, from individuals belonging to organizations such as Intel and
> > Google, from hobbyists and from students using Impala to advance their
> > understanding of distributed databases. The project attracted an active
> > user community as well. We hope to continue to encourage contributions
> from
> > these developers and community members and grow them into committers
> after
> > they have had time to continue their contributions.
> >
> > == Reliance on Salaried Developers ==
> >
> > Many of Impala’s initial set of committers work full-time on Impala, and
> > are paid to do so. However, as mentioned elsewhere, we anticipate growth
> in
> > the developer community which we hope will include hobbyists and
> academics
> > who have an interested in distributed data systems.
> >
> > == An Excessive Fascination with the Apache Brand ==
> > Although we hope that Impala benefits from the Apache Brand, any
> reflected
> > goodwill to Cloudera as the contributing entity is not the goal of
> > establishing Impala as an Apache project. We will work with the Incubator
> > PMC and the PRC to ensure that the Apache Brand is respected.
> >
> > = Documentation =
> > Impala: A Modern, Open-Source SQL Engine for Hadoop (
> > http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf)
> >
> > Impala’s developer wiki (https://github.com/cloudera/Impala/wiki)
> >
> > Impala’s auto-generated API documentation (
> > http://impala.io/doc/html/index.html)
> >
> > = Initial Source =
> > Impala’s initial source contribution will come from
> > http://github.com/cloudera/Impala/.
> >
> > = External Dependencies =
> >
> > Impala depends upon a number of third-party libraries, which we list
> below.
> > We intend to compile a LICENSE.txt file in the very short term (see
> > https://issues.cloudera.org/browse/IMPALA-2670).
> >
> > * Google gflags (BSD)
> > * Google glog (BSD)
> > * Apache Thrift (Apache Software License v2.0)
> > * Apache Commons (Apache Software License v2.0)
> > * Apache Hadoop (Apache Software License v2.0)
> > * Apache HBase (Apache Software License v2.0)
> > * Apache Hive (Apache Software License v2.0)
> > * Boost (Boost Software License)
> > * OpenLdap (OpenLDAP Software License)
> > * rapidjson (MIT)
> > * Google RE2 (BSD-style)
> > * lz4 (BSD)
> > * snappy (BSD)
> > * cyrus-sasl (CMU License)
> > * Apache Avro (Apache Software License v2.0)
> > * Cloudera squeasel (Apache Software License v2.0)
> > * Apache htrace (Incubating) (Apache Software License v2.0)
> > * Apache Sentry (Incubating) (Apache Software License v2.0)
> > * Apache Shiro (Apache Software License v2.0)
> > * Twitter Bootstrap (Apache Software License v2.0)
> > * d3 (BSD)
> > * LLVM (BSD-like)
> >
> > Build and test dependencies:
> >
> > * ant (Apache Software License v2.0)
> > * Apache Maven (Apache Software License v2.0)
> > * cmake (BSD)
> > * clang (BSD)
> > * Google gtest (Apache Software License v2.0)
> >
> > = Required Resources =
> >
> > We request that following resources be created for the project to use:
> >
> > == Mailing lists ==
> >
> > * private@impala.incubator.apache.org (moderated subscriptions)
> > * commits@impala.incubator.apache.org
> > * dev@impala.incubator.apache.org
> > * issues@impala.incubator.apache.org
> > * user@impala.incubator.apache.org
> >
> > == Git repository ==
> > https://git.apache.org/impala.git
> >
> > == JIRA instance ==
> > JIRA project IMPALA (IMPALA or IMP)
> >
> > == Other Resources ==
> > We hope to continue using Gerrit for our code review and commit workflow.
> > We are involved with discussions that the Kudu team at Cloudera have been
> > having with Jake Farrell to start discussions on how Gerrit can fit into
> > the ASF. We know that several other ASF projects or podlings are also
> > interested in Gerrit.
> >
> > If the Infrastructure team does not have the bandwidth to support gerrit,
> > we will continue to support our own instance of gerrit for Impala, and
> make
> > the necessary integrations such that commits are properly authenticated
> and
> > maintain sufficient provenance to uphold the ASF standards (e.g. via the
> > solution adopted by the AsterixDB podling).
> >
> > = Initial Committers =
> >
> > * Tim Armstrong
> > * Alex Behm
> > * Taras Bobrovytsky
> > * Casey Ching
> > * Martin Grund
> > * Daniel Hecht
> > * Michael Ho
> > * Matthew Jacobs
> > * Ishaan Joshi
> > * Lenni Kuff
> > * Marcel Kornacker
> > * Sailesh Mukil
> > * Henry Robinson
> > * John Russell
> > * Dimitris Tsirogiannis
> > * Skye Wanderman-Milne
> > * Juan Yu
> >
> > == Affiliations ==
> > All: Cloudera Inc.
> >
> > = Sponsors =
> >
> > == Champion ==
> > Tom White
> >
> > == Nominated Mentors ==
> > * Tom White (Cloudera)
> > * Todd Lipcon (Cloudera)
> > * Carl Steinbach (LinkedIn)
> > * Brock Noland (StreamSets)
> >
> >
> > = Sponsoring Entity =
> > We ask that the Incubator PMC sponsor this proposal.
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>

Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Ralph Goers <ra...@dslextreme.com>.
-1 (binding)
I’d like to see the project start with CTR and use RTC only for specific cases (like where tests must be modified, over X (1000 lines?) of code added, etc.

Ralph


> On Nov 24, 2015, at 2:03 PM, Henry Robinson <he...@cloudera.com> wrote:
> 
> Hi -
> 
> The [DISCUSS] thread has been quiet for a few days, so I think there's been
> sufficient opportunity for discussion around our proposal to bring Impala
> to the ASF Incubator.
> 
> I'd like to call a VOTE on that proposal, which is on the wiki at
> https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
> below.
> 
> During the discussion period, the proposal has been amended to add Brock
> Noland as a new mentor, to add one missed committer from the list and to
> correct some issues with the dependency list.
> 
> Please cast your votes as follows:
> 
> [] +1, accept Impala into the Incubator
> [] +/-0, non-counted vote to express a disposition
> [] -1, do not accept Impala into the Incubator (please give your reason(s))
> 
> As with the concurrent Kudu vote, I propose leaving the vote open for a
> full seven days (to close at Tuesday, December 1st at noon PST), due to the
> upcoming US holiday.
> 
> Thanks,
> Henry
> 
> --------
> 
> = Abstract =
> Impala is a high-performance C++ and Java SQL query engine for data stored
> in Apache Hadoop-based clusters.
> 
> = Proposal =
> 
> We propose to contribute the Impala codebase and associated artifacts (e.g.
> documentation, web-site content etc.) to the Apache Software Foundation
> with the intent of forming a productive, meritocratic and open community
> around Impala’s continued development, according to the ‘Apache Way’.
> 
> Cloudera owns several trademarks regarding Impala, and proposes to transfer
> ownership of those trademarks in full to the ASF.
> 
> = Background =
> Engineers at Cloudera developed Impala and released it as an
> Apache-licensed open-source project in Fall 2012. Impala was written as a
> brand-new, modern C++ SQL engine targeted from the start for data stored in
> Apache Hadoop clusters.
> 
> Impala’s most important benefit to users is high-performance, making it
> extremely appropriate for common enterprise analytic and business
> intelligence workloads. This is achieved by a number of software
> techniques, including: native support for data stored in HDFS and related
> filesystems, just-in-time compilation and optimization of individual query
> plans, high-performance C++ codebase and massively-parallel distributed
> architecture. In benchmarks, Impala is routinely amongst the very highest
> performing SQL query engines.
> 
> = Rationale =
> 
> Despite the exciting innovation in the so-called ‘big-data’ space, SQL
> remains by far the most common interface for interacting with data in both
> traditional warehouses and modern ‘big-data’ clusters. There is clearly a
> need, as evidenced by the eager adoption of Impala and other SQL engines in
> enterprise contexts, for a query engine that offers the familiar SQL
> interface, but that has been specifically designed to operate in massive,
> distributed clusters rather than in traditional, fixed-hardware,
> warehouse-specific deployments. Impala is one such query engine.
> 
> We believe that the ASF is the right venue to foster an open-source
> community around Impala’s development. We expect that Impala will benefit
> from more productive collaboration with related Apache projects, and under
> the auspices of the ASF will attract talented contributors who will push
> Impala’s development forward at pace.
> 
> We believe that the timing is right for Impala’s development to move
> wholesale to the ASF: Impala is well-established, has been Apache-licensed
> open-source for more than three years, and the core project is relatively
> stable. We are excited to see where an ASF-based community can take Impala
> from this strong starting point.
> 
> = Initial Goals =
> Our initial goals are as follows:
> 
> * Establish ASF-compatible engineering practices and workflows
> * Refactor and publish existing internal build scripts and test
> infrastructure, in order to make them usable by any community member.
> * Transfer source code, documentation and associated artifacts to the ASF.
> * Grow the user and developer communities
> 
> = Current Status =
> 
> Impala is developed as an Apache-licensed open-source project. The source
> code is available at http://github.com/cloudera/Impala, and developer
> documentation is at https://github.com/cloudera/Impala/wiki. The majority
> of commits to the project have come from Cloudera-employed developers, but
> we have accepted some contributions from individuals from other
> organizations.
> 
> All code reviews are done via a public instance of the Gerrit review tool
> at http://gerrit.cloudera.org:8080/, and discussed on a public mailing
> list. All patches must be reviewed before they are accepted into the
> codebase, via a voting mechanism that is similar to that used on Apache
> projects such as Hadoop and HBase.
> 
> Before a patch is committed, it must pass a suite of pre-commit tests.
> These tests are currently run on Cloudera’s internal infrastructure. One of
> our initial goals will be to work with the ASF Infrastructure team to find
> a way to run these tests in an acceptable way on publicly accessible
> machines.
> 
> Issues are tracked in JIRA at https://issues.cloudera.org/projects/IMPALA,
> in a way that is extremely similar to existing practices at other ASF
> projects.
> 
> = Meritocracy =
> 
> We understand the central importance of meritocracy to the Apache Way. We
> will work to establish a welcoming, fair and meritocratic community, in
> part by expanding the set of committers on the project. Although Impala’s
> committer list will initially be dominated by members of the Impala
> engineering team at Cloudera, we look forward to growing a rich user and
> developer community.
> 
> = Community =
> Impala has a strong user community (see
> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user), and a
> growing developer community (see
> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev). We wish
> to attract more developers to the project, and we believe that the ASF’s
> open and meritocratic philosophy will help us with this. We note the
> success of other, similar projects already part of the ASF.
> 
> = Core Developers =
> Most - but not all - of Impala’s core developers are not currently
> affiliated with the ASF, and will require new ICLAs.
> 
> = Alignment =
> Impala is related to several other Apache projects:
> 
> * Data that is read by Impala is very often stored in Apache Hadoop
> clusters powered by the HDFS filesystem.
> * Impala can also read data stored in Apache HBase
> * Metadata for databases, tables and so on is read by Impala from Apache
> Hive.
> * The preferred data format for HDFS-based tables is Apache Parquet, and
> Apache Avro is also a supported data format.
> * Impala is closely integrated with Kudu, which is also being proposed to
> the Incubator.
> * Impala uses Apache Thrift as its RPC and serialization framework of
> choice.
> 
> = Known Risks =
> 
> == Orphaned Products ==
> Impala is used by most of Cloudera’s customers, and Cloudera remains
> committed to developing and supporting the project. Cloudera has a strong
> track record in standing behind projects that were contributed to the ASF
> by its employees, including Apache Flume, Apache Sqoop, and others. Other
> companies both ship and support Impala, lending credence to the idea that
> Impala is not at risk of being suddenly orphaned.
> 
> == Inexperience with Open Source ==
> Although all committers on the initial list have significant experience
> with at least one open-source project - namely Impala - fewer have much
> experience with ASF-based software projects as contributors and community
> members. However, with the guidance of our mentors, committers who do have
> ASF experience, and time to learn during Incubation, we are confident that
> the project can be run in accordance with Apache principles on an ongoing
> basis.
> 
> == Homogeneous Developers ==
> 
> The initial committers are employees of Cloudera.
> 
> The project has received some contributions from developers outside of
> Cloudera, from individuals belonging to organizations such as Intel and
> Google, from hobbyists and from students using Impala to advance their
> understanding of distributed databases. The project attracted an active
> user community as well. We hope to continue to encourage contributions from
> these developers and community members and grow them into committers after
> they have had time to continue their contributions.
> 
> == Reliance on Salaried Developers ==
> 
> Many of Impala’s initial set of committers work full-time on Impala, and
> are paid to do so. However, as mentioned elsewhere, we anticipate growth in
> the developer community which we hope will include hobbyists and academics
> who have an interested in distributed data systems.
> 
> == An Excessive Fascination with the Apache Brand ==
> Although we hope that Impala benefits from the Apache Brand, any reflected
> goodwill to Cloudera as the contributing entity is not the goal of
> establishing Impala as an Apache project. We will work with the Incubator
> PMC and the PRC to ensure that the Apache Brand is respected.
> 
> = Documentation =
> Impala: A Modern, Open-Source SQL Engine for Hadoop (
> http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf)
> 
> Impala’s developer wiki (https://github.com/cloudera/Impala/wiki)
> 
> Impala’s auto-generated API documentation (
> http://impala.io/doc/html/index.html)
> 
> = Initial Source =
> Impala’s initial source contribution will come from
> http://github.com/cloudera/Impala/.
> 
> = External Dependencies =
> 
> Impala depends upon a number of third-party libraries, which we list below.
> We intend to compile a LICENSE.txt file in the very short term (see
> https://issues.cloudera.org/browse/IMPALA-2670).
> 
> * Google gflags (BSD)
> * Google glog (BSD)
> * Apache Thrift (Apache Software License v2.0)
> * Apache Commons (Apache Software License v2.0)
> * Apache Hadoop (Apache Software License v2.0)
> * Apache HBase (Apache Software License v2.0)
> * Apache Hive (Apache Software License v2.0)
> * Boost (Boost Software License)
> * OpenLdap (OpenLDAP Software License)
> * rapidjson (MIT)
> * Google RE2 (BSD-style)
> * lz4 (BSD)
> * snappy (BSD)
> * cyrus-sasl (CMU License)
> * Apache Avro (Apache Software License v2.0)
> * Cloudera squeasel (Apache Software License v2.0)
> * Apache htrace (Incubating) (Apache Software License v2.0)
> * Apache Sentry (Incubating) (Apache Software License v2.0)
> * Apache Shiro (Apache Software License v2.0)
> * Twitter Bootstrap (Apache Software License v2.0)
> * d3 (BSD)
> * LLVM (BSD-like)
> 
> Build and test dependencies:
> 
> * ant (Apache Software License v2.0)
> * Apache Maven (Apache Software License v2.0)
> * cmake (BSD)
> * clang (BSD)
> * Google gtest (Apache Software License v2.0)
> 
> = Required Resources =
> 
> We request that following resources be created for the project to use:
> 
> == Mailing lists ==
> 
> * private@impala.incubator.apache.org (moderated subscriptions)
> * commits@impala.incubator.apache.org
> * dev@impala.incubator.apache.org
> * issues@impala.incubator.apache.org
> * user@impala.incubator.apache.org
> 
> == Git repository ==
> https://git.apache.org/impala.git
> 
> == JIRA instance ==
> JIRA project IMPALA (IMPALA or IMP)
> 
> == Other Resources ==
> We hope to continue using Gerrit for our code review and commit workflow.
> We are involved with discussions that the Kudu team at Cloudera have been
> having with Jake Farrell to start discussions on how Gerrit can fit into
> the ASF. We know that several other ASF projects or podlings are also
> interested in Gerrit.
> 
> If the Infrastructure team does not have the bandwidth to support gerrit,
> we will continue to support our own instance of gerrit for Impala, and make
> the necessary integrations such that commits are properly authenticated and
> maintain sufficient provenance to uphold the ASF standards (e.g. via the
> solution adopted by the AsterixDB podling).
> 
> = Initial Committers =
> 
> * Tim Armstrong
> * Alex Behm
> * Taras Bobrovytsky
> * Casey Ching
> * Martin Grund
> * Daniel Hecht
> * Michael Ho
> * Matthew Jacobs
> * Ishaan Joshi
> * Lenni Kuff
> * Marcel Kornacker
> * Sailesh Mukil
> * Henry Robinson
> * John Russell
> * Dimitris Tsirogiannis
> * Skye Wanderman-Milne
> * Juan Yu
> 
> == Affiliations ==
> All: Cloudera Inc.
> 
> = Sponsors =
> 
> == Champion ==
> Tom White
> 
> == Nominated Mentors ==
> * Tom White (Cloudera)
> * Todd Lipcon (Cloudera)
> * Carl Steinbach (LinkedIn)
> * Brock Noland (StreamSets)
> 
> 
> = Sponsoring Entity =
> We ask that the Incubator PMC sponsor this proposal.



---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Andrei Savu <as...@apache.org>.
+1 (binding)

-- Andrei Savu

On Sun, Nov 29, 2015 at 12:05 PM, Jean-Baptiste Onofré <jb...@nanthrax.net>
wrote:

> +1 (binding)
>
> Regards
> JB
>
>
> On 11/24/2015 10:03 PM, Henry Robinson wrote:
>
>> Hi -
>>
>> The [DISCUSS] thread has been quiet for a few days, so I think there's
>> been
>> sufficient opportunity for discussion around our proposal to bring Impala
>> to the ASF Incubator.
>>
>> I'd like to call a VOTE on that proposal, which is on the wiki at
>> https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
>> below.
>>
>> During the discussion period, the proposal has been amended to add Brock
>> Noland as a new mentor, to add one missed committer from the list and to
>> correct some issues with the dependency list.
>>
>> Please cast your votes as follows:
>>
>> [] +1, accept Impala into the Incubator
>> [] +/-0, non-counted vote to express a disposition
>> [] -1, do not accept Impala into the Incubator (please give your
>> reason(s))
>>
>> As with the concurrent Kudu vote, I propose leaving the vote open for a
>> full seven days (to close at Tuesday, December 1st at noon PST), due to
>> the
>> upcoming US holiday.
>>
>> Thanks,
>> Henry
>>
>> --------
>>
>> = Abstract =
>> Impala is a high-performance C++ and Java SQL query engine for data stored
>> in Apache Hadoop-based clusters.
>>
>> = Proposal =
>>
>> We propose to contribute the Impala codebase and associated artifacts
>> (e.g.
>> documentation, web-site content etc.) to the Apache Software Foundation
>> with the intent of forming a productive, meritocratic and open community
>> around Impala’s continued development, according to the ‘Apache Way’.
>>
>> Cloudera owns several trademarks regarding Impala, and proposes to
>> transfer
>> ownership of those trademarks in full to the ASF.
>>
>> = Background =
>> Engineers at Cloudera developed Impala and released it as an
>> Apache-licensed open-source project in Fall 2012. Impala was written as a
>> brand-new, modern C++ SQL engine targeted from the start for data stored
>> in
>> Apache Hadoop clusters.
>>
>> Impala’s most important benefit to users is high-performance, making it
>> extremely appropriate for common enterprise analytic and business
>> intelligence workloads. This is achieved by a number of software
>> techniques, including: native support for data stored in HDFS and related
>> filesystems, just-in-time compilation and optimization of individual query
>> plans, high-performance C++ codebase and massively-parallel distributed
>> architecture. In benchmarks, Impala is routinely amongst the very highest
>> performing SQL query engines.
>>
>> = Rationale =
>>
>> Despite the exciting innovation in the so-called ‘big-data’ space, SQL
>> remains by far the most common interface for interacting with data in both
>> traditional warehouses and modern ‘big-data’ clusters. There is clearly a
>> need, as evidenced by the eager adoption of Impala and other SQL engines
>> in
>> enterprise contexts, for a query engine that offers the familiar SQL
>> interface, but that has been specifically designed to operate in massive,
>> distributed clusters rather than in traditional, fixed-hardware,
>> warehouse-specific deployments. Impala is one such query engine.
>>
>> We believe that the ASF is the right venue to foster an open-source
>> community around Impala’s development. We expect that Impala will benefit
>> from more productive collaboration with related Apache projects, and under
>> the auspices of the ASF will attract talented contributors who will push
>> Impala’s development forward at pace.
>>
>> We believe that the timing is right for Impala’s development to move
>> wholesale to the ASF: Impala is well-established, has been Apache-licensed
>> open-source for more than three years, and the core project is relatively
>> stable. We are excited to see where an ASF-based community can take Impala
>> from this strong starting point.
>>
>> = Initial Goals =
>> Our initial goals are as follows:
>>
>>   * Establish ASF-compatible engineering practices and workflows
>>   * Refactor and publish existing internal build scripts and test
>> infrastructure, in order to make them usable by any community member.
>>   * Transfer source code, documentation and associated artifacts to the
>> ASF.
>>   * Grow the user and developer communities
>>
>> = Current Status =
>>
>> Impala is developed as an Apache-licensed open-source project. The source
>> code is available at http://github.com/cloudera/Impala, and developer
>> documentation is at https://github.com/cloudera/Impala/wiki. The majority
>> of commits to the project have come from Cloudera-employed developers, but
>> we have accepted some contributions from individuals from other
>> organizations.
>>
>> All code reviews are done via a public instance of the Gerrit review tool
>> at http://gerrit.cloudera.org:8080/, and discussed on a public mailing
>> list. All patches must be reviewed before they are accepted into the
>> codebase, via a voting mechanism that is similar to that used on Apache
>> projects such as Hadoop and HBase.
>>
>> Before a patch is committed, it must pass a suite of pre-commit tests.
>> These tests are currently run on Cloudera’s internal infrastructure. One
>> of
>> our initial goals will be to work with the ASF Infrastructure team to find
>> a way to run these tests in an acceptable way on publicly accessible
>> machines.
>>
>> Issues are tracked in JIRA at https://issues.cloudera.org/projects/IMPALA
>> ,
>> in a way that is extremely similar to existing practices at other ASF
>> projects.
>>
>> = Meritocracy =
>>
>> We understand the central importance of meritocracy to the Apache Way. We
>> will work to establish a welcoming, fair and meritocratic community, in
>> part by expanding the set of committers on the project. Although Impala’s
>> committer list will initially be dominated by members of the Impala
>> engineering team at Cloudera, we look forward to growing a rich user and
>> developer community.
>>
>> = Community =
>> Impala has a strong user community (see
>> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user), and
>> a
>> growing developer community (see
>> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev). We
>> wish
>> to attract more developers to the project, and we believe that the ASF’s
>> open and meritocratic philosophy will help us with this. We note the
>> success of other, similar projects already part of the ASF.
>>
>> = Core Developers =
>> Most - but not all - of Impala’s core developers are not currently
>> affiliated with the ASF, and will require new ICLAs.
>>
>> = Alignment =
>> Impala is related to several other Apache projects:
>>
>>   * Data that is read by Impala is very often stored in Apache Hadoop
>> clusters powered by the HDFS filesystem.
>>   * Impala can also read data stored in Apache HBase
>>   * Metadata for databases, tables and so on is read by Impala from Apache
>> Hive.
>>   * The preferred data format for HDFS-based tables is Apache Parquet, and
>> Apache Avro is also a supported data format.
>>   * Impala is closely integrated with Kudu, which is also being proposed
>> to
>> the Incubator.
>>   * Impala uses Apache Thrift as its RPC and serialization framework of
>> choice.
>>
>> = Known Risks =
>>
>> == Orphaned Products ==
>> Impala is used by most of Cloudera’s customers, and Cloudera remains
>> committed to developing and supporting the project. Cloudera has a strong
>> track record in standing behind projects that were contributed to the ASF
>> by its employees, including Apache Flume, Apache Sqoop, and others. Other
>> companies both ship and support Impala, lending credence to the idea that
>> Impala is not at risk of being suddenly orphaned.
>>
>> == Inexperience with Open Source ==
>> Although all committers on the initial list have significant experience
>> with at least one open-source project - namely Impala - fewer have much
>> experience with ASF-based software projects as contributors and community
>> members. However, with the guidance of our mentors, committers who do have
>> ASF experience, and time to learn during Incubation, we are confident that
>> the project can be run in accordance with Apache principles on an ongoing
>> basis.
>>
>> == Homogeneous Developers ==
>>
>> The initial committers are employees of Cloudera.
>>
>> The project has received some contributions from developers outside of
>> Cloudera, from individuals belonging to organizations such as Intel and
>> Google, from hobbyists and from students using Impala to advance their
>> understanding of distributed databases. The project attracted an active
>> user community as well. We hope to continue to encourage contributions
>> from
>> these developers and community members and grow them into committers after
>> they have had time to continue their contributions.
>>
>> == Reliance on Salaried Developers ==
>>
>> Many of Impala’s initial set of committers work full-time on Impala, and
>> are paid to do so. However, as mentioned elsewhere, we anticipate growth
>> in
>> the developer community which we hope will include hobbyists and academics
>> who have an interested in distributed data systems.
>>
>> == An Excessive Fascination with the Apache Brand ==
>> Although we hope that Impala benefits from the Apache Brand, any reflected
>> goodwill to Cloudera as the contributing entity is not the goal of
>> establishing Impala as an Apache project. We will work with the Incubator
>> PMC and the PRC to ensure that the Apache Brand is respected.
>>
>> = Documentation =
>> Impala: A Modern, Open-Source SQL Engine for Hadoop (
>> http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf)
>>
>> Impala’s developer wiki (https://github.com/cloudera/Impala/wiki)
>>
>> Impala’s auto-generated API documentation (
>> http://impala.io/doc/html/index.html)
>>
>> = Initial Source =
>> Impala’s initial source contribution will come from
>> http://github.com/cloudera/Impala/.
>>
>> = External Dependencies =
>>
>> Impala depends upon a number of third-party libraries, which we list
>> below.
>> We intend to compile a LICENSE.txt file in the very short term (see
>> https://issues.cloudera.org/browse/IMPALA-2670).
>>
>>   * Google gflags (BSD)
>>   * Google glog (BSD)
>>   * Apache Thrift (Apache Software License v2.0)
>>   * Apache Commons (Apache Software License v2.0)
>>   * Apache Hadoop (Apache Software License v2.0)
>>   * Apache HBase (Apache Software License v2.0)
>>   * Apache Hive (Apache Software License v2.0)
>>   * Boost (Boost Software License)
>>   * OpenLdap (OpenLDAP Software License)
>>   * rapidjson (MIT)
>>   * Google RE2 (BSD-style)
>>   * lz4 (BSD)
>>   * snappy (BSD)
>>   * cyrus-sasl (CMU License)
>>   * Apache Avro (Apache Software License v2.0)
>>   * Cloudera squeasel (Apache Software License v2.0)
>>   * Apache htrace (Incubating) (Apache Software License v2.0)
>>   * Apache Sentry (Incubating) (Apache Software License v2.0)
>>   * Apache Shiro (Apache Software License v2.0)
>>   * Twitter Bootstrap (Apache Software License v2.0)
>>   * d3 (BSD)
>>   * LLVM (BSD-like)
>>
>> Build and test dependencies:
>>
>>   * ant (Apache Software License v2.0)
>>   * Apache Maven (Apache Software License v2.0)
>>   * cmake (BSD)
>>   * clang (BSD)
>>   * Google gtest (Apache Software License v2.0)
>>
>> = Required Resources =
>>
>> We request that following resources be created for the project to use:
>>
>> == Mailing lists ==
>>
>>   * private@impala.incubator.apache.org (moderated subscriptions)
>>   * commits@impala.incubator.apache.org
>>   * dev@impala.incubator.apache.org
>>   * issues@impala.incubator.apache.org
>>   * user@impala.incubator.apache.org
>>
>> == Git repository ==
>> https://git.apache.org/impala.git
>>
>> == JIRA instance ==
>> JIRA project IMPALA (IMPALA or IMP)
>>
>> == Other Resources ==
>> We hope to continue using Gerrit for our code review and commit workflow.
>> We are involved with discussions that the Kudu team at Cloudera have been
>> having with Jake Farrell to start discussions on how Gerrit can fit into
>> the ASF. We know that several other ASF projects or podlings are also
>> interested in Gerrit.
>>
>> If the Infrastructure team does not have the bandwidth to support gerrit,
>> we will continue to support our own instance of gerrit for Impala, and
>> make
>> the necessary integrations such that commits are properly authenticated
>> and
>> maintain sufficient provenance to uphold the ASF standards (e.g. via the
>> solution adopted by the AsterixDB podling).
>>
>> = Initial Committers =
>>
>>   * Tim Armstrong
>>   * Alex Behm
>>   * Taras Bobrovytsky
>>   * Casey Ching
>>   * Martin Grund
>>   * Daniel Hecht
>>   * Michael Ho
>>   * Matthew Jacobs
>>   * Ishaan Joshi
>>   * Lenni Kuff
>>   * Marcel Kornacker
>>   * Sailesh Mukil
>>   * Henry Robinson
>>   * John Russell
>>   * Dimitris Tsirogiannis
>>   * Skye Wanderman-Milne
>>   * Juan Yu
>>
>> == Affiliations ==
>> All: Cloudera Inc.
>>
>> = Sponsors =
>>
>> == Champion ==
>> Tom White
>>
>> == Nominated Mentors ==
>>   * Tom White (Cloudera)
>>   * Todd Lipcon (Cloudera)
>>   * Carl Steinbach (LinkedIn)
>>   * Brock Noland (StreamSets)
>>
>>
>> = Sponsoring Entity =
>> We ask that the Incubator PMC sponsor this proposal.
>>
>>
> --
> Jean-Baptiste Onofré
> jbonofre@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> For additional commands, e-mail: general-help@incubator.apache.org
>
>

Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.
+1 (binding)

Regards
JB

On 11/24/2015 10:03 PM, Henry Robinson wrote:
> Hi -
>
> The [DISCUSS] thread has been quiet for a few days, so I think there's been
> sufficient opportunity for discussion around our proposal to bring Impala
> to the ASF Incubator.
>
> I'd like to call a VOTE on that proposal, which is on the wiki at
> https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
> below.
>
> During the discussion period, the proposal has been amended to add Brock
> Noland as a new mentor, to add one missed committer from the list and to
> correct some issues with the dependency list.
>
> Please cast your votes as follows:
>
> [] +1, accept Impala into the Incubator
> [] +/-0, non-counted vote to express a disposition
> [] -1, do not accept Impala into the Incubator (please give your reason(s))
>
> As with the concurrent Kudu vote, I propose leaving the vote open for a
> full seven days (to close at Tuesday, December 1st at noon PST), due to the
> upcoming US holiday.
>
> Thanks,
> Henry
>
> --------
>
> = Abstract =
> Impala is a high-performance C++ and Java SQL query engine for data stored
> in Apache Hadoop-based clusters.
>
> = Proposal =
>
> We propose to contribute the Impala codebase and associated artifacts (e.g.
> documentation, web-site content etc.) to the Apache Software Foundation
> with the intent of forming a productive, meritocratic and open community
> around Impala’s continued development, according to the ‘Apache Way’.
>
> Cloudera owns several trademarks regarding Impala, and proposes to transfer
> ownership of those trademarks in full to the ASF.
>
> = Background =
> Engineers at Cloudera developed Impala and released it as an
> Apache-licensed open-source project in Fall 2012. Impala was written as a
> brand-new, modern C++ SQL engine targeted from the start for data stored in
> Apache Hadoop clusters.
>
> Impala’s most important benefit to users is high-performance, making it
> extremely appropriate for common enterprise analytic and business
> intelligence workloads. This is achieved by a number of software
> techniques, including: native support for data stored in HDFS and related
> filesystems, just-in-time compilation and optimization of individual query
> plans, high-performance C++ codebase and massively-parallel distributed
> architecture. In benchmarks, Impala is routinely amongst the very highest
> performing SQL query engines.
>
> = Rationale =
>
> Despite the exciting innovation in the so-called ‘big-data’ space, SQL
> remains by far the most common interface for interacting with data in both
> traditional warehouses and modern ‘big-data’ clusters. There is clearly a
> need, as evidenced by the eager adoption of Impala and other SQL engines in
> enterprise contexts, for a query engine that offers the familiar SQL
> interface, but that has been specifically designed to operate in massive,
> distributed clusters rather than in traditional, fixed-hardware,
> warehouse-specific deployments. Impala is one such query engine.
>
> We believe that the ASF is the right venue to foster an open-source
> community around Impala’s development. We expect that Impala will benefit
> from more productive collaboration with related Apache projects, and under
> the auspices of the ASF will attract talented contributors who will push
> Impala’s development forward at pace.
>
> We believe that the timing is right for Impala’s development to move
> wholesale to the ASF: Impala is well-established, has been Apache-licensed
> open-source for more than three years, and the core project is relatively
> stable. We are excited to see where an ASF-based community can take Impala
> from this strong starting point.
>
> = Initial Goals =
> Our initial goals are as follows:
>
>   * Establish ASF-compatible engineering practices and workflows
>   * Refactor and publish existing internal build scripts and test
> infrastructure, in order to make them usable by any community member.
>   * Transfer source code, documentation and associated artifacts to the ASF.
>   * Grow the user and developer communities
>
> = Current Status =
>
> Impala is developed as an Apache-licensed open-source project. The source
> code is available at http://github.com/cloudera/Impala, and developer
> documentation is at https://github.com/cloudera/Impala/wiki. The majority
> of commits to the project have come from Cloudera-employed developers, but
> we have accepted some contributions from individuals from other
> organizations.
>
> All code reviews are done via a public instance of the Gerrit review tool
> at http://gerrit.cloudera.org:8080/, and discussed on a public mailing
> list. All patches must be reviewed before they are accepted into the
> codebase, via a voting mechanism that is similar to that used on Apache
> projects such as Hadoop and HBase.
>
> Before a patch is committed, it must pass a suite of pre-commit tests.
> These tests are currently run on Cloudera’s internal infrastructure. One of
> our initial goals will be to work with the ASF Infrastructure team to find
> a way to run these tests in an acceptable way on publicly accessible
> machines.
>
> Issues are tracked in JIRA at https://issues.cloudera.org/projects/IMPALA,
> in a way that is extremely similar to existing practices at other ASF
> projects.
>
> = Meritocracy =
>
> We understand the central importance of meritocracy to the Apache Way. We
> will work to establish a welcoming, fair and meritocratic community, in
> part by expanding the set of committers on the project. Although Impala’s
> committer list will initially be dominated by members of the Impala
> engineering team at Cloudera, we look forward to growing a rich user and
> developer community.
>
> = Community =
> Impala has a strong user community (see
> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user), and a
> growing developer community (see
> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev). We wish
> to attract more developers to the project, and we believe that the ASF’s
> open and meritocratic philosophy will help us with this. We note the
> success of other, similar projects already part of the ASF.
>
> = Core Developers =
> Most - but not all - of Impala’s core developers are not currently
> affiliated with the ASF, and will require new ICLAs.
>
> = Alignment =
> Impala is related to several other Apache projects:
>
>   * Data that is read by Impala is very often stored in Apache Hadoop
> clusters powered by the HDFS filesystem.
>   * Impala can also read data stored in Apache HBase
>   * Metadata for databases, tables and so on is read by Impala from Apache
> Hive.
>   * The preferred data format for HDFS-based tables is Apache Parquet, and
> Apache Avro is also a supported data format.
>   * Impala is closely integrated with Kudu, which is also being proposed to
> the Incubator.
>   * Impala uses Apache Thrift as its RPC and serialization framework of
> choice.
>
> = Known Risks =
>
> == Orphaned Products ==
> Impala is used by most of Cloudera’s customers, and Cloudera remains
> committed to developing and supporting the project. Cloudera has a strong
> track record in standing behind projects that were contributed to the ASF
> by its employees, including Apache Flume, Apache Sqoop, and others. Other
> companies both ship and support Impala, lending credence to the idea that
> Impala is not at risk of being suddenly orphaned.
>
> == Inexperience with Open Source ==
> Although all committers on the initial list have significant experience
> with at least one open-source project - namely Impala - fewer have much
> experience with ASF-based software projects as contributors and community
> members. However, with the guidance of our mentors, committers who do have
> ASF experience, and time to learn during Incubation, we are confident that
> the project can be run in accordance with Apache principles on an ongoing
> basis.
>
> == Homogeneous Developers ==
>
> The initial committers are employees of Cloudera.
>
> The project has received some contributions from developers outside of
> Cloudera, from individuals belonging to organizations such as Intel and
> Google, from hobbyists and from students using Impala to advance their
> understanding of distributed databases. The project attracted an active
> user community as well. We hope to continue to encourage contributions from
> these developers and community members and grow them into committers after
> they have had time to continue their contributions.
>
> == Reliance on Salaried Developers ==
>
> Many of Impala’s initial set of committers work full-time on Impala, and
> are paid to do so. However, as mentioned elsewhere, we anticipate growth in
> the developer community which we hope will include hobbyists and academics
> who have an interested in distributed data systems.
>
> == An Excessive Fascination with the Apache Brand ==
> Although we hope that Impala benefits from the Apache Brand, any reflected
> goodwill to Cloudera as the contributing entity is not the goal of
> establishing Impala as an Apache project. We will work with the Incubator
> PMC and the PRC to ensure that the Apache Brand is respected.
>
> = Documentation =
> Impala: A Modern, Open-Source SQL Engine for Hadoop (
> http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf)
>
> Impala’s developer wiki (https://github.com/cloudera/Impala/wiki)
>
> Impala’s auto-generated API documentation (
> http://impala.io/doc/html/index.html)
>
> = Initial Source =
> Impala’s initial source contribution will come from
> http://github.com/cloudera/Impala/.
>
> = External Dependencies =
>
> Impala depends upon a number of third-party libraries, which we list below.
> We intend to compile a LICENSE.txt file in the very short term (see
> https://issues.cloudera.org/browse/IMPALA-2670).
>
>   * Google gflags (BSD)
>   * Google glog (BSD)
>   * Apache Thrift (Apache Software License v2.0)
>   * Apache Commons (Apache Software License v2.0)
>   * Apache Hadoop (Apache Software License v2.0)
>   * Apache HBase (Apache Software License v2.0)
>   * Apache Hive (Apache Software License v2.0)
>   * Boost (Boost Software License)
>   * OpenLdap (OpenLDAP Software License)
>   * rapidjson (MIT)
>   * Google RE2 (BSD-style)
>   * lz4 (BSD)
>   * snappy (BSD)
>   * cyrus-sasl (CMU License)
>   * Apache Avro (Apache Software License v2.0)
>   * Cloudera squeasel (Apache Software License v2.0)
>   * Apache htrace (Incubating) (Apache Software License v2.0)
>   * Apache Sentry (Incubating) (Apache Software License v2.0)
>   * Apache Shiro (Apache Software License v2.0)
>   * Twitter Bootstrap (Apache Software License v2.0)
>   * d3 (BSD)
>   * LLVM (BSD-like)
>
> Build and test dependencies:
>
>   * ant (Apache Software License v2.0)
>   * Apache Maven (Apache Software License v2.0)
>   * cmake (BSD)
>   * clang (BSD)
>   * Google gtest (Apache Software License v2.0)
>
> = Required Resources =
>
> We request that following resources be created for the project to use:
>
> == Mailing lists ==
>
>   * private@impala.incubator.apache.org (moderated subscriptions)
>   * commits@impala.incubator.apache.org
>   * dev@impala.incubator.apache.org
>   * issues@impala.incubator.apache.org
>   * user@impala.incubator.apache.org
>
> == Git repository ==
> https://git.apache.org/impala.git
>
> == JIRA instance ==
> JIRA project IMPALA (IMPALA or IMP)
>
> == Other Resources ==
> We hope to continue using Gerrit for our code review and commit workflow.
> We are involved with discussions that the Kudu team at Cloudera have been
> having with Jake Farrell to start discussions on how Gerrit can fit into
> the ASF. We know that several other ASF projects or podlings are also
> interested in Gerrit.
>
> If the Infrastructure team does not have the bandwidth to support gerrit,
> we will continue to support our own instance of gerrit for Impala, and make
> the necessary integrations such that commits are properly authenticated and
> maintain sufficient provenance to uphold the ASF standards (e.g. via the
> solution adopted by the AsterixDB podling).
>
> = Initial Committers =
>
>   * Tim Armstrong
>   * Alex Behm
>   * Taras Bobrovytsky
>   * Casey Ching
>   * Martin Grund
>   * Daniel Hecht
>   * Michael Ho
>   * Matthew Jacobs
>   * Ishaan Joshi
>   * Lenni Kuff
>   * Marcel Kornacker
>   * Sailesh Mukil
>   * Henry Robinson
>   * John Russell
>   * Dimitris Tsirogiannis
>   * Skye Wanderman-Milne
>   * Juan Yu
>
> == Affiliations ==
> All: Cloudera Inc.
>
> = Sponsors =
>
> == Champion ==
> Tom White
>
> == Nominated Mentors ==
>   * Tom White (Cloudera)
>   * Todd Lipcon (Cloudera)
>   * Carl Steinbach (LinkedIn)
>   * Brock Noland (StreamSets)
>
>
> = Sponsoring Entity =
> We ask that the Incubator PMC sponsor this proposal.
>

-- 
Jean-Baptiste Onofré
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Jarek Jarcec Cecho <ja...@apache.org>.
[X] +1, accept Impala into the Incubator

(Binding)

Jarcec

> On Nov 24, 2015, at 1:03 PM, Henry Robinson <he...@cloudera.com> wrote:
> 
> Hi -
> 
> The [DISCUSS] thread has been quiet for a few days, so I think there's been
> sufficient opportunity for discussion around our proposal to bring Impala
> to the ASF Incubator.
> 
> I'd like to call a VOTE on that proposal, which is on the wiki at
> https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
> below.
> 
> During the discussion period, the proposal has been amended to add Brock
> Noland as a new mentor, to add one missed committer from the list and to
> correct some issues with the dependency list.
> 
> Please cast your votes as follows:
> 
> [] +1, accept Impala into the Incubator
> [] +/-0, non-counted vote to express a disposition
> [] -1, do not accept Impala into the Incubator (please give your reason(s))
> 
> As with the concurrent Kudu vote, I propose leaving the vote open for a
> full seven days (to close at Tuesday, December 1st at noon PST), due to the
> upcoming US holiday.
> 
> Thanks,
> Henry
> 
> --------
> 
> = Abstract =
> Impala is a high-performance C++ and Java SQL query engine for data stored
> in Apache Hadoop-based clusters.
> 
> = Proposal =
> 
> We propose to contribute the Impala codebase and associated artifacts (e.g.
> documentation, web-site content etc.) to the Apache Software Foundation
> with the intent of forming a productive, meritocratic and open community
> around Impala’s continued development, according to the ‘Apache Way’.
> 
> Cloudera owns several trademarks regarding Impala, and proposes to transfer
> ownership of those trademarks in full to the ASF.
> 
> = Background =
> Engineers at Cloudera developed Impala and released it as an
> Apache-licensed open-source project in Fall 2012. Impala was written as a
> brand-new, modern C++ SQL engine targeted from the start for data stored in
> Apache Hadoop clusters.
> 
> Impala’s most important benefit to users is high-performance, making it
> extremely appropriate for common enterprise analytic and business
> intelligence workloads. This is achieved by a number of software
> techniques, including: native support for data stored in HDFS and related
> filesystems, just-in-time compilation and optimization of individual query
> plans, high-performance C++ codebase and massively-parallel distributed
> architecture. In benchmarks, Impala is routinely amongst the very highest
> performing SQL query engines.
> 
> = Rationale =
> 
> Despite the exciting innovation in the so-called ‘big-data’ space, SQL
> remains by far the most common interface for interacting with data in both
> traditional warehouses and modern ‘big-data’ clusters. There is clearly a
> need, as evidenced by the eager adoption of Impala and other SQL engines in
> enterprise contexts, for a query engine that offers the familiar SQL
> interface, but that has been specifically designed to operate in massive,
> distributed clusters rather than in traditional, fixed-hardware,
> warehouse-specific deployments. Impala is one such query engine.
> 
> We believe that the ASF is the right venue to foster an open-source
> community around Impala’s development. We expect that Impala will benefit
> from more productive collaboration with related Apache projects, and under
> the auspices of the ASF will attract talented contributors who will push
> Impala’s development forward at pace.
> 
> We believe that the timing is right for Impala’s development to move
> wholesale to the ASF: Impala is well-established, has been Apache-licensed
> open-source for more than three years, and the core project is relatively
> stable. We are excited to see where an ASF-based community can take Impala
> from this strong starting point.
> 
> = Initial Goals =
> Our initial goals are as follows:
> 
> * Establish ASF-compatible engineering practices and workflows
> * Refactor and publish existing internal build scripts and test
> infrastructure, in order to make them usable by any community member.
> * Transfer source code, documentation and associated artifacts to the ASF.
> * Grow the user and developer communities
> 
> = Current Status =
> 
> Impala is developed as an Apache-licensed open-source project. The source
> code is available at http://github.com/cloudera/Impala, and developer
> documentation is at https://github.com/cloudera/Impala/wiki. The majority
> of commits to the project have come from Cloudera-employed developers, but
> we have accepted some contributions from individuals from other
> organizations.
> 
> All code reviews are done via a public instance of the Gerrit review tool
> at http://gerrit.cloudera.org:8080/, and discussed on a public mailing
> list. All patches must be reviewed before they are accepted into the
> codebase, via a voting mechanism that is similar to that used on Apache
> projects such as Hadoop and HBase.
> 
> Before a patch is committed, it must pass a suite of pre-commit tests.
> These tests are currently run on Cloudera’s internal infrastructure. One of
> our initial goals will be to work with the ASF Infrastructure team to find
> a way to run these tests in an acceptable way on publicly accessible
> machines.
> 
> Issues are tracked in JIRA at https://issues.cloudera.org/projects/IMPALA,
> in a way that is extremely similar to existing practices at other ASF
> projects.
> 
> = Meritocracy =
> 
> We understand the central importance of meritocracy to the Apache Way. We
> will work to establish a welcoming, fair and meritocratic community, in
> part by expanding the set of committers on the project. Although Impala’s
> committer list will initially be dominated by members of the Impala
> engineering team at Cloudera, we look forward to growing a rich user and
> developer community.
> 
> = Community =
> Impala has a strong user community (see
> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user), and a
> growing developer community (see
> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev). We wish
> to attract more developers to the project, and we believe that the ASF’s
> open and meritocratic philosophy will help us with this. We note the
> success of other, similar projects already part of the ASF.
> 
> = Core Developers =
> Most - but not all - of Impala’s core developers are not currently
> affiliated with the ASF, and will require new ICLAs.
> 
> = Alignment =
> Impala is related to several other Apache projects:
> 
> * Data that is read by Impala is very often stored in Apache Hadoop
> clusters powered by the HDFS filesystem.
> * Impala can also read data stored in Apache HBase
> * Metadata for databases, tables and so on is read by Impala from Apache
> Hive.
> * The preferred data format for HDFS-based tables is Apache Parquet, and
> Apache Avro is also a supported data format.
> * Impala is closely integrated with Kudu, which is also being proposed to
> the Incubator.
> * Impala uses Apache Thrift as its RPC and serialization framework of
> choice.
> 
> = Known Risks =
> 
> == Orphaned Products ==
> Impala is used by most of Cloudera’s customers, and Cloudera remains
> committed to developing and supporting the project. Cloudera has a strong
> track record in standing behind projects that were contributed to the ASF
> by its employees, including Apache Flume, Apache Sqoop, and others. Other
> companies both ship and support Impala, lending credence to the idea that
> Impala is not at risk of being suddenly orphaned.
> 
> == Inexperience with Open Source ==
> Although all committers on the initial list have significant experience
> with at least one open-source project - namely Impala - fewer have much
> experience with ASF-based software projects as contributors and community
> members. However, with the guidance of our mentors, committers who do have
> ASF experience, and time to learn during Incubation, we are confident that
> the project can be run in accordance with Apache principles on an ongoing
> basis.
> 
> == Homogeneous Developers ==
> 
> The initial committers are employees of Cloudera.
> 
> The project has received some contributions from developers outside of
> Cloudera, from individuals belonging to organizations such as Intel and
> Google, from hobbyists and from students using Impala to advance their
> understanding of distributed databases. The project attracted an active
> user community as well. We hope to continue to encourage contributions from
> these developers and community members and grow them into committers after
> they have had time to continue their contributions.
> 
> == Reliance on Salaried Developers ==
> 
> Many of Impala’s initial set of committers work full-time on Impala, and
> are paid to do so. However, as mentioned elsewhere, we anticipate growth in
> the developer community which we hope will include hobbyists and academics
> who have an interested in distributed data systems.
> 
> == An Excessive Fascination with the Apache Brand ==
> Although we hope that Impala benefits from the Apache Brand, any reflected
> goodwill to Cloudera as the contributing entity is not the goal of
> establishing Impala as an Apache project. We will work with the Incubator
> PMC and the PRC to ensure that the Apache Brand is respected.
> 
> = Documentation =
> Impala: A Modern, Open-Source SQL Engine for Hadoop (
> http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf)
> 
> Impala’s developer wiki (https://github.com/cloudera/Impala/wiki)
> 
> Impala’s auto-generated API documentation (
> http://impala.io/doc/html/index.html)
> 
> = Initial Source =
> Impala’s initial source contribution will come from
> http://github.com/cloudera/Impala/.
> 
> = External Dependencies =
> 
> Impala depends upon a number of third-party libraries, which we list below.
> We intend to compile a LICENSE.txt file in the very short term (see
> https://issues.cloudera.org/browse/IMPALA-2670).
> 
> * Google gflags (BSD)
> * Google glog (BSD)
> * Apache Thrift (Apache Software License v2.0)
> * Apache Commons (Apache Software License v2.0)
> * Apache Hadoop (Apache Software License v2.0)
> * Apache HBase (Apache Software License v2.0)
> * Apache Hive (Apache Software License v2.0)
> * Boost (Boost Software License)
> * OpenLdap (OpenLDAP Software License)
> * rapidjson (MIT)
> * Google RE2 (BSD-style)
> * lz4 (BSD)
> * snappy (BSD)
> * cyrus-sasl (CMU License)
> * Apache Avro (Apache Software License v2.0)
> * Cloudera squeasel (Apache Software License v2.0)
> * Apache htrace (Incubating) (Apache Software License v2.0)
> * Apache Sentry (Incubating) (Apache Software License v2.0)
> * Apache Shiro (Apache Software License v2.0)
> * Twitter Bootstrap (Apache Software License v2.0)
> * d3 (BSD)
> * LLVM (BSD-like)
> 
> Build and test dependencies:
> 
> * ant (Apache Software License v2.0)
> * Apache Maven (Apache Software License v2.0)
> * cmake (BSD)
> * clang (BSD)
> * Google gtest (Apache Software License v2.0)
> 
> = Required Resources =
> 
> We request that following resources be created for the project to use:
> 
> == Mailing lists ==
> 
> * private@impala.incubator.apache.org (moderated subscriptions)
> * commits@impala.incubator.apache.org
> * dev@impala.incubator.apache.org
> * issues@impala.incubator.apache.org
> * user@impala.incubator.apache.org
> 
> == Git repository ==
> https://git.apache.org/impala.git
> 
> == JIRA instance ==
> JIRA project IMPALA (IMPALA or IMP)
> 
> == Other Resources ==
> We hope to continue using Gerrit for our code review and commit workflow.
> We are involved with discussions that the Kudu team at Cloudera have been
> having with Jake Farrell to start discussions on how Gerrit can fit into
> the ASF. We know that several other ASF projects or podlings are also
> interested in Gerrit.
> 
> If the Infrastructure team does not have the bandwidth to support gerrit,
> we will continue to support our own instance of gerrit for Impala, and make
> the necessary integrations such that commits are properly authenticated and
> maintain sufficient provenance to uphold the ASF standards (e.g. via the
> solution adopted by the AsterixDB podling).
> 
> = Initial Committers =
> 
> * Tim Armstrong
> * Alex Behm
> * Taras Bobrovytsky
> * Casey Ching
> * Martin Grund
> * Daniel Hecht
> * Michael Ho
> * Matthew Jacobs
> * Ishaan Joshi
> * Lenni Kuff
> * Marcel Kornacker
> * Sailesh Mukil
> * Henry Robinson
> * John Russell
> * Dimitris Tsirogiannis
> * Skye Wanderman-Milne
> * Juan Yu
> 
> == Affiliations ==
> All: Cloudera Inc.
> 
> = Sponsors =
> 
> == Champion ==
> Tom White
> 
> == Nominated Mentors ==
> * Tom White (Cloudera)
> * Todd Lipcon (Cloudera)
> * Carl Steinbach (LinkedIn)
> * Brock Noland (StreamSets)
> 
> 
> = Sponsoring Entity =
> We ask that the Incubator PMC sponsor this proposal.


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Hitesh Shah <hi...@apache.org>.
+1 (binding)

— Hitesh

On Nov 24, 2015, at 1:03 PM, Henry Robinson <he...@cloudera.com> wrote:

> Hi -
> 
> The [DISCUSS] thread has been quiet for a few days, so I think there's been
> sufficient opportunity for discussion around our proposal to bring Impala
> to the ASF Incubator.
> 
> I'd like to call a VOTE on that proposal, which is on the wiki at
> https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
> below.
> 
> During the discussion period, the proposal has been amended to add Brock
> Noland as a new mentor, to add one missed committer from the list and to
> correct some issues with the dependency list.
> 
> Please cast your votes as follows:
> 
> [] +1, accept Impala into the Incubator
> [] +/-0, non-counted vote to express a disposition
> [] -1, do not accept Impala into the Incubator (please give your reason(s))
> 
> As with the concurrent Kudu vote, I propose leaving the vote open for a
> full seven days (to close at Tuesday, December 1st at noon PST), due to the
> upcoming US holiday.
> 
> Thanks,
> Henry
> 
> --------
> 
> = Abstract =
> Impala is a high-performance C++ and Java SQL query engine for data stored
> in Apache Hadoop-based clusters.
> 
> = Proposal =
> 
> We propose to contribute the Impala codebase and associated artifacts (e.g.
> documentation, web-site content etc.) to the Apache Software Foundation
> with the intent of forming a productive, meritocratic and open community
> around Impala’s continued development, according to the ‘Apache Way’.
> 
> Cloudera owns several trademarks regarding Impala, and proposes to transfer
> ownership of those trademarks in full to the ASF.
> 
> = Background =
> Engineers at Cloudera developed Impala and released it as an
> Apache-licensed open-source project in Fall 2012. Impala was written as a
> brand-new, modern C++ SQL engine targeted from the start for data stored in
> Apache Hadoop clusters.
> 
> Impala’s most important benefit to users is high-performance, making it
> extremely appropriate for common enterprise analytic and business
> intelligence workloads. This is achieved by a number of software
> techniques, including: native support for data stored in HDFS and related
> filesystems, just-in-time compilation and optimization of individual query
> plans, high-performance C++ codebase and massively-parallel distributed
> architecture. In benchmarks, Impala is routinely amongst the very highest
> performing SQL query engines.
> 
> = Rationale =
> 
> Despite the exciting innovation in the so-called ‘big-data’ space, SQL
> remains by far the most common interface for interacting with data in both
> traditional warehouses and modern ‘big-data’ clusters. There is clearly a
> need, as evidenced by the eager adoption of Impala and other SQL engines in
> enterprise contexts, for a query engine that offers the familiar SQL
> interface, but that has been specifically designed to operate in massive,
> distributed clusters rather than in traditional, fixed-hardware,
> warehouse-specific deployments. Impala is one such query engine.
> 
> We believe that the ASF is the right venue to foster an open-source
> community around Impala’s development. We expect that Impala will benefit
> from more productive collaboration with related Apache projects, and under
> the auspices of the ASF will attract talented contributors who will push
> Impala’s development forward at pace.
> 
> We believe that the timing is right for Impala’s development to move
> wholesale to the ASF: Impala is well-established, has been Apache-licensed
> open-source for more than three years, and the core project is relatively
> stable. We are excited to see where an ASF-based community can take Impala
> from this strong starting point.
> 
> = Initial Goals =
> Our initial goals are as follows:
> 
> * Establish ASF-compatible engineering practices and workflows
> * Refactor and publish existing internal build scripts and test
> infrastructure, in order to make them usable by any community member.
> * Transfer source code, documentation and associated artifacts to the ASF.
> * Grow the user and developer communities
> 
> = Current Status =
> 
> Impala is developed as an Apache-licensed open-source project. The source
> code is available at http://github.com/cloudera/Impala, and developer
> documentation is at https://github.com/cloudera/Impala/wiki. The majority
> of commits to the project have come from Cloudera-employed developers, but
> we have accepted some contributions from individuals from other
> organizations.
> 
> All code reviews are done via a public instance of the Gerrit review tool
> at http://gerrit.cloudera.org:8080/, and discussed on a public mailing
> list. All patches must be reviewed before they are accepted into the
> codebase, via a voting mechanism that is similar to that used on Apache
> projects such as Hadoop and HBase.
> 
> Before a patch is committed, it must pass a suite of pre-commit tests.
> These tests are currently run on Cloudera’s internal infrastructure. One of
> our initial goals will be to work with the ASF Infrastructure team to find
> a way to run these tests in an acceptable way on publicly accessible
> machines.
> 
> Issues are tracked in JIRA at https://issues.cloudera.org/projects/IMPALA,
> in a way that is extremely similar to existing practices at other ASF
> projects.
> 
> = Meritocracy =
> 
> We understand the central importance of meritocracy to the Apache Way. We
> will work to establish a welcoming, fair and meritocratic community, in
> part by expanding the set of committers on the project. Although Impala’s
> committer list will initially be dominated by members of the Impala
> engineering team at Cloudera, we look forward to growing a rich user and
> developer community.
> 
> = Community =
> Impala has a strong user community (see
> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user), and a
> growing developer community (see
> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev). We wish
> to attract more developers to the project, and we believe that the ASF’s
> open and meritocratic philosophy will help us with this. We note the
> success of other, similar projects already part of the ASF.
> 
> = Core Developers =
> Most - but not all - of Impala’s core developers are not currently
> affiliated with the ASF, and will require new ICLAs.
> 
> = Alignment =
> Impala is related to several other Apache projects:
> 
> * Data that is read by Impala is very often stored in Apache Hadoop
> clusters powered by the HDFS filesystem.
> * Impala can also read data stored in Apache HBase
> * Metadata for databases, tables and so on is read by Impala from Apache
> Hive.
> * The preferred data format for HDFS-based tables is Apache Parquet, and
> Apache Avro is also a supported data format.
> * Impala is closely integrated with Kudu, which is also being proposed to
> the Incubator.
> * Impala uses Apache Thrift as its RPC and serialization framework of
> choice.
> 
> = Known Risks =
> 
> == Orphaned Products ==
> Impala is used by most of Cloudera’s customers, and Cloudera remains
> committed to developing and supporting the project. Cloudera has a strong
> track record in standing behind projects that were contributed to the ASF
> by its employees, including Apache Flume, Apache Sqoop, and others. Other
> companies both ship and support Impala, lending credence to the idea that
> Impala is not at risk of being suddenly orphaned.
> 
> == Inexperience with Open Source ==
> Although all committers on the initial list have significant experience
> with at least one open-source project - namely Impala - fewer have much
> experience with ASF-based software projects as contributors and community
> members. However, with the guidance of our mentors, committers who do have
> ASF experience, and time to learn during Incubation, we are confident that
> the project can be run in accordance with Apache principles on an ongoing
> basis.
> 
> == Homogeneous Developers ==
> 
> The initial committers are employees of Cloudera.
> 
> The project has received some contributions from developers outside of
> Cloudera, from individuals belonging to organizations such as Intel and
> Google, from hobbyists and from students using Impala to advance their
> understanding of distributed databases. The project attracted an active
> user community as well. We hope to continue to encourage contributions from
> these developers and community members and grow them into committers after
> they have had time to continue their contributions.
> 
> == Reliance on Salaried Developers ==
> 
> Many of Impala’s initial set of committers work full-time on Impala, and
> are paid to do so. However, as mentioned elsewhere, we anticipate growth in
> the developer community which we hope will include hobbyists and academics
> who have an interested in distributed data systems.
> 
> == An Excessive Fascination with the Apache Brand ==
> Although we hope that Impala benefits from the Apache Brand, any reflected
> goodwill to Cloudera as the contributing entity is not the goal of
> establishing Impala as an Apache project. We will work with the Incubator
> PMC and the PRC to ensure that the Apache Brand is respected.
> 
> = Documentation =
> Impala: A Modern, Open-Source SQL Engine for Hadoop (
> http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf)
> 
> Impala’s developer wiki (https://github.com/cloudera/Impala/wiki)
> 
> Impala’s auto-generated API documentation (
> http://impala.io/doc/html/index.html)
> 
> = Initial Source =
> Impala’s initial source contribution will come from
> http://github.com/cloudera/Impala/.
> 
> = External Dependencies =
> 
> Impala depends upon a number of third-party libraries, which we list below.
> We intend to compile a LICENSE.txt file in the very short term (see
> https://issues.cloudera.org/browse/IMPALA-2670).
> 
> * Google gflags (BSD)
> * Google glog (BSD)
> * Apache Thrift (Apache Software License v2.0)
> * Apache Commons (Apache Software License v2.0)
> * Apache Hadoop (Apache Software License v2.0)
> * Apache HBase (Apache Software License v2.0)
> * Apache Hive (Apache Software License v2.0)
> * Boost (Boost Software License)
> * OpenLdap (OpenLDAP Software License)
> * rapidjson (MIT)
> * Google RE2 (BSD-style)
> * lz4 (BSD)
> * snappy (BSD)
> * cyrus-sasl (CMU License)
> * Apache Avro (Apache Software License v2.0)
> * Cloudera squeasel (Apache Software License v2.0)
> * Apache htrace (Incubating) (Apache Software License v2.0)
> * Apache Sentry (Incubating) (Apache Software License v2.0)
> * Apache Shiro (Apache Software License v2.0)
> * Twitter Bootstrap (Apache Software License v2.0)
> * d3 (BSD)
> * LLVM (BSD-like)
> 
> Build and test dependencies:
> 
> * ant (Apache Software License v2.0)
> * Apache Maven (Apache Software License v2.0)
> * cmake (BSD)
> * clang (BSD)
> * Google gtest (Apache Software License v2.0)
> 
> = Required Resources =
> 
> We request that following resources be created for the project to use:
> 
> == Mailing lists ==
> 
> * private@impala.incubator.apache.org (moderated subscriptions)
> * commits@impala.incubator.apache.org
> * dev@impala.incubator.apache.org
> * issues@impala.incubator.apache.org
> * user@impala.incubator.apache.org
> 
> == Git repository ==
> https://git.apache.org/impala.git
> 
> == JIRA instance ==
> JIRA project IMPALA (IMPALA or IMP)
> 
> == Other Resources ==
> We hope to continue using Gerrit for our code review and commit workflow.
> We are involved with discussions that the Kudu team at Cloudera have been
> having with Jake Farrell to start discussions on how Gerrit can fit into
> the ASF. We know that several other ASF projects or podlings are also
> interested in Gerrit.
> 
> If the Infrastructure team does not have the bandwidth to support gerrit,
> we will continue to support our own instance of gerrit for Impala, and make
> the necessary integrations such that commits are properly authenticated and
> maintain sufficient provenance to uphold the ASF standards (e.g. via the
> solution adopted by the AsterixDB podling).
> 
> = Initial Committers =
> 
> * Tim Armstrong
> * Alex Behm
> * Taras Bobrovytsky
> * Casey Ching
> * Martin Grund
> * Daniel Hecht
> * Michael Ho
> * Matthew Jacobs
> * Ishaan Joshi
> * Lenni Kuff
> * Marcel Kornacker
> * Sailesh Mukil
> * Henry Robinson
> * John Russell
> * Dimitris Tsirogiannis
> * Skye Wanderman-Milne
> * Juan Yu
> 
> == Affiliations ==
> All: Cloudera Inc.
> 
> = Sponsors =
> 
> == Champion ==
> Tom White
> 
> == Nominated Mentors ==
> * Tom White (Cloudera)
> * Todd Lipcon (Cloudera)
> * Carl Steinbach (LinkedIn)
> * Brock Noland (StreamSets)
> 
> 
> = Sponsoring Entity =
> We ask that the Incubator PMC sponsor this proposal.


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Julian Hyde <jh...@apache.org>.
+1 (binding)

> On Nov 26, 2015, at 11:50 AM, Konstantin Boudnik <co...@apache.org> wrote:
> 
> Come to think of it a bit more, yes I am not satisfied with the outcome of
> the CTR/RTC exchange in the project.
> 
> Hence changing my vote to
> -1 [binding]
> 
> On Thu, Nov 26, 2015 at 11:47AM, Konstantin Boudnik wrote:
>> -0 [binding]
>> 
>> On Tue, Nov 24, 2015 at 01:03PM, Henry Robinson wrote:
>>> Hi -
>>> 
>>> The [DISCUSS] thread has been quiet for a few days, so I think there's been
>>> sufficient opportunity for discussion around our proposal to bring Impala
>>> to the ASF Incubator.
>>> 
>>> I'd like to call a VOTE on that proposal, which is on the wiki at
>>> https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
>>> below.
>>> 
>>> During the discussion period, the proposal has been amended to add Brock
>>> Noland as a new mentor, to add one missed committer from the list and to
>>> correct some issues with the dependency list.
>>> 
>>> Please cast your votes as follows:
>>> 
>>> [] +1, accept Impala into the Incubator
>>> [] +/-0, non-counted vote to express a disposition
>>> [] -1, do not accept Impala into the Incubator (please give your reason(s))
>>> 
>>> As with the concurrent Kudu vote, I propose leaving the vote open for a
>>> full seven days (to close at Tuesday, December 1st at noon PST), due to the
>>> upcoming US holiday.
>>> 
>>> Thanks,
>>> Henry
>>> 
>>> --------
>>> 
>>> = Abstract =
>>> Impala is a high-performance C++ and Java SQL query engine for data stored
>>> in Apache Hadoop-based clusters.
>>> 
>>> = Proposal =
>>> 
>>> We propose to contribute the Impala codebase and associated artifacts (e.g.
>>> documentation, web-site content etc.) to the Apache Software Foundation
>>> with the intent of forming a productive, meritocratic and open community
>>> around Impala’s continued development, according to the ‘Apache Way’.
>>> 
>>> Cloudera owns several trademarks regarding Impala, and proposes to transfer
>>> ownership of those trademarks in full to the ASF.
>>> 
>>> = Background =
>>> Engineers at Cloudera developed Impala and released it as an
>>> Apache-licensed open-source project in Fall 2012. Impala was written as a
>>> brand-new, modern C++ SQL engine targeted from the start for data stored in
>>> Apache Hadoop clusters.
>>> 
>>> Impala’s most important benefit to users is high-performance, making it
>>> extremely appropriate for common enterprise analytic and business
>>> intelligence workloads. This is achieved by a number of software
>>> techniques, including: native support for data stored in HDFS and related
>>> filesystems, just-in-time compilation and optimization of individual query
>>> plans, high-performance C++ codebase and massively-parallel distributed
>>> architecture. In benchmarks, Impala is routinely amongst the very highest
>>> performing SQL query engines.
>>> 
>>> = Rationale =
>>> 
>>> Despite the exciting innovation in the so-called ‘big-data’ space, SQL
>>> remains by far the most common interface for interacting with data in both
>>> traditional warehouses and modern ‘big-data’ clusters. There is clearly a
>>> need, as evidenced by the eager adoption of Impala and other SQL engines in
>>> enterprise contexts, for a query engine that offers the familiar SQL
>>> interface, but that has been specifically designed to operate in massive,
>>> distributed clusters rather than in traditional, fixed-hardware,
>>> warehouse-specific deployments. Impala is one such query engine.
>>> 
>>> We believe that the ASF is the right venue to foster an open-source
>>> community around Impala’s development. We expect that Impala will benefit
>>> from more productive collaboration with related Apache projects, and under
>>> the auspices of the ASF will attract talented contributors who will push
>>> Impala’s development forward at pace.
>>> 
>>> We believe that the timing is right for Impala’s development to move
>>> wholesale to the ASF: Impala is well-established, has been Apache-licensed
>>> open-source for more than three years, and the core project is relatively
>>> stable. We are excited to see where an ASF-based community can take Impala
>>> from this strong starting point.
>>> 
>>> = Initial Goals =
>>> Our initial goals are as follows:
>>> 
>>> * Establish ASF-compatible engineering practices and workflows
>>> * Refactor and publish existing internal build scripts and test
>>> infrastructure, in order to make them usable by any community member.
>>> * Transfer source code, documentation and associated artifacts to the ASF.
>>> * Grow the user and developer communities
>>> 
>>> = Current Status =
>>> 
>>> Impala is developed as an Apache-licensed open-source project. The source
>>> code is available at http://github.com/cloudera/Impala, and developer
>>> documentation is at https://github.com/cloudera/Impala/wiki. The majority
>>> of commits to the project have come from Cloudera-employed developers, but
>>> we have accepted some contributions from individuals from other
>>> organizations.
>>> 
>>> All code reviews are done via a public instance of the Gerrit review tool
>>> at http://gerrit.cloudera.org:8080/, and discussed on a public mailing
>>> list. All patches must be reviewed before they are accepted into the
>>> codebase, via a voting mechanism that is similar to that used on Apache
>>> projects such as Hadoop and HBase.
>>> 
>>> Before a patch is committed, it must pass a suite of pre-commit tests.
>>> These tests are currently run on Cloudera’s internal infrastructure. One of
>>> our initial goals will be to work with the ASF Infrastructure team to find
>>> a way to run these tests in an acceptable way on publicly accessible
>>> machines.
>>> 
>>> Issues are tracked in JIRA at https://issues.cloudera.org/projects/IMPALA,
>>> in a way that is extremely similar to existing practices at other ASF
>>> projects.
>>> 
>>> = Meritocracy =
>>> 
>>> We understand the central importance of meritocracy to the Apache Way. We
>>> will work to establish a welcoming, fair and meritocratic community, in
>>> part by expanding the set of committers on the project. Although Impala’s
>>> committer list will initially be dominated by members of the Impala
>>> engineering team at Cloudera, we look forward to growing a rich user and
>>> developer community.
>>> 
>>> = Community =
>>> Impala has a strong user community (see
>>> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user), and a
>>> growing developer community (see
>>> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev). We wish
>>> to attract more developers to the project, and we believe that the ASF’s
>>> open and meritocratic philosophy will help us with this. We note the
>>> success of other, similar projects already part of the ASF.
>>> 
>>> = Core Developers =
>>> Most - but not all - of Impala’s core developers are not currently
>>> affiliated with the ASF, and will require new ICLAs.
>>> 
>>> = Alignment =
>>> Impala is related to several other Apache projects:
>>> 
>>> * Data that is read by Impala is very often stored in Apache Hadoop
>>> clusters powered by the HDFS filesystem.
>>> * Impala can also read data stored in Apache HBase
>>> * Metadata for databases, tables and so on is read by Impala from Apache
>>> Hive.
>>> * The preferred data format for HDFS-based tables is Apache Parquet, and
>>> Apache Avro is also a supported data format.
>>> * Impala is closely integrated with Kudu, which is also being proposed to
>>> the Incubator.
>>> * Impala uses Apache Thrift as its RPC and serialization framework of
>>> choice.
>>> 
>>> = Known Risks =
>>> 
>>> == Orphaned Products ==
>>> Impala is used by most of Cloudera’s customers, and Cloudera remains
>>> committed to developing and supporting the project. Cloudera has a strong
>>> track record in standing behind projects that were contributed to the ASF
>>> by its employees, including Apache Flume, Apache Sqoop, and others. Other
>>> companies both ship and support Impala, lending credence to the idea that
>>> Impala is not at risk of being suddenly orphaned.
>>> 
>>> == Inexperience with Open Source ==
>>> Although all committers on the initial list have significant experience
>>> with at least one open-source project - namely Impala - fewer have much
>>> experience with ASF-based software projects as contributors and community
>>> members. However, with the guidance of our mentors, committers who do have
>>> ASF experience, and time to learn during Incubation, we are confident that
>>> the project can be run in accordance with Apache principles on an ongoing
>>> basis.
>>> 
>>> == Homogeneous Developers ==
>>> 
>>> The initial committers are employees of Cloudera.
>>> 
>>> The project has received some contributions from developers outside of
>>> Cloudera, from individuals belonging to organizations such as Intel and
>>> Google, from hobbyists and from students using Impala to advance their
>>> understanding of distributed databases. The project attracted an active
>>> user community as well. We hope to continue to encourage contributions from
>>> these developers and community members and grow them into committers after
>>> they have had time to continue their contributions.
>>> 
>>> == Reliance on Salaried Developers ==
>>> 
>>> Many of Impala’s initial set of committers work full-time on Impala, and
>>> are paid to do so. However, as mentioned elsewhere, we anticipate growth in
>>> the developer community which we hope will include hobbyists and academics
>>> who have an interested in distributed data systems.
>>> 
>>> == An Excessive Fascination with the Apache Brand ==
>>> Although we hope that Impala benefits from the Apache Brand, any reflected
>>> goodwill to Cloudera as the contributing entity is not the goal of
>>> establishing Impala as an Apache project. We will work with the Incubator
>>> PMC and the PRC to ensure that the Apache Brand is respected.
>>> 
>>> = Documentation =
>>> Impala: A Modern, Open-Source SQL Engine for Hadoop (
>>> http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf)
>>> 
>>> Impala’s developer wiki (https://github.com/cloudera/Impala/wiki)
>>> 
>>> Impala’s auto-generated API documentation (
>>> http://impala.io/doc/html/index.html)
>>> 
>>> = Initial Source =
>>> Impala’s initial source contribution will come from
>>> http://github.com/cloudera/Impala/.
>>> 
>>> = External Dependencies =
>>> 
>>> Impala depends upon a number of third-party libraries, which we list below.
>>> We intend to compile a LICENSE.txt file in the very short term (see
>>> https://issues.cloudera.org/browse/IMPALA-2670).
>>> 
>>> * Google gflags (BSD)
>>> * Google glog (BSD)
>>> * Apache Thrift (Apache Software License v2.0)
>>> * Apache Commons (Apache Software License v2.0)
>>> * Apache Hadoop (Apache Software License v2.0)
>>> * Apache HBase (Apache Software License v2.0)
>>> * Apache Hive (Apache Software License v2.0)
>>> * Boost (Boost Software License)
>>> * OpenLdap (OpenLDAP Software License)
>>> * rapidjson (MIT)
>>> * Google RE2 (BSD-style)
>>> * lz4 (BSD)
>>> * snappy (BSD)
>>> * cyrus-sasl (CMU License)
>>> * Apache Avro (Apache Software License v2.0)
>>> * Cloudera squeasel (Apache Software License v2.0)
>>> * Apache htrace (Incubating) (Apache Software License v2.0)
>>> * Apache Sentry (Incubating) (Apache Software License v2.0)
>>> * Apache Shiro (Apache Software License v2.0)
>>> * Twitter Bootstrap (Apache Software License v2.0)
>>> * d3 (BSD)
>>> * LLVM (BSD-like)
>>> 
>>> Build and test dependencies:
>>> 
>>> * ant (Apache Software License v2.0)
>>> * Apache Maven (Apache Software License v2.0)
>>> * cmake (BSD)
>>> * clang (BSD)
>>> * Google gtest (Apache Software License v2.0)
>>> 
>>> = Required Resources =
>>> 
>>> We request that following resources be created for the project to use:
>>> 
>>> == Mailing lists ==
>>> 
>>> * private@impala.incubator.apache.org (moderated subscriptions)
>>> * commits@impala.incubator.apache.org
>>> * dev@impala.incubator.apache.org
>>> * issues@impala.incubator.apache.org
>>> * user@impala.incubator.apache.org
>>> 
>>> == Git repository ==
>>> https://git.apache.org/impala.git
>>> 
>>> == JIRA instance ==
>>> JIRA project IMPALA (IMPALA or IMP)
>>> 
>>> == Other Resources ==
>>> We hope to continue using Gerrit for our code review and commit workflow.
>>> We are involved with discussions that the Kudu team at Cloudera have been
>>> having with Jake Farrell to start discussions on how Gerrit can fit into
>>> the ASF. We know that several other ASF projects or podlings are also
>>> interested in Gerrit.
>>> 
>>> If the Infrastructure team does not have the bandwidth to support gerrit,
>>> we will continue to support our own instance of gerrit for Impala, and make
>>> the necessary integrations such that commits are properly authenticated and
>>> maintain sufficient provenance to uphold the ASF standards (e.g. via the
>>> solution adopted by the AsterixDB podling).
>>> 
>>> = Initial Committers =
>>> 
>>> * Tim Armstrong
>>> * Alex Behm
>>> * Taras Bobrovytsky
>>> * Casey Ching
>>> * Martin Grund
>>> * Daniel Hecht
>>> * Michael Ho
>>> * Matthew Jacobs
>>> * Ishaan Joshi
>>> * Lenni Kuff
>>> * Marcel Kornacker
>>> * Sailesh Mukil
>>> * Henry Robinson
>>> * John Russell
>>> * Dimitris Tsirogiannis
>>> * Skye Wanderman-Milne
>>> * Juan Yu
>>> 
>>> == Affiliations ==
>>> All: Cloudera Inc.
>>> 
>>> = Sponsors =
>>> 
>>> == Champion ==
>>> Tom White
>>> 
>>> == Nominated Mentors ==
>>> * Tom White (Cloudera)
>>> * Todd Lipcon (Cloudera)
>>> * Carl Steinbach (LinkedIn)
>>> * Brock Noland (StreamSets)
>>> 
>>> 
>>> = Sponsoring Entity =
>>> We ask that the Incubator PMC sponsor this proposal.
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Andrew Bayer <an...@gmail.com>.
+1 binding

On Thursday, November 26, 2015, Ted Dunning <te...@gmail.com> wrote:

> +1 binding
>
>
>
> On Fri, Nov 27, 2015 at 6:50 AM, Konstantin Boudnik <cos@apache.org
> <javascript:;>> wrote:
>
> > Come to think of it a bit more, yes I am not satisfied with the outcome
> of
> > the CTR/RTC exchange in the project.
> >
> > Hence changing my vote to
> >  -1 [binding]
> >
> > On Thu, Nov 26, 2015 at 11:47AM, Konstantin Boudnik wrote:
> > > -0 [binding]
> > >
> > > On Tue, Nov 24, 2015 at 01:03PM, Henry Robinson wrote:
> > > > Hi -
> > > >
> > > > The [DISCUSS] thread has been quiet for a few days, so I think
> there's
> > been
> > > > sufficient opportunity for discussion around our proposal to bring
> > Impala
> > > > to the ASF Incubator.
> > > >
> > > > I'd like to call a VOTE on that proposal, which is on the wiki at
> > > > https://wiki.apache.org/incubator/ImpalaProposal, and which I've
> > pasted
> > > > below.
> > > >
> > > > During the discussion period, the proposal has been amended to add
> > Brock
> > > > Noland as a new mentor, to add one missed committer from the list and
> > to
> > > > correct some issues with the dependency list.
> > > >
> > > > Please cast your votes as follows:
> > > >
> > > > [] +1, accept Impala into the Incubator
> > > > [] +/-0, non-counted vote to express a disposition
> > > > [] -1, do not accept Impala into the Incubator (please give your
> > reason(s))
> > > >
> > > > As with the concurrent Kudu vote, I propose leaving the vote open
> for a
> > > > full seven days (to close at Tuesday, December 1st at noon PST), due
> > to the
> > > > upcoming US holiday.
> > > >
> > > > Thanks,
> > > > Henry
> > > >
> > > > --------
> > > >
> > > > = Abstract =
> > > > Impala is a high-performance C++ and Java SQL query engine for data
> > stored
> > > > in Apache Hadoop-based clusters.
> > > >
> > > > = Proposal =
> > > >
> > > > We propose to contribute the Impala codebase and associated artifacts
> > (e.g.
> > > > documentation, web-site content etc.) to the Apache Software
> Foundation
> > > > with the intent of forming a productive, meritocratic and open
> > community
> > > > around Impala’s continued development, according to the ‘Apache Way’.
> > > >
> > > > Cloudera owns several trademarks regarding Impala, and proposes to
> > transfer
> > > > ownership of those trademarks in full to the ASF.
> > > >
> > > > = Background =
> > > > Engineers at Cloudera developed Impala and released it as an
> > > > Apache-licensed open-source project in Fall 2012. Impala was written
> > as a
> > > > brand-new, modern C++ SQL engine targeted from the start for data
> > stored in
> > > > Apache Hadoop clusters.
> > > >
> > > > Impala’s most important benefit to users is high-performance, making
> it
> > > > extremely appropriate for common enterprise analytic and business
> > > > intelligence workloads. This is achieved by a number of software
> > > > techniques, including: native support for data stored in HDFS and
> > related
> > > > filesystems, just-in-time compilation and optimization of individual
> > query
> > > > plans, high-performance C++ codebase and massively-parallel
> distributed
> > > > architecture. In benchmarks, Impala is routinely amongst the very
> > highest
> > > > performing SQL query engines.
> > > >
> > > > = Rationale =
> > > >
> > > > Despite the exciting innovation in the so-called ‘big-data’ space,
> SQL
> > > > remains by far the most common interface for interacting with data in
> > both
> > > > traditional warehouses and modern ‘big-data’ clusters. There is
> > clearly a
> > > > need, as evidenced by the eager adoption of Impala and other SQL
> > engines in
> > > > enterprise contexts, for a query engine that offers the familiar SQL
> > > > interface, but that has been specifically designed to operate in
> > massive,
> > > > distributed clusters rather than in traditional, fixed-hardware,
> > > > warehouse-specific deployments. Impala is one such query engine.
> > > >
> > > > We believe that the ASF is the right venue to foster an open-source
> > > > community around Impala’s development. We expect that Impala will
> > benefit
> > > > from more productive collaboration with related Apache projects, and
> > under
> > > > the auspices of the ASF will attract talented contributors who will
> > push
> > > > Impala’s development forward at pace.
> > > >
> > > > We believe that the timing is right for Impala’s development to move
> > > > wholesale to the ASF: Impala is well-established, has been
> > Apache-licensed
> > > > open-source for more than three years, and the core project is
> > relatively
> > > > stable. We are excited to see where an ASF-based community can take
> > Impala
> > > > from this strong starting point.
> > > >
> > > > = Initial Goals =
> > > > Our initial goals are as follows:
> > > >
> > > >  * Establish ASF-compatible engineering practices and workflows
> > > >  * Refactor and publish existing internal build scripts and test
> > > > infrastructure, in order to make them usable by any community member.
> > > >  * Transfer source code, documentation and associated artifacts to
> the
> > ASF.
> > > >  * Grow the user and developer communities
> > > >
> > > > = Current Status =
> > > >
> > > > Impala is developed as an Apache-licensed open-source project. The
> > source
> > > > code is available at http://github.com/cloudera/Impala, and
> developer
> > > > documentation is at https://github.com/cloudera/Impala/wiki. The
> > majority
> > > > of commits to the project have come from Cloudera-employed
> developers,
> > but
> > > > we have accepted some contributions from individuals from other
> > > > organizations.
> > > >
> > > > All code reviews are done via a public instance of the Gerrit review
> > tool
> > > > at http://gerrit.cloudera.org:8080/, and discussed on a public
> mailing
> > > > list. All patches must be reviewed before they are accepted into the
> > > > codebase, via a voting mechanism that is similar to that used on
> Apache
> > > > projects such as Hadoop and HBase.
> > > >
> > > > Before a patch is committed, it must pass a suite of pre-commit
> tests.
> > > > These tests are currently run on Cloudera’s internal infrastructure.
> > One of
> > > > our initial goals will be to work with the ASF Infrastructure team to
> > find
> > > > a way to run these tests in an acceptable way on publicly accessible
> > > > machines.
> > > >
> > > > Issues are tracked in JIRA at
> > https://issues.cloudera.org/projects/IMPALA,
> > > > in a way that is extremely similar to existing practices at other ASF
> > > > projects.
> > > >
> > > > = Meritocracy =
> > > >
> > > > We understand the central importance of meritocracy to the Apache
> Way.
> > We
> > > > will work to establish a welcoming, fair and meritocratic community,
> in
> > > > part by expanding the set of committers on the project. Although
> > Impala’s
> > > > committer list will initially be dominated by members of the Impala
> > > > engineering team at Cloudera, we look forward to growing a rich user
> > and
> > > > developer community.
> > > >
> > > > = Community =
> > > > Impala has a strong user community (see
> > > > https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user),
> > and a
> > > > growing developer community (see
> > > > https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev).
> > We wish
> > > > to attract more developers to the project, and we believe that the
> > ASF’s
> > > > open and meritocratic philosophy will help us with this. We note the
> > > > success of other, similar projects already part of the ASF.
> > > >
> > > > = Core Developers =
> > > > Most - but not all - of Impala’s core developers are not currently
> > > > affiliated with the ASF, and will require new ICLAs.
> > > >
> > > > = Alignment =
> > > > Impala is related to several other Apache projects:
> > > >
> > > >  * Data that is read by Impala is very often stored in Apache Hadoop
> > > > clusters powered by the HDFS filesystem.
> > > >  * Impala can also read data stored in Apache HBase
> > > >  * Metadata for databases, tables and so on is read by Impala from
> > Apache
> > > > Hive.
> > > >  * The preferred data format for HDFS-based tables is Apache Parquet,
> > and
> > > > Apache Avro is also a supported data format.
> > > >  * Impala is closely integrated with Kudu, which is also being
> > proposed to
> > > > the Incubator.
> > > >  * Impala uses Apache Thrift as its RPC and serialization framework
> of
> > > > choice.
> > > >
> > > > = Known Risks =
> > > >
> > > > == Orphaned Products ==
> > > > Impala is used by most of Cloudera’s customers, and Cloudera remains
> > > > committed to developing and supporting the project. Cloudera has a
> > strong
> > > > track record in standing behind projects that were contributed to the
> > ASF
> > > > by its employees, including Apache Flume, Apache Sqoop, and others.
> > Other
> > > > companies both ship and support Impala, lending credence to the idea
> > that
> > > > Impala is not at risk of being suddenly orphaned.
> > > >
> > > > == Inexperience with Open Source ==
> > > > Although all committers on the initial list have significant
> experience
> > > > with at least one open-source project - namely Impala - fewer have
> much
> > > > experience with ASF-based software projects as contributors and
> > community
> > > > members. However, with the guidance of our mentors, committers who do
> > have
> > > > ASF experience, and time to learn during Incubation, we are confident
> > that
> > > > the project can be run in accordance with Apache principles on an
> > ongoing
> > > > basis.
> > > >
> > > > == Homogeneous Developers ==
> > > >
> > > > The initial committers are employees of Cloudera.
> > > >
> > > > The project has received some contributions from developers outside
> of
> > > > Cloudera, from individuals belonging to organizations such as Intel
> and
> > > > Google, from hobbyists and from students using Impala to advance
> their
> > > > understanding of distributed databases. The project attracted an
> active
> > > > user community as well. We hope to continue to encourage
> contributions
> > from
> > > > these developers and community members and grow them into committers
> > after
> > > > they have had time to continue their contributions.
> > > >
> > > > == Reliance on Salaried Developers ==
> > > >
> > > > Many of Impala’s initial set of committers work full-time on Impala,
> > and
> > > > are paid to do so. However, as mentioned elsewhere, we anticipate
> > growth in
> > > > the developer community which we hope will include hobbyists and
> > academics
> > > > who have an interested in distributed data systems.
> > > >
> > > > == An Excessive Fascination with the Apache Brand ==
> > > > Although we hope that Impala benefits from the Apache Brand, any
> > reflected
> > > > goodwill to Cloudera as the contributing entity is not the goal of
> > > > establishing Impala as an Apache project. We will work with the
> > Incubator
> > > > PMC and the PRC to ensure that the Apache Brand is respected.
> > > >
> > > > = Documentation =
> > > > Impala: A Modern, Open-Source SQL Engine for Hadoop (
> > > > http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf)
> > > >
> > > > Impala’s developer wiki (https://github.com/cloudera/Impala/wiki)
> > > >
> > > > Impala’s auto-generated API documentation (
> > > > http://impala.io/doc/html/index.html)
> > > >
> > > > = Initial Source =
> > > > Impala’s initial source contribution will come from
> > > > http://github.com/cloudera/Impala/.
> > > >
> > > > = External Dependencies =
> > > >
> > > > Impala depends upon a number of third-party libraries, which we list
> > below.
> > > > We intend to compile a LICENSE.txt file in the very short term (see
> > > > https://issues.cloudera.org/browse/IMPALA-2670).
> > > >
> > > >  * Google gflags (BSD)
> > > >  * Google glog (BSD)
> > > >  * Apache Thrift (Apache Software License v2.0)
> > > >  * Apache Commons (Apache Software License v2.0)
> > > >  * Apache Hadoop (Apache Software License v2.0)
> > > >  * Apache HBase (Apache Software License v2.0)
> > > >  * Apache Hive (Apache Software License v2.0)
> > > >  * Boost (Boost Software License)
> > > >  * OpenLdap (OpenLDAP Software License)
> > > >  * rapidjson (MIT)
> > > >  * Google RE2 (BSD-style)
> > > >  * lz4 (BSD)
> > > >  * snappy (BSD)
> > > >  * cyrus-sasl (CMU License)
> > > >  * Apache Avro (Apache Software License v2.0)
> > > >  * Cloudera squeasel (Apache Software License v2.0)
> > > >  * Apache htrace (Incubating) (Apache Software License v2.0)
> > > >  * Apache Sentry (Incubating) (Apache Software License v2.0)
> > > >  * Apache Shiro (Apache Software License v2.0)
> > > >  * Twitter Bootstrap (Apache Software License v2.0)
> > > >  * d3 (BSD)
> > > >  * LLVM (BSD-like)
> > > >
> > > > Build and test dependencies:
> > > >
> > > >  * ant (Apache Software License v2.0)
> > > >  * Apache Maven (Apache Software License v2.0)
> > > >  * cmake (BSD)
> > > >  * clang (BSD)
> > > >  * Google gtest (Apache Software License v2.0)
> > > >
> > > > = Required Resources =
> > > >
> > > > We request that following resources be created for the project to
> use:
> > > >
> > > > == Mailing lists ==
> > > >
> > > >  * private@impala.incubator.apache.org <javascript:;> (moderated
> subscriptions)
> > > >  * commits@impala.incubator.apache.org <javascript:;>
> > > >  * dev@impala.incubator.apache.org <javascript:;>
> > > >  * issues@impala.incubator.apache.org <javascript:;>
> > > >  * user@impala.incubator.apache.org <javascript:;>
> > > >
> > > > == Git repository ==
> > > > https://git.apache.org/impala.git
> > > >
> > > > == JIRA instance ==
> > > > JIRA project IMPALA (IMPALA or IMP)
> > > >
> > > > == Other Resources ==
> > > > We hope to continue using Gerrit for our code review and commit
> > workflow.
> > > > We are involved with discussions that the Kudu team at Cloudera have
> > been
> > > > having with Jake Farrell to start discussions on how Gerrit can fit
> > into
> > > > the ASF. We know that several other ASF projects or podlings are also
> > > > interested in Gerrit.
> > > >
> > > > If the Infrastructure team does not have the bandwidth to support
> > gerrit,
> > > > we will continue to support our own instance of gerrit for Impala,
> and
> > make
> > > > the necessary integrations such that commits are properly
> > authenticated and
> > > > maintain sufficient provenance to uphold the ASF standards (e.g. via
> > the
> > > > solution adopted by the AsterixDB podling).
> > > >
> > > > = Initial Committers =
> > > >
> > > >  * Tim Armstrong
> > > >  * Alex Behm
> > > >  * Taras Bobrovytsky
> > > >  * Casey Ching
> > > >  * Martin Grund
> > > >  * Daniel Hecht
> > > >  * Michael Ho
> > > >  * Matthew Jacobs
> > > >  * Ishaan Joshi
> > > >  * Lenni Kuff
> > > >  * Marcel Kornacker
> > > >  * Sailesh Mukil
> > > >  * Henry Robinson
> > > >  * John Russell
> > > >  * Dimitris Tsirogiannis
> > > >  * Skye Wanderman-Milne
> > > >  * Juan Yu
> > > >
> > > > == Affiliations ==
> > > > All: Cloudera Inc.
> > > >
> > > > = Sponsors =
> > > >
> > > > == Champion ==
> > > > Tom White
> > > >
> > > > == Nominated Mentors ==
> > > >  * Tom White (Cloudera)
> > > >  * Todd Lipcon (Cloudera)
> > > >  * Carl Steinbach (LinkedIn)
> > > >  * Brock Noland (StreamSets)
> > > >
> > > >
> > > > = Sponsoring Entity =
> > > > We ask that the Incubator PMC sponsor this proposal.
> >
> >
> >
>

Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Ted Dunning <te...@gmail.com>.
+1 binding



On Fri, Nov 27, 2015 at 6:50 AM, Konstantin Boudnik <co...@apache.org> wrote:

> Come to think of it a bit more, yes I am not satisfied with the outcome of
> the CTR/RTC exchange in the project.
>
> Hence changing my vote to
>  -1 [binding]
>
> On Thu, Nov 26, 2015 at 11:47AM, Konstantin Boudnik wrote:
> > -0 [binding]
> >
> > On Tue, Nov 24, 2015 at 01:03PM, Henry Robinson wrote:
> > > Hi -
> > >
> > > The [DISCUSS] thread has been quiet for a few days, so I think there's
> been
> > > sufficient opportunity for discussion around our proposal to bring
> Impala
> > > to the ASF Incubator.
> > >
> > > I'd like to call a VOTE on that proposal, which is on the wiki at
> > > https://wiki.apache.org/incubator/ImpalaProposal, and which I've
> pasted
> > > below.
> > >
> > > During the discussion period, the proposal has been amended to add
> Brock
> > > Noland as a new mentor, to add one missed committer from the list and
> to
> > > correct some issues with the dependency list.
> > >
> > > Please cast your votes as follows:
> > >
> > > [] +1, accept Impala into the Incubator
> > > [] +/-0, non-counted vote to express a disposition
> > > [] -1, do not accept Impala into the Incubator (please give your
> reason(s))
> > >
> > > As with the concurrent Kudu vote, I propose leaving the vote open for a
> > > full seven days (to close at Tuesday, December 1st at noon PST), due
> to the
> > > upcoming US holiday.
> > >
> > > Thanks,
> > > Henry
> > >
> > > --------
> > >
> > > = Abstract =
> > > Impala is a high-performance C++ and Java SQL query engine for data
> stored
> > > in Apache Hadoop-based clusters.
> > >
> > > = Proposal =
> > >
> > > We propose to contribute the Impala codebase and associated artifacts
> (e.g.
> > > documentation, web-site content etc.) to the Apache Software Foundation
> > > with the intent of forming a productive, meritocratic and open
> community
> > > around Impala’s continued development, according to the ‘Apache Way’.
> > >
> > > Cloudera owns several trademarks regarding Impala, and proposes to
> transfer
> > > ownership of those trademarks in full to the ASF.
> > >
> > > = Background =
> > > Engineers at Cloudera developed Impala and released it as an
> > > Apache-licensed open-source project in Fall 2012. Impala was written
> as a
> > > brand-new, modern C++ SQL engine targeted from the start for data
> stored in
> > > Apache Hadoop clusters.
> > >
> > > Impala’s most important benefit to users is high-performance, making it
> > > extremely appropriate for common enterprise analytic and business
> > > intelligence workloads. This is achieved by a number of software
> > > techniques, including: native support for data stored in HDFS and
> related
> > > filesystems, just-in-time compilation and optimization of individual
> query
> > > plans, high-performance C++ codebase and massively-parallel distributed
> > > architecture. In benchmarks, Impala is routinely amongst the very
> highest
> > > performing SQL query engines.
> > >
> > > = Rationale =
> > >
> > > Despite the exciting innovation in the so-called ‘big-data’ space, SQL
> > > remains by far the most common interface for interacting with data in
> both
> > > traditional warehouses and modern ‘big-data’ clusters. There is
> clearly a
> > > need, as evidenced by the eager adoption of Impala and other SQL
> engines in
> > > enterprise contexts, for a query engine that offers the familiar SQL
> > > interface, but that has been specifically designed to operate in
> massive,
> > > distributed clusters rather than in traditional, fixed-hardware,
> > > warehouse-specific deployments. Impala is one such query engine.
> > >
> > > We believe that the ASF is the right venue to foster an open-source
> > > community around Impala’s development. We expect that Impala will
> benefit
> > > from more productive collaboration with related Apache projects, and
> under
> > > the auspices of the ASF will attract talented contributors who will
> push
> > > Impala’s development forward at pace.
> > >
> > > We believe that the timing is right for Impala’s development to move
> > > wholesale to the ASF: Impala is well-established, has been
> Apache-licensed
> > > open-source for more than three years, and the core project is
> relatively
> > > stable. We are excited to see where an ASF-based community can take
> Impala
> > > from this strong starting point.
> > >
> > > = Initial Goals =
> > > Our initial goals are as follows:
> > >
> > >  * Establish ASF-compatible engineering practices and workflows
> > >  * Refactor and publish existing internal build scripts and test
> > > infrastructure, in order to make them usable by any community member.
> > >  * Transfer source code, documentation and associated artifacts to the
> ASF.
> > >  * Grow the user and developer communities
> > >
> > > = Current Status =
> > >
> > > Impala is developed as an Apache-licensed open-source project. The
> source
> > > code is available at http://github.com/cloudera/Impala, and developer
> > > documentation is at https://github.com/cloudera/Impala/wiki. The
> majority
> > > of commits to the project have come from Cloudera-employed developers,
> but
> > > we have accepted some contributions from individuals from other
> > > organizations.
> > >
> > > All code reviews are done via a public instance of the Gerrit review
> tool
> > > at http://gerrit.cloudera.org:8080/, and discussed on a public mailing
> > > list. All patches must be reviewed before they are accepted into the
> > > codebase, via a voting mechanism that is similar to that used on Apache
> > > projects such as Hadoop and HBase.
> > >
> > > Before a patch is committed, it must pass a suite of pre-commit tests.
> > > These tests are currently run on Cloudera’s internal infrastructure.
> One of
> > > our initial goals will be to work with the ASF Infrastructure team to
> find
> > > a way to run these tests in an acceptable way on publicly accessible
> > > machines.
> > >
> > > Issues are tracked in JIRA at
> https://issues.cloudera.org/projects/IMPALA,
> > > in a way that is extremely similar to existing practices at other ASF
> > > projects.
> > >
> > > = Meritocracy =
> > >
> > > We understand the central importance of meritocracy to the Apache Way.
> We
> > > will work to establish a welcoming, fair and meritocratic community, in
> > > part by expanding the set of committers on the project. Although
> Impala’s
> > > committer list will initially be dominated by members of the Impala
> > > engineering team at Cloudera, we look forward to growing a rich user
> and
> > > developer community.
> > >
> > > = Community =
> > > Impala has a strong user community (see
> > > https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user),
> and a
> > > growing developer community (see
> > > https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev).
> We wish
> > > to attract more developers to the project, and we believe that the
> ASF’s
> > > open and meritocratic philosophy will help us with this. We note the
> > > success of other, similar projects already part of the ASF.
> > >
> > > = Core Developers =
> > > Most - but not all - of Impala’s core developers are not currently
> > > affiliated with the ASF, and will require new ICLAs.
> > >
> > > = Alignment =
> > > Impala is related to several other Apache projects:
> > >
> > >  * Data that is read by Impala is very often stored in Apache Hadoop
> > > clusters powered by the HDFS filesystem.
> > >  * Impala can also read data stored in Apache HBase
> > >  * Metadata for databases, tables and so on is read by Impala from
> Apache
> > > Hive.
> > >  * The preferred data format for HDFS-based tables is Apache Parquet,
> and
> > > Apache Avro is also a supported data format.
> > >  * Impala is closely integrated with Kudu, which is also being
> proposed to
> > > the Incubator.
> > >  * Impala uses Apache Thrift as its RPC and serialization framework of
> > > choice.
> > >
> > > = Known Risks =
> > >
> > > == Orphaned Products ==
> > > Impala is used by most of Cloudera’s customers, and Cloudera remains
> > > committed to developing and supporting the project. Cloudera has a
> strong
> > > track record in standing behind projects that were contributed to the
> ASF
> > > by its employees, including Apache Flume, Apache Sqoop, and others.
> Other
> > > companies both ship and support Impala, lending credence to the idea
> that
> > > Impala is not at risk of being suddenly orphaned.
> > >
> > > == Inexperience with Open Source ==
> > > Although all committers on the initial list have significant experience
> > > with at least one open-source project - namely Impala - fewer have much
> > > experience with ASF-based software projects as contributors and
> community
> > > members. However, with the guidance of our mentors, committers who do
> have
> > > ASF experience, and time to learn during Incubation, we are confident
> that
> > > the project can be run in accordance with Apache principles on an
> ongoing
> > > basis.
> > >
> > > == Homogeneous Developers ==
> > >
> > > The initial committers are employees of Cloudera.
> > >
> > > The project has received some contributions from developers outside of
> > > Cloudera, from individuals belonging to organizations such as Intel and
> > > Google, from hobbyists and from students using Impala to advance their
> > > understanding of distributed databases. The project attracted an active
> > > user community as well. We hope to continue to encourage contributions
> from
> > > these developers and community members and grow them into committers
> after
> > > they have had time to continue their contributions.
> > >
> > > == Reliance on Salaried Developers ==
> > >
> > > Many of Impala’s initial set of committers work full-time on Impala,
> and
> > > are paid to do so. However, as mentioned elsewhere, we anticipate
> growth in
> > > the developer community which we hope will include hobbyists and
> academics
> > > who have an interested in distributed data systems.
> > >
> > > == An Excessive Fascination with the Apache Brand ==
> > > Although we hope that Impala benefits from the Apache Brand, any
> reflected
> > > goodwill to Cloudera as the contributing entity is not the goal of
> > > establishing Impala as an Apache project. We will work with the
> Incubator
> > > PMC and the PRC to ensure that the Apache Brand is respected.
> > >
> > > = Documentation =
> > > Impala: A Modern, Open-Source SQL Engine for Hadoop (
> > > http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf)
> > >
> > > Impala’s developer wiki (https://github.com/cloudera/Impala/wiki)
> > >
> > > Impala’s auto-generated API documentation (
> > > http://impala.io/doc/html/index.html)
> > >
> > > = Initial Source =
> > > Impala’s initial source contribution will come from
> > > http://github.com/cloudera/Impala/.
> > >
> > > = External Dependencies =
> > >
> > > Impala depends upon a number of third-party libraries, which we list
> below.
> > > We intend to compile a LICENSE.txt file in the very short term (see
> > > https://issues.cloudera.org/browse/IMPALA-2670).
> > >
> > >  * Google gflags (BSD)
> > >  * Google glog (BSD)
> > >  * Apache Thrift (Apache Software License v2.0)
> > >  * Apache Commons (Apache Software License v2.0)
> > >  * Apache Hadoop (Apache Software License v2.0)
> > >  * Apache HBase (Apache Software License v2.0)
> > >  * Apache Hive (Apache Software License v2.0)
> > >  * Boost (Boost Software License)
> > >  * OpenLdap (OpenLDAP Software License)
> > >  * rapidjson (MIT)
> > >  * Google RE2 (BSD-style)
> > >  * lz4 (BSD)
> > >  * snappy (BSD)
> > >  * cyrus-sasl (CMU License)
> > >  * Apache Avro (Apache Software License v2.0)
> > >  * Cloudera squeasel (Apache Software License v2.0)
> > >  * Apache htrace (Incubating) (Apache Software License v2.0)
> > >  * Apache Sentry (Incubating) (Apache Software License v2.0)
> > >  * Apache Shiro (Apache Software License v2.0)
> > >  * Twitter Bootstrap (Apache Software License v2.0)
> > >  * d3 (BSD)
> > >  * LLVM (BSD-like)
> > >
> > > Build and test dependencies:
> > >
> > >  * ant (Apache Software License v2.0)
> > >  * Apache Maven (Apache Software License v2.0)
> > >  * cmake (BSD)
> > >  * clang (BSD)
> > >  * Google gtest (Apache Software License v2.0)
> > >
> > > = Required Resources =
> > >
> > > We request that following resources be created for the project to use:
> > >
> > > == Mailing lists ==
> > >
> > >  * private@impala.incubator.apache.org (moderated subscriptions)
> > >  * commits@impala.incubator.apache.org
> > >  * dev@impala.incubator.apache.org
> > >  * issues@impala.incubator.apache.org
> > >  * user@impala.incubator.apache.org
> > >
> > > == Git repository ==
> > > https://git.apache.org/impala.git
> > >
> > > == JIRA instance ==
> > > JIRA project IMPALA (IMPALA or IMP)
> > >
> > > == Other Resources ==
> > > We hope to continue using Gerrit for our code review and commit
> workflow.
> > > We are involved with discussions that the Kudu team at Cloudera have
> been
> > > having with Jake Farrell to start discussions on how Gerrit can fit
> into
> > > the ASF. We know that several other ASF projects or podlings are also
> > > interested in Gerrit.
> > >
> > > If the Infrastructure team does not have the bandwidth to support
> gerrit,
> > > we will continue to support our own instance of gerrit for Impala, and
> make
> > > the necessary integrations such that commits are properly
> authenticated and
> > > maintain sufficient provenance to uphold the ASF standards (e.g. via
> the
> > > solution adopted by the AsterixDB podling).
> > >
> > > = Initial Committers =
> > >
> > >  * Tim Armstrong
> > >  * Alex Behm
> > >  * Taras Bobrovytsky
> > >  * Casey Ching
> > >  * Martin Grund
> > >  * Daniel Hecht
> > >  * Michael Ho
> > >  * Matthew Jacobs
> > >  * Ishaan Joshi
> > >  * Lenni Kuff
> > >  * Marcel Kornacker
> > >  * Sailesh Mukil
> > >  * Henry Robinson
> > >  * John Russell
> > >  * Dimitris Tsirogiannis
> > >  * Skye Wanderman-Milne
> > >  * Juan Yu
> > >
> > > == Affiliations ==
> > > All: Cloudera Inc.
> > >
> > > = Sponsors =
> > >
> > > == Champion ==
> > > Tom White
> > >
> > > == Nominated Mentors ==
> > >  * Tom White (Cloudera)
> > >  * Todd Lipcon (Cloudera)
> > >  * Carl Steinbach (LinkedIn)
> > >  * Brock Noland (StreamSets)
> > >
> > >
> > > = Sponsoring Entity =
> > > We ask that the Incubator PMC sponsor this proposal.
>
>
>

Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Joe Witt <jo...@gmail.com>.
+1 (non-binding)

On Thu, Nov 26, 2015 at 2:50 PM, Konstantin Boudnik <co...@apache.org> wrote:
> Come to think of it a bit more, yes I am not satisfied with the outcome of
> the CTR/RTC exchange in the project.
>
> Hence changing my vote to
>  -1 [binding]
>
> On Thu, Nov 26, 2015 at 11:47AM, Konstantin Boudnik wrote:
>> -0 [binding]
>>
>> On Tue, Nov 24, 2015 at 01:03PM, Henry Robinson wrote:
>> > Hi -
>> >
>> > The [DISCUSS] thread has been quiet for a few days, so I think there's been
>> > sufficient opportunity for discussion around our proposal to bring Impala
>> > to the ASF Incubator.
>> >
>> > I'd like to call a VOTE on that proposal, which is on the wiki at
>> > https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
>> > below.
>> >
>> > During the discussion period, the proposal has been amended to add Brock
>> > Noland as a new mentor, to add one missed committer from the list and to
>> > correct some issues with the dependency list.
>> >
>> > Please cast your votes as follows:
>> >
>> > [] +1, accept Impala into the Incubator
>> > [] +/-0, non-counted vote to express a disposition
>> > [] -1, do not accept Impala into the Incubator (please give your reason(s))
>> >
>> > As with the concurrent Kudu vote, I propose leaving the vote open for a
>> > full seven days (to close at Tuesday, December 1st at noon PST), due to the
>> > upcoming US holiday.
>> >
>> > Thanks,
>> > Henry
>> >
>> > --------
>> >
>> > = Abstract =
>> > Impala is a high-performance C++ and Java SQL query engine for data stored
>> > in Apache Hadoop-based clusters.
>> >
>> > = Proposal =
>> >
>> > We propose to contribute the Impala codebase and associated artifacts (e.g.
>> > documentation, web-site content etc.) to the Apache Software Foundation
>> > with the intent of forming a productive, meritocratic and open community
>> > around Impala’s continued development, according to the ‘Apache Way’.
>> >
>> > Cloudera owns several trademarks regarding Impala, and proposes to transfer
>> > ownership of those trademarks in full to the ASF.
>> >
>> > = Background =
>> > Engineers at Cloudera developed Impala and released it as an
>> > Apache-licensed open-source project in Fall 2012. Impala was written as a
>> > brand-new, modern C++ SQL engine targeted from the start for data stored in
>> > Apache Hadoop clusters.
>> >
>> > Impala’s most important benefit to users is high-performance, making it
>> > extremely appropriate for common enterprise analytic and business
>> > intelligence workloads. This is achieved by a number of software
>> > techniques, including: native support for data stored in HDFS and related
>> > filesystems, just-in-time compilation and optimization of individual query
>> > plans, high-performance C++ codebase and massively-parallel distributed
>> > architecture. In benchmarks, Impala is routinely amongst the very highest
>> > performing SQL query engines.
>> >
>> > = Rationale =
>> >
>> > Despite the exciting innovation in the so-called ‘big-data’ space, SQL
>> > remains by far the most common interface for interacting with data in both
>> > traditional warehouses and modern ‘big-data’ clusters. There is clearly a
>> > need, as evidenced by the eager adoption of Impala and other SQL engines in
>> > enterprise contexts, for a query engine that offers the familiar SQL
>> > interface, but that has been specifically designed to operate in massive,
>> > distributed clusters rather than in traditional, fixed-hardware,
>> > warehouse-specific deployments. Impala is one such query engine.
>> >
>> > We believe that the ASF is the right venue to foster an open-source
>> > community around Impala’s development. We expect that Impala will benefit
>> > from more productive collaboration with related Apache projects, and under
>> > the auspices of the ASF will attract talented contributors who will push
>> > Impala’s development forward at pace.
>> >
>> > We believe that the timing is right for Impala’s development to move
>> > wholesale to the ASF: Impala is well-established, has been Apache-licensed
>> > open-source for more than three years, and the core project is relatively
>> > stable. We are excited to see where an ASF-based community can take Impala
>> > from this strong starting point.
>> >
>> > = Initial Goals =
>> > Our initial goals are as follows:
>> >
>> >  * Establish ASF-compatible engineering practices and workflows
>> >  * Refactor and publish existing internal build scripts and test
>> > infrastructure, in order to make them usable by any community member.
>> >  * Transfer source code, documentation and associated artifacts to the ASF.
>> >  * Grow the user and developer communities
>> >
>> > = Current Status =
>> >
>> > Impala is developed as an Apache-licensed open-source project. The source
>> > code is available at http://github.com/cloudera/Impala, and developer
>> > documentation is at https://github.com/cloudera/Impala/wiki. The majority
>> > of commits to the project have come from Cloudera-employed developers, but
>> > we have accepted some contributions from individuals from other
>> > organizations.
>> >
>> > All code reviews are done via a public instance of the Gerrit review tool
>> > at http://gerrit.cloudera.org:8080/, and discussed on a public mailing
>> > list. All patches must be reviewed before they are accepted into the
>> > codebase, via a voting mechanism that is similar to that used on Apache
>> > projects such as Hadoop and HBase.
>> >
>> > Before a patch is committed, it must pass a suite of pre-commit tests.
>> > These tests are currently run on Cloudera’s internal infrastructure. One of
>> > our initial goals will be to work with the ASF Infrastructure team to find
>> > a way to run these tests in an acceptable way on publicly accessible
>> > machines.
>> >
>> > Issues are tracked in JIRA at https://issues.cloudera.org/projects/IMPALA,
>> > in a way that is extremely similar to existing practices at other ASF
>> > projects.
>> >
>> > = Meritocracy =
>> >
>> > We understand the central importance of meritocracy to the Apache Way. We
>> > will work to establish a welcoming, fair and meritocratic community, in
>> > part by expanding the set of committers on the project. Although Impala’s
>> > committer list will initially be dominated by members of the Impala
>> > engineering team at Cloudera, we look forward to growing a rich user and
>> > developer community.
>> >
>> > = Community =
>> > Impala has a strong user community (see
>> > https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user), and a
>> > growing developer community (see
>> > https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev). We wish
>> > to attract more developers to the project, and we believe that the ASF’s
>> > open and meritocratic philosophy will help us with this. We note the
>> > success of other, similar projects already part of the ASF.
>> >
>> > = Core Developers =
>> > Most - but not all - of Impala’s core developers are not currently
>> > affiliated with the ASF, and will require new ICLAs.
>> >
>> > = Alignment =
>> > Impala is related to several other Apache projects:
>> >
>> >  * Data that is read by Impala is very often stored in Apache Hadoop
>> > clusters powered by the HDFS filesystem.
>> >  * Impala can also read data stored in Apache HBase
>> >  * Metadata for databases, tables and so on is read by Impala from Apache
>> > Hive.
>> >  * The preferred data format for HDFS-based tables is Apache Parquet, and
>> > Apache Avro is also a supported data format.
>> >  * Impala is closely integrated with Kudu, which is also being proposed to
>> > the Incubator.
>> >  * Impala uses Apache Thrift as its RPC and serialization framework of
>> > choice.
>> >
>> > = Known Risks =
>> >
>> > == Orphaned Products ==
>> > Impala is used by most of Cloudera’s customers, and Cloudera remains
>> > committed to developing and supporting the project. Cloudera has a strong
>> > track record in standing behind projects that were contributed to the ASF
>> > by its employees, including Apache Flume, Apache Sqoop, and others. Other
>> > companies both ship and support Impala, lending credence to the idea that
>> > Impala is not at risk of being suddenly orphaned.
>> >
>> > == Inexperience with Open Source ==
>> > Although all committers on the initial list have significant experience
>> > with at least one open-source project - namely Impala - fewer have much
>> > experience with ASF-based software projects as contributors and community
>> > members. However, with the guidance of our mentors, committers who do have
>> > ASF experience, and time to learn during Incubation, we are confident that
>> > the project can be run in accordance with Apache principles on an ongoing
>> > basis.
>> >
>> > == Homogeneous Developers ==
>> >
>> > The initial committers are employees of Cloudera.
>> >
>> > The project has received some contributions from developers outside of
>> > Cloudera, from individuals belonging to organizations such as Intel and
>> > Google, from hobbyists and from students using Impala to advance their
>> > understanding of distributed databases. The project attracted an active
>> > user community as well. We hope to continue to encourage contributions from
>> > these developers and community members and grow them into committers after
>> > they have had time to continue their contributions.
>> >
>> > == Reliance on Salaried Developers ==
>> >
>> > Many of Impala’s initial set of committers work full-time on Impala, and
>> > are paid to do so. However, as mentioned elsewhere, we anticipate growth in
>> > the developer community which we hope will include hobbyists and academics
>> > who have an interested in distributed data systems.
>> >
>> > == An Excessive Fascination with the Apache Brand ==
>> > Although we hope that Impala benefits from the Apache Brand, any reflected
>> > goodwill to Cloudera as the contributing entity is not the goal of
>> > establishing Impala as an Apache project. We will work with the Incubator
>> > PMC and the PRC to ensure that the Apache Brand is respected.
>> >
>> > = Documentation =
>> > Impala: A Modern, Open-Source SQL Engine for Hadoop (
>> > http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf)
>> >
>> > Impala’s developer wiki (https://github.com/cloudera/Impala/wiki)
>> >
>> > Impala’s auto-generated API documentation (
>> > http://impala.io/doc/html/index.html)
>> >
>> > = Initial Source =
>> > Impala’s initial source contribution will come from
>> > http://github.com/cloudera/Impala/.
>> >
>> > = External Dependencies =
>> >
>> > Impala depends upon a number of third-party libraries, which we list below.
>> > We intend to compile a LICENSE.txt file in the very short term (see
>> > https://issues.cloudera.org/browse/IMPALA-2670).
>> >
>> >  * Google gflags (BSD)
>> >  * Google glog (BSD)
>> >  * Apache Thrift (Apache Software License v2.0)
>> >  * Apache Commons (Apache Software License v2.0)
>> >  * Apache Hadoop (Apache Software License v2.0)
>> >  * Apache HBase (Apache Software License v2.0)
>> >  * Apache Hive (Apache Software License v2.0)
>> >  * Boost (Boost Software License)
>> >  * OpenLdap (OpenLDAP Software License)
>> >  * rapidjson (MIT)
>> >  * Google RE2 (BSD-style)
>> >  * lz4 (BSD)
>> >  * snappy (BSD)
>> >  * cyrus-sasl (CMU License)
>> >  * Apache Avro (Apache Software License v2.0)
>> >  * Cloudera squeasel (Apache Software License v2.0)
>> >  * Apache htrace (Incubating) (Apache Software License v2.0)
>> >  * Apache Sentry (Incubating) (Apache Software License v2.0)
>> >  * Apache Shiro (Apache Software License v2.0)
>> >  * Twitter Bootstrap (Apache Software License v2.0)
>> >  * d3 (BSD)
>> >  * LLVM (BSD-like)
>> >
>> > Build and test dependencies:
>> >
>> >  * ant (Apache Software License v2.0)
>> >  * Apache Maven (Apache Software License v2.0)
>> >  * cmake (BSD)
>> >  * clang (BSD)
>> >  * Google gtest (Apache Software License v2.0)
>> >
>> > = Required Resources =
>> >
>> > We request that following resources be created for the project to use:
>> >
>> > == Mailing lists ==
>> >
>> >  * private@impala.incubator.apache.org (moderated subscriptions)
>> >  * commits@impala.incubator.apache.org
>> >  * dev@impala.incubator.apache.org
>> >  * issues@impala.incubator.apache.org
>> >  * user@impala.incubator.apache.org
>> >
>> > == Git repository ==
>> > https://git.apache.org/impala.git
>> >
>> > == JIRA instance ==
>> > JIRA project IMPALA (IMPALA or IMP)
>> >
>> > == Other Resources ==
>> > We hope to continue using Gerrit for our code review and commit workflow.
>> > We are involved with discussions that the Kudu team at Cloudera have been
>> > having with Jake Farrell to start discussions on how Gerrit can fit into
>> > the ASF. We know that several other ASF projects or podlings are also
>> > interested in Gerrit.
>> >
>> > If the Infrastructure team does not have the bandwidth to support gerrit,
>> > we will continue to support our own instance of gerrit for Impala, and make
>> > the necessary integrations such that commits are properly authenticated and
>> > maintain sufficient provenance to uphold the ASF standards (e.g. via the
>> > solution adopted by the AsterixDB podling).
>> >
>> > = Initial Committers =
>> >
>> >  * Tim Armstrong
>> >  * Alex Behm
>> >  * Taras Bobrovytsky
>> >  * Casey Ching
>> >  * Martin Grund
>> >  * Daniel Hecht
>> >  * Michael Ho
>> >  * Matthew Jacobs
>> >  * Ishaan Joshi
>> >  * Lenni Kuff
>> >  * Marcel Kornacker
>> >  * Sailesh Mukil
>> >  * Henry Robinson
>> >  * John Russell
>> >  * Dimitris Tsirogiannis
>> >  * Skye Wanderman-Milne
>> >  * Juan Yu
>> >
>> > == Affiliations ==
>> > All: Cloudera Inc.
>> >
>> > = Sponsors =
>> >
>> > == Champion ==
>> > Tom White
>> >
>> > == Nominated Mentors ==
>> >  * Tom White (Cloudera)
>> >  * Todd Lipcon (Cloudera)
>> >  * Carl Steinbach (LinkedIn)
>> >  * Brock Noland (StreamSets)
>> >
>> >
>> > = Sponsoring Entity =
>> > We ask that the Incubator PMC sponsor this proposal.
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Konstantin Boudnik <co...@apache.org>.
Come to think of it a bit more, yes I am not satisfied with the outcome of
the CTR/RTC exchange in the project.

Hence changing my vote to
 -1 [binding]

On Thu, Nov 26, 2015 at 11:47AM, Konstantin Boudnik wrote:
> -0 [binding]
> 
> On Tue, Nov 24, 2015 at 01:03PM, Henry Robinson wrote:
> > Hi -
> > 
> > The [DISCUSS] thread has been quiet for a few days, so I think there's been
> > sufficient opportunity for discussion around our proposal to bring Impala
> > to the ASF Incubator.
> > 
> > I'd like to call a VOTE on that proposal, which is on the wiki at
> > https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
> > below.
> > 
> > During the discussion period, the proposal has been amended to add Brock
> > Noland as a new mentor, to add one missed committer from the list and to
> > correct some issues with the dependency list.
> > 
> > Please cast your votes as follows:
> > 
> > [] +1, accept Impala into the Incubator
> > [] +/-0, non-counted vote to express a disposition
> > [] -1, do not accept Impala into the Incubator (please give your reason(s))
> > 
> > As with the concurrent Kudu vote, I propose leaving the vote open for a
> > full seven days (to close at Tuesday, December 1st at noon PST), due to the
> > upcoming US holiday.
> > 
> > Thanks,
> > Henry
> > 
> > --------
> > 
> > = Abstract =
> > Impala is a high-performance C++ and Java SQL query engine for data stored
> > in Apache Hadoop-based clusters.
> > 
> > = Proposal =
> > 
> > We propose to contribute the Impala codebase and associated artifacts (e.g.
> > documentation, web-site content etc.) to the Apache Software Foundation
> > with the intent of forming a productive, meritocratic and open community
> > around Impala’s continued development, according to the ‘Apache Way’.
> > 
> > Cloudera owns several trademarks regarding Impala, and proposes to transfer
> > ownership of those trademarks in full to the ASF.
> > 
> > = Background =
> > Engineers at Cloudera developed Impala and released it as an
> > Apache-licensed open-source project in Fall 2012. Impala was written as a
> > brand-new, modern C++ SQL engine targeted from the start for data stored in
> > Apache Hadoop clusters.
> > 
> > Impala’s most important benefit to users is high-performance, making it
> > extremely appropriate for common enterprise analytic and business
> > intelligence workloads. This is achieved by a number of software
> > techniques, including: native support for data stored in HDFS and related
> > filesystems, just-in-time compilation and optimization of individual query
> > plans, high-performance C++ codebase and massively-parallel distributed
> > architecture. In benchmarks, Impala is routinely amongst the very highest
> > performing SQL query engines.
> > 
> > = Rationale =
> > 
> > Despite the exciting innovation in the so-called ‘big-data’ space, SQL
> > remains by far the most common interface for interacting with data in both
> > traditional warehouses and modern ‘big-data’ clusters. There is clearly a
> > need, as evidenced by the eager adoption of Impala and other SQL engines in
> > enterprise contexts, for a query engine that offers the familiar SQL
> > interface, but that has been specifically designed to operate in massive,
> > distributed clusters rather than in traditional, fixed-hardware,
> > warehouse-specific deployments. Impala is one such query engine.
> > 
> > We believe that the ASF is the right venue to foster an open-source
> > community around Impala’s development. We expect that Impala will benefit
> > from more productive collaboration with related Apache projects, and under
> > the auspices of the ASF will attract talented contributors who will push
> > Impala’s development forward at pace.
> > 
> > We believe that the timing is right for Impala’s development to move
> > wholesale to the ASF: Impala is well-established, has been Apache-licensed
> > open-source for more than three years, and the core project is relatively
> > stable. We are excited to see where an ASF-based community can take Impala
> > from this strong starting point.
> > 
> > = Initial Goals =
> > Our initial goals are as follows:
> > 
> >  * Establish ASF-compatible engineering practices and workflows
> >  * Refactor and publish existing internal build scripts and test
> > infrastructure, in order to make them usable by any community member.
> >  * Transfer source code, documentation and associated artifacts to the ASF.
> >  * Grow the user and developer communities
> > 
> > = Current Status =
> > 
> > Impala is developed as an Apache-licensed open-source project. The source
> > code is available at http://github.com/cloudera/Impala, and developer
> > documentation is at https://github.com/cloudera/Impala/wiki. The majority
> > of commits to the project have come from Cloudera-employed developers, but
> > we have accepted some contributions from individuals from other
> > organizations.
> > 
> > All code reviews are done via a public instance of the Gerrit review tool
> > at http://gerrit.cloudera.org:8080/, and discussed on a public mailing
> > list. All patches must be reviewed before they are accepted into the
> > codebase, via a voting mechanism that is similar to that used on Apache
> > projects such as Hadoop and HBase.
> > 
> > Before a patch is committed, it must pass a suite of pre-commit tests.
> > These tests are currently run on Cloudera’s internal infrastructure. One of
> > our initial goals will be to work with the ASF Infrastructure team to find
> > a way to run these tests in an acceptable way on publicly accessible
> > machines.
> > 
> > Issues are tracked in JIRA at https://issues.cloudera.org/projects/IMPALA,
> > in a way that is extremely similar to existing practices at other ASF
> > projects.
> > 
> > = Meritocracy =
> > 
> > We understand the central importance of meritocracy to the Apache Way. We
> > will work to establish a welcoming, fair and meritocratic community, in
> > part by expanding the set of committers on the project. Although Impala’s
> > committer list will initially be dominated by members of the Impala
> > engineering team at Cloudera, we look forward to growing a rich user and
> > developer community.
> > 
> > = Community =
> > Impala has a strong user community (see
> > https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user), and a
> > growing developer community (see
> > https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev). We wish
> > to attract more developers to the project, and we believe that the ASF’s
> > open and meritocratic philosophy will help us with this. We note the
> > success of other, similar projects already part of the ASF.
> > 
> > = Core Developers =
> > Most - but not all - of Impala’s core developers are not currently
> > affiliated with the ASF, and will require new ICLAs.
> > 
> > = Alignment =
> > Impala is related to several other Apache projects:
> > 
> >  * Data that is read by Impala is very often stored in Apache Hadoop
> > clusters powered by the HDFS filesystem.
> >  * Impala can also read data stored in Apache HBase
> >  * Metadata for databases, tables and so on is read by Impala from Apache
> > Hive.
> >  * The preferred data format for HDFS-based tables is Apache Parquet, and
> > Apache Avro is also a supported data format.
> >  * Impala is closely integrated with Kudu, which is also being proposed to
> > the Incubator.
> >  * Impala uses Apache Thrift as its RPC and serialization framework of
> > choice.
> > 
> > = Known Risks =
> > 
> > == Orphaned Products ==
> > Impala is used by most of Cloudera’s customers, and Cloudera remains
> > committed to developing and supporting the project. Cloudera has a strong
> > track record in standing behind projects that were contributed to the ASF
> > by its employees, including Apache Flume, Apache Sqoop, and others. Other
> > companies both ship and support Impala, lending credence to the idea that
> > Impala is not at risk of being suddenly orphaned.
> > 
> > == Inexperience with Open Source ==
> > Although all committers on the initial list have significant experience
> > with at least one open-source project - namely Impala - fewer have much
> > experience with ASF-based software projects as contributors and community
> > members. However, with the guidance of our mentors, committers who do have
> > ASF experience, and time to learn during Incubation, we are confident that
> > the project can be run in accordance with Apache principles on an ongoing
> > basis.
> > 
> > == Homogeneous Developers ==
> > 
> > The initial committers are employees of Cloudera.
> > 
> > The project has received some contributions from developers outside of
> > Cloudera, from individuals belonging to organizations such as Intel and
> > Google, from hobbyists and from students using Impala to advance their
> > understanding of distributed databases. The project attracted an active
> > user community as well. We hope to continue to encourage contributions from
> > these developers and community members and grow them into committers after
> > they have had time to continue their contributions.
> > 
> > == Reliance on Salaried Developers ==
> > 
> > Many of Impala’s initial set of committers work full-time on Impala, and
> > are paid to do so. However, as mentioned elsewhere, we anticipate growth in
> > the developer community which we hope will include hobbyists and academics
> > who have an interested in distributed data systems.
> > 
> > == An Excessive Fascination with the Apache Brand ==
> > Although we hope that Impala benefits from the Apache Brand, any reflected
> > goodwill to Cloudera as the contributing entity is not the goal of
> > establishing Impala as an Apache project. We will work with the Incubator
> > PMC and the PRC to ensure that the Apache Brand is respected.
> > 
> > = Documentation =
> > Impala: A Modern, Open-Source SQL Engine for Hadoop (
> > http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf)
> > 
> > Impala’s developer wiki (https://github.com/cloudera/Impala/wiki)
> > 
> > Impala’s auto-generated API documentation (
> > http://impala.io/doc/html/index.html)
> > 
> > = Initial Source =
> > Impala’s initial source contribution will come from
> > http://github.com/cloudera/Impala/.
> > 
> > = External Dependencies =
> > 
> > Impala depends upon a number of third-party libraries, which we list below.
> > We intend to compile a LICENSE.txt file in the very short term (see
> > https://issues.cloudera.org/browse/IMPALA-2670).
> > 
> >  * Google gflags (BSD)
> >  * Google glog (BSD)
> >  * Apache Thrift (Apache Software License v2.0)
> >  * Apache Commons (Apache Software License v2.0)
> >  * Apache Hadoop (Apache Software License v2.0)
> >  * Apache HBase (Apache Software License v2.0)
> >  * Apache Hive (Apache Software License v2.0)
> >  * Boost (Boost Software License)
> >  * OpenLdap (OpenLDAP Software License)
> >  * rapidjson (MIT)
> >  * Google RE2 (BSD-style)
> >  * lz4 (BSD)
> >  * snappy (BSD)
> >  * cyrus-sasl (CMU License)
> >  * Apache Avro (Apache Software License v2.0)
> >  * Cloudera squeasel (Apache Software License v2.0)
> >  * Apache htrace (Incubating) (Apache Software License v2.0)
> >  * Apache Sentry (Incubating) (Apache Software License v2.0)
> >  * Apache Shiro (Apache Software License v2.0)
> >  * Twitter Bootstrap (Apache Software License v2.0)
> >  * d3 (BSD)
> >  * LLVM (BSD-like)
> > 
> > Build and test dependencies:
> > 
> >  * ant (Apache Software License v2.0)
> >  * Apache Maven (Apache Software License v2.0)
> >  * cmake (BSD)
> >  * clang (BSD)
> >  * Google gtest (Apache Software License v2.0)
> > 
> > = Required Resources =
> > 
> > We request that following resources be created for the project to use:
> > 
> > == Mailing lists ==
> > 
> >  * private@impala.incubator.apache.org (moderated subscriptions)
> >  * commits@impala.incubator.apache.org
> >  * dev@impala.incubator.apache.org
> >  * issues@impala.incubator.apache.org
> >  * user@impala.incubator.apache.org
> > 
> > == Git repository ==
> > https://git.apache.org/impala.git
> > 
> > == JIRA instance ==
> > JIRA project IMPALA (IMPALA or IMP)
> > 
> > == Other Resources ==
> > We hope to continue using Gerrit for our code review and commit workflow.
> > We are involved with discussions that the Kudu team at Cloudera have been
> > having with Jake Farrell to start discussions on how Gerrit can fit into
> > the ASF. We know that several other ASF projects or podlings are also
> > interested in Gerrit.
> > 
> > If the Infrastructure team does not have the bandwidth to support gerrit,
> > we will continue to support our own instance of gerrit for Impala, and make
> > the necessary integrations such that commits are properly authenticated and
> > maintain sufficient provenance to uphold the ASF standards (e.g. via the
> > solution adopted by the AsterixDB podling).
> > 
> > = Initial Committers =
> > 
> >  * Tim Armstrong
> >  * Alex Behm
> >  * Taras Bobrovytsky
> >  * Casey Ching
> >  * Martin Grund
> >  * Daniel Hecht
> >  * Michael Ho
> >  * Matthew Jacobs
> >  * Ishaan Joshi
> >  * Lenni Kuff
> >  * Marcel Kornacker
> >  * Sailesh Mukil
> >  * Henry Robinson
> >  * John Russell
> >  * Dimitris Tsirogiannis
> >  * Skye Wanderman-Milne
> >  * Juan Yu
> > 
> > == Affiliations ==
> > All: Cloudera Inc.
> > 
> > = Sponsors =
> > 
> > == Champion ==
> > Tom White
> > 
> > == Nominated Mentors ==
> >  * Tom White (Cloudera)
> >  * Todd Lipcon (Cloudera)
> >  * Carl Steinbach (LinkedIn)
> >  * Brock Noland (StreamSets)
> > 
> > 
> > = Sponsoring Entity =
> > We ask that the Incubator PMC sponsor this proposal.



Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Konstantin Boudnik <co...@apache.org>.
-0 [binding]

On Tue, Nov 24, 2015 at 01:03PM, Henry Robinson wrote:
> Hi -
> 
> The [DISCUSS] thread has been quiet for a few days, so I think there's been
> sufficient opportunity for discussion around our proposal to bring Impala
> to the ASF Incubator.
> 
> I'd like to call a VOTE on that proposal, which is on the wiki at
> https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
> below.
> 
> During the discussion period, the proposal has been amended to add Brock
> Noland as a new mentor, to add one missed committer from the list and to
> correct some issues with the dependency list.
> 
> Please cast your votes as follows:
> 
> [] +1, accept Impala into the Incubator
> [] +/-0, non-counted vote to express a disposition
> [] -1, do not accept Impala into the Incubator (please give your reason(s))
> 
> As with the concurrent Kudu vote, I propose leaving the vote open for a
> full seven days (to close at Tuesday, December 1st at noon PST), due to the
> upcoming US holiday.
> 
> Thanks,
> Henry
> 
> --------
> 
> = Abstract =
> Impala is a high-performance C++ and Java SQL query engine for data stored
> in Apache Hadoop-based clusters.
> 
> = Proposal =
> 
> We propose to contribute the Impala codebase and associated artifacts (e.g.
> documentation, web-site content etc.) to the Apache Software Foundation
> with the intent of forming a productive, meritocratic and open community
> around Impala’s continued development, according to the ‘Apache Way’.
> 
> Cloudera owns several trademarks regarding Impala, and proposes to transfer
> ownership of those trademarks in full to the ASF.
> 
> = Background =
> Engineers at Cloudera developed Impala and released it as an
> Apache-licensed open-source project in Fall 2012. Impala was written as a
> brand-new, modern C++ SQL engine targeted from the start for data stored in
> Apache Hadoop clusters.
> 
> Impala’s most important benefit to users is high-performance, making it
> extremely appropriate for common enterprise analytic and business
> intelligence workloads. This is achieved by a number of software
> techniques, including: native support for data stored in HDFS and related
> filesystems, just-in-time compilation and optimization of individual query
> plans, high-performance C++ codebase and massively-parallel distributed
> architecture. In benchmarks, Impala is routinely amongst the very highest
> performing SQL query engines.
> 
> = Rationale =
> 
> Despite the exciting innovation in the so-called ‘big-data’ space, SQL
> remains by far the most common interface for interacting with data in both
> traditional warehouses and modern ‘big-data’ clusters. There is clearly a
> need, as evidenced by the eager adoption of Impala and other SQL engines in
> enterprise contexts, for a query engine that offers the familiar SQL
> interface, but that has been specifically designed to operate in massive,
> distributed clusters rather than in traditional, fixed-hardware,
> warehouse-specific deployments. Impala is one such query engine.
> 
> We believe that the ASF is the right venue to foster an open-source
> community around Impala’s development. We expect that Impala will benefit
> from more productive collaboration with related Apache projects, and under
> the auspices of the ASF will attract talented contributors who will push
> Impala’s development forward at pace.
> 
> We believe that the timing is right for Impala’s development to move
> wholesale to the ASF: Impala is well-established, has been Apache-licensed
> open-source for more than three years, and the core project is relatively
> stable. We are excited to see where an ASF-based community can take Impala
> from this strong starting point.
> 
> = Initial Goals =
> Our initial goals are as follows:
> 
>  * Establish ASF-compatible engineering practices and workflows
>  * Refactor and publish existing internal build scripts and test
> infrastructure, in order to make them usable by any community member.
>  * Transfer source code, documentation and associated artifacts to the ASF.
>  * Grow the user and developer communities
> 
> = Current Status =
> 
> Impala is developed as an Apache-licensed open-source project. The source
> code is available at http://github.com/cloudera/Impala, and developer
> documentation is at https://github.com/cloudera/Impala/wiki. The majority
> of commits to the project have come from Cloudera-employed developers, but
> we have accepted some contributions from individuals from other
> organizations.
> 
> All code reviews are done via a public instance of the Gerrit review tool
> at http://gerrit.cloudera.org:8080/, and discussed on a public mailing
> list. All patches must be reviewed before they are accepted into the
> codebase, via a voting mechanism that is similar to that used on Apache
> projects such as Hadoop and HBase.
> 
> Before a patch is committed, it must pass a suite of pre-commit tests.
> These tests are currently run on Cloudera’s internal infrastructure. One of
> our initial goals will be to work with the ASF Infrastructure team to find
> a way to run these tests in an acceptable way on publicly accessible
> machines.
> 
> Issues are tracked in JIRA at https://issues.cloudera.org/projects/IMPALA,
> in a way that is extremely similar to existing practices at other ASF
> projects.
> 
> = Meritocracy =
> 
> We understand the central importance of meritocracy to the Apache Way. We
> will work to establish a welcoming, fair and meritocratic community, in
> part by expanding the set of committers on the project. Although Impala’s
> committer list will initially be dominated by members of the Impala
> engineering team at Cloudera, we look forward to growing a rich user and
> developer community.
> 
> = Community =
> Impala has a strong user community (see
> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user), and a
> growing developer community (see
> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev). We wish
> to attract more developers to the project, and we believe that the ASF’s
> open and meritocratic philosophy will help us with this. We note the
> success of other, similar projects already part of the ASF.
> 
> = Core Developers =
> Most - but not all - of Impala’s core developers are not currently
> affiliated with the ASF, and will require new ICLAs.
> 
> = Alignment =
> Impala is related to several other Apache projects:
> 
>  * Data that is read by Impala is very often stored in Apache Hadoop
> clusters powered by the HDFS filesystem.
>  * Impala can also read data stored in Apache HBase
>  * Metadata for databases, tables and so on is read by Impala from Apache
> Hive.
>  * The preferred data format for HDFS-based tables is Apache Parquet, and
> Apache Avro is also a supported data format.
>  * Impala is closely integrated with Kudu, which is also being proposed to
> the Incubator.
>  * Impala uses Apache Thrift as its RPC and serialization framework of
> choice.
> 
> = Known Risks =
> 
> == Orphaned Products ==
> Impala is used by most of Cloudera’s customers, and Cloudera remains
> committed to developing and supporting the project. Cloudera has a strong
> track record in standing behind projects that were contributed to the ASF
> by its employees, including Apache Flume, Apache Sqoop, and others. Other
> companies both ship and support Impala, lending credence to the idea that
> Impala is not at risk of being suddenly orphaned.
> 
> == Inexperience with Open Source ==
> Although all committers on the initial list have significant experience
> with at least one open-source project - namely Impala - fewer have much
> experience with ASF-based software projects as contributors and community
> members. However, with the guidance of our mentors, committers who do have
> ASF experience, and time to learn during Incubation, we are confident that
> the project can be run in accordance with Apache principles on an ongoing
> basis.
> 
> == Homogeneous Developers ==
> 
> The initial committers are employees of Cloudera.
> 
> The project has received some contributions from developers outside of
> Cloudera, from individuals belonging to organizations such as Intel and
> Google, from hobbyists and from students using Impala to advance their
> understanding of distributed databases. The project attracted an active
> user community as well. We hope to continue to encourage contributions from
> these developers and community members and grow them into committers after
> they have had time to continue their contributions.
> 
> == Reliance on Salaried Developers ==
> 
> Many of Impala’s initial set of committers work full-time on Impala, and
> are paid to do so. However, as mentioned elsewhere, we anticipate growth in
> the developer community which we hope will include hobbyists and academics
> who have an interested in distributed data systems.
> 
> == An Excessive Fascination with the Apache Brand ==
> Although we hope that Impala benefits from the Apache Brand, any reflected
> goodwill to Cloudera as the contributing entity is not the goal of
> establishing Impala as an Apache project. We will work with the Incubator
> PMC and the PRC to ensure that the Apache Brand is respected.
> 
> = Documentation =
> Impala: A Modern, Open-Source SQL Engine for Hadoop (
> http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf)
> 
> Impala’s developer wiki (https://github.com/cloudera/Impala/wiki)
> 
> Impala’s auto-generated API documentation (
> http://impala.io/doc/html/index.html)
> 
> = Initial Source =
> Impala’s initial source contribution will come from
> http://github.com/cloudera/Impala/.
> 
> = External Dependencies =
> 
> Impala depends upon a number of third-party libraries, which we list below.
> We intend to compile a LICENSE.txt file in the very short term (see
> https://issues.cloudera.org/browse/IMPALA-2670).
> 
>  * Google gflags (BSD)
>  * Google glog (BSD)
>  * Apache Thrift (Apache Software License v2.0)
>  * Apache Commons (Apache Software License v2.0)
>  * Apache Hadoop (Apache Software License v2.0)
>  * Apache HBase (Apache Software License v2.0)
>  * Apache Hive (Apache Software License v2.0)
>  * Boost (Boost Software License)
>  * OpenLdap (OpenLDAP Software License)
>  * rapidjson (MIT)
>  * Google RE2 (BSD-style)
>  * lz4 (BSD)
>  * snappy (BSD)
>  * cyrus-sasl (CMU License)
>  * Apache Avro (Apache Software License v2.0)
>  * Cloudera squeasel (Apache Software License v2.0)
>  * Apache htrace (Incubating) (Apache Software License v2.0)
>  * Apache Sentry (Incubating) (Apache Software License v2.0)
>  * Apache Shiro (Apache Software License v2.0)
>  * Twitter Bootstrap (Apache Software License v2.0)
>  * d3 (BSD)
>  * LLVM (BSD-like)
> 
> Build and test dependencies:
> 
>  * ant (Apache Software License v2.0)
>  * Apache Maven (Apache Software License v2.0)
>  * cmake (BSD)
>  * clang (BSD)
>  * Google gtest (Apache Software License v2.0)
> 
> = Required Resources =
> 
> We request that following resources be created for the project to use:
> 
> == Mailing lists ==
> 
>  * private@impala.incubator.apache.org (moderated subscriptions)
>  * commits@impala.incubator.apache.org
>  * dev@impala.incubator.apache.org
>  * issues@impala.incubator.apache.org
>  * user@impala.incubator.apache.org
> 
> == Git repository ==
> https://git.apache.org/impala.git
> 
> == JIRA instance ==
> JIRA project IMPALA (IMPALA or IMP)
> 
> == Other Resources ==
> We hope to continue using Gerrit for our code review and commit workflow.
> We are involved with discussions that the Kudu team at Cloudera have been
> having with Jake Farrell to start discussions on how Gerrit can fit into
> the ASF. We know that several other ASF projects or podlings are also
> interested in Gerrit.
> 
> If the Infrastructure team does not have the bandwidth to support gerrit,
> we will continue to support our own instance of gerrit for Impala, and make
> the necessary integrations such that commits are properly authenticated and
> maintain sufficient provenance to uphold the ASF standards (e.g. via the
> solution adopted by the AsterixDB podling).
> 
> = Initial Committers =
> 
>  * Tim Armstrong
>  * Alex Behm
>  * Taras Bobrovytsky
>  * Casey Ching
>  * Martin Grund
>  * Daniel Hecht
>  * Michael Ho
>  * Matthew Jacobs
>  * Ishaan Joshi
>  * Lenni Kuff
>  * Marcel Kornacker
>  * Sailesh Mukil
>  * Henry Robinson
>  * John Russell
>  * Dimitris Tsirogiannis
>  * Skye Wanderman-Milne
>  * Juan Yu
> 
> == Affiliations ==
> All: Cloudera Inc.
> 
> = Sponsors =
> 
> == Champion ==
> Tom White
> 
> == Nominated Mentors ==
>  * Tom White (Cloudera)
>  * Todd Lipcon (Cloudera)
>  * Carl Steinbach (LinkedIn)
>  * Brock Noland (StreamSets)
> 
> 
> = Sponsoring Entity =
> We ask that the Incubator PMC sponsor this proposal.

Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Brock Noland <br...@apache.org>.
+1 (binding)

On Tuesday, November 24, 2015, Carl Steinbach <cw...@apache.org> wrote:

> +1 (binding)
>
>
> On Tue, Nov 24, 2015 at 4:56 PM, Luke Han <luke.hq@gmail.com
> <javascript:;>> wrote:
>
> > +1 (non-binding)
> >
> >
> > Best Regards!
> > ---------------------
> >
> > Luke Han
> >
> > On Wed, Nov 25, 2015 at 8:07 AM, Julien Le Dem <julien@dremio.com
> <javascript:;>> wrote:
> >
> > > +1 (binding)
> > >
> > > On Tue, Nov 24, 2015 at 3:49 PM, Mike Percy <mpercy@apache.org
> <javascript:;>> wrote:
> > >
> > > > On Tue, Nov 24, 2015 at 1:03 PM, Henry Robinson <henry@cloudera.com
> <javascript:;>>
> > > > wrote:
> > > >
> > > > > I'd like to call a VOTE on that proposal, which is on the wiki at
> > > > > https://wiki.apache.org/incubator/ImpalaProposal, and which I've
> > > pasted
> > > > > below.
> > > > >
> > > > > During the discussion period, the proposal has been amended to add
> > > Brock
> > > > > Noland as a new mentor, to add one missed committer from the list
> and
> > > to
> > > > > correct some issues with the dependency list.
> > > > >
> > > > > Please cast your votes as follows:
> > > > >
> > > >
> > > > +1 (non-binding)
> > > >
> > > > Mike
> > > >
> > >
> > >
> > >
> > > --
> > > Julien
> > >
> >
>

Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Carl Steinbach <cw...@apache.org>.
+1 (binding)


On Tue, Nov 24, 2015 at 4:56 PM, Luke Han <lu...@gmail.com> wrote:

> +1 (non-binding)
>
>
> Best Regards!
> ---------------------
>
> Luke Han
>
> On Wed, Nov 25, 2015 at 8:07 AM, Julien Le Dem <ju...@dremio.com> wrote:
>
> > +1 (binding)
> >
> > On Tue, Nov 24, 2015 at 3:49 PM, Mike Percy <mp...@apache.org> wrote:
> >
> > > On Tue, Nov 24, 2015 at 1:03 PM, Henry Robinson <he...@cloudera.com>
> > > wrote:
> > >
> > > > I'd like to call a VOTE on that proposal, which is on the wiki at
> > > > https://wiki.apache.org/incubator/ImpalaProposal, and which I've
> > pasted
> > > > below.
> > > >
> > > > During the discussion period, the proposal has been amended to add
> > Brock
> > > > Noland as a new mentor, to add one missed committer from the list and
> > to
> > > > correct some issues with the dependency list.
> > > >
> > > > Please cast your votes as follows:
> > > >
> > >
> > > +1 (non-binding)
> > >
> > > Mike
> > >
> >
> >
> >
> > --
> > Julien
> >
>

Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Luke Han <lu...@gmail.com>.
+1 (non-binding)


Best Regards!
---------------------

Luke Han

On Wed, Nov 25, 2015 at 8:07 AM, Julien Le Dem <ju...@dremio.com> wrote:

> +1 (binding)
>
> On Tue, Nov 24, 2015 at 3:49 PM, Mike Percy <mp...@apache.org> wrote:
>
> > On Tue, Nov 24, 2015 at 1:03 PM, Henry Robinson <he...@cloudera.com>
> > wrote:
> >
> > > I'd like to call a VOTE on that proposal, which is on the wiki at
> > > https://wiki.apache.org/incubator/ImpalaProposal, and which I've
> pasted
> > > below.
> > >
> > > During the discussion period, the proposal has been amended to add
> Brock
> > > Noland as a new mentor, to add one missed committer from the list and
> to
> > > correct some issues with the dependency list.
> > >
> > > Please cast your votes as follows:
> > >
> >
> > +1 (non-binding)
> >
> > Mike
> >
>
>
>
> --
> Julien
>

Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Julien Le Dem <ju...@dremio.com>.
+1 (binding)

On Tue, Nov 24, 2015 at 3:49 PM, Mike Percy <mp...@apache.org> wrote:

> On Tue, Nov 24, 2015 at 1:03 PM, Henry Robinson <he...@cloudera.com>
> wrote:
>
> > I'd like to call a VOTE on that proposal, which is on the wiki at
> > https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
> > below.
> >
> > During the discussion period, the proposal has been amended to add Brock
> > Noland as a new mentor, to add one missed committer from the list and to
> > correct some issues with the dependency list.
> >
> > Please cast your votes as follows:
> >
>
> +1 (non-binding)
>
> Mike
>



-- 
Julien

Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Mike Percy <mp...@apache.org>.
On Tue, Nov 24, 2015 at 1:03 PM, Henry Robinson <he...@cloudera.com> wrote:

> I'd like to call a VOTE on that proposal, which is on the wiki at
> https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
> below.
>
> During the discussion period, the proposal has been amended to add Brock
> Noland as a new mentor, to add one missed committer from the list and to
> correct some issues with the dependency list.
>
> Please cast your votes as follows:
>

+1 (non-binding)

Mike

Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Chris Douglas <cd...@apache.org>.
+1 (binding) -C

On Tue, Nov 24, 2015 at 1:03 PM, Henry Robinson <he...@cloudera.com> wrote:
> Hi -
>
> The [DISCUSS] thread has been quiet for a few days, so I think there's been
> sufficient opportunity for discussion around our proposal to bring Impala
> to the ASF Incubator.
>
> I'd like to call a VOTE on that proposal, which is on the wiki at
> https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
> below.
>
> During the discussion period, the proposal has been amended to add Brock
> Noland as a new mentor, to add one missed committer from the list and to
> correct some issues with the dependency list.
>
> Please cast your votes as follows:
>
> [] +1, accept Impala into the Incubator
> [] +/-0, non-counted vote to express a disposition
> [] -1, do not accept Impala into the Incubator (please give your reason(s))
>
> As with the concurrent Kudu vote, I propose leaving the vote open for a
> full seven days (to close at Tuesday, December 1st at noon PST), due to the
> upcoming US holiday.
>
> Thanks,
> Henry
>
> --------
>
> = Abstract =
> Impala is a high-performance C++ and Java SQL query engine for data stored
> in Apache Hadoop-based clusters.
>
> = Proposal =
>
> We propose to contribute the Impala codebase and associated artifacts (e.g.
> documentation, web-site content etc.) to the Apache Software Foundation
> with the intent of forming a productive, meritocratic and open community
> around Impala’s continued development, according to the ‘Apache Way’.
>
> Cloudera owns several trademarks regarding Impala, and proposes to transfer
> ownership of those trademarks in full to the ASF.
>
> = Background =
> Engineers at Cloudera developed Impala and released it as an
> Apache-licensed open-source project in Fall 2012. Impala was written as a
> brand-new, modern C++ SQL engine targeted from the start for data stored in
> Apache Hadoop clusters.
>
> Impala’s most important benefit to users is high-performance, making it
> extremely appropriate for common enterprise analytic and business
> intelligence workloads. This is achieved by a number of software
> techniques, including: native support for data stored in HDFS and related
> filesystems, just-in-time compilation and optimization of individual query
> plans, high-performance C++ codebase and massively-parallel distributed
> architecture. In benchmarks, Impala is routinely amongst the very highest
> performing SQL query engines.
>
> = Rationale =
>
> Despite the exciting innovation in the so-called ‘big-data’ space, SQL
> remains by far the most common interface for interacting with data in both
> traditional warehouses and modern ‘big-data’ clusters. There is clearly a
> need, as evidenced by the eager adoption of Impala and other SQL engines in
> enterprise contexts, for a query engine that offers the familiar SQL
> interface, but that has been specifically designed to operate in massive,
> distributed clusters rather than in traditional, fixed-hardware,
> warehouse-specific deployments. Impala is one such query engine.
>
> We believe that the ASF is the right venue to foster an open-source
> community around Impala’s development. We expect that Impala will benefit
> from more productive collaboration with related Apache projects, and under
> the auspices of the ASF will attract talented contributors who will push
> Impala’s development forward at pace.
>
> We believe that the timing is right for Impala’s development to move
> wholesale to the ASF: Impala is well-established, has been Apache-licensed
> open-source for more than three years, and the core project is relatively
> stable. We are excited to see where an ASF-based community can take Impala
> from this strong starting point.
>
> = Initial Goals =
> Our initial goals are as follows:
>
>  * Establish ASF-compatible engineering practices and workflows
>  * Refactor and publish existing internal build scripts and test
> infrastructure, in order to make them usable by any community member.
>  * Transfer source code, documentation and associated artifacts to the ASF.
>  * Grow the user and developer communities
>
> = Current Status =
>
> Impala is developed as an Apache-licensed open-source project. The source
> code is available at http://github.com/cloudera/Impala, and developer
> documentation is at https://github.com/cloudera/Impala/wiki. The majority
> of commits to the project have come from Cloudera-employed developers, but
> we have accepted some contributions from individuals from other
> organizations.
>
> All code reviews are done via a public instance of the Gerrit review tool
> at http://gerrit.cloudera.org:8080/, and discussed on a public mailing
> list. All patches must be reviewed before they are accepted into the
> codebase, via a voting mechanism that is similar to that used on Apache
> projects such as Hadoop and HBase.
>
> Before a patch is committed, it must pass a suite of pre-commit tests.
> These tests are currently run on Cloudera’s internal infrastructure. One of
> our initial goals will be to work with the ASF Infrastructure team to find
> a way to run these tests in an acceptable way on publicly accessible
> machines.
>
> Issues are tracked in JIRA at https://issues.cloudera.org/projects/IMPALA,
> in a way that is extremely similar to existing practices at other ASF
> projects.
>
> = Meritocracy =
>
> We understand the central importance of meritocracy to the Apache Way. We
> will work to establish a welcoming, fair and meritocratic community, in
> part by expanding the set of committers on the project. Although Impala’s
> committer list will initially be dominated by members of the Impala
> engineering team at Cloudera, we look forward to growing a rich user and
> developer community.
>
> = Community =
> Impala has a strong user community (see
> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user), and a
> growing developer community (see
> https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev). We wish
> to attract more developers to the project, and we believe that the ASF’s
> open and meritocratic philosophy will help us with this. We note the
> success of other, similar projects already part of the ASF.
>
> = Core Developers =
> Most - but not all - of Impala’s core developers are not currently
> affiliated with the ASF, and will require new ICLAs.
>
> = Alignment =
> Impala is related to several other Apache projects:
>
>  * Data that is read by Impala is very often stored in Apache Hadoop
> clusters powered by the HDFS filesystem.
>  * Impala can also read data stored in Apache HBase
>  * Metadata for databases, tables and so on is read by Impala from Apache
> Hive.
>  * The preferred data format for HDFS-based tables is Apache Parquet, and
> Apache Avro is also a supported data format.
>  * Impala is closely integrated with Kudu, which is also being proposed to
> the Incubator.
>  * Impala uses Apache Thrift as its RPC and serialization framework of
> choice.
>
> = Known Risks =
>
> == Orphaned Products ==
> Impala is used by most of Cloudera’s customers, and Cloudera remains
> committed to developing and supporting the project. Cloudera has a strong
> track record in standing behind projects that were contributed to the ASF
> by its employees, including Apache Flume, Apache Sqoop, and others. Other
> companies both ship and support Impala, lending credence to the idea that
> Impala is not at risk of being suddenly orphaned.
>
> == Inexperience with Open Source ==
> Although all committers on the initial list have significant experience
> with at least one open-source project - namely Impala - fewer have much
> experience with ASF-based software projects as contributors and community
> members. However, with the guidance of our mentors, committers who do have
> ASF experience, and time to learn during Incubation, we are confident that
> the project can be run in accordance with Apache principles on an ongoing
> basis.
>
> == Homogeneous Developers ==
>
> The initial committers are employees of Cloudera.
>
> The project has received some contributions from developers outside of
> Cloudera, from individuals belonging to organizations such as Intel and
> Google, from hobbyists and from students using Impala to advance their
> understanding of distributed databases. The project attracted an active
> user community as well. We hope to continue to encourage contributions from
> these developers and community members and grow them into committers after
> they have had time to continue their contributions.
>
> == Reliance on Salaried Developers ==
>
> Many of Impala’s initial set of committers work full-time on Impala, and
> are paid to do so. However, as mentioned elsewhere, we anticipate growth in
> the developer community which we hope will include hobbyists and academics
> who have an interested in distributed data systems.
>
> == An Excessive Fascination with the Apache Brand ==
> Although we hope that Impala benefits from the Apache Brand, any reflected
> goodwill to Cloudera as the contributing entity is not the goal of
> establishing Impala as an Apache project. We will work with the Incubator
> PMC and the PRC to ensure that the Apache Brand is respected.
>
> = Documentation =
> Impala: A Modern, Open-Source SQL Engine for Hadoop (
> http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf)
>
> Impala’s developer wiki (https://github.com/cloudera/Impala/wiki)
>
> Impala’s auto-generated API documentation (
> http://impala.io/doc/html/index.html)
>
> = Initial Source =
> Impala’s initial source contribution will come from
> http://github.com/cloudera/Impala/.
>
> = External Dependencies =
>
> Impala depends upon a number of third-party libraries, which we list below.
> We intend to compile a LICENSE.txt file in the very short term (see
> https://issues.cloudera.org/browse/IMPALA-2670).
>
>  * Google gflags (BSD)
>  * Google glog (BSD)
>  * Apache Thrift (Apache Software License v2.0)
>  * Apache Commons (Apache Software License v2.0)
>  * Apache Hadoop (Apache Software License v2.0)
>  * Apache HBase (Apache Software License v2.0)
>  * Apache Hive (Apache Software License v2.0)
>  * Boost (Boost Software License)
>  * OpenLdap (OpenLDAP Software License)
>  * rapidjson (MIT)
>  * Google RE2 (BSD-style)
>  * lz4 (BSD)
>  * snappy (BSD)
>  * cyrus-sasl (CMU License)
>  * Apache Avro (Apache Software License v2.0)
>  * Cloudera squeasel (Apache Software License v2.0)
>  * Apache htrace (Incubating) (Apache Software License v2.0)
>  * Apache Sentry (Incubating) (Apache Software License v2.0)
>  * Apache Shiro (Apache Software License v2.0)
>  * Twitter Bootstrap (Apache Software License v2.0)
>  * d3 (BSD)
>  * LLVM (BSD-like)
>
> Build and test dependencies:
>
>  * ant (Apache Software License v2.0)
>  * Apache Maven (Apache Software License v2.0)
>  * cmake (BSD)
>  * clang (BSD)
>  * Google gtest (Apache Software License v2.0)
>
> = Required Resources =
>
> We request that following resources be created for the project to use:
>
> == Mailing lists ==
>
>  * private@impala.incubator.apache.org (moderated subscriptions)
>  * commits@impala.incubator.apache.org
>  * dev@impala.incubator.apache.org
>  * issues@impala.incubator.apache.org
>  * user@impala.incubator.apache.org
>
> == Git repository ==
> https://git.apache.org/impala.git
>
> == JIRA instance ==
> JIRA project IMPALA (IMPALA or IMP)
>
> == Other Resources ==
> We hope to continue using Gerrit for our code review and commit workflow.
> We are involved with discussions that the Kudu team at Cloudera have been
> having with Jake Farrell to start discussions on how Gerrit can fit into
> the ASF. We know that several other ASF projects or podlings are also
> interested in Gerrit.
>
> If the Infrastructure team does not have the bandwidth to support gerrit,
> we will continue to support our own instance of gerrit for Impala, and make
> the necessary integrations such that commits are properly authenticated and
> maintain sufficient provenance to uphold the ASF standards (e.g. via the
> solution adopted by the AsterixDB podling).
>
> = Initial Committers =
>
>  * Tim Armstrong
>  * Alex Behm
>  * Taras Bobrovytsky
>  * Casey Ching
>  * Martin Grund
>  * Daniel Hecht
>  * Michael Ho
>  * Matthew Jacobs
>  * Ishaan Joshi
>  * Lenni Kuff
>  * Marcel Kornacker
>  * Sailesh Mukil
>  * Henry Robinson
>  * John Russell
>  * Dimitris Tsirogiannis
>  * Skye Wanderman-Milne
>  * Juan Yu
>
> == Affiliations ==
> All: Cloudera Inc.
>
> = Sponsors =
>
> == Champion ==
> Tom White
>
> == Nominated Mentors ==
>  * Tom White (Cloudera)
>  * Todd Lipcon (Cloudera)
>  * Carl Steinbach (LinkedIn)
>  * Brock Noland (StreamSets)
>
>
> = Sponsoring Entity =
> We ask that the Incubator PMC sponsor this proposal.

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: [VOTE] Accept Impala into the Apache Incubator

Posted by Todd Lipcon <to...@apache.org>.
On Tue, Nov 24, 2015 at 1:03 PM, Henry Robinson <he...@cloudera.com> wrote:

> Hi -
>
> The [DISCUSS] thread has been quiet for a few days, so I think there's
> been sufficient opportunity for discussion around our proposal to bring
> Impala to the ASF Incubator.
>
> I'd like to call a VOTE on that proposal, which is on the wiki at
> https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
> below.
>
> During the discussion period, the proposal has been amended to add Brock
> Noland as a new mentor, to add one missed committer from the list and to
> correct some issues with the dependency list.
>
> Please cast your votes as follows:
>
> [] +1, accept Impala into the Incubator
> [] +/-0, non-counted vote to express a disposition
> [] -1, do not accept Impala into the Incubator (please give your reason(s))
>
> As with the concurrent Kudu vote, I propose leaving the vote open for a
> full seven days (to close at Tuesday, December 1st at noon PST), due to the
> upcoming US holiday.
>
> Thanks,
> Henry
>

+1 (binding)

Re: [VOTE] Accept Impala into the Apache Incubator

Posted by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>.
+1 from me binding

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++





-----Original Message-----
From: Henry Robinson <he...@cloudera.com>
Reply-To: "general@incubator.apache.org" <ge...@incubator.apache.org>
Date: Tuesday, November 24, 2015 at 1:03 PM
To: "general@incubator.apache.org" <ge...@incubator.apache.org>
Subject: [VOTE] Accept Impala into the Apache Incubator

>Hi -
>
>The [DISCUSS] thread has been quiet for a few days, so I think there's
>been
>sufficient opportunity for discussion around our proposal to bring Impala
>to the ASF Incubator.
>
>I'd like to call a VOTE on that proposal, which is on the wiki at
>https://wiki.apache.org/incubator/ImpalaProposal, and which I've pasted
>below.
>
>During the discussion period, the proposal has been amended to add Brock
>Noland as a new mentor, to add one missed committer from the list and to
>correct some issues with the dependency list.
>
>Please cast your votes as follows:
>
>[] +1, accept Impala into the Incubator
>[] +/-0, non-counted vote to express a disposition
>[] -1, do not accept Impala into the Incubator (please give your
>reason(s))
>
>As with the concurrent Kudu vote, I propose leaving the vote open for a
>full seven days (to close at Tuesday, December 1st at noon PST), due to
>the
>upcoming US holiday.
>
>Thanks,
>Henry
>
>--------
>
>= Abstract =
>Impala is a high-performance C++ and Java SQL query engine for data stored
>in Apache Hadoop-based clusters.
>
>= Proposal =
>
>We propose to contribute the Impala codebase and associated artifacts
>(e.g.
>documentation, web-site content etc.) to the Apache Software Foundation
>with the intent of forming a productive, meritocratic and open community
>around Impala’s continued development, according to the ‘Apache Way’.
>
>Cloudera owns several trademarks regarding Impala, and proposes to
>transfer
>ownership of those trademarks in full to the ASF.
>
>= Background =
>Engineers at Cloudera developed Impala and released it as an
>Apache-licensed open-source project in Fall 2012. Impala was written as a
>brand-new, modern C++ SQL engine targeted from the start for data stored
>in
>Apache Hadoop clusters.
>
>Impala’s most important benefit to users is high-performance, making it
>extremely appropriate for common enterprise analytic and business
>intelligence workloads. This is achieved by a number of software
>techniques, including: native support for data stored in HDFS and related
>filesystems, just-in-time compilation and optimization of individual query
>plans, high-performance C++ codebase and massively-parallel distributed
>architecture. In benchmarks, Impala is routinely amongst the very highest
>performing SQL query engines.
>
>= Rationale =
>
>Despite the exciting innovation in the so-called ‘big-data’ space, SQL
>remains by far the most common interface for interacting with data in both
>traditional warehouses and modern ‘big-data’ clusters. There is clearly a
>need, as evidenced by the eager adoption of Impala and other SQL engines
>in
>enterprise contexts, for a query engine that offers the familiar SQL
>interface, but that has been specifically designed to operate in massive,
>distributed clusters rather than in traditional, fixed-hardware,
>warehouse-specific deployments. Impala is one such query engine.
>
>We believe that the ASF is the right venue to foster an open-source
>community around Impala’s development. We expect that Impala will benefit
>from more productive collaboration with related Apache projects, and under
>the auspices of the ASF will attract talented contributors who will push
>Impala’s development forward at pace.
>
>We believe that the timing is right for Impala’s development to move
>wholesale to the ASF: Impala is well-established, has been Apache-licensed
>open-source for more than three years, and the core project is relatively
>stable. We are excited to see where an ASF-based community can take Impala
>from this strong starting point.
>
>= Initial Goals =
>Our initial goals are as follows:
>
> * Establish ASF-compatible engineering practices and workflows
> * Refactor and publish existing internal build scripts and test
>infrastructure, in order to make them usable by any community member.
> * Transfer source code, documentation and associated artifacts to the
>ASF.
> * Grow the user and developer communities
>
>= Current Status =
>
>Impala is developed as an Apache-licensed open-source project. The source
>code is available at http://github.com/cloudera/Impala, and developer
>documentation is at https://github.com/cloudera/Impala/wiki. The majority
>of commits to the project have come from Cloudera-employed developers, but
>we have accepted some contributions from individuals from other
>organizations.
>
>All code reviews are done via a public instance of the Gerrit review tool
>at http://gerrit.cloudera.org:8080/, and discussed on a public mailing
>list. All patches must be reviewed before they are accepted into the
>codebase, via a voting mechanism that is similar to that used on Apache
>projects such as Hadoop and HBase.
>
>Before a patch is committed, it must pass a suite of pre-commit tests.
>These tests are currently run on Cloudera’s internal infrastructure. One
>of
>our initial goals will be to work with the ASF Infrastructure team to find
>a way to run these tests in an acceptable way on publicly accessible
>machines.
>
>Issues are tracked in JIRA at https://issues.cloudera.org/projects/IMPALA,
>in a way that is extremely similar to existing practices at other ASF
>projects.
>
>= Meritocracy =
>
>We understand the central importance of meritocracy to the Apache Way. We
>will work to establish a welcoming, fair and meritocratic community, in
>part by expanding the set of committers on the project. Although Impala’s
>committer list will initially be dominated by members of the Impala
>engineering team at Cloudera, we look forward to growing a rich user and
>developer community.
>
>= Community =
>Impala has a strong user community (see
>https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user), and a
>growing developer community (see
>https://groups.google.com/a/cloudera.org/forum/#!forum/impala-dev). We
>wish
>to attract more developers to the project, and we believe that the ASF’s
>open and meritocratic philosophy will help us with this. We note the
>success of other, similar projects already part of the ASF.
>
>= Core Developers =
>Most - but not all - of Impala’s core developers are not currently
>affiliated with the ASF, and will require new ICLAs.
>
>= Alignment =
>Impala is related to several other Apache projects:
>
> * Data that is read by Impala is very often stored in Apache Hadoop
>clusters powered by the HDFS filesystem.
> * Impala can also read data stored in Apache HBase
> * Metadata for databases, tables and so on is read by Impala from Apache
>Hive.
> * The preferred data format for HDFS-based tables is Apache Parquet, and
>Apache Avro is also a supported data format.
> * Impala is closely integrated with Kudu, which is also being proposed to
>the Incubator.
> * Impala uses Apache Thrift as its RPC and serialization framework of
>choice.
>
>= Known Risks =
>
>== Orphaned Products ==
>Impala is used by most of Cloudera’s customers, and Cloudera remains
>committed to developing and supporting the project. Cloudera has a strong
>track record in standing behind projects that were contributed to the ASF
>by its employees, including Apache Flume, Apache Sqoop, and others. Other
>companies both ship and support Impala, lending credence to the idea that
>Impala is not at risk of being suddenly orphaned.
>
>== Inexperience with Open Source ==
>Although all committers on the initial list have significant experience
>with at least one open-source project - namely Impala - fewer have much
>experience with ASF-based software projects as contributors and community
>members. However, with the guidance of our mentors, committers who do have
>ASF experience, and time to learn during Incubation, we are confident that
>the project can be run in accordance with Apache principles on an ongoing
>basis.
>
>== Homogeneous Developers ==
>
>The initial committers are employees of Cloudera.
>
>The project has received some contributions from developers outside of
>Cloudera, from individuals belonging to organizations such as Intel and
>Google, from hobbyists and from students using Impala to advance their
>understanding of distributed databases. The project attracted an active
>user community as well. We hope to continue to encourage contributions
>from
>these developers and community members and grow them into committers after
>they have had time to continue their contributions.
>
>== Reliance on Salaried Developers ==
>
>Many of Impala’s initial set of committers work full-time on Impala, and
>are paid to do so. However, as mentioned elsewhere, we anticipate growth
>in
>the developer community which we hope will include hobbyists and academics
>who have an interested in distributed data systems.
>
>== An Excessive Fascination with the Apache Brand ==
>Although we hope that Impala benefits from the Apache Brand, any reflected
>goodwill to Cloudera as the contributing entity is not the goal of
>establishing Impala as an Apache project. We will work with the Incubator
>PMC and the PRC to ensure that the Apache Brand is respected.
>
>= Documentation =
>Impala: A Modern, Open-Source SQL Engine for Hadoop (
>http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf)
>
>Impala’s developer wiki (https://github.com/cloudera/Impala/wiki)
>
>Impala’s auto-generated API documentation (
>http://impala.io/doc/html/index.html)
>
>= Initial Source =
>Impala’s initial source contribution will come from
>http://github.com/cloudera/Impala/.
>
>= External Dependencies =
>
>Impala depends upon a number of third-party libraries, which we list
>below.
>We intend to compile a LICENSE.txt file in the very short term (see
>https://issues.cloudera.org/browse/IMPALA-2670).
>
> * Google gflags (BSD)
> * Google glog (BSD)
> * Apache Thrift (Apache Software License v2.0)
> * Apache Commons (Apache Software License v2.0)
> * Apache Hadoop (Apache Software License v2.0)
> * Apache HBase (Apache Software License v2.0)
> * Apache Hive (Apache Software License v2.0)
> * Boost (Boost Software License)
> * OpenLdap (OpenLDAP Software License)
> * rapidjson (MIT)
> * Google RE2 (BSD-style)
> * lz4 (BSD)
> * snappy (BSD)
> * cyrus-sasl (CMU License)
> * Apache Avro (Apache Software License v2.0)
> * Cloudera squeasel (Apache Software License v2.0)
> * Apache htrace (Incubating) (Apache Software License v2.0)
> * Apache Sentry (Incubating) (Apache Software License v2.0)
> * Apache Shiro (Apache Software License v2.0)
> * Twitter Bootstrap (Apache Software License v2.0)
> * d3 (BSD)
> * LLVM (BSD-like)
>
>Build and test dependencies:
>
> * ant (Apache Software License v2.0)
> * Apache Maven (Apache Software License v2.0)
> * cmake (BSD)
> * clang (BSD)
> * Google gtest (Apache Software License v2.0)
>
>= Required Resources =
>
>We request that following resources be created for the project to use:
>
>== Mailing lists ==
>
> * private@impala.incubator.apache.org (moderated subscriptions)
> * commits@impala.incubator.apache.org
> * dev@impala.incubator.apache.org
> * issues@impala.incubator.apache.org
> * user@impala.incubator.apache.org
>
>== Git repository ==
>https://git.apache.org/impala.git
>
>== JIRA instance ==
>JIRA project IMPALA (IMPALA or IMP)
>
>== Other Resources ==
>We hope to continue using Gerrit for our code review and commit workflow.
>We are involved with discussions that the Kudu team at Cloudera have been
>having with Jake Farrell to start discussions on how Gerrit can fit into
>the ASF. We know that several other ASF projects or podlings are also
>interested in Gerrit.
>
>If the Infrastructure team does not have the bandwidth to support gerrit,
>we will continue to support our own instance of gerrit for Impala, and
>make
>the necessary integrations such that commits are properly authenticated
>and
>maintain sufficient provenance to uphold the ASF standards (e.g. via the
>solution adopted by the AsterixDB podling).
>
>= Initial Committers =
>
> * Tim Armstrong
> * Alex Behm
> * Taras Bobrovytsky
> * Casey Ching
> * Martin Grund
> * Daniel Hecht
> * Michael Ho
> * Matthew Jacobs
> * Ishaan Joshi
> * Lenni Kuff
> * Marcel Kornacker
> * Sailesh Mukil
> * Henry Robinson
> * John Russell
> * Dimitris Tsirogiannis
> * Skye Wanderman-Milne
> * Juan Yu
>
>== Affiliations ==
>All: Cloudera Inc.
>
>= Sponsors =
>
>== Champion ==
>Tom White
>
>== Nominated Mentors ==
> * Tom White (Cloudera)
> * Todd Lipcon (Cloudera)
> * Carl Steinbach (LinkedIn)
> * Brock Noland (StreamSets)
>
>
>= Sponsoring Entity =
>We ask that the Incubator PMC sponsor this proposal.