You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@hadoop.apache.org by Eli Collins <el...@cloudera.com> on 2010/07/12 21:39:45 UTC

HEP proposal

A while back we started discussing on list (http://bit.ly/aFj9Ya) and
at the contributors meeting (http://bit.ly/aj4Y7I) a more coordinated
way to describe, socialize and shepherd enhancements to Hadoop.
Thanks for all the feedback.  Most of it was encouraging so I wrote up
a draft proposal with specifics to discuss here.  After incorporating
feedback I'll send out another revision for vote.

Thanks,
Eli


HEP: 1
Title: HEP Purpose and Guidelines
Author: Eli Collins
Status: Draft


What is a HEP?
==============

HEP stands for Hadoop Enhancement Proposal, and is based on Python's
PEP (Python Enhancement Proposal) [1].  A HEP is a document that
describes a new feature, it's rationale, and issues the feature needs
to address in order to be successuflly incorporated.

The intent is for HEPs to be the primary mechanism for proposing
significant new features to core Hadoop (common, HDFS and MapReduce),
incorporating community feedback, and recording the proposal.  Going
through the HEP process should improve the chances that a proposal is
successful.

While HEPs do not need to come with code, they are a mechanism to
propose features to the community, with the intent of contributing the
feature, rather than request the community implement a feature.

HEPs must be consistent with Apache bylaws [2], for example, the HEP
workflow takes place on the public Apache Hadoop lists.


When is a HEP Required?
=======================

HEPs should not impede casual contribution to Hadoop.  Small
improvements and bugs do not require HEPs.  Not all features need
HEPs.  While the decision is subjective, here are some guidelines to
indicate a HEP should be considered:

- The feature impacts backwards compatibility (eg modifies released
public APIs in an incompatible way).

- The feature requires that an existing component be substantially
re-designed (eg NameNode modified to use Bookkeeper).

- The implementation impact multiple parts of the system (eg symbolic
links versus adding a pluggable component like a codec).

- The feature impacts the entire development community (eg converts
the build system to use maven).


HEP Workflow
============

The author of a HEP should first try to determine if their idea is
HEP-able by sending mail to the general, or the project-specific lists
if the scope of the idea is limited to the project.  This gives the
author a chance to flesh out the proposal, address intial concerns,
and figure out whether it has a chance of being accepted.  The
author's role is to build consensus, and gather dissenting opinions.

Following this discussion the author should draft a HEP proposal
following the HEP template. The proposal should accurately reflect and
address feedback and dissenting opinions.  For example, flesh out
sections on backwards compatibility or testing. The author should send
the draft of the proposal to hep@hadoop.apache.org for review.  This
is a new, public list for editors and those interested in following
the review process.

A set of editors reviews incoming HEPs. Each HEP is assigned a single
primary editor. An editor may volunteer if they feel particular
functional expertise is required or assign HEPs to editors round
robin.

The editor reviews the proposal and may request it be updated if it
does not sufficiently address feedback raised during discussion, eg
why the proposal is not redundant with existing functionality, or is
technically sound, sufficiently motivated, covers backwards
compatibility, etc. As updates are necessary, the HEP author can check
in new versions if they have commit permissions, or can email new HEP
versions to the editor for committing. In order to ensure HEP
proposals make progress the editor should respond to proposal drafts
within two weeks of receiving them (or the proposer can request
another editor), and the proposer should generate updates to the draft
within two weeks of receiving feedback from the editor.

The editor's role is to determine if the proposal is complete, so that
the proposal can be voted on, not whether they agree with the proposal
itself.  The editor's involvement should increase the chance that a
HEP proposal makes it to a vote.

Once the editor deems the proposal is complete they add it to a
versioned HEP repository and the author posts the proposal to
general@hadoop.apache.org for vote.  HEP votes, like Apache procedural
votes, use majority rule [3]. Successful HEPs are assigned a number,
unsuccessful HEPs remain drafts.

The editors are apointed and removed by the PMC informally, similar to
how the Apache Board appoints shepherds to projects.


HEP Contents
============

Each HEP should contain the following:

1. Preamble -- Including the HEP number, a short descriptive title,
and the names of the authors.

2. Abstract -- A short (~200 word) description of the technical issue
being addressed.

3. Copyright/public domain -- Each HEP must either be explicitly
labelled as placed in the public domain (see this HEP as an example).

4. Design -- A high-level explanation of the design. It should cover
intended use cases, failure scenarios, and impact on the existing
system.

5. Motivation -- The motivation spells out the use case for the
feature and the benefits it provides.

6. Rationale -- The rationale describes what motivated the design and
why particular design decisions were made.  It should describe
alternate designs that were considered and related work, e.g. how the
feature is designed in other systems. It should also consider whether
the feature could be achieved by layering atop the existing system
rather than modifying it.

The rationale should provide evidence of consensus within the
community and discuss important objections or concerns raised during
discussion.

7. Backwards Compatibility -- All HEPs that introduce backwards
incompatibilities must include a section describing these
incompatibilities and their severity.  The HEP must explain how the
author proposes to deal with these incompatibilities.  HEP submissions
without a sufficient backwards compatibility treatise may be rejected
outright.


HEP Template
============

HEPs should be plain text with minimal structural markup that adheres
to a rigid style.  You can use this HEP as an example. Each HEP starts
with a header that contains the HEP number (or empty if the number has
not yet been assigned), title, list of authors and status (Draft,
Accepted, Rejected, or Withdrawn).


Auxiliary Files
===============

HEPs may include auxiliary files such as diagrams.  Such files must be
named ``hep-XXXX-Y.ext``, where "XXXX" is the HEP number, "Y" is a
serial number (starting at 1), and "ext" is replaced by the actual
file extension (e.g. "png").


References
==========

1. http://www.python.org/dev/peps/pep-0001

2. http://www.apache.org/foundation/bylaws.html

3. http://www.apache.org/foundation/voting.html


Copyright
=========

This document has been placed in the public domain.

Re: HEP proposal

Posted by Eli Collins <el...@cloudera.com>.
On Wed, Jul 14, 2010 at 12:12 PM, Konstantin Boudnik <co...@yahoo-inc.com> wrote:
> I have been following this discussion for some time now and the only question
> came to my mind: why mimicking PEP? Is it so astonishingly successful or is
> it much better than Apache voting or RFC process (from where it has been
> apparently derived).

I thought it would be more fruitful to adapt an existing, working
model rather than invent a new one. I based it on PEP after looking at
what other projects used; PEP seems to strike a balance between no
structure and more heavy weight processes (JSR, RFC).  I'm open to
other models if there's something you think is more suitable.

> So far I see HEP as an over-complicated process for a process sake. I'd appreciate if some one can chip-in and tell me if and where I'm wrong.

I think that's a reasonable opinion. There are communities that
function without process.

In the limited time I've been working on Hadoop there's been tension
between substantially modifying the system and preserving it's
stability. It seemed to me that some additional structure around
discussing change up front would help relieve some of that tension.
There is value in thinking and discussing what you're doing up front
in a way that's visible to the community. HEP essentially enforces
that. If that happens naturally I agree we don't need the process. If
people think what we've currently got is sufficient I'm happy to chalk
it up as a learning experience, don't want people to adopt it just
because I've taken the time to draft something.

Thanks,
Eli

> Thanks,
>  Cos
>
> On Wed, Jul 14, 2010 at 10:46AM, Eli Collins wrote:
>>    Hey Konstantin,
>>
>>    Thanks for taking a look, comments in-line.
>>
>>    On Tue, Jul 13, 2010 at 1:54 PM, Konstantin Shvachko <sh...@yahoo-inc.com>
>>    wrote:
>>    > Eli,
>>    >
>>    > Thanks for a really good proposal.
>>    > Some questions / comments:
>>    >
>>    > On voting
>>    > 1. Which voting rule?
>>    > http://www.apache.org/foundation/glossary.html#ConsensusApproval
>>    > http://www.apache.org/foundation/glossary.html#MajorityApproval
>>    > I think you mean the MajorityApproval as it does not have veto rule.
>>    > So may be it's just clarifying the reference.
>>
>>    Good point, clarified so it's majority approval.
>>
>>    > 2. Who can vote?
>>    > Usually PMCs have Binding Votes.
>>    > Would be good to have a sentence clarifying this.
>>
>>    Yup, added.
>>
>>    > 3. How long does the vote go?
>>    > Usual 3 days may not be enough. One week is reasonable?
>>
>>    Specified one week.
>>
>>    > 4. Discussion on public lists.
>>    > A HEP can evolve from a jira, then it should be counted as a public
>>    > discussion. I think it makes sense even to continue the discussion
>>    > there if so.
>>
>>    Agreed, changed the wording to "If the scope of the idea is limited to
>>    a specific project the discussion may happen on the project-specific
>>    list or jira."
>>
>>    > 5. How the set of editors is selected?
>>    >   "The editors are apointed and removed by the PMC informally, similar
>>    to
>>    >   how the Apache Board appoints shepherds to projects."
>>    > This needs a reference. How does Apache Board appoints shepherds?
>>
>>    Good question, anyone know? Since it's informal I imagine shepherds
>>    volunteer. The editors could be a subset of the PMC that either
>>    volunteers or is rotated periodically.
>>
>>    > 6. The level of design details.
>>    > I think HEP should have a pretty detailed design. When people vote they
>>    > will want to be sure the design can lead to a reasonable implementation.
>>    > Should we say "implementation-ready design", rather than
>>    > "A high-level explanation of the design."
>>    > Or just
>>    > "A _detailed_ explanation of the design."
>>
>>    Rewrote this section, tried to make it more explicit about giving both
>>    a high-level view and complete enough description so the design can
>>    lead to a reasonable implementation. Also added that this section
>>    should cover how to test the design.
>>
>>    > 7. Typos:
>>    > successuflly, apointed, intial
>>
>>    Fixed.
>>
>>    Updated draft follows.
>>
>>    Thanks,
>>    Eli
>>
>>    HEP: 1
>>    Title: HEP Purpose and Guidelines
>>    Author: Eli Collins
>>    Status: Draft
>>
>>    What is a HEP?
>>    ==============
>>
>>    HEP stands for Hadoop Enhancement Proposal, and is based on Python's
>>    PEP (Python Enhancement Proposal) [1].  A HEP is a document that
>>    describes a new feature, it's rationale, and issues the feature needs
>>    to address in order to be successfully incorporated.
>>
>>    The intent is for HEPs to be the primary mechanism for proposing
>>    significant new features to core Hadoop (common, HDFS and MapReduce),
>>    incorporating community feedback, and recording the proposal.  Going
>>    through the HEP process should improve the chances that a proposal is
>>    successful.
>>
>>    While HEPs do not need to come with code, they are a mechanism to
>>    propose features to the community, with the intent of contributing the
>>    feature, rather than request the community implement a feature.
>>
>>    HEPs must be consistent with Apache bylaws [2], for example, the HEP
>>    workflow takes place on the public Apache Hadoop lists.
>>
>>    When is a HEP Required?
>>    =======================
>>
>>    HEPs should not impede casual contribution to Hadoop.  Small
>>    improvements and bugs do not require HEPs.  Not all features need
>>    HEPs.  While the decision is subjective, here are some guidelines to
>>    indicate a HEP should be considered:
>>
>>    - The feature impacts backwards compatibility (eg modifies released
>>    public APIs in an incompatible way).
>>
>>    - The feature requires that an existing component be substantially
>>    re-designed (eg NameNode modified to use Bookkeeper).
>>
>>    - The implementation impact multiple parts of the system (eg symbolic
>>    links versus adding a pluggable component like a codec).
>>
>>    - The feature impacts the entire development community (eg converts
>>    the build system to use maven).
>>
>>    HEP Workflow
>>    ============
>>
>>    The author of a HEP should first try to determine if their idea is
>>    HEP-able by sending mail to the general list.  If the scope of the
>>    idea is limited to a specific project the discussion may happen on the
>>    project-specific list or jira.  This gives the author a chance to
>>    flesh out the proposal, address initial concerns, and figure out
>>    whether it has a chance of being accepted.  The author's role is to
>>    build consensus, and gather dissenting opinions.
>>
>>    Following this discussion the author should draft a HEP proposal
>>    following the HEP template. The proposal should accurately reflect and
>>    address feedback and dissenting opinions.  For example, flesh out
>>    sections on backwards compatibility or testing. The author should send
>>    the draft of the proposal to hep@hadoop.apache.org for review.  This
>>    is a new, public list for editors and those interested in following
>>    the review process.
>>
>>    A set of editors reviews incoming HEPs. Each HEP is assigned a single
>>    primary editor. An editor may volunteer if they feel particular
>>    functional expertise is required or assign HEPs to editors round
>>    robin.
>>
>>    The editor reviews the proposal and may request it be updated if it
>>    does not sufficiently address feedback raised during discussion, eg
>>    why the proposal is not redundant with existing functionality, or is
>>    technically sound, sufficiently motivated, covers backwards
>>    compatibility, etc. As updates are necessary, the HEP author can check
>>    in new versions if they have commit permissions, or can email new HEP
>>    versions to the editor for committing. In order to ensure HEP
>>    proposals make progress the editor should respond to proposal drafts
>>    within two weeks of receiving them (or the proposer can request
>>    another editor), and the proposer should generate updates to the draft
>>    within two weeks of receiving feedback from the editor.
>>
>>    The editor's role is to determine if the proposal is complete, so that
>>    the proposal can be voted on, not whether they agree with the proposal
>>    itself.  The editor's involvement should increase the chance that a
>>    HEP proposal makes it to a vote.
>>
>>    Once the editor deems the proposal is complete they add it to a
>>    versioned HEP repository and the author posts the proposal to
>>    general@hadoop.apache.org for vote.  HEP votes, like Apache procedural
>>    votes, use majority approval [3]. Only PMC members have binding votes.
>>    Votes are open for a period of 1 week to allow all active voters time
>>    to consider the proposal. Successful HEPs are assigned a number,
>>    unsuccessful HEPs remain drafts.
>>
>>    The editors are appointed and removed by the PMC informally, similar
>>    to how the Apache Board appoints shepherds to projects.
>>
>>    HEP Contents
>>    ============
>>
>>    Each HEP should contain the following:
>>
>>    1. Preamble -- Including the HEP number, a short descriptive title,
>>    and the names of the authors.
>>
>>    2. Abstract -- A short (~200 word) description of the technical issue
>>    being addressed.
>>
>>    3. Copyright/public domain -- Each HEP must either be explicitly
>>    labelled as placed in the public domain (see this HEP as an example).
>>
>>    4. Design -- This section should give both a high-level view and a
>>    complete description of the feature.  While the design does not need
>>    to cover implementation detail it should be clear to the reader that
>>    the design can lead to a reasonable implementation.  This section
>>    should cover intended use cases, failure scenarios, strategies for
>>    testing, and impact on the existing system.
>>
>>    5. Motivation -- The motivation spells out the use case for the
>>    feature and the benefits it provides.
>>
>>    6. Rationale -- The rationale describes what motivated the design and
>>    why particular design decisions were made.  It should describe
>>    alternate designs that were considered and related work, e.g. how the
>>    feature is designed in other systems. It should also consider whether
>>    the feature could be achieved by layering atop the existing system
>>    rather than modifying it.
>>
>>    The rationale should provide evidence of consensus within the
>>    community and discuss important objections or concerns raised during
>>    discussion.
>>
>>    7. Backwards Compatibility -- All HEPs that introduce backwards
>>    incompatibilities must include a section describing these
>>    incompatibilities and their severity.  The HEP must explain how the
>>    author proposes to deal with these incompatibilities.  HEP submissions
>>    without a sufficient backwards compatibility treatise may be rejected
>>    outright.
>>
>>    HEP Template
>>    ============
>>
>>    HEPs should be plain text with minimal structural markup that adheres
>>    to a rigid style.  You can use this HEP as an example. Each HEP starts
>>    with a header that contains the HEP number (or empty if the number has
>>    not yet been assigned), title, list of authors and status (Draft,
>>    Accepted, Rejected, or Withdrawn).
>>
>>    Auxiliary Files
>>    ===============
>>
>>    HEPs may include auxiliary files such as diagrams.  Such files must be
>>    named ``hep-XXXX-Y.ext``, where "XXXX" is the HEP number, "Y" is a
>>    serial number (starting at 1), and "ext" is replaced by the actual
>>    file extension (e.g. "png").
>>
>>    References
>>    ==========
>>
>>    1. http://www.python.org/dev/peps/pep-0001
>>
>>    2. http://www.apache.org/foundation/bylaws.html
>>
>>    3. http://www.apache.org/foundation/glossary.html#MajorityApproval
>>
>>    Copyright
>>    =========
>>
>>    This document has been placed in the public domain.
>

Re: HEP proposal

Posted by Konstantin Boudnik <co...@yahoo-inc.com>.
I have been following this discussion for some time now and the only question
came to my mind: why mimicking PEP? Is it so astonishingly successful or is
it much better than Apache voting or RFC process (from where it has been
apparently derived).

So far I see HEP as an over-complicated process for a process sake. I'd
appreciate if some one can chip-in and tell me if and where I'm wrong.

Thanks,
  Cos

On Wed, Jul 14, 2010 at 10:46AM, Eli Collins wrote:
>    Hey Konstantin,
> 
>    Thanks for taking a look, comments in-line.
> 
>    On Tue, Jul 13, 2010 at 1:54 PM, Konstantin Shvachko <sh...@yahoo-inc.com>
>    wrote:
>    > Eli,
>    >
>    > Thanks for a really good proposal.
>    > Some questions / comments:
>    >
>    > On voting
>    > 1. Which voting rule?
>    > http://www.apache.org/foundation/glossary.html#ConsensusApproval
>    > http://www.apache.org/foundation/glossary.html#MajorityApproval
>    > I think you mean the MajorityApproval as it does not have veto rule.
>    > So may be it's just clarifying the reference.
> 
>    Good point, clarified so it's majority approval.
> 
>    > 2. Who can vote?
>    > Usually PMCs have Binding Votes.
>    > Would be good to have a sentence clarifying this.
> 
>    Yup, added.
> 
>    > 3. How long does the vote go?
>    > Usual 3 days may not be enough. One week is reasonable?
> 
>    Specified one week.
> 
>    > 4. Discussion on public lists.
>    > A HEP can evolve from a jira, then it should be counted as a public
>    > discussion. I think it makes sense even to continue the discussion
>    > there if so.
> 
>    Agreed, changed the wording to "If the scope of the idea is limited to
>    a specific project the discussion may happen on the project-specific
>    list or jira."
> 
>    > 5. How the set of editors is selected?
>    >   "The editors are apointed and removed by the PMC informally, similar
>    to
>    >   how the Apache Board appoints shepherds to projects."
>    > This needs a reference. How does Apache Board appoints shepherds?
> 
>    Good question, anyone know? Since it's informal I imagine shepherds
>    volunteer. The editors could be a subset of the PMC that either
>    volunteers or is rotated periodically.
> 
>    > 6. The level of design details.
>    > I think HEP should have a pretty detailed design. When people vote they
>    > will want to be sure the design can lead to a reasonable implementation.
>    > Should we say "implementation-ready design", rather than
>    > "A high-level explanation of the design."
>    > Or just
>    > "A _detailed_ explanation of the design."
> 
>    Rewrote this section, tried to make it more explicit about giving both
>    a high-level view and complete enough description so the design can
>    lead to a reasonable implementation. Also added that this section
>    should cover how to test the design.
> 
>    > 7. Typos:
>    > successuflly, apointed, intial
> 
>    Fixed.
> 
>    Updated draft follows.
> 
>    Thanks,
>    Eli
> 
>    HEP: 1
>    Title: HEP Purpose and Guidelines
>    Author: Eli Collins
>    Status: Draft
> 
>    What is a HEP?
>    ==============
> 
>    HEP stands for Hadoop Enhancement Proposal, and is based on Python's
>    PEP (Python Enhancement Proposal) [1].  A HEP is a document that
>    describes a new feature, it's rationale, and issues the feature needs
>    to address in order to be successfully incorporated.
> 
>    The intent is for HEPs to be the primary mechanism for proposing
>    significant new features to core Hadoop (common, HDFS and MapReduce),
>    incorporating community feedback, and recording the proposal.  Going
>    through the HEP process should improve the chances that a proposal is
>    successful.
> 
>    While HEPs do not need to come with code, they are a mechanism to
>    propose features to the community, with the intent of contributing the
>    feature, rather than request the community implement a feature.
> 
>    HEPs must be consistent with Apache bylaws [2], for example, the HEP
>    workflow takes place on the public Apache Hadoop lists.
> 
>    When is a HEP Required?
>    =======================
> 
>    HEPs should not impede casual contribution to Hadoop.  Small
>    improvements and bugs do not require HEPs.  Not all features need
>    HEPs.  While the decision is subjective, here are some guidelines to
>    indicate a HEP should be considered:
> 
>    - The feature impacts backwards compatibility (eg modifies released
>    public APIs in an incompatible way).
> 
>    - The feature requires that an existing component be substantially
>    re-designed (eg NameNode modified to use Bookkeeper).
> 
>    - The implementation impact multiple parts of the system (eg symbolic
>    links versus adding a pluggable component like a codec).
> 
>    - The feature impacts the entire development community (eg converts
>    the build system to use maven).
> 
>    HEP Workflow
>    ============
> 
>    The author of a HEP should first try to determine if their idea is
>    HEP-able by sending mail to the general list.  If the scope of the
>    idea is limited to a specific project the discussion may happen on the
>    project-specific list or jira.  This gives the author a chance to
>    flesh out the proposal, address initial concerns, and figure out
>    whether it has a chance of being accepted.  The author's role is to
>    build consensus, and gather dissenting opinions.
> 
>    Following this discussion the author should draft a HEP proposal
>    following the HEP template. The proposal should accurately reflect and
>    address feedback and dissenting opinions.  For example, flesh out
>    sections on backwards compatibility or testing. The author should send
>    the draft of the proposal to hep@hadoop.apache.org for review.  This
>    is a new, public list for editors and those interested in following
>    the review process.
> 
>    A set of editors reviews incoming HEPs. Each HEP is assigned a single
>    primary editor. An editor may volunteer if they feel particular
>    functional expertise is required or assign HEPs to editors round
>    robin.
> 
>    The editor reviews the proposal and may request it be updated if it
>    does not sufficiently address feedback raised during discussion, eg
>    why the proposal is not redundant with existing functionality, or is
>    technically sound, sufficiently motivated, covers backwards
>    compatibility, etc. As updates are necessary, the HEP author can check
>    in new versions if they have commit permissions, or can email new HEP
>    versions to the editor for committing. In order to ensure HEP
>    proposals make progress the editor should respond to proposal drafts
>    within two weeks of receiving them (or the proposer can request
>    another editor), and the proposer should generate updates to the draft
>    within two weeks of receiving feedback from the editor.
> 
>    The editor's role is to determine if the proposal is complete, so that
>    the proposal can be voted on, not whether they agree with the proposal
>    itself.  The editor's involvement should increase the chance that a
>    HEP proposal makes it to a vote.
> 
>    Once the editor deems the proposal is complete they add it to a
>    versioned HEP repository and the author posts the proposal to
>    general@hadoop.apache.org for vote.  HEP votes, like Apache procedural
>    votes, use majority approval [3]. Only PMC members have binding votes.
>    Votes are open for a period of 1 week to allow all active voters time
>    to consider the proposal. Successful HEPs are assigned a number,
>    unsuccessful HEPs remain drafts.
> 
>    The editors are appointed and removed by the PMC informally, similar
>    to how the Apache Board appoints shepherds to projects.
> 
>    HEP Contents
>    ============
> 
>    Each HEP should contain the following:
> 
>    1. Preamble -- Including the HEP number, a short descriptive title,
>    and the names of the authors.
> 
>    2. Abstract -- A short (~200 word) description of the technical issue
>    being addressed.
> 
>    3. Copyright/public domain -- Each HEP must either be explicitly
>    labelled as placed in the public domain (see this HEP as an example).
> 
>    4. Design -- This section should give both a high-level view and a
>    complete description of the feature.  While the design does not need
>    to cover implementation detail it should be clear to the reader that
>    the design can lead to a reasonable implementation.  This section
>    should cover intended use cases, failure scenarios, strategies for
>    testing, and impact on the existing system.
> 
>    5. Motivation -- The motivation spells out the use case for the
>    feature and the benefits it provides.
> 
>    6. Rationale -- The rationale describes what motivated the design and
>    why particular design decisions were made.  It should describe
>    alternate designs that were considered and related work, e.g. how the
>    feature is designed in other systems. It should also consider whether
>    the feature could be achieved by layering atop the existing system
>    rather than modifying it.
> 
>    The rationale should provide evidence of consensus within the
>    community and discuss important objections or concerns raised during
>    discussion.
> 
>    7. Backwards Compatibility -- All HEPs that introduce backwards
>    incompatibilities must include a section describing these
>    incompatibilities and their severity.  The HEP must explain how the
>    author proposes to deal with these incompatibilities.  HEP submissions
>    without a sufficient backwards compatibility treatise may be rejected
>    outright.
> 
>    HEP Template
>    ============
> 
>    HEPs should be plain text with minimal structural markup that adheres
>    to a rigid style.  You can use this HEP as an example. Each HEP starts
>    with a header that contains the HEP number (or empty if the number has
>    not yet been assigned), title, list of authors and status (Draft,
>    Accepted, Rejected, or Withdrawn).
> 
>    Auxiliary Files
>    ===============
> 
>    HEPs may include auxiliary files such as diagrams.  Such files must be
>    named ``hep-XXXX-Y.ext``, where "XXXX" is the HEP number, "Y" is a
>    serial number (starting at 1), and "ext" is replaced by the actual
>    file extension (e.g. "png").
> 
>    References
>    ==========
> 
>    1. http://www.python.org/dev/peps/pep-0001
> 
>    2. http://www.apache.org/foundation/bylaws.html
> 
>    3. http://www.apache.org/foundation/glossary.html#MajorityApproval
> 
>    Copyright
>    =========
> 
>    This document has been placed in the public domain.

Re: HEP proposal

Posted by Eli Collins <el...@cloudera.com>.
Yup.

"As updates are necessary, the HEP author can check
in new versions if they have commit permissions, or can email new HEP
versions to the editor for committing."

Thanks,
Eli

On Wed, Jul 14, 2010 at 10:36 PM, Dhruba Borthakur <dh...@gmail.com> wrote:
> on a related note, since a HEP-initiator might not be a apache committer,
> does it mean that it is the responsibility of the editor/champion to check
> versions of the HEP proposal into svn?
>
> thanks,
> dhruba
>
> On Wed, Jul 14, 2010 at 6:23 PM, Aaron Kimball <aa...@cloudera.com> wrote:
>
>> Eli,
>>
>> Great work. I like where this is going.
>>
>> Here's something that I think might be problematic:
>>
>> 3. Copyright/public domain -- Each HEP must either be explicitly
>> labelled as placed in the public domain (see this HEP as an example).
>>
>>
>> "must either be placed in the public domain, or..?" I assume the
>> alternative
>> is an open source copyright license? You should state what this is. PEP's
>> alternative is "licensed under the Open Publication License."  (
>> http://www.opencontent.org/openpub/) Is that your intent?
>>
>> Upon acceptance of a PEP, the PEP workflow states that it should be checked
>> into svn. Are we planning to host accepted HEPs in the Hadoop svn? If so,
>> we
>> should specify that non-public-domain HEPs should be placed under a license
>> listed on http://www.apache.org/legal/resolved.html. The Open Publication
>> License isn't listed there. Nor are the GFDL and other documentation
>> licenses (presumably not because they're not specifically allowed, but
>> because Apache deals with code more than documentation). Note that
>> CC/Attribution/Share-Alike is allowed. You may want to recommend that one?
>>
>> - Aaron
>>
>>
>> On Wed, Jul 14, 2010 at 3:00 PM, Eli Collins <el...@cloudera.com> wrote:
>>
>> > On Wed, Jul 14, 2010 at 11:57 AM, Doug Cutting <cu...@apache.org>
>> wrote:
>> >
>> > > On 07/14/2010 10:46 AM, Eli Collins wrote:
>> > >
>> > >> 5. How the set of editors is selected?
>> > >>>   "The editors are apointed and removed by the PMC informally,
>> similar
>> > to
>> > >>>   how the Apache Board appoints shepherds to projects."
>> > >>> This needs a reference. How does Apache Board appoints shepherds?
>> > >>>
>> > >>
>> > >> Good question, anyone know? Since it's informal I imagine shepherds
>> > >> volunteer. The editors could be a subset of the PMC that either
>> > >> volunteers or is rotated periodically.
>> > >>
>> > >
>> > > Shepherds are assigned randomly by the chair 48 hours before each
>> > meeting.
>> > >  Perhaps a better analogy for an editor is an Incubator Champion?
>> > >
>> > >
>> > >
>> >
>> http://incubator.apache.org/incubation/Roles_and_Responsibilities.html#Champion
>> > >
>> > > Champions are volunteers.  So, rather than assigning an editor, a HEP
>> > might
>> > > ask for one to volunteer.  Any PMC member might volunteer.
>> > >
>> >
>> > That sounds reasonable. Having a PMC member volunteer will likely result
>> in
>> > someone who has more expertise in the area and will perhaps be more
>> > responsive.
>> >
>> > Thanks,
>> > Eli
>> >
>>
>
>
>
> --
> Connect to me at http://www.facebook.com/dhruba
>

Re: HEP proposal

Posted by Dhruba Borthakur <dh...@gmail.com>.
on a related note, since a HEP-initiator might not be a apache committer,
does it mean that it is the responsibility of the editor/champion to check
versions of the HEP proposal into svn?

thanks,
dhruba

On Wed, Jul 14, 2010 at 6:23 PM, Aaron Kimball <aa...@cloudera.com> wrote:

> Eli,
>
> Great work. I like where this is going.
>
> Here's something that I think might be problematic:
>
> 3. Copyright/public domain -- Each HEP must either be explicitly
> labelled as placed in the public domain (see this HEP as an example).
>
>
> "must either be placed in the public domain, or..?" I assume the
> alternative
> is an open source copyright license? You should state what this is. PEP's
> alternative is "licensed under the Open Publication License."  (
> http://www.opencontent.org/openpub/) Is that your intent?
>
> Upon acceptance of a PEP, the PEP workflow states that it should be checked
> into svn. Are we planning to host accepted HEPs in the Hadoop svn? If so,
> we
> should specify that non-public-domain HEPs should be placed under a license
> listed on http://www.apache.org/legal/resolved.html. The Open Publication
> License isn't listed there. Nor are the GFDL and other documentation
> licenses (presumably not because they're not specifically allowed, but
> because Apache deals with code more than documentation). Note that
> CC/Attribution/Share-Alike is allowed. You may want to recommend that one?
>
> - Aaron
>
>
> On Wed, Jul 14, 2010 at 3:00 PM, Eli Collins <el...@cloudera.com> wrote:
>
> > On Wed, Jul 14, 2010 at 11:57 AM, Doug Cutting <cu...@apache.org>
> wrote:
> >
> > > On 07/14/2010 10:46 AM, Eli Collins wrote:
> > >
> > >> 5. How the set of editors is selected?
> > >>>   "The editors are apointed and removed by the PMC informally,
> similar
> > to
> > >>>   how the Apache Board appoints shepherds to projects."
> > >>> This needs a reference. How does Apache Board appoints shepherds?
> > >>>
> > >>
> > >> Good question, anyone know? Since it's informal I imagine shepherds
> > >> volunteer. The editors could be a subset of the PMC that either
> > >> volunteers or is rotated periodically.
> > >>
> > >
> > > Shepherds are assigned randomly by the chair 48 hours before each
> > meeting.
> > >  Perhaps a better analogy for an editor is an Incubator Champion?
> > >
> > >
> > >
> >
> http://incubator.apache.org/incubation/Roles_and_Responsibilities.html#Champion
> > >
> > > Champions are volunteers.  So, rather than assigning an editor, a HEP
> > might
> > > ask for one to volunteer.  Any PMC member might volunteer.
> > >
> >
> > That sounds reasonable. Having a PMC member volunteer will likely result
> in
> > someone who has more expertise in the area and will perhaps be more
> > responsive.
> >
> > Thanks,
> > Eli
> >
>



-- 
Connect to me at http://www.facebook.com/dhruba

Re: HEP proposal

Posted by Doug Cutting <cu...@apache.org>.
On 07/14/2010 11:41 PM, Eli Collins wrote:
> I was assuming HEPs would just be placed in the public domain, period.
> Like design docs posted on jira they're just text posted to a mailing
> list, not sure checking them into svn needs to require they have an
> alternate license. Something I'm missing?

If they were distributed with releases then they should be under the 
Apache license, but I don't see that HEPs will be distributed with 
releases, so public domain seems sufficient.

Doug

Re: HEP proposal

Posted by Eli Collins <el...@cloudera.com>.
On Wed, Jul 14, 2010 at 6:23 PM, Aaron Kimball <aa...@cloudera.com> wrote:
> Eli,
>
> Great work. I like where this is going.
>
> Here's something that I think might be problematic:
>
> 3. Copyright/public domain -- Each HEP must either be explicitly
> labelled as placed in the public domain (see this HEP as an example).
>
>
> "must either be placed in the public domain, or..?" I assume the alternative
> is an open source copyright license?

I was assuming HEPs would just be placed in the public domain, period.
Like design docs posted on jira they're just text posted to a mailing
list, not sure checking them into svn needs to require they have an
alternate license. Something I'm missing?

Thanks,
Eli



You should state what this is. PEP's
> alternative is "licensed under the Open Publication License."  (
> http://www.opencontent.org/openpub/) Is that your intent?
>
> Upon acceptance of a PEP, the PEP workflow states that it should be checked
> into svn. Are we planning to host accepted HEPs in the Hadoop svn? If so, we
> should specify that non-public-domain HEPs should be placed under a license
> listed on http://www.apache.org/legal/resolved.html. The Open Publication
> License isn't listed there. Nor are the GFDL and other documentation
> licenses (presumably not because they're not specifically allowed, but
> because Apache deals with code more than documentation). Note that
> CC/Attribution/Share-Alike is allowed. You may want to recommend that one?
>
> - Aaron
>
>
> On Wed, Jul 14, 2010 at 3:00 PM, Eli Collins <el...@cloudera.com> wrote:
>
>> On Wed, Jul 14, 2010 at 11:57 AM, Doug Cutting <cu...@apache.org> wrote:
>>
>> > On 07/14/2010 10:46 AM, Eli Collins wrote:
>> >
>> >> 5. How the set of editors is selected?
>> >>>   "The editors are apointed and removed by the PMC informally, similar
>> to
>> >>>   how the Apache Board appoints shepherds to projects."
>> >>> This needs a reference. How does Apache Board appoints shepherds?
>> >>>
>> >>
>> >> Good question, anyone know? Since it's informal I imagine shepherds
>> >> volunteer. The editors could be a subset of the PMC that either
>> >> volunteers or is rotated periodically.
>> >>
>> >
>> > Shepherds are assigned randomly by the chair 48 hours before each
>> meeting.
>> >  Perhaps a better analogy for an editor is an Incubator Champion?
>> >
>> >
>> >
>> http://incubator.apache.org/incubation/Roles_and_Responsibilities.html#Champion
>> >
>> > Champions are volunteers.  So, rather than assigning an editor, a HEP
>> might
>> > ask for one to volunteer.  Any PMC member might volunteer.
>> >
>>
>> That sounds reasonable. Having a PMC member volunteer will likely result in
>> someone who has more expertise in the area and will perhaps be more
>> responsive.
>>
>> Thanks,
>> Eli
>>
>

Re: HEP proposal

Posted by Aaron Kimball <aa...@cloudera.com>.
Eli,

Great work. I like where this is going.

Here's something that I think might be problematic:

3. Copyright/public domain -- Each HEP must either be explicitly
labelled as placed in the public domain (see this HEP as an example).


"must either be placed in the public domain, or..?" I assume the alternative
is an open source copyright license? You should state what this is. PEP's
alternative is "licensed under the Open Publication License."  (
http://www.opencontent.org/openpub/) Is that your intent?

Upon acceptance of a PEP, the PEP workflow states that it should be checked
into svn. Are we planning to host accepted HEPs in the Hadoop svn? If so, we
should specify that non-public-domain HEPs should be placed under a license
listed on http://www.apache.org/legal/resolved.html. The Open Publication
License isn't listed there. Nor are the GFDL and other documentation
licenses (presumably not because they're not specifically allowed, but
because Apache deals with code more than documentation). Note that
CC/Attribution/Share-Alike is allowed. You may want to recommend that one?

- Aaron


On Wed, Jul 14, 2010 at 3:00 PM, Eli Collins <el...@cloudera.com> wrote:

> On Wed, Jul 14, 2010 at 11:57 AM, Doug Cutting <cu...@apache.org> wrote:
>
> > On 07/14/2010 10:46 AM, Eli Collins wrote:
> >
> >> 5. How the set of editors is selected?
> >>>   "The editors are apointed and removed by the PMC informally, similar
> to
> >>>   how the Apache Board appoints shepherds to projects."
> >>> This needs a reference. How does Apache Board appoints shepherds?
> >>>
> >>
> >> Good question, anyone know? Since it's informal I imagine shepherds
> >> volunteer. The editors could be a subset of the PMC that either
> >> volunteers or is rotated periodically.
> >>
> >
> > Shepherds are assigned randomly by the chair 48 hours before each
> meeting.
> >  Perhaps a better analogy for an editor is an Incubator Champion?
> >
> >
> >
> http://incubator.apache.org/incubation/Roles_and_Responsibilities.html#Champion
> >
> > Champions are volunteers.  So, rather than assigning an editor, a HEP
> might
> > ask for one to volunteer.  Any PMC member might volunteer.
> >
>
> That sounds reasonable. Having a PMC member volunteer will likely result in
> someone who has more expertise in the area and will perhaps be more
> responsive.
>
> Thanks,
> Eli
>

Re: HEP proposal

Posted by Eli Collins <el...@cloudera.com>.
On Wed, Jul 14, 2010 at 11:57 AM, Doug Cutting <cu...@apache.org> wrote:

> On 07/14/2010 10:46 AM, Eli Collins wrote:
>
>> 5. How the set of editors is selected?
>>>   "The editors are apointed and removed by the PMC informally, similar to
>>>   how the Apache Board appoints shepherds to projects."
>>> This needs a reference. How does Apache Board appoints shepherds?
>>>
>>
>> Good question, anyone know? Since it's informal I imagine shepherds
>> volunteer. The editors could be a subset of the PMC that either
>> volunteers or is rotated periodically.
>>
>
> Shepherds are assigned randomly by the chair 48 hours before each meeting.
>  Perhaps a better analogy for an editor is an Incubator Champion?
>
>
> http://incubator.apache.org/incubation/Roles_and_Responsibilities.html#Champion
>
> Champions are volunteers.  So, rather than assigning an editor, a HEP might
> ask for one to volunteer.  Any PMC member might volunteer.
>

That sounds reasonable. Having a PMC member volunteer will likely result in
someone who has more expertise in the area and will perhaps be more
responsive.

Thanks,
Eli

Re: HEP proposal

Posted by Doug Cutting <cu...@apache.org>.
On 07/14/2010 10:46 AM, Eli Collins wrote:
>> 5. How the set of editors is selected?
>>    "The editors are apointed and removed by the PMC informally, similar to
>>    how the Apache Board appoints shepherds to projects."
>> This needs a reference. How does Apache Board appoints shepherds?
>
> Good question, anyone know? Since it's informal I imagine shepherds
> volunteer. The editors could be a subset of the PMC that either
> volunteers or is rotated periodically.

Shepherds are assigned randomly by the chair 48 hours before each 
meeting.  Perhaps a better analogy for an editor is an Incubator Champion?

http://incubator.apache.org/incubation/Roles_and_Responsibilities.html#Champion

Champions are volunteers.  So, rather than assigning an editor, a HEP 
might ask for one to volunteer.  Any PMC member might volunteer.

Doug

Re: HEP proposal

Posted by Eli Collins <el...@cloudera.com>.
Hey Konstantin,

Thanks for taking a look, comments in-line.

On Tue, Jul 13, 2010 at 1:54 PM, Konstantin Shvachko <sh...@yahoo-inc.com> wrote:
> Eli,
>
> Thanks for a really good proposal.
> Some questions / comments:
>
> On voting
> 1. Which voting rule?
> http://www.apache.org/foundation/glossary.html#ConsensusApproval
> http://www.apache.org/foundation/glossary.html#MajorityApproval
> I think you mean the MajorityApproval as it does not have veto rule.
> So may be it's just clarifying the reference.

Good point, clarified so it's majority approval.

> 2. Who can vote?
> Usually PMCs have Binding Votes.
> Would be good to have a sentence clarifying this.

Yup, added.

> 3. How long does the vote go?
> Usual 3 days may not be enough. One week is reasonable?

Specified one week.

> 4. Discussion on public lists.
> A HEP can evolve from a jira, then it should be counted as a public
> discussion. I think it makes sense even to continue the discussion
> there if so.

Agreed, changed the wording to "If the scope of the idea is limited to
a specific project the discussion may happen on the project-specific
list or jira."

> 5. How the set of editors is selected?
>   "The editors are apointed and removed by the PMC informally, similar to
>   how the Apache Board appoints shepherds to projects."
> This needs a reference. How does Apache Board appoints shepherds?

Good question, anyone know? Since it's informal I imagine shepherds
volunteer. The editors could be a subset of the PMC that either
volunteers or is rotated periodically.

> 6. The level of design details.
> I think HEP should have a pretty detailed design. When people vote they
> will want to be sure the design can lead to a reasonable implementation.
> Should we say "implementation-ready design", rather than
> "A high-level explanation of the design."
> Or just
> "A _detailed_ explanation of the design."

Rewrote this section, tried to make it more explicit about giving both
a high-level view and complete enough description so the design can
lead to a reasonable implementation. Also added that this section
should cover how to test the design.

> 7. Typos:
> successuflly, apointed, intial

Fixed.

Updated draft follows.

Thanks,
Eli



HEP: 1
Title: HEP Purpose and Guidelines
Author: Eli Collins
Status: Draft


What is a HEP?
==============

HEP stands for Hadoop Enhancement Proposal, and is based on Python's
PEP (Python Enhancement Proposal) [1].  A HEP is a document that
describes a new feature, it's rationale, and issues the feature needs
to address in order to be successfully incorporated.

The intent is for HEPs to be the primary mechanism for proposing
significant new features to core Hadoop (common, HDFS and MapReduce),
incorporating community feedback, and recording the proposal.  Going
through the HEP process should improve the chances that a proposal is
successful.

While HEPs do not need to come with code, they are a mechanism to
propose features to the community, with the intent of contributing the
feature, rather than request the community implement a feature.

HEPs must be consistent with Apache bylaws [2], for example, the HEP
workflow takes place on the public Apache Hadoop lists.


When is a HEP Required?
=======================

HEPs should not impede casual contribution to Hadoop.  Small
improvements and bugs do not require HEPs.  Not all features need
HEPs.  While the decision is subjective, here are some guidelines to
indicate a HEP should be considered:

- The feature impacts backwards compatibility (eg modifies released
public APIs in an incompatible way).

- The feature requires that an existing component be substantially
re-designed (eg NameNode modified to use Bookkeeper).

- The implementation impact multiple parts of the system (eg symbolic
links versus adding a pluggable component like a codec).

- The feature impacts the entire development community (eg converts
the build system to use maven).


HEP Workflow
============

The author of a HEP should first try to determine if their idea is
HEP-able by sending mail to the general list.  If the scope of the
idea is limited to a specific project the discussion may happen on the
project-specific list or jira.  This gives the author a chance to
flesh out the proposal, address initial concerns, and figure out
whether it has a chance of being accepted.  The author's role is to
build consensus, and gather dissenting opinions.

Following this discussion the author should draft a HEP proposal
following the HEP template. The proposal should accurately reflect and
address feedback and dissenting opinions.  For example, flesh out
sections on backwards compatibility or testing. The author should send
the draft of the proposal to hep@hadoop.apache.org for review.  This
is a new, public list for editors and those interested in following
the review process.

A set of editors reviews incoming HEPs. Each HEP is assigned a single
primary editor. An editor may volunteer if they feel particular
functional expertise is required or assign HEPs to editors round
robin.

The editor reviews the proposal and may request it be updated if it
does not sufficiently address feedback raised during discussion, eg
why the proposal is not redundant with existing functionality, or is
technically sound, sufficiently motivated, covers backwards
compatibility, etc. As updates are necessary, the HEP author can check
in new versions if they have commit permissions, or can email new HEP
versions to the editor for committing. In order to ensure HEP
proposals make progress the editor should respond to proposal drafts
within two weeks of receiving them (or the proposer can request
another editor), and the proposer should generate updates to the draft
within two weeks of receiving feedback from the editor.

The editor's role is to determine if the proposal is complete, so that
the proposal can be voted on, not whether they agree with the proposal
itself.  The editor's involvement should increase the chance that a
HEP proposal makes it to a vote.

Once the editor deems the proposal is complete they add it to a
versioned HEP repository and the author posts the proposal to
general@hadoop.apache.org for vote.  HEP votes, like Apache procedural
votes, use majority approval [3]. Only PMC members have binding votes.
Votes are open for a period of 1 week to allow all active voters time
to consider the proposal. Successful HEPs are assigned a number,
unsuccessful HEPs remain drafts.

The editors are appointed and removed by the PMC informally, similar
to how the Apache Board appoints shepherds to projects.


HEP Contents
============

Each HEP should contain the following:

1. Preamble -- Including the HEP number, a short descriptive title,
and the names of the authors.

2. Abstract -- A short (~200 word) description of the technical issue
being addressed.

3. Copyright/public domain -- Each HEP must either be explicitly
labelled as placed in the public domain (see this HEP as an example).

4. Design -- This section should give both a high-level view and a
complete description of the feature.  While the design does not need
to cover implementation detail it should be clear to the reader that
the design can lead to a reasonable implementation.  This section
should cover intended use cases, failure scenarios, strategies for
testing, and impact on the existing system.

5. Motivation -- The motivation spells out the use case for the
feature and the benefits it provides.

6. Rationale -- The rationale describes what motivated the design and
why particular design decisions were made.  It should describe
alternate designs that were considered and related work, e.g. how the
feature is designed in other systems. It should also consider whether
the feature could be achieved by layering atop the existing system
rather than modifying it.

The rationale should provide evidence of consensus within the
community and discuss important objections or concerns raised during
discussion.

7. Backwards Compatibility -- All HEPs that introduce backwards
incompatibilities must include a section describing these
incompatibilities and their severity.  The HEP must explain how the
author proposes to deal with these incompatibilities.  HEP submissions
without a sufficient backwards compatibility treatise may be rejected
outright.


HEP Template
============

HEPs should be plain text with minimal structural markup that adheres
to a rigid style.  You can use this HEP as an example. Each HEP starts
with a header that contains the HEP number (or empty if the number has
not yet been assigned), title, list of authors and status (Draft,
Accepted, Rejected, or Withdrawn).


Auxiliary Files
===============

HEPs may include auxiliary files such as diagrams.  Such files must be
named ``hep-XXXX-Y.ext``, where "XXXX" is the HEP number, "Y" is a
serial number (starting at 1), and "ext" is replaced by the actual
file extension (e.g. "png").


References
==========

1. http://www.python.org/dev/peps/pep-0001

2. http://www.apache.org/foundation/bylaws.html

3. http://www.apache.org/foundation/glossary.html#MajorityApproval


Copyright
=========

This document has been placed in the public domain.

Re: HEP proposal

Posted by Konstantin Shvachko <sh...@yahoo-inc.com>.
Eli,

Thanks for a really good proposal.
Some questions / comments:

On voting
1. Which voting rule?
http://www.apache.org/foundation/glossary.html#ConsensusApproval
http://www.apache.org/foundation/glossary.html#MajorityApproval
I think you mean the MajorityApproval as it does not have veto rule.
So may be it's just clarifying the reference.

2. Who can vote?
Usually PMCs have Binding Votes.
Would be good to have a sentence clarifying this.

3. How long does the vote go?
Usual 3 days may not be enough. One week is reasonable?

4. Discussion on public lists.
A HEP can evolve from a jira, then it should be counted as a public
discussion. I think it makes sense even to continue the discussion
there if so. I propose to add the following:
=========================================
idea is HEP-able by sending mail to the general, or the project-specific lists
+	(including Jiras)
if the scope of the idea is limited to the project.
=========================================

5. How the set of editors is selected?
    "The editors are apointed and removed by the PMC informally, similar to
    how the Apache Board appoints shepherds to projects."
This needs a reference. How does Apache Board appoints shepherds?

6. The level of design details.
I think HEP should have a pretty detailed design. When people vote they
will want to be sure the design can lead to a reasonable implementation.
Should we say "implementation-ready design", rather than
"A high-level explanation of the design."
Or just
"A _detailed_ explanation of the design."

7. Typos:
successuflly, apointed, intial

Thanks,
--Konstantin

On 7/12/2010 12:39 PM, Eli Collins wrote:
> A while back we started discussing on list (http://bit.ly/aFj9Ya) and
> at the contributors meeting (http://bit.ly/aj4Y7I) a more coordinated
> way to describe, socialize and shepherd enhancements to Hadoop.
> Thanks for all the feedback.  Most of it was encouraging so I wrote up
> a draft proposal with specifics to discuss here.  After incorporating
> feedback I'll send out another revision for vote.
>
> Thanks,
> Eli
>
>
> HEP: 1
> Title: HEP Purpose and Guidelines
> Author: Eli Collins
> Status: Draft
>
>
> What is a HEP?
> ==============
>
> HEP stands for Hadoop Enhancement Proposal, and is based on Python's
> PEP (Python Enhancement Proposal) [1].  A HEP is a document that
> describes a new feature, it's rationale, and issues the feature needs
> to address in order to be successuflly incorporated.
>
> The intent is for HEPs to be the primary mechanism for proposing
> significant new features to core Hadoop (common, HDFS and MapReduce),
> incorporating community feedback, and recording the proposal.  Going
> through the HEP process should improve the chances that a proposal is
> successful.
>
> While HEPs do not need to come with code, they are a mechanism to
> propose features to the community, with the intent of contributing the
> feature, rather than request the community implement a feature.
>
> HEPs must be consistent with Apache bylaws [2], for example, the HEP
> workflow takes place on the public Apache Hadoop lists.
>
>
> When is a HEP Required?
> =======================
>
> HEPs should not impede casual contribution to Hadoop.  Small
> improvements and bugs do not require HEPs.  Not all features need
> HEPs.  While the decision is subjective, here are some guidelines to
> indicate a HEP should be considered:
>
> - The feature impacts backwards compatibility (eg modifies released
> public APIs in an incompatible way).
>
> - The feature requires that an existing component be substantially
> re-designed (eg NameNode modified to use Bookkeeper).
>
> - The implementation impact multiple parts of the system (eg symbolic
> links versus adding a pluggable component like a codec).
>
> - The feature impacts the entire development community (eg converts
> the build system to use maven).
>
>
> HEP Workflow
> ============
>
> The author of a HEP should first try to determine if their idea is
> HEP-able by sending mail to the general, or the project-specific lists
> if the scope of the idea is limited to the project.  This gives the
> author a chance to flesh out the proposal, address intial concerns,
> and figure out whether it has a chance of being accepted.  The
> author's role is to build consensus, and gather dissenting opinions.
>
> Following this discussion the author should draft a HEP proposal
> following the HEP template. The proposal should accurately reflect and
> address feedback and dissenting opinions.  For example, flesh out
> sections on backwards compatibility or testing. The author should send
> the draft of the proposal to hep@hadoop.apache.org for review.  This
> is a new, public list for editors and those interested in following
> the review process.
>
> A set of editors reviews incoming HEPs. Each HEP is assigned a single
> primary editor. An editor may volunteer if they feel particular
> functional expertise is required or assign HEPs to editors round
> robin.
>
> The editor reviews the proposal and may request it be updated if it
> does not sufficiently address feedback raised during discussion, eg
> why the proposal is not redundant with existing functionality, or is
> technically sound, sufficiently motivated, covers backwards
> compatibility, etc. As updates are necessary, the HEP author can check
> in new versions if they have commit permissions, or can email new HEP
> versions to the editor for committing. In order to ensure HEP
> proposals make progress the editor should respond to proposal drafts
> within two weeks of receiving them (or the proposer can request
> another editor), and the proposer should generate updates to the draft
> within two weeks of receiving feedback from the editor.
>
> The editor's role is to determine if the proposal is complete, so that
> the proposal can be voted on, not whether they agree with the proposal
> itself.  The editor's involvement should increase the chance that a
> HEP proposal makes it to a vote.
>
> Once the editor deems the proposal is complete they add it to a
> versioned HEP repository and the author posts the proposal to
> general@hadoop.apache.org for vote.  HEP votes, like Apache procedural
> votes, use majority rule [3]. Successful HEPs are assigned a number,
> unsuccessful HEPs remain drafts.
>
> The editors are apointed and removed by the PMC informally, similar to
> how the Apache Board appoints shepherds to projects.
>
>
> HEP Contents
> ============
>
> Each HEP should contain the following:
>
> 1. Preamble -- Including the HEP number, a short descriptive title,
> and the names of the authors.
>
> 2. Abstract -- A short (~200 word) description of the technical issue
> being addressed.
>
> 3. Copyright/public domain -- Each HEP must either be explicitly
> labelled as placed in the public domain (see this HEP as an example).
>
> 4. Design -- A high-level explanation of the design. It should cover
> intended use cases, failure scenarios, and impact on the existing
> system.
>
> 5. Motivation -- The motivation spells out the use case for the
> feature and the benefits it provides.
>
> 6. Rationale -- The rationale describes what motivated the design and
> why particular design decisions were made.  It should describe
> alternate designs that were considered and related work, e.g. how the
> feature is designed in other systems. It should also consider whether
> the feature could be achieved by layering atop the existing system
> rather than modifying it.
>
> The rationale should provide evidence of consensus within the
> community and discuss important objections or concerns raised during
> discussion.
>
> 7. Backwards Compatibility -- All HEPs that introduce backwards
> incompatibilities must include a section describing these
> incompatibilities and their severity.  The HEP must explain how the
> author proposes to deal with these incompatibilities.  HEP submissions
> without a sufficient backwards compatibility treatise may be rejected
> outright.
>
>
> HEP Template
> ============
>
> HEPs should be plain text with minimal structural markup that adheres
> to a rigid style.  You can use this HEP as an example. Each HEP starts
> with a header that contains the HEP number (or empty if the number has
> not yet been assigned), title, list of authors and status (Draft,
> Accepted, Rejected, or Withdrawn).
>
>
> Auxiliary Files
> ===============
>
> HEPs may include auxiliary files such as diagrams.  Such files must be
> named ``hep-XXXX-Y.ext``, where "XXXX" is the HEP number, "Y" is a
> serial number (starting at 1), and "ext" is replaced by the actual
> file extension (e.g. "png").
>
>
> References
> ==========
>
> 1. http://www.python.org/dev/peps/pep-0001
>
> 2. http://www.apache.org/foundation/bylaws.html
>
> 3. http://www.apache.org/foundation/voting.html
>
>
> Copyright
> =========
>
> This document has been placed in the public domain.
>