You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Arun C Murthy <ac...@gmail.com> on 2010/05/07 02:38:25 UTC

Minutes: Hadoop Contributor Meeting 05/06/2010

# Shared goals
  - Hadoop is HDFS & Map-Reduce in this context of this set of slides
# Priorities
   * Yahoo
     - Correctness
     - Availability: Not the same as high-availability (6 9s. etc.)  
i.e. SPOFs
     - API Compatibility
     - Scalability
     - Operability
     - Performance
     - Innovation
   * Cloudera
     - Test coverage, api coverage
     - APL Licensed codec (lzo replacement)
     - Security
     - Wire compatibility
     - Cluster-wide resource availability
     - New apis (FileContext, MR Context Objs.), documentation of  
their advantages
     - HDFS to better support non-MR use-cases
     - Cluster metrics hooks
     - MR modularity (package)
   * Facebook
     - Correctness
     - Availability, High Availability, Failover, Continuous  
Availability
     - Scalability
# Bar for patches/features keeps going higher as the project matures
   - Build consensus (e.g. Python Enhancement Process, JSR etc.)
   - Run/test on your own to prove the concept/feature or branch and  
finish
   - Early versions of libraries should be started outside of the  
project (github etc.) e.g. input-formats, compression-codecs etc.
     - github for all the above
     - Prune contrib
# Maven for packaging
# Tom: hadoop-0.21 (Tom - can you please post your slides? Thanks!)
# Owen: Release Manager (see slides)
# Agenda for next meeting
   - Eli: Hadoop Enhancement Process (modelled on PEP?)
   - Branching strategies: Development Models

Arun


Re: Minutes: Hadoop Contributor Meeting 05/06/2010

Posted by Eli Collins <el...@cloudera.com>.
Hey Arun,

I updated the agenda on Meetup. I was assuming the branching would
fall out of how to implement HEP but good to discuss separately as
well.

Also, I added moving contrib out of the repos since a couple people
mentioned that at the last meetup but we should do that time
permitting the other topics.

Thanks,
Eli

On Fri, May 21, 2010 at 2:04 PM, Arun C Murthy <ac...@yahoo-inc.com> wrote:
>
> On May 6, 2010, at 5:38 PM, Arun C Murthy wrote:
>>
>> # Agenda for next meeting
>>  - Eli: Hadoop Enhancement Process (modelled on PEP?)
>>  - Branching strategies: Development Models
>
> Something to think about w.r.t branching strategies:
> http://incubator.apache.org/learn/rules-for-revolutionaries.html
>
> Arun
>

Re: Minutes: Hadoop Contributor Meeting 05/06/2010

Posted by Eli Collins <el...@cloudera.com>.
Hey Arun,

I updated the agenda on Meetup. I was assuming the branching would
fall out of how to implement HEP but good to discuss separately as
well.

Also, I added moving contrib out of the repos since a couple people
mentioned that at the last meetup but we should do that time
permitting the other topics.

Thanks,
Eli

On Fri, May 21, 2010 at 2:04 PM, Arun C Murthy <ac...@yahoo-inc.com> wrote:
>
> On May 6, 2010, at 5:38 PM, Arun C Murthy wrote:
>>
>> # Agenda for next meeting
>>  - Eli: Hadoop Enhancement Process (modelled on PEP?)
>>  - Branching strategies: Development Models
>
> Something to think about w.r.t branching strategies:
> http://incubator.apache.org/learn/rules-for-revolutionaries.html
>
> Arun
>

Re: Minutes: Hadoop Contributor Meeting 05/06/2010

Posted by Eli Collins <el...@cloudera.com>.
Hey Arun,

I updated the agenda on Meetup. I was assuming the branching would
fall out of how to implement HEP but good to discuss separately as
well.

Also, I added moving contrib out of the repos since a couple people
mentioned that at the last meetup but we should do that time
permitting the other topics.

Thanks,
Eli

On Fri, May 21, 2010 at 2:04 PM, Arun C Murthy <ac...@yahoo-inc.com> wrote:
>
> On May 6, 2010, at 5:38 PM, Arun C Murthy wrote:
>>
>> # Agenda for next meeting
>>  - Eli: Hadoop Enhancement Process (modelled on PEP?)
>>  - Branching strategies: Development Models
>
> Something to think about w.r.t branching strategies:
> http://incubator.apache.org/learn/rules-for-revolutionaries.html
>
> Arun
>

Re: Minutes: Hadoop Contributor Meeting 05/06/2010

Posted by Arun C Murthy <ac...@yahoo-inc.com>.
On May 6, 2010, at 5:38 PM, Arun C Murthy wrote:
> # Agenda for next meeting
>  - Eli: Hadoop Enhancement Process (modelled on PEP?)
>  - Branching strategies: Development Models

Something to think about w.r.t branching strategies: http://incubator.apache.org/learn/rules-for-revolutionaries.html

Arun

Re: Minutes: Hadoop Contributor Meeting 05/06/2010

Posted by Arun C Murthy <ac...@yahoo-inc.com>.
On May 6, 2010, at 5:38 PM, Arun C Murthy wrote:
> # Agenda for next meeting
>  - Eli: Hadoop Enhancement Process (modelled on PEP?)
>  - Branching strategies: Development Models

Something to think about w.r.t branching strategies: http://incubator.apache.org/learn/rules-for-revolutionaries.html

Arun

Re: Minutes: Hadoop Contributor Meeting 05/06/2010

Posted by Tom White <to...@cloudera.com>.
Not sure why my attachment didn't make it to the list. Anyway, I've
posted Arun's notes on the wiki at
http://wiki.apache.org/hadoop/HadoopContributorsMeeting20100506, and
included the content of my slide there. (Attachments on the wiki have
been disabled - as of today apparently, see SVN commit r775220 - so I
wasn't able to post the slide there either.)

Tom

On Fri, May 7, 2010 at 9:36 AM, Tom White <to...@cloudera.com> wrote:
> Here's my (single) slide about the 0.21 release.
>
> Tom
>
> On Thu, May 6, 2010 at 5:38 PM, Arun C Murthy <ac...@gmail.com> wrote:
>> # Shared goals
>>  - Hadoop is HDFS & Map-Reduce in this context of this set of slides
>> # Priorities
>>  * Yahoo
>>    - Correctness
>>    - Availability: Not the same as high-availability (6 9s. etc.) i.e. SPOFs
>>    - API Compatibility
>>    - Scalability
>>    - Operability
>>    - Performance
>>    - Innovation
>>  * Cloudera
>>    - Test coverage, api coverage
>>    - APL Licensed codec (lzo replacement)
>>    - Security
>>    - Wire compatibility
>>    - Cluster-wide resource availability
>>    - New apis (FileContext, MR Context Objs.), documentation of their
>> advantages
>>    - HDFS to better support non-MR use-cases
>>    - Cluster metrics hooks
>>    - MR modularity (package)
>>  * Facebook
>>    - Correctness
>>    - Availability, High Availability, Failover, Continuous Availability
>>    - Scalability
>> # Bar for patches/features keeps going higher as the project matures
>>  - Build consensus (e.g. Python Enhancement Process, JSR etc.)
>>  - Run/test on your own to prove the concept/feature or branch and finish
>>  - Early versions of libraries should be started outside of the project
>> (github etc.) e.g. input-formats, compression-codecs etc.
>>    - github for all the above
>>    - Prune contrib
>> # Maven for packaging
>> # Tom: hadoop-0.21 (Tom - can you please post your slides? Thanks!)
>> # Owen: Release Manager (see slides)
>> # Agenda for next meeting
>>  - Eli: Hadoop Enhancement Process (modelled on PEP?)
>>  - Branching strategies: Development Models
>>
>> Arun
>>
>>
>>
>>
>>
>

Re: Minutes: Hadoop Contributor Meeting 05/06/2010

Posted by Tom White <to...@cloudera.com>.
Not sure why my attachment didn't make it to the list. Anyway, I've
posted Arun's notes on the wiki at
http://wiki.apache.org/hadoop/HadoopContributorsMeeting20100506, and
included the content of my slide there. (Attachments on the wiki have
been disabled - as of today apparently, see SVN commit r775220 - so I
wasn't able to post the slide there either.)

Tom

On Fri, May 7, 2010 at 9:36 AM, Tom White <to...@cloudera.com> wrote:
> Here's my (single) slide about the 0.21 release.
>
> Tom
>
> On Thu, May 6, 2010 at 5:38 PM, Arun C Murthy <ac...@gmail.com> wrote:
>> # Shared goals
>>  - Hadoop is HDFS & Map-Reduce in this context of this set of slides
>> # Priorities
>>  * Yahoo
>>    - Correctness
>>    - Availability: Not the same as high-availability (6 9s. etc.) i.e. SPOFs
>>    - API Compatibility
>>    - Scalability
>>    - Operability
>>    - Performance
>>    - Innovation
>>  * Cloudera
>>    - Test coverage, api coverage
>>    - APL Licensed codec (lzo replacement)
>>    - Security
>>    - Wire compatibility
>>    - Cluster-wide resource availability
>>    - New apis (FileContext, MR Context Objs.), documentation of their
>> advantages
>>    - HDFS to better support non-MR use-cases
>>    - Cluster metrics hooks
>>    - MR modularity (package)
>>  * Facebook
>>    - Correctness
>>    - Availability, High Availability, Failover, Continuous Availability
>>    - Scalability
>> # Bar for patches/features keeps going higher as the project matures
>>  - Build consensus (e.g. Python Enhancement Process, JSR etc.)
>>  - Run/test on your own to prove the concept/feature or branch and finish
>>  - Early versions of libraries should be started outside of the project
>> (github etc.) e.g. input-formats, compression-codecs etc.
>>    - github for all the above
>>    - Prune contrib
>> # Maven for packaging
>> # Tom: hadoop-0.21 (Tom - can you please post your slides? Thanks!)
>> # Owen: Release Manager (see slides)
>> # Agenda for next meeting
>>  - Eli: Hadoop Enhancement Process (modelled on PEP?)
>>  - Branching strategies: Development Models
>>
>> Arun
>>
>>
>>
>>
>>
>

Re: Minutes: Hadoop Contributor Meeting 05/06/2010

Posted by Tom White <to...@cloudera.com>.
Not sure why my attachment didn't make it to the list. Anyway, I've
posted Arun's notes on the wiki at
http://wiki.apache.org/hadoop/HadoopContributorsMeeting20100506, and
included the content of my slide there. (Attachments on the wiki have
been disabled - as of today apparently, see SVN commit r775220 - so I
wasn't able to post the slide there either.)

Tom

On Fri, May 7, 2010 at 9:36 AM, Tom White <to...@cloudera.com> wrote:
> Here's my (single) slide about the 0.21 release.
>
> Tom
>
> On Thu, May 6, 2010 at 5:38 PM, Arun C Murthy <ac...@gmail.com> wrote:
>> # Shared goals
>>  - Hadoop is HDFS & Map-Reduce in this context of this set of slides
>> # Priorities
>>  * Yahoo
>>    - Correctness
>>    - Availability: Not the same as high-availability (6 9s. etc.) i.e. SPOFs
>>    - API Compatibility
>>    - Scalability
>>    - Operability
>>    - Performance
>>    - Innovation
>>  * Cloudera
>>    - Test coverage, api coverage
>>    - APL Licensed codec (lzo replacement)
>>    - Security
>>    - Wire compatibility
>>    - Cluster-wide resource availability
>>    - New apis (FileContext, MR Context Objs.), documentation of their
>> advantages
>>    - HDFS to better support non-MR use-cases
>>    - Cluster metrics hooks
>>    - MR modularity (package)
>>  * Facebook
>>    - Correctness
>>    - Availability, High Availability, Failover, Continuous Availability
>>    - Scalability
>> # Bar for patches/features keeps going higher as the project matures
>>  - Build consensus (e.g. Python Enhancement Process, JSR etc.)
>>  - Run/test on your own to prove the concept/feature or branch and finish
>>  - Early versions of libraries should be started outside of the project
>> (github etc.) e.g. input-formats, compression-codecs etc.
>>    - github for all the above
>>    - Prune contrib
>> # Maven for packaging
>> # Tom: hadoop-0.21 (Tom - can you please post your slides? Thanks!)
>> # Owen: Release Manager (see slides)
>> # Agenda for next meeting
>>  - Eli: Hadoop Enhancement Process (modelled on PEP?)
>>  - Branching strategies: Development Models
>>
>> Arun
>>
>>
>>
>>
>>
>

Re: Minutes: Hadoop Contributor Meeting 05/06/2010

Posted by Tom White <to...@cloudera.com>.
Here's my (single) slide about the 0.21 release.

Tom

On Thu, May 6, 2010 at 5:38 PM, Arun C Murthy <ac...@gmail.com> wrote:
> # Shared goals
>  - Hadoop is HDFS & Map-Reduce in this context of this set of slides
> # Priorities
>  * Yahoo
>    - Correctness
>    - Availability: Not the same as high-availability (6 9s. etc.) i.e. SPOFs
>    - API Compatibility
>    - Scalability
>    - Operability
>    - Performance
>    - Innovation
>  * Cloudera
>    - Test coverage, api coverage
>    - APL Licensed codec (lzo replacement)
>    - Security
>    - Wire compatibility
>    - Cluster-wide resource availability
>    - New apis (FileContext, MR Context Objs.), documentation of their
> advantages
>    - HDFS to better support non-MR use-cases
>    - Cluster metrics hooks
>    - MR modularity (package)
>  * Facebook
>    - Correctness
>    - Availability, High Availability, Failover, Continuous Availability
>    - Scalability
> # Bar for patches/features keeps going higher as the project matures
>  - Build consensus (e.g. Python Enhancement Process, JSR etc.)
>  - Run/test on your own to prove the concept/feature or branch and finish
>  - Early versions of libraries should be started outside of the project
> (github etc.) e.g. input-formats, compression-codecs etc.
>    - github for all the above
>    - Prune contrib
> # Maven for packaging
> # Tom: hadoop-0.21 (Tom - can you please post your slides? Thanks!)
> # Owen: Release Manager (see slides)
> # Agenda for next meeting
>  - Eli: Hadoop Enhancement Process (modelled on PEP?)
>  - Branching strategies: Development Models
>
> Arun
>
>
>
>
>

Re: Minutes: Hadoop Contributor Meeting 05/06/2010

Posted by Arun C Murthy <ac...@yahoo-inc.com>.
On May 6, 2010, at 5:38 PM, Arun C Murthy wrote:
> # Agenda for next meeting
>  - Eli: Hadoop Enhancement Process (modelled on PEP?)
>  - Branching strategies: Development Models

Something to think about w.r.t branching strategies: http://incubator.apache.org/learn/rules-for-revolutionaries.html

Arun

Re: Minutes: Hadoop Contributor Meeting 05/06/2010

Posted by Tom White <to...@cloudera.com>.
Here's my (single) slide about the 0.21 release.

Tom

On Thu, May 6, 2010 at 5:38 PM, Arun C Murthy <ac...@gmail.com> wrote:
> # Shared goals
>  - Hadoop is HDFS & Map-Reduce in this context of this set of slides
> # Priorities
>  * Yahoo
>    - Correctness
>    - Availability: Not the same as high-availability (6 9s. etc.) i.e. SPOFs
>    - API Compatibility
>    - Scalability
>    - Operability
>    - Performance
>    - Innovation
>  * Cloudera
>    - Test coverage, api coverage
>    - APL Licensed codec (lzo replacement)
>    - Security
>    - Wire compatibility
>    - Cluster-wide resource availability
>    - New apis (FileContext, MR Context Objs.), documentation of their
> advantages
>    - HDFS to better support non-MR use-cases
>    - Cluster metrics hooks
>    - MR modularity (package)
>  * Facebook
>    - Correctness
>    - Availability, High Availability, Failover, Continuous Availability
>    - Scalability
> # Bar for patches/features keeps going higher as the project matures
>  - Build consensus (e.g. Python Enhancement Process, JSR etc.)
>  - Run/test on your own to prove the concept/feature or branch and finish
>  - Early versions of libraries should be started outside of the project
> (github etc.) e.g. input-formats, compression-codecs etc.
>    - github for all the above
>    - Prune contrib
> # Maven for packaging
> # Tom: hadoop-0.21 (Tom - can you please post your slides? Thanks!)
> # Owen: Release Manager (see slides)
> # Agenda for next meeting
>  - Eli: Hadoop Enhancement Process (modelled on PEP?)
>  - Branching strategies: Development Models
>
> Arun
>
>
>
>
>

Re: Minutes: Hadoop Contributor Meeting 05/06/2010

Posted by Tom White <to...@cloudera.com>.
Here's my (single) slide about the 0.21 release.

Tom

On Thu, May 6, 2010 at 5:38 PM, Arun C Murthy <ac...@gmail.com> wrote:
> # Shared goals
>  - Hadoop is HDFS & Map-Reduce in this context of this set of slides
> # Priorities
>  * Yahoo
>    - Correctness
>    - Availability: Not the same as high-availability (6 9s. etc.) i.e. SPOFs
>    - API Compatibility
>    - Scalability
>    - Operability
>    - Performance
>    - Innovation
>  * Cloudera
>    - Test coverage, api coverage
>    - APL Licensed codec (lzo replacement)
>    - Security
>    - Wire compatibility
>    - Cluster-wide resource availability
>    - New apis (FileContext, MR Context Objs.), documentation of their
> advantages
>    - HDFS to better support non-MR use-cases
>    - Cluster metrics hooks
>    - MR modularity (package)
>  * Facebook
>    - Correctness
>    - Availability, High Availability, Failover, Continuous Availability
>    - Scalability
> # Bar for patches/features keeps going higher as the project matures
>  - Build consensus (e.g. Python Enhancement Process, JSR etc.)
>  - Run/test on your own to prove the concept/feature or branch and finish
>  - Early versions of libraries should be started outside of the project
> (github etc.) e.g. input-formats, compression-codecs etc.
>    - github for all the above
>    - Prune contrib
> # Maven for packaging
> # Tom: hadoop-0.21 (Tom - can you please post your slides? Thanks!)
> # Owen: Release Manager (see slides)
> # Agenda for next meeting
>  - Eli: Hadoop Enhancement Process (modelled on PEP?)
>  - Branching strategies: Development Models
>
> Arun
>
>
>
>
>