You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Arun C Murthy <ac...@gmail.com> on 2010/05/07 02:38:25 UTC
Minutes: Hadoop Contributor Meeting 05/06/2010
# Shared goals
- Hadoop is HDFS & Map-Reduce in this context of this set of slides
# Priorities
* Yahoo
- Correctness
- Availability: Not the same as high-availability (6 9s. etc.)
i.e. SPOFs
- API Compatibility
- Scalability
- Operability
- Performance
- Innovation
* Cloudera
- Test coverage, api coverage
- APL Licensed codec (lzo replacement)
- Security
- Wire compatibility
- Cluster-wide resource availability
- New apis (FileContext, MR Context Objs.), documentation of
their advantages
- HDFS to better support non-MR use-cases
- Cluster metrics hooks
- MR modularity (package)
* Facebook
- Correctness
- Availability, High Availability, Failover, Continuous
Availability
- Scalability
# Bar for patches/features keeps going higher as the project matures
- Build consensus (e.g. Python Enhancement Process, JSR etc.)
- Run/test on your own to prove the concept/feature or branch and
finish
- Early versions of libraries should be started outside of the
project (github etc.) e.g. input-formats, compression-codecs etc.
- github for all the above
- Prune contrib
# Maven for packaging
# Tom: hadoop-0.21 (Tom - can you please post your slides? Thanks!)
# Owen: Release Manager (see slides)
# Agenda for next meeting
- Eli: Hadoop Enhancement Process (modelled on PEP?)
- Branching strategies: Development Models
Arun
Re: Minutes: Hadoop Contributor Meeting 05/06/2010
Posted by Eli Collins <el...@cloudera.com>.
Hey Arun,
I updated the agenda on Meetup. I was assuming the branching would
fall out of how to implement HEP but good to discuss separately as
well.
Also, I added moving contrib out of the repos since a couple people
mentioned that at the last meetup but we should do that time
permitting the other topics.
Thanks,
Eli
On Fri, May 21, 2010 at 2:04 PM, Arun C Murthy <ac...@yahoo-inc.com> wrote:
>
> On May 6, 2010, at 5:38 PM, Arun C Murthy wrote:
>>
>> # Agenda for next meeting
>> - Eli: Hadoop Enhancement Process (modelled on PEP?)
>> - Branching strategies: Development Models
>
> Something to think about w.r.t branching strategies:
> http://incubator.apache.org/learn/rules-for-revolutionaries.html
>
> Arun
>
Re: Minutes: Hadoop Contributor Meeting 05/06/2010
Posted by Eli Collins <el...@cloudera.com>.
Hey Arun,
I updated the agenda on Meetup. I was assuming the branching would
fall out of how to implement HEP but good to discuss separately as
well.
Also, I added moving contrib out of the repos since a couple people
mentioned that at the last meetup but we should do that time
permitting the other topics.
Thanks,
Eli
On Fri, May 21, 2010 at 2:04 PM, Arun C Murthy <ac...@yahoo-inc.com> wrote:
>
> On May 6, 2010, at 5:38 PM, Arun C Murthy wrote:
>>
>> # Agenda for next meeting
>> - Eli: Hadoop Enhancement Process (modelled on PEP?)
>> - Branching strategies: Development Models
>
> Something to think about w.r.t branching strategies:
> http://incubator.apache.org/learn/rules-for-revolutionaries.html
>
> Arun
>
Re: Minutes: Hadoop Contributor Meeting 05/06/2010
Posted by Eli Collins <el...@cloudera.com>.
Hey Arun,
I updated the agenda on Meetup. I was assuming the branching would
fall out of how to implement HEP but good to discuss separately as
well.
Also, I added moving contrib out of the repos since a couple people
mentioned that at the last meetup but we should do that time
permitting the other topics.
Thanks,
Eli
On Fri, May 21, 2010 at 2:04 PM, Arun C Murthy <ac...@yahoo-inc.com> wrote:
>
> On May 6, 2010, at 5:38 PM, Arun C Murthy wrote:
>>
>> # Agenda for next meeting
>> - Eli: Hadoop Enhancement Process (modelled on PEP?)
>> - Branching strategies: Development Models
>
> Something to think about w.r.t branching strategies:
> http://incubator.apache.org/learn/rules-for-revolutionaries.html
>
> Arun
>
Re: Minutes: Hadoop Contributor Meeting 05/06/2010
Posted by Arun C Murthy <ac...@yahoo-inc.com>.
On May 6, 2010, at 5:38 PM, Arun C Murthy wrote:
> # Agenda for next meeting
> - Eli: Hadoop Enhancement Process (modelled on PEP?)
> - Branching strategies: Development Models
Something to think about w.r.t branching strategies: http://incubator.apache.org/learn/rules-for-revolutionaries.html
Arun
Re: Minutes: Hadoop Contributor Meeting 05/06/2010
Posted by Arun C Murthy <ac...@yahoo-inc.com>.
On May 6, 2010, at 5:38 PM, Arun C Murthy wrote:
> # Agenda for next meeting
> - Eli: Hadoop Enhancement Process (modelled on PEP?)
> - Branching strategies: Development Models
Something to think about w.r.t branching strategies: http://incubator.apache.org/learn/rules-for-revolutionaries.html
Arun
Re: Minutes: Hadoop Contributor Meeting 05/06/2010
Posted by Tom White <to...@cloudera.com>.
Not sure why my attachment didn't make it to the list. Anyway, I've
posted Arun's notes on the wiki at
http://wiki.apache.org/hadoop/HadoopContributorsMeeting20100506, and
included the content of my slide there. (Attachments on the wiki have
been disabled - as of today apparently, see SVN commit r775220 - so I
wasn't able to post the slide there either.)
Tom
On Fri, May 7, 2010 at 9:36 AM, Tom White <to...@cloudera.com> wrote:
> Here's my (single) slide about the 0.21 release.
>
> Tom
>
> On Thu, May 6, 2010 at 5:38 PM, Arun C Murthy <ac...@gmail.com> wrote:
>> # Shared goals
>> - Hadoop is HDFS & Map-Reduce in this context of this set of slides
>> # Priorities
>> * Yahoo
>> - Correctness
>> - Availability: Not the same as high-availability (6 9s. etc.) i.e. SPOFs
>> - API Compatibility
>> - Scalability
>> - Operability
>> - Performance
>> - Innovation
>> * Cloudera
>> - Test coverage, api coverage
>> - APL Licensed codec (lzo replacement)
>> - Security
>> - Wire compatibility
>> - Cluster-wide resource availability
>> - New apis (FileContext, MR Context Objs.), documentation of their
>> advantages
>> - HDFS to better support non-MR use-cases
>> - Cluster metrics hooks
>> - MR modularity (package)
>> * Facebook
>> - Correctness
>> - Availability, High Availability, Failover, Continuous Availability
>> - Scalability
>> # Bar for patches/features keeps going higher as the project matures
>> - Build consensus (e.g. Python Enhancement Process, JSR etc.)
>> - Run/test on your own to prove the concept/feature or branch and finish
>> - Early versions of libraries should be started outside of the project
>> (github etc.) e.g. input-formats, compression-codecs etc.
>> - github for all the above
>> - Prune contrib
>> # Maven for packaging
>> # Tom: hadoop-0.21 (Tom - can you please post your slides? Thanks!)
>> # Owen: Release Manager (see slides)
>> # Agenda for next meeting
>> - Eli: Hadoop Enhancement Process (modelled on PEP?)
>> - Branching strategies: Development Models
>>
>> Arun
>>
>>
>>
>>
>>
>
Re: Minutes: Hadoop Contributor Meeting 05/06/2010
Posted by Tom White <to...@cloudera.com>.
Not sure why my attachment didn't make it to the list. Anyway, I've
posted Arun's notes on the wiki at
http://wiki.apache.org/hadoop/HadoopContributorsMeeting20100506, and
included the content of my slide there. (Attachments on the wiki have
been disabled - as of today apparently, see SVN commit r775220 - so I
wasn't able to post the slide there either.)
Tom
On Fri, May 7, 2010 at 9:36 AM, Tom White <to...@cloudera.com> wrote:
> Here's my (single) slide about the 0.21 release.
>
> Tom
>
> On Thu, May 6, 2010 at 5:38 PM, Arun C Murthy <ac...@gmail.com> wrote:
>> # Shared goals
>> - Hadoop is HDFS & Map-Reduce in this context of this set of slides
>> # Priorities
>> * Yahoo
>> - Correctness
>> - Availability: Not the same as high-availability (6 9s. etc.) i.e. SPOFs
>> - API Compatibility
>> - Scalability
>> - Operability
>> - Performance
>> - Innovation
>> * Cloudera
>> - Test coverage, api coverage
>> - APL Licensed codec (lzo replacement)
>> - Security
>> - Wire compatibility
>> - Cluster-wide resource availability
>> - New apis (FileContext, MR Context Objs.), documentation of their
>> advantages
>> - HDFS to better support non-MR use-cases
>> - Cluster metrics hooks
>> - MR modularity (package)
>> * Facebook
>> - Correctness
>> - Availability, High Availability, Failover, Continuous Availability
>> - Scalability
>> # Bar for patches/features keeps going higher as the project matures
>> - Build consensus (e.g. Python Enhancement Process, JSR etc.)
>> - Run/test on your own to prove the concept/feature or branch and finish
>> - Early versions of libraries should be started outside of the project
>> (github etc.) e.g. input-formats, compression-codecs etc.
>> - github for all the above
>> - Prune contrib
>> # Maven for packaging
>> # Tom: hadoop-0.21 (Tom - can you please post your slides? Thanks!)
>> # Owen: Release Manager (see slides)
>> # Agenda for next meeting
>> - Eli: Hadoop Enhancement Process (modelled on PEP?)
>> - Branching strategies: Development Models
>>
>> Arun
>>
>>
>>
>>
>>
>
Re: Minutes: Hadoop Contributor Meeting 05/06/2010
Posted by Tom White <to...@cloudera.com>.
Not sure why my attachment didn't make it to the list. Anyway, I've
posted Arun's notes on the wiki at
http://wiki.apache.org/hadoop/HadoopContributorsMeeting20100506, and
included the content of my slide there. (Attachments on the wiki have
been disabled - as of today apparently, see SVN commit r775220 - so I
wasn't able to post the slide there either.)
Tom
On Fri, May 7, 2010 at 9:36 AM, Tom White <to...@cloudera.com> wrote:
> Here's my (single) slide about the 0.21 release.
>
> Tom
>
> On Thu, May 6, 2010 at 5:38 PM, Arun C Murthy <ac...@gmail.com> wrote:
>> # Shared goals
>> - Hadoop is HDFS & Map-Reduce in this context of this set of slides
>> # Priorities
>> * Yahoo
>> - Correctness
>> - Availability: Not the same as high-availability (6 9s. etc.) i.e. SPOFs
>> - API Compatibility
>> - Scalability
>> - Operability
>> - Performance
>> - Innovation
>> * Cloudera
>> - Test coverage, api coverage
>> - APL Licensed codec (lzo replacement)
>> - Security
>> - Wire compatibility
>> - Cluster-wide resource availability
>> - New apis (FileContext, MR Context Objs.), documentation of their
>> advantages
>> - HDFS to better support non-MR use-cases
>> - Cluster metrics hooks
>> - MR modularity (package)
>> * Facebook
>> - Correctness
>> - Availability, High Availability, Failover, Continuous Availability
>> - Scalability
>> # Bar for patches/features keeps going higher as the project matures
>> - Build consensus (e.g. Python Enhancement Process, JSR etc.)
>> - Run/test on your own to prove the concept/feature or branch and finish
>> - Early versions of libraries should be started outside of the project
>> (github etc.) e.g. input-formats, compression-codecs etc.
>> - github for all the above
>> - Prune contrib
>> # Maven for packaging
>> # Tom: hadoop-0.21 (Tom - can you please post your slides? Thanks!)
>> # Owen: Release Manager (see slides)
>> # Agenda for next meeting
>> - Eli: Hadoop Enhancement Process (modelled on PEP?)
>> - Branching strategies: Development Models
>>
>> Arun
>>
>>
>>
>>
>>
>
Re: Minutes: Hadoop Contributor Meeting 05/06/2010
Posted by Tom White <to...@cloudera.com>.
Here's my (single) slide about the 0.21 release.
Tom
On Thu, May 6, 2010 at 5:38 PM, Arun C Murthy <ac...@gmail.com> wrote:
> # Shared goals
> - Hadoop is HDFS & Map-Reduce in this context of this set of slides
> # Priorities
> * Yahoo
> - Correctness
> - Availability: Not the same as high-availability (6 9s. etc.) i.e. SPOFs
> - API Compatibility
> - Scalability
> - Operability
> - Performance
> - Innovation
> * Cloudera
> - Test coverage, api coverage
> - APL Licensed codec (lzo replacement)
> - Security
> - Wire compatibility
> - Cluster-wide resource availability
> - New apis (FileContext, MR Context Objs.), documentation of their
> advantages
> - HDFS to better support non-MR use-cases
> - Cluster metrics hooks
> - MR modularity (package)
> * Facebook
> - Correctness
> - Availability, High Availability, Failover, Continuous Availability
> - Scalability
> # Bar for patches/features keeps going higher as the project matures
> - Build consensus (e.g. Python Enhancement Process, JSR etc.)
> - Run/test on your own to prove the concept/feature or branch and finish
> - Early versions of libraries should be started outside of the project
> (github etc.) e.g. input-formats, compression-codecs etc.
> - github for all the above
> - Prune contrib
> # Maven for packaging
> # Tom: hadoop-0.21 (Tom - can you please post your slides? Thanks!)
> # Owen: Release Manager (see slides)
> # Agenda for next meeting
> - Eli: Hadoop Enhancement Process (modelled on PEP?)
> - Branching strategies: Development Models
>
> Arun
>
>
>
>
>
Re: Minutes: Hadoop Contributor Meeting 05/06/2010
Posted by Arun C Murthy <ac...@yahoo-inc.com>.
On May 6, 2010, at 5:38 PM, Arun C Murthy wrote:
> # Agenda for next meeting
> - Eli: Hadoop Enhancement Process (modelled on PEP?)
> - Branching strategies: Development Models
Something to think about w.r.t branching strategies: http://incubator.apache.org/learn/rules-for-revolutionaries.html
Arun
Re: Minutes: Hadoop Contributor Meeting 05/06/2010
Posted by Tom White <to...@cloudera.com>.
Here's my (single) slide about the 0.21 release.
Tom
On Thu, May 6, 2010 at 5:38 PM, Arun C Murthy <ac...@gmail.com> wrote:
> # Shared goals
> - Hadoop is HDFS & Map-Reduce in this context of this set of slides
> # Priorities
> * Yahoo
> - Correctness
> - Availability: Not the same as high-availability (6 9s. etc.) i.e. SPOFs
> - API Compatibility
> - Scalability
> - Operability
> - Performance
> - Innovation
> * Cloudera
> - Test coverage, api coverage
> - APL Licensed codec (lzo replacement)
> - Security
> - Wire compatibility
> - Cluster-wide resource availability
> - New apis (FileContext, MR Context Objs.), documentation of their
> advantages
> - HDFS to better support non-MR use-cases
> - Cluster metrics hooks
> - MR modularity (package)
> * Facebook
> - Correctness
> - Availability, High Availability, Failover, Continuous Availability
> - Scalability
> # Bar for patches/features keeps going higher as the project matures
> - Build consensus (e.g. Python Enhancement Process, JSR etc.)
> - Run/test on your own to prove the concept/feature or branch and finish
> - Early versions of libraries should be started outside of the project
> (github etc.) e.g. input-formats, compression-codecs etc.
> - github for all the above
> - Prune contrib
> # Maven for packaging
> # Tom: hadoop-0.21 (Tom - can you please post your slides? Thanks!)
> # Owen: Release Manager (see slides)
> # Agenda for next meeting
> - Eli: Hadoop Enhancement Process (modelled on PEP?)
> - Branching strategies: Development Models
>
> Arun
>
>
>
>
>
Re: Minutes: Hadoop Contributor Meeting 05/06/2010
Posted by Tom White <to...@cloudera.com>.
Here's my (single) slide about the 0.21 release.
Tom
On Thu, May 6, 2010 at 5:38 PM, Arun C Murthy <ac...@gmail.com> wrote:
> # Shared goals
> - Hadoop is HDFS & Map-Reduce in this context of this set of slides
> # Priorities
> * Yahoo
> - Correctness
> - Availability: Not the same as high-availability (6 9s. etc.) i.e. SPOFs
> - API Compatibility
> - Scalability
> - Operability
> - Performance
> - Innovation
> * Cloudera
> - Test coverage, api coverage
> - APL Licensed codec (lzo replacement)
> - Security
> - Wire compatibility
> - Cluster-wide resource availability
> - New apis (FileContext, MR Context Objs.), documentation of their
> advantages
> - HDFS to better support non-MR use-cases
> - Cluster metrics hooks
> - MR modularity (package)
> * Facebook
> - Correctness
> - Availability, High Availability, Failover, Continuous Availability
> - Scalability
> # Bar for patches/features keeps going higher as the project matures
> - Build consensus (e.g. Python Enhancement Process, JSR etc.)
> - Run/test on your own to prove the concept/feature or branch and finish
> - Early versions of libraries should be started outside of the project
> (github etc.) e.g. input-formats, compression-codecs etc.
> - github for all the above
> - Prune contrib
> # Maven for packaging
> # Tom: hadoop-0.21 (Tom - can you please post your slides? Thanks!)
> # Owen: Release Manager (see slides)
> # Agenda for next meeting
> - Eli: Hadoop Enhancement Process (modelled on PEP?)
> - Branching strategies: Development Models
>
> Arun
>
>
>
>
>