You are viewing a plain text version of this content. The canonical link for it is here.

Posted to general@hadoop.apache.org by Jeremy Davis <je...@speakeasy.net> on 2010/03/19 17:43:02 UTC

0.21 Release

Could someone clarify the position on the 0.21 release?
This thread:
http://mail-archives.apache.org/mod_mbox/hadoop-general/201002.mbox/<dfe484f01002161412h8bc953axee2a73d81a234bdf@mail.gmail.com 
 >
kind of died out (it seems) without any resolution.

I saw two suggestions: Rebase 0.21 against trunk and then try to  
release, and I also saw release 0.21 as is with stack as the release  
manager. Was there a final decision on this off list?

Thanks.

Re: 0.21 Release

Posted by Stack <st...@duboce.net>.

On Fri, Mar 19, 2010 at 12:17 PM, Jeremy Davis <je...@speakeasy.net> wrote:
> So what would be your (or anyones) advice on getting, HDFS sync/flush/append
> functionality?
> You seem to indicate that 0.21 branch as is might not be the best idea, with
> your preference being a patch set against 0.20.
>

There'll be a hadoop release that ships with the working sync/append
(In our testing, whats in the 0.21 branch+TRUNK, hdfs-265, works very
nicely).  I'm just not sure when.

St.Ack

Re: 0.21 Release

Posted by Ted Yu <yu...@gmail.com>.

I want to bring up the issue about shuffler from thread:
Shuffle In Memory OutOfMemoryError

I was expecting improved performance from MAPREDUCE-1182

That would be one of the reasons for people to try out 0.21 release

Regards

On Fri, Mar 19, 2010 at 12:31 PM, Todd Lipcon <to...@cloudera.com> wrote:

> On Fri, Mar 19, 2010 at 12:17 PM, Jeremy Davis <jerdavis@speakeasy.net
> >wrote:
>
> > Thank you for your reply,
> >
> > So what would be your (or anyones) advice on getting, HDFS
> > sync/flush/append functionality?
> > You seem to indicate that 0.21 branch as is might not be the best idea,
> > with your preference being a patch set against 0.20.
> >
> > We definitely have an application for this specific functionality, and I
> > need to provide some direction/answers in this area for my colleagues.
> >
> > For example, I might say: We will install the current 0.21 now, and
> develop
> > against it.. But in X time we will install CDH2 (and get all the goodness
> it
> > brings), and then apply a given patch set against it. Is this in line
> with
> > what you are thinking? If so, could I get your perceived level of effort,
> > maybe a time frame? April/May/June/ etc..
> >
> >
> FYI the 0.20 sync patches mentioned by Stack will be going into CDH3 at
> some
> point this spring. This is mainly for the benefit of HBase but of course
> other applications will benefit as well.
>
> Thanks
> -Todd
>
>
> > I'm sure I'm not the only one that has plans for this feature, as it's in
> >  "Hadoop: the Definitive Guide" by O'Reilly published in September '09.
> >
> > Thanks,
> > -JD
> >
> >
> >
> >
> >
> > On Mar 19, 2010, at 11:51 AM, Stack wrote:
> >
> >  On Fri, Mar 19, 2010 at 9:43 AM, Jeremy Davis <je...@speakeasy.net>
> >> wrote:
> >>
> >>> ...and I also saw release 0.21 as is with stack as the release manager.
> >>> Was
> >>> there a final decision on this off list?
> >>>
> >>>
> >> On the above, I was toying with the idea of being release manager for
> >> releasing as hadoop 0.21.0 what is in current 0.21 hadoop branch but I
> >> subsequently decided against it after chatting with folks and figuring
> >> that the only group that seemed interested in driving a release of the
> >> hadoop 0.21 branch was the hbase crew.  If I were to guess, an hadoop
> >> vouched for by a couple of hbasers with their spotty hdfs and
> >> mapreduce knowledge probably wouldn't have the penetration of a
> >> release backed by, say, a Yahoo.  No one would trust their data to
> >> such a release.  If no data in hadoop 0.21 clusters, hbase wouldn't
> >> have anything to run against.  So I let it go and figured time could
> >> be spent better elsewhere; e.g. helping test the set of patches that
> >> could get us a sync/flush/append on a patched hadoop 0.20 (hdfs-200,
> >> etc.).
> >>
> >> Sorry, I should have added a note to cited thread that I'd wandered...
> >>
> >> St.Ack
> >>
> >
> >
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Re: 0.21 Release

Posted by Todd Lipcon <to...@cloudera.com>.

On Fri, Mar 19, 2010 at 12:17 PM, Jeremy Davis <je...@speakeasy.net>wrote:

> Thank you for your reply,
>
> So what would be your (or anyones) advice on getting, HDFS
> sync/flush/append functionality?
> You seem to indicate that 0.21 branch as is might not be the best idea,
> with your preference being a patch set against 0.20.
>
> We definitely have an application for this specific functionality, and I
> need to provide some direction/answers in this area for my colleagues.
>
> For example, I might say: We will install the current 0.21 now, and develop
> against it.. But in X time we will install CDH2 (and get all the goodness it
> brings), and then apply a given patch set against it. Is this in line with
> what you are thinking? If so, could I get your perceived level of effort,
> maybe a time frame? April/May/June/ etc..
>
>
FYI the 0.20 sync patches mentioned by Stack will be going into CDH3 at some
point this spring. This is mainly for the benefit of HBase but of course
other applications will benefit as well.

Thanks
-Todd


> I'm sure I'm not the only one that has plans for this feature, as it's in
>  "Hadoop: the Definitive Guide" by O'Reilly published in September '09.
>
> Thanks,
> -JD
>
>
>
>
>
> On Mar 19, 2010, at 11:51 AM, Stack wrote:
>
>  On Fri, Mar 19, 2010 at 9:43 AM, Jeremy Davis <je...@speakeasy.net>
>> wrote:
>>
>>> ...and I also saw release 0.21 as is with stack as the release manager.
>>> Was
>>> there a final decision on this off list?
>>>
>>>
>> On the above, I was toying with the idea of being release manager for
>> releasing as hadoop 0.21.0 what is in current 0.21 hadoop branch but I
>> subsequently decided against it after chatting with folks and figuring
>> that the only group that seemed interested in driving a release of the
>> hadoop 0.21 branch was the hbase crew.  If I were to guess, an hadoop
>> vouched for by a couple of hbasers with their spotty hdfs and
>> mapreduce knowledge probably wouldn't have the penetration of a
>> release backed by, say, a Yahoo.  No one would trust their data to
>> such a release.  If no data in hadoop 0.21 clusters, hbase wouldn't
>> have anything to run against.  So I let it go and figured time could
>> be spent better elsewhere; e.g. helping test the set of patches that
>> could get us a sync/flush/append on a patched hadoop 0.20 (hdfs-200,
>> etc.).
>>
>> Sorry, I should have added a note to cited thread that I'd wandered...
>>
>> St.Ack
>>
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

Re: 0.21 Release

Posted by Jeremy Davis <je...@speakeasy.net>.

Thank you for your reply,

So what would be your (or anyones) advice on getting, HDFS sync/flush/ 
append functionality?
You seem to indicate that 0.21 branch as is might not be the best  
idea, with your preference being a patch set against 0.20.

We definitely have an application for this specific functionality, and  
I need to provide some direction/answers in this area for my colleagues.

For example, I might say: We will install the current 0.21 now, and  
develop against it.. But in X time we will install CDH2 (and get all  
the goodness it brings), and then apply a given patch set against it.  
Is this in line with what you are thinking? If so, could I get your  
perceived level of effort, maybe a time frame? April/May/June/ etc..

I'm sure I'm not the only one that has plans for this feature, as it's  
in  "Hadoop: the Definitive Guide" by O'Reilly published in September  
'09.

Thanks,
-JD

On Mar 19, 2010, at 11:51 AM, Stack wrote:

> On Fri, Mar 19, 2010 at 9:43 AM, Jeremy Davis  
> <je...@speakeasy.net> wrote:
>> ...and I also saw release 0.21 as is with stack as the release  
>> manager. Was
>> there a final decision on this off list?
>>
>
> On the above, I was toying with the idea of being release manager for
> releasing as hadoop 0.21.0 what is in current 0.21 hadoop branch but I
> subsequently decided against it after chatting with folks and figuring
> that the only group that seemed interested in driving a release of the
> hadoop 0.21 branch was the hbase crew.  If I were to guess, an hadoop
> vouched for by a couple of hbasers with their spotty hdfs and
> mapreduce knowledge probably wouldn't have the penetration of a
> release backed by, say, a Yahoo.  No one would trust their data to
> such a release.  If no data in hadoop 0.21 clusters, hbase wouldn't
> have anything to run against.  So I let it go and figured time could
> be spent better elsewhere; e.g. helping test the set of patches that
> could get us a sync/flush/append on a patched hadoop 0.20 (hdfs-200,
> etc.).
>
> Sorry, I should have added a note to cited thread that I'd wandered...
>
> St.Ack

Re: 0.21 Release

Posted by Steve Loughran <st...@apache.org>.

Stack wrote:
> On Fri, Mar 19, 2010 at 9:43 AM, Jeremy Davis <je...@speakeasy.net> wrote:
>> ...and I also saw release 0.21 as is with stack as the release manager. Was
>> there a final decision on this off list?
>>
> 
> On the above, I was toying with the idea of being release manager for
> releasing as hadoop 0.21.0 what is in current 0.21 hadoop branch but I
> subsequently decided against it after chatting with folks and figuring
> that the only group that seemed interested in driving a release of the
> hadoop 0.21 branch was the hbase crew.  If I were to guess, an hadoop
> vouched for by a couple of hbasers with their spotty hdfs and
> mapreduce knowledge probably wouldn't have the penetration of a
> release backed by, say, a Yahoo.  No one would trust their data to
> such a release.  If no data in hadoop 0.21 clusters, hbase wouldn't
> have anything to run against.  So I let it go and figured time could
> be spent better elsewhere; e.g. helping test the set of patches that
> could get us a sync/flush/append on a patched hadoop 0.20 (hdfs-200,
> etc.).
> 
> Sorry, I should have added a note to cited thread that I'd wandered...
> 
> St.Ack

I'm not going to volunteer to help with this as my changes are still not 
in 0.22, which is what I'm trying to target, but I'd certainly approve 
of cutting a release if only because
  - it ensures the release process itself works well. When we were doing 
releases every fortnight at work, you make sure everything is automated 
that can be automated.
  - with 0.22 adding security, I expect its deployment will be traumatic 
and take longer to roll out than anyone expects
  - there are lots of improvements in 0.21; getting into a world of 
backports is complicated. Better to keep momentum up.

The big concern has to be the filesystem. Who out there is willing/able 
to test 0.21-based filesystems, and what size clusters do they have?

-steve

Re: 0.21 Release

Posted by Stack <st...@duboce.net>.

On Fri, Mar 19, 2010 at 9:43 AM, Jeremy Davis <je...@speakeasy.net> wrote:
> ...and I also saw release 0.21 as is with stack as the release manager. Was
> there a final decision on this off list?
>

On the above, I was toying with the idea of being release manager for
releasing as hadoop 0.21.0 what is in current 0.21 hadoop branch but I
subsequently decided against it after chatting with folks and figuring
that the only group that seemed interested in driving a release of the
hadoop 0.21 branch was the hbase crew.  If I were to guess, an hadoop
vouched for by a couple of hbasers with their spotty hdfs and
mapreduce knowledge probably wouldn't have the penetration of a
release backed by, say, a Yahoo.  No one would trust their data to
such a release.  If no data in hadoop 0.21 clusters, hbase wouldn't
have anything to run against.  So I let it go and figured time could
be spent better elsewhere; e.g. helping test the set of patches that
could get us a sync/flush/append on a patched hadoop 0.20 (hdfs-200,
etc.).

Sorry, I should have added a note to cited thread that I'd wandered...

St.Ack