You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Jean-Daniel Cryans <jd...@apache.org> on 2010/02/17 20:42:13 UTC

HBase trunk and Hadoop 0.21

Hi devs,

Yesterday Stack, Ryan, Todd (from cloudera) and me had a meeting with the FB
team about the course of action we should take with regard to Hadoop 0.21.
Since Y! doesn't seem committed to release it anytime soon (or even use it),
most users will probably stick with Hadoop 0.20.

What it means for us is no HDFS-265 until months and this is not a situation
our users should/will tolerate. Dhruba agreed to work on HDFS-200 (and some
others) to make sure we can have an equivalent support for sync (at least
from the HBase point of view). This work is targeted at Hadoop 0.20 although
it won't probably ever be in an Apache release.

This means that the current HBase trunk should ideally support both
0.20+HDFS-200 and 0.21 at the same time. I opened HBASE-2233 for that. Todd
was mentioning that if HDFS-200 isn't making the rest of Hadoop unstable,
they could even package it in some sort of release of theirs and make our
users' life easier. Could be a win-win.

Should we still name the next major HBase release as 0.21? If it becomes
common for HBase to support multiple Hadoop releases, should we still follow
their version number? Could it be time for HBase 1.0?

Let's hear everyone's opinion.

J-D

Re: HBase trunk and Hadoop 0.21

Posted by Stack <st...@duboce.net>.
On Thu, Feb 18, 2010 at 2:48 PM, Lars Francke <la...@gmail.com> wrote:
>
> How about the new Thrift interface? I wasn't overly concerned with
> timing as Hadoop 0.21 seemed to be some time off but if you decide to
> put it into a 0.20.x version I'll spend some time on it to get another
> patch out for review. It shouldn't break anything anyway.
>

Up to you Lars.  I think the story will be cleaner if its done in 0.21
since whats in 0.20 basically works and since the new thrift is a
radical break

As I see it, TRUNK (hbase 0.21) is for the big, destabilizing changes:
master rewrite, maven, new thrift, etc.  The revival of hbase 0.20
branch is to keep those who are hunkering down on hadoop 0.20 happy (A
good few users will probably stick with hadoop 0.20 a while given
whats going on up in the parent project -- see hadoop general for
discussion -- and even more so if there is the possibility of a
working flush/sync patch for an hdfs 0.20).  So, IMO, backporting
bug-fixes and critical features only makes sense on hbase 0.20 branch.
and the thrift redo should go on in TRUNK.

Thanks,
St.Ack

Re: HBase trunk and Hadoop 0.21

Posted by Lars Francke <la...@gmail.com>.
On Thu, Feb 18, 2010 at 21:53, Stack <st...@duboce.net> wrote:
> It looks like hbase 0.20.x will be around for longer than we were
> planning on.  Lets adapt.
>
> We should have a 0.20.4 soon

How about the new Thrift interface? I wasn't overly concerned with
timing as Hadoop 0.21 seemed to be some time off but if you decide to
put it into a 0.20.x version I'll spend some time on it to get another
patch out for review. It shouldn't break anything anyway.

Lars

Re: HBase trunk and Hadoop 0.21

Posted by Stack <st...@duboce.net>.
On Thu, Feb 18, 2010 at 1:41 PM, Jean-Daniel Cryans <jd...@apache.org> wrote:
> Sounds good. For the sake of getting "nice" version numbers, what
> about we do a bug fix-only release for 0.20.4 and then in 0.20.5 we
> break RPC compatibility and add new features.
>
> Also I we will need to backport cluster replication to 0.20.
>
Above sounds good.  Lets wrap up a 0.20.4 soon.  Lets see if we can
get another one or two "performance" tweeks in there and put it up on
hadoop 0.20.2 too since it looks imminent.

St.Ack




> J-D
>
> On Thu, Feb 18, 2010 at 12:53 PM, Stack <st...@duboce.net> wrote:
>> It looks like hbase 0.20.x will be around for longer than we were
>> planning on.  Lets adapt.
>>
>> We should have a 0.20.4 soon that includes hbase-2180 and, IMO, it
>> would include a one-time breakage of the RPC interface requiring a
>> cluster shutdown to upgrade so we can get in "HBASE-2219  stop using
>> code mapping for method names in the RPC".  We'd do this to set
>> ourselves up for being more elastic going forward.  With it in place,
>> we can add in interface changes -- e.g. add something like the
>> multiput, multiget, etc. -- without breaking ability to do a rolling
>> restart as we move through 0.20.5, 0.20.6 etc.  Then we should do
>> aggressive roll out of new hbase 0.20.x versions, etc. with fixes that
>> in particular can accomodate the evolving state of sync/flush on
>> hadoop 0.20 branch (hdfs-200+, etc.).
>>
>> In hbase 0.21 we keep on with replication and master rewrite.  We also
>> take on the notion that when hbase TRUNK is baked, we'll make it run
>> on hadoop 0.20/0.21/0.22.
>>
>> I'm against an hbase 1.0.0 just now.
>>
>> St.Ack
>>
>>
>> On Wed, Feb 17, 2010 at 11:42 AM, Jean-Daniel Cryans
>> <jd...@apache.org> wrote:
>>> Hi devs,
>>>
>>> Yesterday Stack, Ryan, Todd (from cloudera) and me had a meeting with the FB
>>> team about the course of action we should take with regard to Hadoop 0.21.
>>> Since Y! doesn't seem committed to release it anytime soon (or even use it),
>>> most users will probably stick with Hadoop 0.20.
>>>
>>> What it means for us is no HDFS-265 until months and this is not a situation
>>> our users should/will tolerate. Dhruba agreed to work on HDFS-200 (and some
>>> others) to make sure we can have an equivalent support for sync (at least
>>> from the HBase point of view). This work is targeted at Hadoop 0.20 although
>>> it won't probably ever be in an Apache release.
>>>
>>> This means that the current HBase trunk should ideally support both
>>> 0.20+HDFS-200 and 0.21 at the same time. I opened HBASE-2233 for that. Todd
>>> was mentioning that if HDFS-200 isn't making the rest of Hadoop unstable,
>>> they could even package it in some sort of release of theirs and make our
>>> users' life easier. Could be a win-win.
>>>
>>> Should we still name the next major HBase release as 0.21? If it becomes
>>> common for HBase to support multiple Hadoop releases, should we still follow
>>> their version number? Could it be time for HBase 1.0?
>>>
>>> Let's hear everyone's opinion.
>>>
>>> J-D
>>>
>>
>

Re: HBase trunk and Hadoop 0.21

Posted by Jean-Daniel Cryans <jd...@apache.org>.
Sounds good. For the sake of getting "nice" version numbers, what
about we do a bug fix-only release for 0.20.4 and then in 0.20.5 we
break RPC compatibility and add new features.

Also I we will need to backport cluster replication to 0.20.

J-D

On Thu, Feb 18, 2010 at 12:53 PM, Stack <st...@duboce.net> wrote:
> It looks like hbase 0.20.x will be around for longer than we were
> planning on.  Lets adapt.
>
> We should have a 0.20.4 soon that includes hbase-2180 and, IMO, it
> would include a one-time breakage of the RPC interface requiring a
> cluster shutdown to upgrade so we can get in "HBASE-2219  stop using
> code mapping for method names in the RPC".  We'd do this to set
> ourselves up for being more elastic going forward.  With it in place,
> we can add in interface changes -- e.g. add something like the
> multiput, multiget, etc. -- without breaking ability to do a rolling
> restart as we move through 0.20.5, 0.20.6 etc.  Then we should do
> aggressive roll out of new hbase 0.20.x versions, etc. with fixes that
> in particular can accomodate the evolving state of sync/flush on
> hadoop 0.20 branch (hdfs-200+, etc.).
>
> In hbase 0.21 we keep on with replication and master rewrite.  We also
> take on the notion that when hbase TRUNK is baked, we'll make it run
> on hadoop 0.20/0.21/0.22.
>
> I'm against an hbase 1.0.0 just now.
>
> St.Ack
>
>
> On Wed, Feb 17, 2010 at 11:42 AM, Jean-Daniel Cryans
> <jd...@apache.org> wrote:
>> Hi devs,
>>
>> Yesterday Stack, Ryan, Todd (from cloudera) and me had a meeting with the FB
>> team about the course of action we should take with regard to Hadoop 0.21.
>> Since Y! doesn't seem committed to release it anytime soon (or even use it),
>> most users will probably stick with Hadoop 0.20.
>>
>> What it means for us is no HDFS-265 until months and this is not a situation
>> our users should/will tolerate. Dhruba agreed to work on HDFS-200 (and some
>> others) to make sure we can have an equivalent support for sync (at least
>> from the HBase point of view). This work is targeted at Hadoop 0.20 although
>> it won't probably ever be in an Apache release.
>>
>> This means that the current HBase trunk should ideally support both
>> 0.20+HDFS-200 and 0.21 at the same time. I opened HBASE-2233 for that. Todd
>> was mentioning that if HDFS-200 isn't making the rest of Hadoop unstable,
>> they could even package it in some sort of release of theirs and make our
>> users' life easier. Could be a win-win.
>>
>> Should we still name the next major HBase release as 0.21? If it becomes
>> common for HBase to support multiple Hadoop releases, should we still follow
>> their version number? Could it be time for HBase 1.0?
>>
>> Let's hear everyone's opinion.
>>
>> J-D
>>
>

Re: HBase trunk and Hadoop 0.21

Posted by Stack <st...@duboce.net>.
It looks like hbase 0.20.x will be around for longer than we were
planning on.  Lets adapt.

We should have a 0.20.4 soon that includes hbase-2180 and, IMO, it
would include a one-time breakage of the RPC interface requiring a
cluster shutdown to upgrade so we can get in "HBASE-2219  stop using
code mapping for method names in the RPC".  We'd do this to set
ourselves up for being more elastic going forward.  With it in place,
we can add in interface changes -- e.g. add something like the
multiput, multiget, etc. -- without breaking ability to do a rolling
restart as we move through 0.20.5, 0.20.6 etc.  Then we should do
aggressive roll out of new hbase 0.20.x versions, etc. with fixes that
in particular can accomodate the evolving state of sync/flush on
hadoop 0.20 branch (hdfs-200+, etc.).

In hbase 0.21 we keep on with replication and master rewrite.  We also
take on the notion that when hbase TRUNK is baked, we'll make it run
on hadoop 0.20/0.21/0.22.

I'm against an hbase 1.0.0 just now.

St.Ack


On Wed, Feb 17, 2010 at 11:42 AM, Jean-Daniel Cryans
<jd...@apache.org> wrote:
> Hi devs,
>
> Yesterday Stack, Ryan, Todd (from cloudera) and me had a meeting with the FB
> team about the course of action we should take with regard to Hadoop 0.21.
> Since Y! doesn't seem committed to release it anytime soon (or even use it),
> most users will probably stick with Hadoop 0.20.
>
> What it means for us is no HDFS-265 until months and this is not a situation
> our users should/will tolerate. Dhruba agreed to work on HDFS-200 (and some
> others) to make sure we can have an equivalent support for sync (at least
> from the HBase point of view). This work is targeted at Hadoop 0.20 although
> it won't probably ever be in an Apache release.
>
> This means that the current HBase trunk should ideally support both
> 0.20+HDFS-200 and 0.21 at the same time. I opened HBASE-2233 for that. Todd
> was mentioning that if HDFS-200 isn't making the rest of Hadoop unstable,
> they could even package it in some sort of release of theirs and make our
> users' life easier. Could be a win-win.
>
> Should we still name the next major HBase release as 0.21? If it becomes
> common for HBase to support multiple Hadoop releases, should we still follow
> their version number? Could it be time for HBase 1.0?
>
> Let's hear everyone's opinion.
>
> J-D
>

Re: HBase trunk and Hadoop 0.21

Posted by Kay Kay <ka...@gmail.com>.
On 2/17/10 11:42 AM, Jean-Daniel Cryans wrote:
> Hi devs,
>
> Yesterday Stack, Ryan, Todd (from cloudera) and me had a meeting with the FB
> team about the course of action we should take with regard to Hadoop 0.21.
> Since Y! doesn't seem committed to release it anytime soon (or even use it),
> most users will probably stick with Hadoop 0.20.
>
> What it means for us is no HDFS-265 until months and this is not a situation
> our users should/will tolerate. Dhruba agreed to work on HDFS-200 (and some
> others) to make sure we can have an equivalent support for sync (at least
> from the HBase point of view). This work is targeted at Hadoop 0.20 although
> it won't probably ever be in an Apache release.
>
> This means that the current HBase trunk should ideally support both
> 0.20+HDFS-200 and 0.21 at the same time. I opened HBASE-2233 for that. Todd
> was mentioning that if HDFS-200 isn't making the rest of Hadoop unstable,
> they could even package it in some sort of release of theirs and make our
> users' life easier. Could be a win-win.
>
> Should we still name the next major HBase release as 0.21? If it becomes
> common for HBase to support multiple Hadoop releases, should we still follow
> their version number?
It might be useful to start a separate minor version ( say, 0.7.x ) that 
depends on 0.20.x of hdfs (of hadoop), so that hbase has the flexibility 
to upgrade the minor numbers ( say, 0.8.x ) for any fundamental changes, 
while both of them can depend on 0.20.x .  I believe depending on the 
stability / evolution of the dependencies - 1.0 can present itself down 
the road, possibly sooner. At the moment - it is kind of hazy though.

> Could it be time for HBase 1.0 ?
>    

> Let's hear everyone's opinion.
>
> J-D
>
>    


Re: HBase trunk and Hadoop 0.21

Posted by Andrew Purtell <ap...@yahoo.com>.
> Could it be time for HBase 1.0?

I think HBASE-2180 warrants a 0.20.4 release.

Beyond that, I'd have no objection to making trunk into an 0.99,
or whatever. :-)
What would be 0.99.1 could be 1.0. 

   - Andy


----- Original Message ----
> From: Jean-Daniel Cryans <jd...@apache.org>
> To: HBase Dev List <hb...@hadoop.apache.org>
> Sent: Wed, February 17, 2010 11:42:13 AM
> Subject: HBase trunk and Hadoop 0.21
> 
> Hi devs,
> 
> Yesterday Stack, Ryan, Todd (from cloudera) and me had a meeting with the FB
> team about the course of action we should take with regard to Hadoop 0.21.
> Since Y! doesn't seem committed to release it anytime soon (or even use it),
> most users will probably stick with Hadoop 0.20.
> 
> What it means for us is no HDFS-265 until months and this is not a situation
> our users should/will tolerate. Dhruba agreed to work on HDFS-200 (and some
> others) to make sure we can have an equivalent support for sync (at least
> from the HBase point of view). This work is targeted at Hadoop 0.20 although
> it won't probably ever be in an Apache release.
> 
> This means that the current HBase trunk should ideally support both
> 0.20+HDFS-200 and 0.21 at the same time. I opened HBASE-2233 for that. Todd
> was mentioning that if HDFS-200 isn't making the rest of Hadoop unstable,
> they could even package it in some sort of release of theirs and make our
> users' life easier. Could be a win-win.
> 
> Should we still name the next major HBase release as 0.21? If it becomes
> common for HBase to support multiple Hadoop releases, should we still follow
> their version number? Could it be time for HBase 1.0?
> 
> Let's hear everyone's opinion.
> 
> J-D