You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Nick Dimiduk <nd...@apache.org> on 2016/11/04 16:24:30 UTC

[DISCUSS] EOL 1.1 Release Branch

Hello HBase Community!

We have a small matter to discuss.

HBase 1.2 has been formally marked as "stable" for the last couple months.
HBase 1.3.0rc0 is just around the corner. I think it's time to start a
conversation about retiring the 1.1 line. The volunteer bandwidth for
maintaining multiple branches is precious and as we spread ourselves more
thin, odds of decay increase.

I propose discontinuing 1.1 with a single release following 1.3.1. That'll
give us one last chance to back port any bug fixes discovered in the
diligence we're putting into the new minor release. Given the current pace
of 1.3, I estimate this will happen in January or February of 2017. It's
not a lot of time for existing deployments to get around to upgrading, but
the upgrade path is trivial and 1.2 has been available for quite some
time. This will probably make our last release from this branch at 1.1.10
or there abouts.

Are there any objections or concerns with the above plan? Are there any
downstream communities who need our help moving onto 1.2? Please let us
know.

Thanks,
Nick

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Ted Yu <yu...@gmail.com>.
Nick:
The plan sounds good.

On Fri, Nov 4, 2016 at 9:24 AM, Nick Dimiduk <nd...@apache.org> wrote:

> Hello HBase Community!
>
> We have a small matter to discuss.
>
> HBase 1.2 has been formally marked as "stable" for the last couple months.
> HBase 1.3.0rc0 is just around the corner. I think it's time to start a
> conversation about retiring the 1.1 line. The volunteer bandwidth for
> maintaining multiple branches is precious and as we spread ourselves more
> thin, odds of decay increase.
>
> I propose discontinuing 1.1 with a single release following 1.3.1. That'll
> give us one last chance to back port any bug fixes discovered in the
> diligence we're putting into the new minor release. Given the current pace
> of 1.3, I estimate this will happen in January or February of 2017. It's
> not a lot of time for existing deployments to get around to upgrading, but
> the upgrade path is trivial and 1.2 has been available for quite some
> time. This will probably make our last release from this branch at 1.1.10
> or there abouts.
>
> Are there any objections or concerns with the above plan? Are there any
> downstream communities who need our help moving onto 1.2? Please let us
> know.
>
> Thanks,
> Nick
>

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Enis Söztutar <en...@apache.org>.
I also think that having 1.1 going for a bit longer might be helpful still,
especially if the ITBLL is failing with branch-1.2. Almost all of our
internal testing happens with a 1.1 based code base, so I cannot tell
whether 1.2 / 1.3 is the same stability or not.

Enis

On Fri, Nov 4, 2016 at 5:05 PM, Andrew Purtell <ap...@apache.org> wrote:

> Thanks. Yes I have been eyeing HBASE-16093. There might be another corner
> case there.
>
>
> On Fri, Nov 4, 2016 at 4:41 PM, Gary Helmling <gh...@gmail.com> wrote:
>
> > >
> > > The behavior: Looks like failed split/compaction rollback: row(s) in
> META
> > > without HRegionInfo, regions deployed without valid meta entries (at
> > > first), regions on HDFS without valid meta entries (later, after RS
> > > carrying them are killed by chaos), holes in the region chain leading
> to
> > > timeouts and job failure.
> > >
> > >
> > The empty regioninfo in meta sounds like HBASE-16093, though that fix is
> in
> > 1.2.  Interested to see if there are other problems around splits though.
> > Do you have a JIRA yet for tracking?
> >
> >
> > >
> > > You'll know you have found it when on the ITBLL console its meta
> scanner
> > > starts complaining about rows in meta without serialized HRegionInfo.
> > >
> > >
> > Will keep an eye out for this in our ITBLL runs here.
> >
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Yu Li <ca...@gmail.com>.
bq. Our team is preparing to upgrade our production cluster from 0.98 to
1.1.3.
@Jacob I'd suggest to pickup at least 1.1.4 because of HBASE-14460 (an
important fix on performance regression), just a note irrelative to the
topic here (sorry for the disturbing guys...)

Best Regards,
Yu

On 5 November 2016 at 14:08, Dima Spivak <di...@apache.org> wrote:

> FYI, once INFRA-12849 is done, I plan on having ITBLL (w/ 1 billion linked
> list nodes) running on a nightly basis in GCE via clusterdock. I think we'd
> trust our testing more if we didn't always have to go through the exercise
> of validating whether a failure we see is new or not and having a history
> of such for different branches should help with that.
>
> On Friday, November 4, 2016, Andrew Purtell <an...@gmail.com>
> wrote:
>
> > That wasn't my question. At all.
> >
> > > On Nov 4, 2016, at 7:27 PM, Ted Yu <yuzhihong@gmail.com
> <javascript:;>>
> > wrote:
> > >
> > > I looked at AssignmentManager#onRegionMerge() between branch-1.1
> > > and branch-1.2
> > >
> > > AFAICT, there is no obvious divergence.
> > >
> > > Later on, I plan to compare the diff between output for 'git log
> > > hbase-server/src/main/java/org/apache/hadoop/hbase/
> > master/AssignmentManager.java'
> > > and see which JIRAs were unique to branch-1.2
> > >
> > > Cheers
> > >
> > >> On Fri, Nov 4, 2016 at 6:37 PM, Andrew Purtell <apurtell@apache.org
> > <javascript:;>> wrote:
> > >>
> > >> I'm not deeply familiar with the AssignmentManager. I see when we
> > process
> > >> split rollbacks in onRegionSplit() we only call regionOffline() on
> > >> daughters if they are known to exist. However when processing merge
> > >> rollbacks in the else case of onRegionMerge() we unconditionally call
> > >> regionOffline() on the parent-being-merged. Shouldn't that likewise be
> > >> conditional on regionStates holding a state for the
> parent-being-merged?
> > >> Pardon if I've missed something.
> > >>
> > >>
> > >> On Fri, Nov 4, 2016 at 5:05 PM, Andrew Purtell <apurtell@apache.org
> > <javascript:;>>
> > >> wrote:
> > >>
> > >>> Thanks. Yes I have been eyeing HBASE-16093. There might be another
> > corner
> > >>> case there.
> > >>>
> > >>>
> > >>> On Fri, Nov 4, 2016 at 4:41 PM, Gary Helmling <ghelmling@gmail.com
> > <javascript:;>>
> > >> wrote:
> > >>>
> > >>>>>
> > >>>>> The behavior: Looks like failed split/compaction rollback: row(s)
> in
> > >>>> META
> > >>>>> without HRegionInfo, regions deployed without valid meta entries
> (at
> > >>>>> first), regions on HDFS without valid meta entries (later, after RS
> > >>>>> carrying them are killed by chaos), holes in the region chain
> leading
> > >> to
> > >>>>> timeouts and job failure.
> > >>>>>
> > >>>>>
> > >>>> The empty regioninfo in meta sounds like HBASE-16093, though that
> fix
> > is
> > >>>> in
> > >>>> 1.2.  Interested to see if there are other problems around splits
> > >> though.
> > >>>> Do you have a JIRA yet for tracking?
> > >>>>
> > >>>>
> > >>>>>
> > >>>>> You'll know you have found it when on the ITBLL console its meta
> > >> scanner
> > >>>>> starts complaining about rows in meta without serialized
> HRegionInfo.
> > >>>>>
> > >>>>>
> > >>>> Will keep an eye out for this in our ITBLL runs here.
> > >>>>
> > >>>
> > >>>
> > >>>
> > >>> --
> > >>> Best regards,
> > >>>
> > >>>   - Andy
> > >>>
> > >>> Problems worthy of attack prove their worth by hitting back. - Piet
> > Hein
> > >>> (via Tom White)
> > >>>
> > >>
> > >>
> > >>
> > >> --
> > >> Best regards,
> > >>
> > >>   - Andy
> > >>
> > >> Problems worthy of attack prove their worth by hitting back. - Piet
> Hein
> > >> (via Tom White)
> > >>
> >
>
>
> --
> -Dima
>

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Dima Spivak <di...@apache.org>.
FYI, once INFRA-12849 is done, I plan on having ITBLL (w/ 1 billion linked
list nodes) running on a nightly basis in GCE via clusterdock. I think we'd
trust our testing more if we didn't always have to go through the exercise
of validating whether a failure we see is new or not and having a history
of such for different branches should help with that.

On Friday, November 4, 2016, Andrew Purtell <an...@gmail.com>
wrote:

> That wasn't my question. At all.
>
> > On Nov 4, 2016, at 7:27 PM, Ted Yu <yuzhihong@gmail.com <javascript:;>>
> wrote:
> >
> > I looked at AssignmentManager#onRegionMerge() between branch-1.1
> > and branch-1.2
> >
> > AFAICT, there is no obvious divergence.
> >
> > Later on, I plan to compare the diff between output for 'git log
> > hbase-server/src/main/java/org/apache/hadoop/hbase/
> master/AssignmentManager.java'
> > and see which JIRAs were unique to branch-1.2
> >
> > Cheers
> >
> >> On Fri, Nov 4, 2016 at 6:37 PM, Andrew Purtell <apurtell@apache.org
> <javascript:;>> wrote:
> >>
> >> I'm not deeply familiar with the AssignmentManager. I see when we
> process
> >> split rollbacks in onRegionSplit() we only call regionOffline() on
> >> daughters if they are known to exist. However when processing merge
> >> rollbacks in the else case of onRegionMerge() we unconditionally call
> >> regionOffline() on the parent-being-merged. Shouldn't that likewise be
> >> conditional on regionStates holding a state for the parent-being-merged?
> >> Pardon if I've missed something.
> >>
> >>
> >> On Fri, Nov 4, 2016 at 5:05 PM, Andrew Purtell <apurtell@apache.org
> <javascript:;>>
> >> wrote:
> >>
> >>> Thanks. Yes I have been eyeing HBASE-16093. There might be another
> corner
> >>> case there.
> >>>
> >>>
> >>> On Fri, Nov 4, 2016 at 4:41 PM, Gary Helmling <ghelmling@gmail.com
> <javascript:;>>
> >> wrote:
> >>>
> >>>>>
> >>>>> The behavior: Looks like failed split/compaction rollback: row(s) in
> >>>> META
> >>>>> without HRegionInfo, regions deployed without valid meta entries (at
> >>>>> first), regions on HDFS without valid meta entries (later, after RS
> >>>>> carrying them are killed by chaos), holes in the region chain leading
> >> to
> >>>>> timeouts and job failure.
> >>>>>
> >>>>>
> >>>> The empty regioninfo in meta sounds like HBASE-16093, though that fix
> is
> >>>> in
> >>>> 1.2.  Interested to see if there are other problems around splits
> >> though.
> >>>> Do you have a JIRA yet for tracking?
> >>>>
> >>>>
> >>>>>
> >>>>> You'll know you have found it when on the ITBLL console its meta
> >> scanner
> >>>>> starts complaining about rows in meta without serialized HRegionInfo.
> >>>>>
> >>>>>
> >>>> Will keep an eye out for this in our ITBLL runs here.
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Best regards,
> >>>
> >>>   - Andy
> >>>
> >>> Problems worthy of attack prove their worth by hitting back. - Piet
> Hein
> >>> (via Tom White)
> >>>
> >>
> >>
> >>
> >> --
> >> Best regards,
> >>
> >>   - Andy
> >>
> >> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> >> (via Tom White)
> >>
>


-- 
-Dima

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Andrew Purtell <an...@gmail.com>.
That wasn't my question. At all. 

> On Nov 4, 2016, at 7:27 PM, Ted Yu <yu...@gmail.com> wrote:
> 
> I looked at AssignmentManager#onRegionMerge() between branch-1.1
> and branch-1.2
> 
> AFAICT, there is no obvious divergence.
> 
> Later on, I plan to compare the diff between output for 'git log
> hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java'
> and see which JIRAs were unique to branch-1.2
> 
> Cheers
> 
>> On Fri, Nov 4, 2016 at 6:37 PM, Andrew Purtell <ap...@apache.org> wrote:
>> 
>> I'm not deeply familiar with the AssignmentManager. I see when we process
>> split rollbacks in onRegionSplit() we only call regionOffline() on
>> daughters if they are known to exist. However when processing merge
>> rollbacks in the else case of onRegionMerge() we unconditionally call
>> regionOffline() on the parent-being-merged. Shouldn't that likewise be
>> conditional on regionStates holding a state for the parent-being-merged?
>> Pardon if I've missed something.
>> 
>> 
>> On Fri, Nov 4, 2016 at 5:05 PM, Andrew Purtell <ap...@apache.org>
>> wrote:
>> 
>>> Thanks. Yes I have been eyeing HBASE-16093. There might be another corner
>>> case there.
>>> 
>>> 
>>> On Fri, Nov 4, 2016 at 4:41 PM, Gary Helmling <gh...@gmail.com>
>> wrote:
>>> 
>>>>> 
>>>>> The behavior: Looks like failed split/compaction rollback: row(s) in
>>>> META
>>>>> without HRegionInfo, regions deployed without valid meta entries (at
>>>>> first), regions on HDFS without valid meta entries (later, after RS
>>>>> carrying them are killed by chaos), holes in the region chain leading
>> to
>>>>> timeouts and job failure.
>>>>> 
>>>>> 
>>>> The empty regioninfo in meta sounds like HBASE-16093, though that fix is
>>>> in
>>>> 1.2.  Interested to see if there are other problems around splits
>> though.
>>>> Do you have a JIRA yet for tracking?
>>>> 
>>>> 
>>>>> 
>>>>> You'll know you have found it when on the ITBLL console its meta
>> scanner
>>>>> starts complaining about rows in meta without serialized HRegionInfo.
>>>>> 
>>>>> 
>>>> Will keep an eye out for this in our ITBLL runs here.
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Best regards,
>>> 
>>>   - Andy
>>> 
>>> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>>> (via Tom White)
>>> 
>> 
>> 
>> 
>> --
>> Best regards,
>> 
>>   - Andy
>> 
>> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>> (via Tom White)
>> 

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Ted Yu <yu...@gmail.com>.
I looked at AssignmentManager#onRegionMerge() between branch-1.1
and branch-1.2

AFAICT, there is no obvious divergence.

Later on, I plan to compare the diff between output for 'git log
hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java'
and see which JIRAs were unique to branch-1.2

Cheers

On Fri, Nov 4, 2016 at 6:37 PM, Andrew Purtell <ap...@apache.org> wrote:

> I'm not deeply familiar with the AssignmentManager. I see when we process
> split rollbacks in onRegionSplit() we only call regionOffline() on
> daughters if they are known to exist. However when processing merge
> rollbacks in the else case of onRegionMerge() we unconditionally call
> regionOffline() on the parent-being-merged. Shouldn't that likewise be
> conditional on regionStates holding a state for the parent-being-merged?
> Pardon if I've missed something.
>
>
> On Fri, Nov 4, 2016 at 5:05 PM, Andrew Purtell <ap...@apache.org>
> wrote:
>
> > Thanks. Yes I have been eyeing HBASE-16093. There might be another corner
> > case there.
> >
> >
> > On Fri, Nov 4, 2016 at 4:41 PM, Gary Helmling <gh...@gmail.com>
> wrote:
> >
> >> >
> >> > The behavior: Looks like failed split/compaction rollback: row(s) in
> >> META
> >> > without HRegionInfo, regions deployed without valid meta entries (at
> >> > first), regions on HDFS without valid meta entries (later, after RS
> >> > carrying them are killed by chaos), holes in the region chain leading
> to
> >> > timeouts and job failure.
> >> >
> >> >
> >> The empty regioninfo in meta sounds like HBASE-16093, though that fix is
> >> in
> >> 1.2.  Interested to see if there are other problems around splits
> though.
> >> Do you have a JIRA yet for tracking?
> >>
> >>
> >> >
> >> > You'll know you have found it when on the ITBLL console its meta
> scanner
> >> > starts complaining about rows in meta without serialized HRegionInfo.
> >> >
> >> >
> >> Will keep an eye out for this in our ITBLL runs here.
> >>
> >
> >
> >
> > --
> > Best regards,
> >
> >    - Andy
> >
> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
> > (via Tom White)
> >
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Gary Helmling <gh...@gmail.com>.
>
> I'm not deeply familiar with the AssignmentManager. I see when we process
> split rollbacks in onRegionSplit() we only call regionOffline() on
> daughters if they are known to exist. However when processing merge
> rollbacks in the else case of onRegionMerge() we unconditionally call
> regionOffline() on the parent-being-merged. Shouldn't that likewise be
> conditional on regionStates holding a state for the parent-being-merged?
> Pardon if I've missed something.
>
>
> I'm really not familiar with the merge code, but this seems plausible to
> me.  I see that onRegionSplit() has an early out at the top of the method,
> but that will fail to evaluate if rs_a and rs_b are open and rs_p is null.
> So if it's called with a code of MERGE_REVERTED, I think we could wind up
> creating an offline meta entry for rs_p with no regioninfo, similar to
> HBASE-16093.  And that entry could wind up hiding the (still online)
> daughter regions.
>

s/onRegionSplit()/onRegionMerge()/ in that comment.

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Andrew Purtell <ap...@apache.org>.
Thanks Nick!

On Tue, Nov 8, 2016 at 7:51 AM, Nick Dimiduk <nd...@apache.org> wrote:

> You're right Enis, sounds like we'll continue on with 1.1. No issues on my
> side, I'm happy to continue producing these releases as I'm able.
>
> Thanks everyone for the discussion.
>
> On Monday, November 7, 2016, Enis Söztutar <en...@apache.org> wrote:
>
> > Going back to the original discussion, the conclusion is to continue with
> > 1.1 line for now, and re-evaluate in a couple of months it seems.
> >
> > Nick do you want to keep driving the next 1.1 releases, or you want to
> let
> > Andrew do that? I can help as well if needed.
> >
> > Enis
> >
> > On Mon, Nov 7, 2016 at 12:17 PM, Andrew Purtell <
> andrew.purtell@gmail.com
> > <javascript:;>>
> > wrote:
> >
> > > I have a patch for this and will be trying it out.
> > >
> > > On Nov 7, 2016, at 12:00 PM, Gary Helmling <ghelmling@gmail.com
> > <javascript:;>> wrote:
> > >
> > > >>
> > > >> I'm not deeply familiar with the AssignmentManager. I see when we
> > > process
> > > >> split rollbacks in onRegionSplit() we only call regionOffline() on
> > > >> daughters if they are known to exist. However when processing merge
> > > >> rollbacks in the else case of onRegionMerge() we unconditionally
> call
> > > >> regionOffline() on the parent-being-merged. Shouldn't that likewise
> be
> > > >> conditional on regionStates holding a state for the
> > parent-being-merged?
> > > >> Pardon if I've missed something.
> > > >>
> > > >>
> > > > I'm really not familiar with the merge code, but this seems plausible
> > to
> > > > me.  I see that onRegionSplit() has an early out at the top of the
> > > method,
> > > > but that will fail to evaluate if rs_a and rs_b are open and rs_p is
> > > null.
> > > > So if it's called with a code of MERGE_REVERTED, I think we could
> wind
> > up
> > > > creating an offline meta entry for rs_p with no regioninfo, similar
> to
> > > > HBASE-16093.  And that entry could wind up hiding the (still online)
> > > > daughter regions.
> > >
> >
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Andrew Purtell <ap...@apache.org>.
Thanks Nick!

On Tue, Nov 8, 2016 at 7:51 AM, Nick Dimiduk <nd...@apache.org> wrote:

> You're right Enis, sounds like we'll continue on with 1.1. No issues on my
> side, I'm happy to continue producing these releases as I'm able.
>
> Thanks everyone for the discussion.
>
> On Monday, November 7, 2016, Enis Söztutar <en...@apache.org> wrote:
>
> > Going back to the original discussion, the conclusion is to continue with
> > 1.1 line for now, and re-evaluate in a couple of months it seems.
> >
> > Nick do you want to keep driving the next 1.1 releases, or you want to
> let
> > Andrew do that? I can help as well if needed.
> >
> > Enis
> >
> > On Mon, Nov 7, 2016 at 12:17 PM, Andrew Purtell <
> andrew.purtell@gmail.com
> > <javascript:;>>
> > wrote:
> >
> > > I have a patch for this and will be trying it out.
> > >
> > > On Nov 7, 2016, at 12:00 PM, Gary Helmling <ghelmling@gmail.com
> > <javascript:;>> wrote:
> > >
> > > >>
> > > >> I'm not deeply familiar with the AssignmentManager. I see when we
> > > process
> > > >> split rollbacks in onRegionSplit() we only call regionOffline() on
> > > >> daughters if they are known to exist. However when processing merge
> > > >> rollbacks in the else case of onRegionMerge() we unconditionally
> call
> > > >> regionOffline() on the parent-being-merged. Shouldn't that likewise
> be
> > > >> conditional on regionStates holding a state for the
> > parent-being-merged?
> > > >> Pardon if I've missed something.
> > > >>
> > > >>
> > > > I'm really not familiar with the merge code, but this seems plausible
> > to
> > > > me.  I see that onRegionSplit() has an early out at the top of the
> > > method,
> > > > but that will fail to evaluate if rs_a and rs_b are open and rs_p is
> > > null.
> > > > So if it's called with a code of MERGE_REVERTED, I think we could
> wind
> > up
> > > > creating an offline meta entry for rs_p with no regioninfo, similar
> to
> > > > HBASE-16093.  And that entry could wind up hiding the (still online)
> > > > daughter regions.
> > >
> >
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Nick Dimiduk <nd...@apache.org>.
You're right Enis, sounds like we'll continue on with 1.1. No issues on my
side, I'm happy to continue producing these releases as I'm able.

Thanks everyone for the discussion.

On Monday, November 7, 2016, Enis Söztutar <en...@apache.org> wrote:

> Going back to the original discussion, the conclusion is to continue with
> 1.1 line for now, and re-evaluate in a couple of months it seems.
>
> Nick do you want to keep driving the next 1.1 releases, or you want to let
> Andrew do that? I can help as well if needed.
>
> Enis
>
> On Mon, Nov 7, 2016 at 12:17 PM, Andrew Purtell <andrew.purtell@gmail.com
> <javascript:;>>
> wrote:
>
> > I have a patch for this and will be trying it out.
> >
> > On Nov 7, 2016, at 12:00 PM, Gary Helmling <ghelmling@gmail.com
> <javascript:;>> wrote:
> >
> > >>
> > >> I'm not deeply familiar with the AssignmentManager. I see when we
> > process
> > >> split rollbacks in onRegionSplit() we only call regionOffline() on
> > >> daughters if they are known to exist. However when processing merge
> > >> rollbacks in the else case of onRegionMerge() we unconditionally call
> > >> regionOffline() on the parent-being-merged. Shouldn't that likewise be
> > >> conditional on regionStates holding a state for the
> parent-being-merged?
> > >> Pardon if I've missed something.
> > >>
> > >>
> > > I'm really not familiar with the merge code, but this seems plausible
> to
> > > me.  I see that onRegionSplit() has an early out at the top of the
> > method,
> > > but that will fail to evaluate if rs_a and rs_b are open and rs_p is
> > null.
> > > So if it's called with a code of MERGE_REVERTED, I think we could wind
> up
> > > creating an offline meta entry for rs_p with no regioninfo, similar to
> > > HBASE-16093.  And that entry could wind up hiding the (still online)
> > > daughter regions.
> >
>

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Nick Dimiduk <nd...@apache.org>.
You're right Enis, sounds like we'll continue on with 1.1. No issues on my
side, I'm happy to continue producing these releases as I'm able.

Thanks everyone for the discussion.

On Monday, November 7, 2016, Enis Söztutar <en...@apache.org> wrote:

> Going back to the original discussion, the conclusion is to continue with
> 1.1 line for now, and re-evaluate in a couple of months it seems.
>
> Nick do you want to keep driving the next 1.1 releases, or you want to let
> Andrew do that? I can help as well if needed.
>
> Enis
>
> On Mon, Nov 7, 2016 at 12:17 PM, Andrew Purtell <andrew.purtell@gmail.com
> <javascript:;>>
> wrote:
>
> > I have a patch for this and will be trying it out.
> >
> > On Nov 7, 2016, at 12:00 PM, Gary Helmling <ghelmling@gmail.com
> <javascript:;>> wrote:
> >
> > >>
> > >> I'm not deeply familiar with the AssignmentManager. I see when we
> > process
> > >> split rollbacks in onRegionSplit() we only call regionOffline() on
> > >> daughters if they are known to exist. However when processing merge
> > >> rollbacks in the else case of onRegionMerge() we unconditionally call
> > >> regionOffline() on the parent-being-merged. Shouldn't that likewise be
> > >> conditional on regionStates holding a state for the
> parent-being-merged?
> > >> Pardon if I've missed something.
> > >>
> > >>
> > > I'm really not familiar with the merge code, but this seems plausible
> to
> > > me.  I see that onRegionSplit() has an early out at the top of the
> > method,
> > > but that will fail to evaluate if rs_a and rs_b are open and rs_p is
> > null.
> > > So if it's called with a code of MERGE_REVERTED, I think we could wind
> up
> > > creating an offline meta entry for rs_p with no regioninfo, similar to
> > > HBASE-16093.  And that entry could wind up hiding the (still online)
> > > daughter regions.
> >
>

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Enis Söztutar <en...@apache.org>.
Going back to the original discussion, the conclusion is to continue with
1.1 line for now, and re-evaluate in a couple of months it seems.

Nick do you want to keep driving the next 1.1 releases, or you want to let
Andrew do that? I can help as well if needed.

Enis

On Mon, Nov 7, 2016 at 12:17 PM, Andrew Purtell <an...@gmail.com>
wrote:

> I have a patch for this and will be trying it out.
>
> On Nov 7, 2016, at 12:00 PM, Gary Helmling <gh...@gmail.com> wrote:
>
> >>
> >> I'm not deeply familiar with the AssignmentManager. I see when we
> process
> >> split rollbacks in onRegionSplit() we only call regionOffline() on
> >> daughters if they are known to exist. However when processing merge
> >> rollbacks in the else case of onRegionMerge() we unconditionally call
> >> regionOffline() on the parent-being-merged. Shouldn't that likewise be
> >> conditional on regionStates holding a state for the parent-being-merged?
> >> Pardon if I've missed something.
> >>
> >>
> > I'm really not familiar with the merge code, but this seems plausible to
> > me.  I see that onRegionSplit() has an early out at the top of the
> method,
> > but that will fail to evaluate if rs_a and rs_b are open and rs_p is
> null.
> > So if it's called with a code of MERGE_REVERTED, I think we could wind up
> > creating an offline meta entry for rs_p with no regioninfo, similar to
> > HBASE-16093.  And that entry could wind up hiding the (still online)
> > daughter regions.
>

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Enis Söztutar <en...@apache.org>.
Going back to the original discussion, the conclusion is to continue with
1.1 line for now, and re-evaluate in a couple of months it seems.

Nick do you want to keep driving the next 1.1 releases, or you want to let
Andrew do that? I can help as well if needed.

Enis

On Mon, Nov 7, 2016 at 12:17 PM, Andrew Purtell <an...@gmail.com>
wrote:

> I have a patch for this and will be trying it out.
>
> On Nov 7, 2016, at 12:00 PM, Gary Helmling <gh...@gmail.com> wrote:
>
> >>
> >> I'm not deeply familiar with the AssignmentManager. I see when we
> process
> >> split rollbacks in onRegionSplit() we only call regionOffline() on
> >> daughters if they are known to exist. However when processing merge
> >> rollbacks in the else case of onRegionMerge() we unconditionally call
> >> regionOffline() on the parent-being-merged. Shouldn't that likewise be
> >> conditional on regionStates holding a state for the parent-being-merged?
> >> Pardon if I've missed something.
> >>
> >>
> > I'm really not familiar with the merge code, but this seems plausible to
> > me.  I see that onRegionSplit() has an early out at the top of the
> method,
> > but that will fail to evaluate if rs_a and rs_b are open and rs_p is
> null.
> > So if it's called with a code of MERGE_REVERTED, I think we could wind up
> > creating an offline meta entry for rs_p with no regioninfo, similar to
> > HBASE-16093.  And that entry could wind up hiding the (still online)
> > daughter regions.
>

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Andrew Purtell <an...@gmail.com>.
I have a patch for this and will be trying it out. 

On Nov 7, 2016, at 12:00 PM, Gary Helmling <gh...@gmail.com> wrote:

>> 
>> I'm not deeply familiar with the AssignmentManager. I see when we process
>> split rollbacks in onRegionSplit() we only call regionOffline() on
>> daughters if they are known to exist. However when processing merge
>> rollbacks in the else case of onRegionMerge() we unconditionally call
>> regionOffline() on the parent-being-merged. Shouldn't that likewise be
>> conditional on regionStates holding a state for the parent-being-merged?
>> Pardon if I've missed something.
>> 
>> 
> I'm really not familiar with the merge code, but this seems plausible to
> me.  I see that onRegionSplit() has an early out at the top of the method,
> but that will fail to evaluate if rs_a and rs_b are open and rs_p is null.
> So if it's called with a code of MERGE_REVERTED, I think we could wind up
> creating an offline meta entry for rs_p with no regioninfo, similar to
> HBASE-16093.  And that entry could wind up hiding the (still online)
> daughter regions.

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Gary Helmling <gh...@gmail.com>.
>
> I'm not deeply familiar with the AssignmentManager. I see when we process
> split rollbacks in onRegionSplit() we only call regionOffline() on
> daughters if they are known to exist. However when processing merge
> rollbacks in the else case of onRegionMerge() we unconditionally call
> regionOffline() on the parent-being-merged. Shouldn't that likewise be
> conditional on regionStates holding a state for the parent-being-merged?
> Pardon if I've missed something.
>
>
> I'm really not familiar with the merge code, but this seems plausible to
> me.  I see that onRegionSplit() has an early out at the top of the method,
> but that will fail to evaluate if rs_a and rs_b are open and rs_p is null.
> So if it's called with a code of MERGE_REVERTED, I think we could wind up
> creating an offline meta entry for rs_p with no regioninfo, similar to
> HBASE-16093.  And that entry could wind up hiding the (still online)
> daughter regions.
>

s/onRegionSplit()/onRegionMerge()/ in that comment.

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Andrew Purtell <an...@gmail.com>.
I have a patch for this and will be trying it out. 

On Nov 7, 2016, at 12:00 PM, Gary Helmling <gh...@gmail.com> wrote:

>> 
>> I'm not deeply familiar with the AssignmentManager. I see when we process
>> split rollbacks in onRegionSplit() we only call regionOffline() on
>> daughters if they are known to exist. However when processing merge
>> rollbacks in the else case of onRegionMerge() we unconditionally call
>> regionOffline() on the parent-being-merged. Shouldn't that likewise be
>> conditional on regionStates holding a state for the parent-being-merged?
>> Pardon if I've missed something.
>> 
>> 
> I'm really not familiar with the merge code, but this seems plausible to
> me.  I see that onRegionSplit() has an early out at the top of the method,
> but that will fail to evaluate if rs_a and rs_b are open and rs_p is null.
> So if it's called with a code of MERGE_REVERTED, I think we could wind up
> creating an offline meta entry for rs_p with no regioninfo, similar to
> HBASE-16093.  And that entry could wind up hiding the (still online)
> daughter regions.

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Gary Helmling <gh...@gmail.com>.
>
> I'm not deeply familiar with the AssignmentManager. I see when we process
> split rollbacks in onRegionSplit() we only call regionOffline() on
> daughters if they are known to exist. However when processing merge
> rollbacks in the else case of onRegionMerge() we unconditionally call
> regionOffline() on the parent-being-merged. Shouldn't that likewise be
> conditional on regionStates holding a state for the parent-being-merged?
> Pardon if I've missed something.
>
>
I'm really not familiar with the merge code, but this seems plausible to
me.  I see that onRegionSplit() has an early out at the top of the method,
but that will fail to evaluate if rs_a and rs_b are open and rs_p is null.
So if it's called with a code of MERGE_REVERTED, I think we could wind up
creating an offline meta entry for rs_p with no regioninfo, similar to
HBASE-16093.  And that entry could wind up hiding the (still online)
daughter regions.

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Gary Helmling <gh...@gmail.com>.
>
> I'm not deeply familiar with the AssignmentManager. I see when we process
> split rollbacks in onRegionSplit() we only call regionOffline() on
> daughters if they are known to exist. However when processing merge
> rollbacks in the else case of onRegionMerge() we unconditionally call
> regionOffline() on the parent-being-merged. Shouldn't that likewise be
> conditional on regionStates holding a state for the parent-being-merged?
> Pardon if I've missed something.
>
>
I'm really not familiar with the merge code, but this seems plausible to
me.  I see that onRegionSplit() has an early out at the top of the method,
but that will fail to evaluate if rs_a and rs_b are open and rs_p is null.
So if it's called with a code of MERGE_REVERTED, I think we could wind up
creating an offline meta entry for rs_p with no regioninfo, similar to
HBASE-16093.  And that entry could wind up hiding the (still online)
daughter regions.

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Andrew Purtell <ap...@apache.org>.
I'm not deeply familiar with the AssignmentManager. I see when we process
split rollbacks in onRegionSplit() we only call regionOffline() on
daughters if they are known to exist. However when processing merge
rollbacks in the else case of onRegionMerge() we unconditionally call
regionOffline() on the parent-being-merged. Shouldn't that likewise be
conditional on regionStates holding a state for the parent-being-merged?
Pardon if I've missed something.


On Fri, Nov 4, 2016 at 5:05 PM, Andrew Purtell <ap...@apache.org> wrote:

> Thanks. Yes I have been eyeing HBASE-16093. There might be another corner
> case there.
>
>
> On Fri, Nov 4, 2016 at 4:41 PM, Gary Helmling <gh...@gmail.com> wrote:
>
>> >
>> > The behavior: Looks like failed split/compaction rollback: row(s) in
>> META
>> > without HRegionInfo, regions deployed without valid meta entries (at
>> > first), regions on HDFS without valid meta entries (later, after RS
>> > carrying them are killed by chaos), holes in the region chain leading to
>> > timeouts and job failure.
>> >
>> >
>> The empty regioninfo in meta sounds like HBASE-16093, though that fix is
>> in
>> 1.2.  Interested to see if there are other problems around splits though.
>> Do you have a JIRA yet for tracking?
>>
>>
>> >
>> > You'll know you have found it when on the ITBLL console its meta scanner
>> > starts complaining about rows in meta without serialized HRegionInfo.
>> >
>> >
>> Will keep an eye out for this in our ITBLL runs here.
>>
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Andrew Purtell <ap...@apache.org>.
I'm not deeply familiar with the AssignmentManager. I see when we process
split rollbacks in onRegionSplit() we only call regionOffline() on
daughters if they are known to exist. However when processing merge
rollbacks in the else case of onRegionMerge() we unconditionally call
regionOffline() on the parent-being-merged. Shouldn't that likewise be
conditional on regionStates holding a state for the parent-being-merged?
Pardon if I've missed something.


On Fri, Nov 4, 2016 at 5:05 PM, Andrew Purtell <ap...@apache.org> wrote:

> Thanks. Yes I have been eyeing HBASE-16093. There might be another corner
> case there.
>
>
> On Fri, Nov 4, 2016 at 4:41 PM, Gary Helmling <gh...@gmail.com> wrote:
>
>> >
>> > The behavior: Looks like failed split/compaction rollback: row(s) in
>> META
>> > without HRegionInfo, regions deployed without valid meta entries (at
>> > first), regions on HDFS without valid meta entries (later, after RS
>> > carrying them are killed by chaos), holes in the region chain leading to
>> > timeouts and job failure.
>> >
>> >
>> The empty regioninfo in meta sounds like HBASE-16093, though that fix is
>> in
>> 1.2.  Interested to see if there are other problems around splits though.
>> Do you have a JIRA yet for tracking?
>>
>>
>> >
>> > You'll know you have found it when on the ITBLL console its meta scanner
>> > starts complaining about rows in meta without serialized HRegionInfo.
>> >
>> >
>> Will keep an eye out for this in our ITBLL runs here.
>>
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Enis Söztutar <en...@apache.org>.
I also think that having 1.1 going for a bit longer might be helpful still,
especially if the ITBLL is failing with branch-1.2. Almost all of our
internal testing happens with a 1.1 based code base, so I cannot tell
whether 1.2 / 1.3 is the same stability or not.

Enis

On Fri, Nov 4, 2016 at 5:05 PM, Andrew Purtell <ap...@apache.org> wrote:

> Thanks. Yes I have been eyeing HBASE-16093. There might be another corner
> case there.
>
>
> On Fri, Nov 4, 2016 at 4:41 PM, Gary Helmling <gh...@gmail.com> wrote:
>
> > >
> > > The behavior: Looks like failed split/compaction rollback: row(s) in
> META
> > > without HRegionInfo, regions deployed without valid meta entries (at
> > > first), regions on HDFS without valid meta entries (later, after RS
> > > carrying them are killed by chaos), holes in the region chain leading
> to
> > > timeouts and job failure.
> > >
> > >
> > The empty regioninfo in meta sounds like HBASE-16093, though that fix is
> in
> > 1.2.  Interested to see if there are other problems around splits though.
> > Do you have a JIRA yet for tracking?
> >
> >
> > >
> > > You'll know you have found it when on the ITBLL console its meta
> scanner
> > > starts complaining about rows in meta without serialized HRegionInfo.
> > >
> > >
> > Will keep an eye out for this in our ITBLL runs here.
> >
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Andrew Purtell <ap...@apache.org>.
Thanks. Yes I have been eyeing HBASE-16093. There might be another corner
case there.


On Fri, Nov 4, 2016 at 4:41 PM, Gary Helmling <gh...@gmail.com> wrote:

> >
> > The behavior: Looks like failed split/compaction rollback: row(s) in META
> > without HRegionInfo, regions deployed without valid meta entries (at
> > first), regions on HDFS without valid meta entries (later, after RS
> > carrying them are killed by chaos), holes in the region chain leading to
> > timeouts and job failure.
> >
> >
> The empty regioninfo in meta sounds like HBASE-16093, though that fix is in
> 1.2.  Interested to see if there are other problems around splits though.
> Do you have a JIRA yet for tracking?
>
>
> >
> > You'll know you have found it when on the ITBLL console its meta scanner
> > starts complaining about rows in meta without serialized HRegionInfo.
> >
> >
> Will keep an eye out for this in our ITBLL runs here.
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Andrew Purtell <ap...@apache.org>.
Thanks. Yes I have been eyeing HBASE-16093. There might be another corner
case there.


On Fri, Nov 4, 2016 at 4:41 PM, Gary Helmling <gh...@gmail.com> wrote:

> >
> > The behavior: Looks like failed split/compaction rollback: row(s) in META
> > without HRegionInfo, regions deployed without valid meta entries (at
> > first), regions on HDFS without valid meta entries (later, after RS
> > carrying them are killed by chaos), holes in the region chain leading to
> > timeouts and job failure.
> >
> >
> The empty regioninfo in meta sounds like HBASE-16093, though that fix is in
> 1.2.  Interested to see if there are other problems around splits though.
> Do you have a JIRA yet for tracking?
>
>
> >
> > You'll know you have found it when on the ITBLL console its meta scanner
> > starts complaining about rows in meta without serialized HRegionInfo.
> >
> >
> Will keep an eye out for this in our ITBLL runs here.
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Gary Helmling <gh...@gmail.com>.
>
> The behavior: Looks like failed split/compaction rollback: row(s) in META
> without HRegionInfo, regions deployed without valid meta entries (at
> first), regions on HDFS without valid meta entries (later, after RS
> carrying them are killed by chaos), holes in the region chain leading to
> timeouts and job failure.
>
>
The empty regioninfo in meta sounds like HBASE-16093, though that fix is in
1.2.  Interested to see if there are other problems around splits though.
Do you have a JIRA yet for tracking?


>
> You'll know you have found it when on the ITBLL console its meta scanner
> starts complaining about rows in meta without serialized HRegionInfo.
>
>
Will keep an eye out for this in our ITBLL runs here.

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Gary Helmling <gh...@gmail.com>.
>
> The behavior: Looks like failed split/compaction rollback: row(s) in META
> without HRegionInfo, regions deployed without valid meta entries (at
> first), regions on HDFS without valid meta entries (later, after RS
> carrying them are killed by chaos), holes in the region chain leading to
> timeouts and job failure.
>
>
The empty regioninfo in meta sounds like HBASE-16093, though that fix is in
1.2.  Interested to see if there are other problems around splits though.
Do you have a JIRA yet for tracking?


>
> You'll know you have found it when on the ITBLL console its meta scanner
> starts complaining about rows in meta without serialized HRegionInfo.
>
>
Will keep an eye out for this in our ITBLL runs here.

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Andrew Purtell <ap...@apache.org>.
Eh, instead of split/compaction, meant split/merge


On Fri, Nov 4, 2016 at 2:59 PM, Andrew Purtell <ap...@apache.org> wrote:

> The behavior: Looks like failed split/compaction rollback: row(s) in META
> without HRegionInfo, regions deployed without valid meta entries (at
> first), regions on HDFS without valid meta entries (later, after RS
> carrying them are killed by chaos), holes in the region chain leading to
> timeouts and job failure.
>
> The cause: Still looking
>
> You'll know you have found it when on the ITBLL console its meta scanner
> starts complaining about rows in meta without serialized HRegionInfo.
>
> I volunteer to maintain 1.1 if nobody else wants it, and like I said I
> vote -1 on EOL at this time.
>
>
>
>
> On Fri, Nov 4, 2016 at 2:31 PM, Mikhail Antonov <ol...@gmail.com>
> wrote:
>
>> On the original topic - on one hand, the fewer code lines we maintain the
>> better, as less volunteering time and efforts
>> are spent on that. Also does postponing of EOL-ing of older releases slow
>> down adoption of newer releases?
>>
>> On the other hand, if we believe ITBBL is broken / unreliable for some
>> release this is a bad state to be in.
>>
>> @Andrew - I think this is also a question of what exactly is deemed wrong
>> with ITBLL. What kind of issues do you observe
>>  on 1.2? Do you observe data loss (unref / undef keys after Verify run),
>> or
>> does job fail to complete because regions in transition / offline,
>> or is it something else? How reproducible is that for you? Could you share
>> some more details of your experiments
>> (should it be a separate thread, probably?)
>>
>> -Mikhail
>>
>> On Fri, Nov 4, 2016 at 9:57 AM, Andrew Purtell <ap...@apache.org>
>> wrote:
>>
>> > Let me add I'd switch my thinking to +1 for retiring 1.1 if, now that we
>> > have a 1.3.0RC0 shaping up, it turns out the 1.3 code line can survive
>> the
>> > same 1B ITBLL testing that 1.1 does (and 1.2 does not).
>> >
>> > On Fri, Nov 4, 2016 at 9:52 AM, Andrew Purtell <ap...@apache.org>
>> > wrote:
>> >
>> > > I'm -1 on this idea, for now.
>> > >
>> > > We have been evaluating 1.1 and 1.2 for upgrade and whereas 1.1 will
>> > > survive all testing including large scale ITBLL tests, 1.2 will not -
>> no
>> > > 1.2, from 1.2.0 on up. I've found one issue (fixed), and am now
>> trying to
>> > > nail down another.
>> > >
>> > > I would like to see two things:
>> > >
>> > > 1. Others in the community step up to evaluate the stability of 1.1.7
>> > > versus 1.2.3 (or .4) using ITBLL with at least 1B rows of data, and
>> > report
>> > > in. Is it just me?
>> > >
>> > > 2. We do not declare 1.1 EOL until 1.2 is unquestionable stable
>> according
>> > > to the most practical rigor we can throw at it with our tooling.
>> > Especially
>> > > because I still plan to resign as 0.98 RM soon, which I think will
>> > trigger
>> > > an EOL of that code line.
>> > >
>> > > I will be resigning as 0.98 RM effective January 1 2017 and at that
>> time
>> > > the community can discuss what to do with 0.98. From my point of view,
>> > I'm
>> > > done with spending time on it. Happy to take some of the time freed up
>> > and
>> > > use it to carry 1.1 forward if we are still making releases off this
>> code
>> > > line then.
>> > >
>> > >
>> > > On Fri, Nov 4, 2016 at 9:24 AM, Nick Dimiduk <nd...@apache.org>
>> > wrote:
>> > >
>> > >> Hello HBase Community!
>> > >>
>> > >> We have a small matter to discuss.
>> > >>
>> > >> HBase 1.2 has been formally marked as "stable" for the last couple
>> > months.
>> > >> HBase 1.3.0rc0 is just around the corner. I think it's time to start
>> a
>> > >> conversation about retiring the 1.1 line. The volunteer bandwidth for
>> > >> maintaining multiple branches is precious and as we spread ourselves
>> > more
>> > >> thin, odds of decay increase.
>> > >>
>> > >> I propose discontinuing 1.1 with a single release following 1.3.1.
>> > That'll
>> > >> give us one last chance to back port any bug fixes discovered in the
>> > >> diligence we're putting into the new minor release. Given the current
>> > pace
>> > >> of 1.3, I estimate this will happen in January or February of 2017.
>> It's
>> > >> not a lot of time for existing deployments to get around to
>> upgrading,
>> > but
>> > >> the upgrade path is trivial and 1.2 has been available for quite some
>> > >> time. This will probably make our last release from this branch at
>> > 1.1.10
>> > >> or there abouts.
>> > >>
>> > >> Are there any objections or concerns with the above plan? Are there
>> any
>> > >> downstream communities who need our help moving onto 1.2? Please let
>> us
>> > >> know.
>> > >>
>> > >> Thanks,
>> > >> Nick
>> > >>
>> > >
>> > >
>> > >
>> > > --
>> > > Best regards,
>> > >
>> > >    - Andy
>> > >
>> > > Problems worthy of attack prove their worth by hitting back. - Piet
>> Hein
>> > > (via Tom White)
>> > >
>> >
>> >
>> >
>> > --
>> > Best regards,
>> >
>> >    - Andy
>> >
>> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
>> > (via Tom White)
>> >
>>
>>
>>
>> --
>> Thanks,
>> Michael Antonov
>>
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Andrew Purtell <ap...@apache.org>.
Eh, instead of split/compaction, meant split/merge


On Fri, Nov 4, 2016 at 2:59 PM, Andrew Purtell <ap...@apache.org> wrote:

> The behavior: Looks like failed split/compaction rollback: row(s) in META
> without HRegionInfo, regions deployed without valid meta entries (at
> first), regions on HDFS without valid meta entries (later, after RS
> carrying them are killed by chaos), holes in the region chain leading to
> timeouts and job failure.
>
> The cause: Still looking
>
> You'll know you have found it when on the ITBLL console its meta scanner
> starts complaining about rows in meta without serialized HRegionInfo.
>
> I volunteer to maintain 1.1 if nobody else wants it, and like I said I
> vote -1 on EOL at this time.
>
>
>
>
> On Fri, Nov 4, 2016 at 2:31 PM, Mikhail Antonov <ol...@gmail.com>
> wrote:
>
>> On the original topic - on one hand, the fewer code lines we maintain the
>> better, as less volunteering time and efforts
>> are spent on that. Also does postponing of EOL-ing of older releases slow
>> down adoption of newer releases?
>>
>> On the other hand, if we believe ITBBL is broken / unreliable for some
>> release this is a bad state to be in.
>>
>> @Andrew - I think this is also a question of what exactly is deemed wrong
>> with ITBLL. What kind of issues do you observe
>>  on 1.2? Do you observe data loss (unref / undef keys after Verify run),
>> or
>> does job fail to complete because regions in transition / offline,
>> or is it something else? How reproducible is that for you? Could you share
>> some more details of your experiments
>> (should it be a separate thread, probably?)
>>
>> -Mikhail
>>
>> On Fri, Nov 4, 2016 at 9:57 AM, Andrew Purtell <ap...@apache.org>
>> wrote:
>>
>> > Let me add I'd switch my thinking to +1 for retiring 1.1 if, now that we
>> > have a 1.3.0RC0 shaping up, it turns out the 1.3 code line can survive
>> the
>> > same 1B ITBLL testing that 1.1 does (and 1.2 does not).
>> >
>> > On Fri, Nov 4, 2016 at 9:52 AM, Andrew Purtell <ap...@apache.org>
>> > wrote:
>> >
>> > > I'm -1 on this idea, for now.
>> > >
>> > > We have been evaluating 1.1 and 1.2 for upgrade and whereas 1.1 will
>> > > survive all testing including large scale ITBLL tests, 1.2 will not -
>> no
>> > > 1.2, from 1.2.0 on up. I've found one issue (fixed), and am now
>> trying to
>> > > nail down another.
>> > >
>> > > I would like to see two things:
>> > >
>> > > 1. Others in the community step up to evaluate the stability of 1.1.7
>> > > versus 1.2.3 (or .4) using ITBLL with at least 1B rows of data, and
>> > report
>> > > in. Is it just me?
>> > >
>> > > 2. We do not declare 1.1 EOL until 1.2 is unquestionable stable
>> according
>> > > to the most practical rigor we can throw at it with our tooling.
>> > Especially
>> > > because I still plan to resign as 0.98 RM soon, which I think will
>> > trigger
>> > > an EOL of that code line.
>> > >
>> > > I will be resigning as 0.98 RM effective January 1 2017 and at that
>> time
>> > > the community can discuss what to do with 0.98. From my point of view,
>> > I'm
>> > > done with spending time on it. Happy to take some of the time freed up
>> > and
>> > > use it to carry 1.1 forward if we are still making releases off this
>> code
>> > > line then.
>> > >
>> > >
>> > > On Fri, Nov 4, 2016 at 9:24 AM, Nick Dimiduk <nd...@apache.org>
>> > wrote:
>> > >
>> > >> Hello HBase Community!
>> > >>
>> > >> We have a small matter to discuss.
>> > >>
>> > >> HBase 1.2 has been formally marked as "stable" for the last couple
>> > months.
>> > >> HBase 1.3.0rc0 is just around the corner. I think it's time to start
>> a
>> > >> conversation about retiring the 1.1 line. The volunteer bandwidth for
>> > >> maintaining multiple branches is precious and as we spread ourselves
>> > more
>> > >> thin, odds of decay increase.
>> > >>
>> > >> I propose discontinuing 1.1 with a single release following 1.3.1.
>> > That'll
>> > >> give us one last chance to back port any bug fixes discovered in the
>> > >> diligence we're putting into the new minor release. Given the current
>> > pace
>> > >> of 1.3, I estimate this will happen in January or February of 2017.
>> It's
>> > >> not a lot of time for existing deployments to get around to
>> upgrading,
>> > but
>> > >> the upgrade path is trivial and 1.2 has been available for quite some
>> > >> time. This will probably make our last release from this branch at
>> > 1.1.10
>> > >> or there abouts.
>> > >>
>> > >> Are there any objections or concerns with the above plan? Are there
>> any
>> > >> downstream communities who need our help moving onto 1.2? Please let
>> us
>> > >> know.
>> > >>
>> > >> Thanks,
>> > >> Nick
>> > >>
>> > >
>> > >
>> > >
>> > > --
>> > > Best regards,
>> > >
>> > >    - Andy
>> > >
>> > > Problems worthy of attack prove their worth by hitting back. - Piet
>> Hein
>> > > (via Tom White)
>> > >
>> >
>> >
>> >
>> > --
>> > Best regards,
>> >
>> >    - Andy
>> >
>> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
>> > (via Tom White)
>> >
>>
>>
>>
>> --
>> Thanks,
>> Michael Antonov
>>
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Andrew Purtell <ap...@apache.org>.
The behavior: Looks like failed split/compaction rollback: row(s) in META
without HRegionInfo, regions deployed without valid meta entries (at
first), regions on HDFS without valid meta entries (later, after RS
carrying them are killed by chaos), holes in the region chain leading to
timeouts and job failure.

The cause: Still looking

You'll know you have found it when on the ITBLL console its meta scanner
starts complaining about rows in meta without serialized HRegionInfo.

I volunteer to maintain 1.1 if nobody else wants it, and like I said I vote
-1 on EOL at this time.




On Fri, Nov 4, 2016 at 2:31 PM, Mikhail Antonov <ol...@gmail.com>
wrote:

> On the original topic - on one hand, the fewer code lines we maintain the
> better, as less volunteering time and efforts
> are spent on that. Also does postponing of EOL-ing of older releases slow
> down adoption of newer releases?
>
> On the other hand, if we believe ITBBL is broken / unreliable for some
> release this is a bad state to be in.
>
> @Andrew - I think this is also a question of what exactly is deemed wrong
> with ITBLL. What kind of issues do you observe
>  on 1.2? Do you observe data loss (unref / undef keys after Verify run), or
> does job fail to complete because regions in transition / offline,
> or is it something else? How reproducible is that for you? Could you share
> some more details of your experiments
> (should it be a separate thread, probably?)
>
> -Mikhail
>
> On Fri, Nov 4, 2016 at 9:57 AM, Andrew Purtell <ap...@apache.org>
> wrote:
>
> > Let me add I'd switch my thinking to +1 for retiring 1.1 if, now that we
> > have a 1.3.0RC0 shaping up, it turns out the 1.3 code line can survive
> the
> > same 1B ITBLL testing that 1.1 does (and 1.2 does not).
> >
> > On Fri, Nov 4, 2016 at 9:52 AM, Andrew Purtell <ap...@apache.org>
> > wrote:
> >
> > > I'm -1 on this idea, for now.
> > >
> > > We have been evaluating 1.1 and 1.2 for upgrade and whereas 1.1 will
> > > survive all testing including large scale ITBLL tests, 1.2 will not -
> no
> > > 1.2, from 1.2.0 on up. I've found one issue (fixed), and am now trying
> to
> > > nail down another.
> > >
> > > I would like to see two things:
> > >
> > > 1. Others in the community step up to evaluate the stability of 1.1.7
> > > versus 1.2.3 (or .4) using ITBLL with at least 1B rows of data, and
> > report
> > > in. Is it just me?
> > >
> > > 2. We do not declare 1.1 EOL until 1.2 is unquestionable stable
> according
> > > to the most practical rigor we can throw at it with our tooling.
> > Especially
> > > because I still plan to resign as 0.98 RM soon, which I think will
> > trigger
> > > an EOL of that code line.
> > >
> > > I will be resigning as 0.98 RM effective January 1 2017 and at that
> time
> > > the community can discuss what to do with 0.98. From my point of view,
> > I'm
> > > done with spending time on it. Happy to take some of the time freed up
> > and
> > > use it to carry 1.1 forward if we are still making releases off this
> code
> > > line then.
> > >
> > >
> > > On Fri, Nov 4, 2016 at 9:24 AM, Nick Dimiduk <nd...@apache.org>
> > wrote:
> > >
> > >> Hello HBase Community!
> > >>
> > >> We have a small matter to discuss.
> > >>
> > >> HBase 1.2 has been formally marked as "stable" for the last couple
> > months.
> > >> HBase 1.3.0rc0 is just around the corner. I think it's time to start a
> > >> conversation about retiring the 1.1 line. The volunteer bandwidth for
> > >> maintaining multiple branches is precious and as we spread ourselves
> > more
> > >> thin, odds of decay increase.
> > >>
> > >> I propose discontinuing 1.1 with a single release following 1.3.1.
> > That'll
> > >> give us one last chance to back port any bug fixes discovered in the
> > >> diligence we're putting into the new minor release. Given the current
> > pace
> > >> of 1.3, I estimate this will happen in January or February of 2017.
> It's
> > >> not a lot of time for existing deployments to get around to upgrading,
> > but
> > >> the upgrade path is trivial and 1.2 has been available for quite some
> > >> time. This will probably make our last release from this branch at
> > 1.1.10
> > >> or there abouts.
> > >>
> > >> Are there any objections or concerns with the above plan? Are there
> any
> > >> downstream communities who need our help moving onto 1.2? Please let
> us
> > >> know.
> > >>
> > >> Thanks,
> > >> Nick
> > >>
> > >
> > >
> > >
> > > --
> > > Best regards,
> > >
> > >    - Andy
> > >
> > > Problems worthy of attack prove their worth by hitting back. - Piet
> Hein
> > > (via Tom White)
> > >
> >
> >
> >
> > --
> > Best regards,
> >
> >    - Andy
> >
> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
> > (via Tom White)
> >
>
>
>
> --
> Thanks,
> Michael Antonov
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Andrew Purtell <ap...@apache.org>.
The behavior: Looks like failed split/compaction rollback: row(s) in META
without HRegionInfo, regions deployed without valid meta entries (at
first), regions on HDFS without valid meta entries (later, after RS
carrying them are killed by chaos), holes in the region chain leading to
timeouts and job failure.

The cause: Still looking

You'll know you have found it when on the ITBLL console its meta scanner
starts complaining about rows in meta without serialized HRegionInfo.

I volunteer to maintain 1.1 if nobody else wants it, and like I said I vote
-1 on EOL at this time.




On Fri, Nov 4, 2016 at 2:31 PM, Mikhail Antonov <ol...@gmail.com>
wrote:

> On the original topic - on one hand, the fewer code lines we maintain the
> better, as less volunteering time and efforts
> are spent on that. Also does postponing of EOL-ing of older releases slow
> down adoption of newer releases?
>
> On the other hand, if we believe ITBBL is broken / unreliable for some
> release this is a bad state to be in.
>
> @Andrew - I think this is also a question of what exactly is deemed wrong
> with ITBLL. What kind of issues do you observe
>  on 1.2? Do you observe data loss (unref / undef keys after Verify run), or
> does job fail to complete because regions in transition / offline,
> or is it something else? How reproducible is that for you? Could you share
> some more details of your experiments
> (should it be a separate thread, probably?)
>
> -Mikhail
>
> On Fri, Nov 4, 2016 at 9:57 AM, Andrew Purtell <ap...@apache.org>
> wrote:
>
> > Let me add I'd switch my thinking to +1 for retiring 1.1 if, now that we
> > have a 1.3.0RC0 shaping up, it turns out the 1.3 code line can survive
> the
> > same 1B ITBLL testing that 1.1 does (and 1.2 does not).
> >
> > On Fri, Nov 4, 2016 at 9:52 AM, Andrew Purtell <ap...@apache.org>
> > wrote:
> >
> > > I'm -1 on this idea, for now.
> > >
> > > We have been evaluating 1.1 and 1.2 for upgrade and whereas 1.1 will
> > > survive all testing including large scale ITBLL tests, 1.2 will not -
> no
> > > 1.2, from 1.2.0 on up. I've found one issue (fixed), and am now trying
> to
> > > nail down another.
> > >
> > > I would like to see two things:
> > >
> > > 1. Others in the community step up to evaluate the stability of 1.1.7
> > > versus 1.2.3 (or .4) using ITBLL with at least 1B rows of data, and
> > report
> > > in. Is it just me?
> > >
> > > 2. We do not declare 1.1 EOL until 1.2 is unquestionable stable
> according
> > > to the most practical rigor we can throw at it with our tooling.
> > Especially
> > > because I still plan to resign as 0.98 RM soon, which I think will
> > trigger
> > > an EOL of that code line.
> > >
> > > I will be resigning as 0.98 RM effective January 1 2017 and at that
> time
> > > the community can discuss what to do with 0.98. From my point of view,
> > I'm
> > > done with spending time on it. Happy to take some of the time freed up
> > and
> > > use it to carry 1.1 forward if we are still making releases off this
> code
> > > line then.
> > >
> > >
> > > On Fri, Nov 4, 2016 at 9:24 AM, Nick Dimiduk <nd...@apache.org>
> > wrote:
> > >
> > >> Hello HBase Community!
> > >>
> > >> We have a small matter to discuss.
> > >>
> > >> HBase 1.2 has been formally marked as "stable" for the last couple
> > months.
> > >> HBase 1.3.0rc0 is just around the corner. I think it's time to start a
> > >> conversation about retiring the 1.1 line. The volunteer bandwidth for
> > >> maintaining multiple branches is precious and as we spread ourselves
> > more
> > >> thin, odds of decay increase.
> > >>
> > >> I propose discontinuing 1.1 with a single release following 1.3.1.
> > That'll
> > >> give us one last chance to back port any bug fixes discovered in the
> > >> diligence we're putting into the new minor release. Given the current
> > pace
> > >> of 1.3, I estimate this will happen in January or February of 2017.
> It's
> > >> not a lot of time for existing deployments to get around to upgrading,
> > but
> > >> the upgrade path is trivial and 1.2 has been available for quite some
> > >> time. This will probably make our last release from this branch at
> > 1.1.10
> > >> or there abouts.
> > >>
> > >> Are there any objections or concerns with the above plan? Are there
> any
> > >> downstream communities who need our help moving onto 1.2? Please let
> us
> > >> know.
> > >>
> > >> Thanks,
> > >> Nick
> > >>
> > >
> > >
> > >
> > > --
> > > Best regards,
> > >
> > >    - Andy
> > >
> > > Problems worthy of attack prove their worth by hitting back. - Piet
> Hein
> > > (via Tom White)
> > >
> >
> >
> >
> > --
> > Best regards,
> >
> >    - Andy
> >
> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
> > (via Tom White)
> >
>
>
>
> --
> Thanks,
> Michael Antonov
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Mikhail Antonov <ol...@gmail.com>.
On the original topic - on one hand, the fewer code lines we maintain the
better, as less volunteering time and efforts
are spent on that. Also does postponing of EOL-ing of older releases slow
down adoption of newer releases?

On the other hand, if we believe ITBBL is broken / unreliable for some
release this is a bad state to be in.

@Andrew - I think this is also a question of what exactly is deemed wrong
with ITBLL. What kind of issues do you observe
 on 1.2? Do you observe data loss (unref / undef keys after Verify run), or
does job fail to complete because regions in transition / offline,
or is it something else? How reproducible is that for you? Could you share
some more details of your experiments
(should it be a separate thread, probably?)

-Mikhail

On Fri, Nov 4, 2016 at 9:57 AM, Andrew Purtell <ap...@apache.org> wrote:

> Let me add I'd switch my thinking to +1 for retiring 1.1 if, now that we
> have a 1.3.0RC0 shaping up, it turns out the 1.3 code line can survive the
> same 1B ITBLL testing that 1.1 does (and 1.2 does not).
>
> On Fri, Nov 4, 2016 at 9:52 AM, Andrew Purtell <ap...@apache.org>
> wrote:
>
> > I'm -1 on this idea, for now.
> >
> > We have been evaluating 1.1 and 1.2 for upgrade and whereas 1.1 will
> > survive all testing including large scale ITBLL tests, 1.2 will not - no
> > 1.2, from 1.2.0 on up. I've found one issue (fixed), and am now trying to
> > nail down another.
> >
> > I would like to see two things:
> >
> > 1. Others in the community step up to evaluate the stability of 1.1.7
> > versus 1.2.3 (or .4) using ITBLL with at least 1B rows of data, and
> report
> > in. Is it just me?
> >
> > 2. We do not declare 1.1 EOL until 1.2 is unquestionable stable according
> > to the most practical rigor we can throw at it with our tooling.
> Especially
> > because I still plan to resign as 0.98 RM soon, which I think will
> trigger
> > an EOL of that code line.
> >
> > I will be resigning as 0.98 RM effective January 1 2017 and at that time
> > the community can discuss what to do with 0.98. From my point of view,
> I'm
> > done with spending time on it. Happy to take some of the time freed up
> and
> > use it to carry 1.1 forward if we are still making releases off this code
> > line then.
> >
> >
> > On Fri, Nov 4, 2016 at 9:24 AM, Nick Dimiduk <nd...@apache.org>
> wrote:
> >
> >> Hello HBase Community!
> >>
> >> We have a small matter to discuss.
> >>
> >> HBase 1.2 has been formally marked as "stable" for the last couple
> months.
> >> HBase 1.3.0rc0 is just around the corner. I think it's time to start a
> >> conversation about retiring the 1.1 line. The volunteer bandwidth for
> >> maintaining multiple branches is precious and as we spread ourselves
> more
> >> thin, odds of decay increase.
> >>
> >> I propose discontinuing 1.1 with a single release following 1.3.1.
> That'll
> >> give us one last chance to back port any bug fixes discovered in the
> >> diligence we're putting into the new minor release. Given the current
> pace
> >> of 1.3, I estimate this will happen in January or February of 2017. It's
> >> not a lot of time for existing deployments to get around to upgrading,
> but
> >> the upgrade path is trivial and 1.2 has been available for quite some
> >> time. This will probably make our last release from this branch at
> 1.1.10
> >> or there abouts.
> >>
> >> Are there any objections or concerns with the above plan? Are there any
> >> downstream communities who need our help moving onto 1.2? Please let us
> >> know.
> >>
> >> Thanks,
> >> Nick
> >>
> >
> >
> >
> > --
> > Best regards,
> >
> >    - Andy
> >
> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
> > (via Tom White)
> >
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>



-- 
Thanks,
Michael Antonov

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Mikhail Antonov <ol...@gmail.com>.
On the original topic - on one hand, the fewer code lines we maintain the
better, as less volunteering time and efforts
are spent on that. Also does postponing of EOL-ing of older releases slow
down adoption of newer releases?

On the other hand, if we believe ITBBL is broken / unreliable for some
release this is a bad state to be in.

@Andrew - I think this is also a question of what exactly is deemed wrong
with ITBLL. What kind of issues do you observe
 on 1.2? Do you observe data loss (unref / undef keys after Verify run), or
does job fail to complete because regions in transition / offline,
or is it something else? How reproducible is that for you? Could you share
some more details of your experiments
(should it be a separate thread, probably?)

-Mikhail

On Fri, Nov 4, 2016 at 9:57 AM, Andrew Purtell <ap...@apache.org> wrote:

> Let me add I'd switch my thinking to +1 for retiring 1.1 if, now that we
> have a 1.3.0RC0 shaping up, it turns out the 1.3 code line can survive the
> same 1B ITBLL testing that 1.1 does (and 1.2 does not).
>
> On Fri, Nov 4, 2016 at 9:52 AM, Andrew Purtell <ap...@apache.org>
> wrote:
>
> > I'm -1 on this idea, for now.
> >
> > We have been evaluating 1.1 and 1.2 for upgrade and whereas 1.1 will
> > survive all testing including large scale ITBLL tests, 1.2 will not - no
> > 1.2, from 1.2.0 on up. I've found one issue (fixed), and am now trying to
> > nail down another.
> >
> > I would like to see two things:
> >
> > 1. Others in the community step up to evaluate the stability of 1.1.7
> > versus 1.2.3 (or .4) using ITBLL with at least 1B rows of data, and
> report
> > in. Is it just me?
> >
> > 2. We do not declare 1.1 EOL until 1.2 is unquestionable stable according
> > to the most practical rigor we can throw at it with our tooling.
> Especially
> > because I still plan to resign as 0.98 RM soon, which I think will
> trigger
> > an EOL of that code line.
> >
> > I will be resigning as 0.98 RM effective January 1 2017 and at that time
> > the community can discuss what to do with 0.98. From my point of view,
> I'm
> > done with spending time on it. Happy to take some of the time freed up
> and
> > use it to carry 1.1 forward if we are still making releases off this code
> > line then.
> >
> >
> > On Fri, Nov 4, 2016 at 9:24 AM, Nick Dimiduk <nd...@apache.org>
> wrote:
> >
> >> Hello HBase Community!
> >>
> >> We have a small matter to discuss.
> >>
> >> HBase 1.2 has been formally marked as "stable" for the last couple
> months.
> >> HBase 1.3.0rc0 is just around the corner. I think it's time to start a
> >> conversation about retiring the 1.1 line. The volunteer bandwidth for
> >> maintaining multiple branches is precious and as we spread ourselves
> more
> >> thin, odds of decay increase.
> >>
> >> I propose discontinuing 1.1 with a single release following 1.3.1.
> That'll
> >> give us one last chance to back port any bug fixes discovered in the
> >> diligence we're putting into the new minor release. Given the current
> pace
> >> of 1.3, I estimate this will happen in January or February of 2017. It's
> >> not a lot of time for existing deployments to get around to upgrading,
> but
> >> the upgrade path is trivial and 1.2 has been available for quite some
> >> time. This will probably make our last release from this branch at
> 1.1.10
> >> or there abouts.
> >>
> >> Are there any objections or concerns with the above plan? Are there any
> >> downstream communities who need our help moving onto 1.2? Please let us
> >> know.
> >>
> >> Thanks,
> >> Nick
> >>
> >
> >
> >
> > --
> > Best regards,
> >
> >    - Andy
> >
> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
> > (via Tom White)
> >
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>



-- 
Thanks,
Michael Antonov

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Andrew Purtell <ap...@apache.org>.
Let me add I'd switch my thinking to +1 for retiring 1.1 if, now that we
have a 1.3.0RC0 shaping up, it turns out the 1.3 code line can survive the
same 1B ITBLL testing that 1.1 does (and 1.2 does not).

On Fri, Nov 4, 2016 at 9:52 AM, Andrew Purtell <ap...@apache.org> wrote:

> I'm -1 on this idea, for now.
>
> We have been evaluating 1.1 and 1.2 for upgrade and whereas 1.1 will
> survive all testing including large scale ITBLL tests, 1.2 will not - no
> 1.2, from 1.2.0 on up. I've found one issue (fixed), and am now trying to
> nail down another.
>
> I would like to see two things:
>
> 1. Others in the community step up to evaluate the stability of 1.1.7
> versus 1.2.3 (or .4) using ITBLL with at least 1B rows of data, and report
> in. Is it just me?
>
> 2. We do not declare 1.1 EOL until 1.2 is unquestionable stable according
> to the most practical rigor we can throw at it with our tooling. Especially
> because I still plan to resign as 0.98 RM soon, which I think will trigger
> an EOL of that code line.
>
> I will be resigning as 0.98 RM effective January 1 2017 and at that time
> the community can discuss what to do with 0.98. From my point of view, I'm
> done with spending time on it. Happy to take some of the time freed up and
> use it to carry 1.1 forward if we are still making releases off this code
> line then.
>
>
> On Fri, Nov 4, 2016 at 9:24 AM, Nick Dimiduk <nd...@apache.org> wrote:
>
>> Hello HBase Community!
>>
>> We have a small matter to discuss.
>>
>> HBase 1.2 has been formally marked as "stable" for the last couple months.
>> HBase 1.3.0rc0 is just around the corner. I think it's time to start a
>> conversation about retiring the 1.1 line. The volunteer bandwidth for
>> maintaining multiple branches is precious and as we spread ourselves more
>> thin, odds of decay increase.
>>
>> I propose discontinuing 1.1 with a single release following 1.3.1. That'll
>> give us one last chance to back port any bug fixes discovered in the
>> diligence we're putting into the new minor release. Given the current pace
>> of 1.3, I estimate this will happen in January or February of 2017. It's
>> not a lot of time for existing deployments to get around to upgrading, but
>> the upgrade path is trivial and 1.2 has been available for quite some
>> time. This will probably make our last release from this branch at 1.1.10
>> or there abouts.
>>
>> Are there any objections or concerns with the above plan? Are there any
>> downstream communities who need our help moving onto 1.2? Please let us
>> know.
>>
>> Thanks,
>> Nick
>>
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Andrew Purtell <ap...@apache.org>.
Let me add I'd switch my thinking to +1 for retiring 1.1 if, now that we
have a 1.3.0RC0 shaping up, it turns out the 1.3 code line can survive the
same 1B ITBLL testing that 1.1 does (and 1.2 does not).

On Fri, Nov 4, 2016 at 9:52 AM, Andrew Purtell <ap...@apache.org> wrote:

> I'm -1 on this idea, for now.
>
> We have been evaluating 1.1 and 1.2 for upgrade and whereas 1.1 will
> survive all testing including large scale ITBLL tests, 1.2 will not - no
> 1.2, from 1.2.0 on up. I've found one issue (fixed), and am now trying to
> nail down another.
>
> I would like to see two things:
>
> 1. Others in the community step up to evaluate the stability of 1.1.7
> versus 1.2.3 (or .4) using ITBLL with at least 1B rows of data, and report
> in. Is it just me?
>
> 2. We do not declare 1.1 EOL until 1.2 is unquestionable stable according
> to the most practical rigor we can throw at it with our tooling. Especially
> because I still plan to resign as 0.98 RM soon, which I think will trigger
> an EOL of that code line.
>
> I will be resigning as 0.98 RM effective January 1 2017 and at that time
> the community can discuss what to do with 0.98. From my point of view, I'm
> done with spending time on it. Happy to take some of the time freed up and
> use it to carry 1.1 forward if we are still making releases off this code
> line then.
>
>
> On Fri, Nov 4, 2016 at 9:24 AM, Nick Dimiduk <nd...@apache.org> wrote:
>
>> Hello HBase Community!
>>
>> We have a small matter to discuss.
>>
>> HBase 1.2 has been formally marked as "stable" for the last couple months.
>> HBase 1.3.0rc0 is just around the corner. I think it's time to start a
>> conversation about retiring the 1.1 line. The volunteer bandwidth for
>> maintaining multiple branches is precious and as we spread ourselves more
>> thin, odds of decay increase.
>>
>> I propose discontinuing 1.1 with a single release following 1.3.1. That'll
>> give us one last chance to back port any bug fixes discovered in the
>> diligence we're putting into the new minor release. Given the current pace
>> of 1.3, I estimate this will happen in January or February of 2017. It's
>> not a lot of time for existing deployments to get around to upgrading, but
>> the upgrade path is trivial and 1.2 has been available for quite some
>> time. This will probably make our last release from this branch at 1.1.10
>> or there abouts.
>>
>> Are there any objections or concerns with the above plan? Are there any
>> downstream communities who need our help moving onto 1.2? Please let us
>> know.
>>
>> Thanks,
>> Nick
>>
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Andrew Purtell <ap...@apache.org>.
I'm -1 on this idea, for now.

We have been evaluating 1.1 and 1.2 for upgrade and whereas 1.1 will
survive all testing including large scale ITBLL tests, 1.2 will not - no
1.2, from 1.2.0 on up. I've found one issue (fixed), and am now trying to
nail down another.

I would like to see two things:

1. Others in the community step up to evaluate the stability of 1.1.7
versus 1.2.3 (or .4) using ITBLL with at least 1B rows of data, and report
in. Is it just me?

2. We do not declare 1.1 EOL until 1.2 is unquestionable stable according
to the most practical rigor we can throw at it with our tooling. Especially
because I still plan to resign as 0.98 RM soon, which I think will trigger
an EOL of that code line.

I will be resigning as 0.98 RM effective January 1 2017 and at that time
the community can discuss what to do with 0.98. From my point of view, I'm
done with spending time on it. Happy to take some of the time freed up and
use it to carry 1.1 forward if we are still making releases off this code
line then.


On Fri, Nov 4, 2016 at 9:24 AM, Nick Dimiduk <nd...@apache.org> wrote:

> Hello HBase Community!
>
> We have a small matter to discuss.
>
> HBase 1.2 has been formally marked as "stable" for the last couple months.
> HBase 1.3.0rc0 is just around the corner. I think it's time to start a
> conversation about retiring the 1.1 line. The volunteer bandwidth for
> maintaining multiple branches is precious and as we spread ourselves more
> thin, odds of decay increase.
>
> I propose discontinuing 1.1 with a single release following 1.3.1. That'll
> give us one last chance to back port any bug fixes discovered in the
> diligence we're putting into the new minor release. Given the current pace
> of 1.3, I estimate this will happen in January or February of 2017. It's
> not a lot of time for existing deployments to get around to upgrading, but
> the upgrade path is trivial and 1.2 has been available for quite some
> time. This will probably make our last release from this branch at 1.1.10
> or there abouts.
>
> Are there any objections or concerns with the above plan? Are there any
> downstream communities who need our help moving onto 1.2? Please let us
> know.
>
> Thanks,
> Nick
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

RE: [DISCUSS] EOL 1.1 Release Branch

Posted by "LeBlanc, Jacob" <ja...@hpe.com>.
Our team is preparing to upgrade our production cluster from 0.98 to 1.1.3. We chose 1.1.3 because that is the latest in bigtop.bom and we deploy using RPMs built with Bigtop. Just mentioning, in case an update to Bigtop is a gating factor for EOL...

--Jacob LeBlanc

-----Original Message-----
From: Nick Dimiduk [mailto:ndimiduk@apache.org] 
Sent: Friday, November 04, 2016 12:25 PM
To: hbase-user; hbase-dev
Subject: [DISCUSS] EOL 1.1 Release Branch

Hello HBase Community!

We have a small matter to discuss.

HBase 1.2 has been formally marked as "stable" for the last couple months.
HBase 1.3.0rc0 is just around the corner. I think it's time to start a conversation about retiring the 1.1 line. The volunteer bandwidth for maintaining multiple branches is precious and as we spread ourselves more thin, odds of decay increase.

I propose discontinuing 1.1 with a single release following 1.3.1. That'll give us one last chance to back port any bug fixes discovered in the diligence we're putting into the new minor release. Given the current pace of 1.3, I estimate this will happen in January or February of 2017. It's not a lot of time for existing deployments to get around to upgrading, but the upgrade path is trivial and 1.2 has been available for quite some time. This will probably make our last release from this branch at 1.1.10 or there abouts.

Are there any objections or concerns with the above plan? Are there any downstream communities who need our help moving onto 1.2? Please let us know.

Thanks,
Nick

Re: [DISCUSS] EOL 1.1 Release Branch

Posted by Andrew Purtell <ap...@apache.org>.
I'm -1 on this idea, for now.

We have been evaluating 1.1 and 1.2 for upgrade and whereas 1.1 will
survive all testing including large scale ITBLL tests, 1.2 will not - no
1.2, from 1.2.0 on up. I've found one issue (fixed), and am now trying to
nail down another.

I would like to see two things:

1. Others in the community step up to evaluate the stability of 1.1.7
versus 1.2.3 (or .4) using ITBLL with at least 1B rows of data, and report
in. Is it just me?

2. We do not declare 1.1 EOL until 1.2 is unquestionable stable according
to the most practical rigor we can throw at it with our tooling. Especially
because I still plan to resign as 0.98 RM soon, which I think will trigger
an EOL of that code line.

I will be resigning as 0.98 RM effective January 1 2017 and at that time
the community can discuss what to do with 0.98. From my point of view, I'm
done with spending time on it. Happy to take some of the time freed up and
use it to carry 1.1 forward if we are still making releases off this code
line then.


On Fri, Nov 4, 2016 at 9:24 AM, Nick Dimiduk <nd...@apache.org> wrote:

> Hello HBase Community!
>
> We have a small matter to discuss.
>
> HBase 1.2 has been formally marked as "stable" for the last couple months.
> HBase 1.3.0rc0 is just around the corner. I think it's time to start a
> conversation about retiring the 1.1 line. The volunteer bandwidth for
> maintaining multiple branches is precious and as we spread ourselves more
> thin, odds of decay increase.
>
> I propose discontinuing 1.1 with a single release following 1.3.1. That'll
> give us one last chance to back port any bug fixes discovered in the
> diligence we're putting into the new minor release. Given the current pace
> of 1.3, I estimate this will happen in January or February of 2017. It's
> not a lot of time for existing deployments to get around to upgrading, but
> the upgrade path is trivial and 1.2 has been available for quite some
> time. This will probably make our last release from this branch at 1.1.10
> or there abouts.
>
> Are there any objections or concerns with the above plan? Are there any
> downstream communities who need our help moving onto 1.2? Please let us
> know.
>
> Thanks,
> Nick
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

RE: [DISCUSS] EOL 1.1 Release Branch

Posted by "LeBlanc, Jacob" <ja...@hpe.com>.
Our team is preparing to upgrade our production cluster from 0.98 to 1.1.3. We chose 1.1.3 because that is the latest in bigtop.bom and we deploy using RPMs built with Bigtop. Just mentioning, in case an update to Bigtop is a gating factor for EOL...

--Jacob LeBlanc

-----Original Message-----
From: Nick Dimiduk [mailto:ndimiduk@apache.org] 
Sent: Friday, November 04, 2016 12:25 PM
To: hbase-user; hbase-dev
Subject: [DISCUSS] EOL 1.1 Release Branch

Hello HBase Community!

We have a small matter to discuss.

HBase 1.2 has been formally marked as "stable" for the last couple months.
HBase 1.3.0rc0 is just around the corner. I think it's time to start a conversation about retiring the 1.1 line. The volunteer bandwidth for maintaining multiple branches is precious and as we spread ourselves more thin, odds of decay increase.

I propose discontinuing 1.1 with a single release following 1.3.1. That'll give us one last chance to back port any bug fixes discovered in the diligence we're putting into the new minor release. Given the current pace of 1.3, I estimate this will happen in January or February of 2017. It's not a lot of time for existing deployments to get around to upgrading, but the upgrade path is trivial and 1.2 has been available for quite some time. This will probably make our last release from this branch at 1.1.10 or there abouts.

Are there any objections or concerns with the above plan? Are there any downstream communities who need our help moving onto 1.2? Please let us know.

Thanks,
Nick