You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Stack <st...@duboce.net> on 2013/10/04 23:20:41 UTC

HEADSUP: Working on new 0.96.0RC

Waiting on HBASE-9612 jenkins build but starting in making a new RC.  It
takes a few hours if all goes well.  Please no commits on 0.96 branch till
the all clear is sounded.  Thanks.

St.Ack

Re: HEADSUP: Working on new 0.96.0RC

Posted by Steve Loughran <st...@hortonworks.com>.
On 6 October 2013 02:30, Stack <st...@duboce.net> wrote:

> On Sat, Oct 5, 2013 at 6:01 PM, Enis Söztutar <en...@gmail.com> wrote:
>
> > Thanks Stack for doing this. We had a lot of churn between RC3 and 4 (new
> > modules etc). Agreed that we should easy on the risky patches even if
> this
> > RC fails.
> >
>
> Yeah.  Sorry about that.  We put out a bunch of development releases but
> downstreamers only seemed to have started paying attention now we are in RC
> state.


That's because if you have data you care about you don't want to start
playing with it until the project team says "this is what you are about to
get".

Non-storage related projects can get faster pickup, but even there it's
late beta before you get the flood of bugreps



> The addition of the new test module was to make hbase have hadoops'
> form so downstreamers could depend on our test tools/cluster explicitly and
> all dependencies would get pulled in (mvn resolve is wonky for the *-test
> jars).
>
> St.Ack
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: HEADSUP: Working on new 0.96.0RC

Posted by Stack <st...@duboce.net>.
On Sat, Oct 5, 2013 at 6:01 PM, Enis Söztutar <en...@gmail.com> wrote:

> Thanks Stack for doing this. We had a lot of churn between RC3 and 4 (new
> modules etc). Agreed that we should easy on the risky patches even if this
> RC fails.
>

Yeah.  Sorry about that.  We put out a bunch of development releases but
downstreamers only seemed to have started paying attention now we are in RC
state.  The addition of the new test module was to make hbase have hadoops'
form so downstreamers could depend on our test tools/cluster explicitly and
all dependencies would get pulled in (mvn resolve is wonky for the *-test
jars).

St.Ack

Re: HEADSUP: Working on new 0.96.0RC

Posted by Enis Söztutar <en...@gmail.com>.
Thanks Stack for doing this. We had a lot of churn between RC3 and 4 (new
modules etc). Agreed that we should easy on the risky patches even if this
RC fails.

Enis


On Sat, Oct 5, 2013 at 5:00 PM, Stack <st...@duboce.net> wrote:

> On Fri, Oct 4, 2013 at 2:20 PM, Stack <st...@duboce.net> wrote:
>
> > Waiting on HBASE-9612 jenkins build but starting in making a new RC.  It
> > takes a few hours if all goes well.  Please no commits on 0.96 branch
> till
> > the all clear is sounded.  Thanks.
> >
> >
> All clear, but please only important bug fixes for 0.96 branch; nothing
> that might destabilize.  If you do commit one, mark it fixed in version
> 0.96.1.
> Thanks,
> St.Ack
>

Re: HEADSUP: Working on new 0.96.0RC

Posted by Sergey Shelukhin <se...@hortonworks.com>.
It would be really nice to avoid committing large changes to 96 henceforth
before we have the RC5. Otherwise it would never stabilize at this rate.


On Wed, Oct 9, 2013 at 10:45 AM, Enis Söztutar <en...@apache.org> wrote:

> I just committed https://issues.apache.org/jira/browse/HBASE-9730 for
> this.
> Time for another RC, what do you think?
>
> You know what they say, sixth time is the charm.
> Enis
>
>
> On Tue, Oct 8, 2013 at 2:43 PM, Enis Söztutar <en...@apache.org> wrote:
>
> > HEADS UP:
> > I think we may have to sink this one. Our tests ITBLL and ITLAV with CM
> > fails consistently, and we suspect a problem with HBASE-9612 (although
> not
> > confirmed yet)
> >
> > More details are coming soon after more digging into logs.
> > Enis
> >
> >
> > On Mon, Oct 7, 2013 at 1:46 PM, Stack <st...@duboce.net> wrote:
> >
> >> On Mon, Oct 7, 2013 at 10:05 AM, Steve Loughran <stevel@hortonworks.com
> >> >wrote:
> >>
> >> > go
> >> >
> >> > well, those are the .gz files, not the JARs, I'll have to download and
> >> > check...
> >> >
> >> >
> >> Give me list of jars you want a sha for and I'll run them for you
> against
> >> the build RC (and publish it)
> >>
> >>
> >>
> >> > BTW, there's a new Hadoop RC out in staging: 2.2.0
> >> >
> >> >
> >> > The RC is available at:
> >> > http://people.apache.org/~acmurthy/hadoop-2.2.0-rc0
> >> > The RC tag in svn is here:
> >> > http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.2.0-rc0
> >>
> >>
> >> Yeah.  Hopefully our RC works against it (I didn't try it).
> >>
> >> St.Ack
> >>
> >
> >
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: HEADSUP: Working on new 0.96.0RC

Posted by Stack <st...@duboce.net>.
On Wed, Oct 9, 2013 at 12:42 PM, Sergey Shelukhin <se...@hortonworks.com>wrote:

> 9696 looks a little bit scary... did you guys test it on your rig?
>
>
As per above, it is being tested now.
St.Ack

Re: HEADSUP: Working on new 0.96.0RC

Posted by Stack <st...@duboce.net>.
On Fri, Oct 11, 2013 at 7:12 PM, liushaohui <li...@xiaomi.com> wrote:

> mr related tests may fail for
>
> https://issues.apache.org/**jira/browse/HBASE-8324<https://issues.apache.org/jira/browse/HBASE-8324>
>
> - liushaohui
>
>
That fix should be in this RC and tests seemed fine w/ the 2.2 RC.  You
have a different experience liushaohui?

Thanks,
St.Ack

Re: HEADSUP: Working on new 0.96.0RC

Posted by liushaohui <li...@xiaomi.com>.
mr related tests may fail for

https://issues.apache.org/jira/browse/HBASE-8324

- liushaohui

On 10/12/2013 05:03 AM, Stack wrote:
> Anyone tried the 2.2 hadoop that is up for vote at the moment?  I tried our
> unit tests and got these failures:
>
> Failed tests:
> testCopyTable(org.apache.hadoop.hbase.mapreduce.TestCopyTable):
> expected:<0> but was:<1>
>    testStartStopRow(org.apache.hadoop.hbase.mapreduce.TestCopyTable):
> expected:<0> but was:<1>
>
> testMultithreadedTableMapper(org.apache.hadoop.hbase.mapreduce.TestMultithreadedTableMapper)
>    testSimpleCase(org.apache.hadoop.hbase.mapreduce.TestImportExport)
>    testMetaExport(org.apache.hadoop.hbase.mapreduce.TestImportExport)
>
> testExportScannerBatching(org.apache.hadoop.hbase.mapreduce.TestImportExport)
>    testWithFilter(org.apache.hadoop.hbase.mapreduce.TestImportExport)
>    testWithDeletes(org.apache.hadoop.hbase.mapreduce.TestImportExport)
>
> testExcludeMinorCompaction(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat)
>
> testMRIncrementalLoad(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat)
>
> testMRIncrementalLoadWithSplit(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat)
>    testMROnTable(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
> expected:<0> but was:<1>
>
> testMROnTableWithCustomMapper(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
> expected:<0> but was:<1>
>
> testBulkOutputWithTsvImporterTextMapper(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
> expected:<0> but was:<1>
>
> testBulkOutputWithAnExistingTable(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
> expected:<0> but was:<1>
>
> testMROnTableWithTimestamp(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
> expected:<0> but was:<1>
>
> testBulkOutputWithoutAnExistingTable(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
> expected:<0> but was:<1>
>    testRowCounterNoColumn(org.apache.hadoop.hbase.mapreduce.TestRowCounter)
>
> testRowCounterHiddenColumn(org.apache.hadoop.hbase.mapreduce.TestRowCounter)
>
> testRowCounterExclusiveColumn(org.apache.hadoop.hbase.mapreduce.TestRowCounter)
>    testCombiner(org.apache.hadoop.hbase.mapreduce.TestTableMapReduce)
>    testMultiRegionTable(org.apache.hadoop.hbase.mapreduce.TestTableMapReduce)
>
> testScanEmptyToAPP(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)
>
> testScanEmptyToBBA(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)
>
> testScanEmptyToBBB(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)
>
> testScanEmptyToOPP(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)
>
> testScanEmptyToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)
>    testWALPlayer(org.apache.hadoop.hbase.mapreduce.TestWALPlayer):
> expected:<0> but was:<1>
>
> testScanYZYToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
>
> testScanOPPToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
>
> testScanYYXToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
>
> testScanOBBToOPP(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
>
> testScanOBBToQPP(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
>
> testScanFromConfiguration(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
>
> testScanYYYToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
>
> testExportFileSystemState(org.apache.hadoop.hbase.snapshot.TestExportSnapshot):
> expected:<0> but was:<1>
>
> testSnapshotWithRefsExportFileSystemState(org.apache.hadoop.hbase.snapshot.TestExportSnapshot):
> expected:<0> but was:<1>
>
>
> Anyone else seeing this?
> St.Ack
>
>
>
> On Fri, Oct 11, 2013 at 10:05 AM, Stack <st...@duboce.net> wrote:
>
>> On Thu, Oct 10, 2013 at 6:39 PM, Sergey Shelukhin <se...@hortonworks.com>wrote:
>>
>>>> Can we agree if the IT tests are green for a certain number of runs in a
>>> row, then it's stable?
>>>
>>> What do you mean by IT tests are green? Ours are mostly green lately
>>> (except for recently fixed bugs).
>>> Can you please share some investigation details? Maybe file bugs with
>>> description of symptoms, like logs and stuff; are you sure you are hitting
>>> 9696 in particular?
>>>
>> We've been trying to keep up HBASE-9696 w/ ongoing notes.  We should do
>> better for sure but big picture is that we have evidence that what is in
>> HBASE-9696 is an improvement over what we have now having had two sustained
>> runs w/o data loss.   The fix is needed so we can do long-running hbase-it
>> suites; w/o it we were just crash-landing a few hours in.
>>
>>
>>> 9696 is a very big patch too, it can introduce more bugs and will require
>>> more fixing.
>>> We do need to have some deadline where large/risky changes cannot go imho.
>>>
>>>
>>>
>> Agree but after reviews, I do not know how to avoid it (see 9696 and its
>> RB)
>>
>> I suggest we commit hbase-9696 as is since it an incompatible change with
>> its introduction of two new states, states that we do not seem to be able
>> to do without.  Then I cut an RC.  If further issue in 9696, we can fine
>> tune/bug-fix post release.
>>
>> On another note, a rig run that has been going for almost 24 hours has
>> gone further than any run of the last few weeks.  That is good.
>>
>> Let us know if need any more info/insight.  Almost there.
>> St.Ack
>>


Re: HEADSUP: Working on new 0.96.0RC

Posted by Enis Söztutar <en...@apache.org>.
> We're not releasing a new rc tonight, it seems really weird to hold up a
bug fix to try and hit some unknown, and un-agreed upon, deadline.
There is no deadline, but we cannot hold the RC for non-blocker patches
especially once the release candidate process has begun. We are not going
to solve every bug in existence (look at HBASE-9721 for example) in HBase.
In the usual case for a release, once an RC is cut, you do not want to
destabilize by adding risky patches and continue on this kind cat and mouse
game. Note that 9696 did not hold up the cut for the previous RC, why
should it hold it now?

> To me it feels like HBASE-9724 should go into 0.96
I am fine with committing HBASE-9724. I did not commit that yesterday since
I though we should be doing an RC and I did not want last minute fixes to
AM.

Let's get 9724 and a solution for 9563 in. Would that work?

Enis


On Wed, Oct 9, 2013 at 9:22 PM, Elliott Clark <ec...@apache.org> wrote:

> To me it feels like HBASE-9724 should go into 0.96.  We're not releasing a
> new rc tonight, it seems really weird to hold up a bug fix to try and hit
> some unknown, and un-agreed upon, deadline.
>
>
> On Wed, Oct 9, 2013 at 8:08 PM, Stack <st...@duboce.net> wrote:
>
>> On Wed, Oct 9, 2013 at 7:14 PM, Enis Söztutar <en...@apache.org> wrote:
>>
>>> > Anyways, if you fellas can't wait anymore, just say and we'll figure
>>> out
>>> something.
>>> As I see it, HBASE-9563 is committed,
>>
>>
>> It is still open and committed with qualification "Stack:...Was going to
>> try this first but likely needs more..."  and "Elliott: +1 I think it's
>> an improvement even if it doesn't 100% fix the master issue."
>>
>>
>>
>>> and HBASE-9696 is not a blocker
>>> against 0.96. But if you argue that 9696 is indeed a blocker, let's raise
>>> it as such.
>>
>>
>>
>> Agree.
>>
>>
>>
>>> There is no point in creating an RC, an immediately sinking it
>>> if we cannot verify the RC for a +1. We don't run into data loss issues
>>> anymore which is why I still think we can release 0.96 even without 9696
>>> and 9724. Nothing is preventing us to release 0.96.1, with this and more
>>> fixes in let's say a couple of weeks or months.
>>>
>>> I guess let's wait for tomorrow to see whether there is any progress on
>>> 9563 and 9696.
>>>
>>
>> Yes.  Lets take this up tomorrow.  Elliott and I are on the master issue,
>> HBASE-9563, this evening.
>>
>> Thanks Enis,
>> St.Ack
>>
>
>

Re: HEADSUP: Working on new 0.96.0RC

Posted by Stack <st...@duboce.net>.
Ok.  HBASE-9696 is in.  Let me start the RC build.  No commits to 0.96
please.  Thanks.
St.Ack


On Fri, Oct 11, 2013 at 2:19 PM, Stack <st...@duboce.net> wrote:

> False alarm.  Local DNS issue.  Seems to work on a real box.
> Thanks,
> St.Ack
>
>
> On Fri, Oct 11, 2013 at 2:09 PM, Devaraj Das <dd...@hortonworks.com> wrote:
>
>> Likewise, no failures with hadoop-2.2
>>
>>
>> On Fri, Oct 11, 2013 at 2:07 PM, Ted Yu <yu...@gmail.com> wrote:
>>
>> > Can you provide some detail about the test failure ?
>> >
>> > I ran test suite for trunk on hadoop 2.2 and didn't see such failure.
>> >
>> > Cheers
>> >
>> >
>> > On Fri, Oct 11, 2013 at 2:03 PM, Stack <st...@duboce.net> wrote:
>> >
>> > > Anyone tried the 2.2 hadoop that is up for vote at the moment?  I
>> tried
>> > our
>> > > unit tests and got these failures:
>> > >
>> > > Failed tests:
>> > > testCopyTable(org.apache.hadoop.hbase.mapreduce.TestCopyTable):
>> > > expected:<0> but was:<1>
>> > >   testStartStopRow(org.apache.hadoop.hbase.mapreduce.TestCopyTable):
>> > > expected:<0> but was:<1>
>> > >
>> > >
>> > >
>> >
>> testMultithreadedTableMapper(org.apache.hadoop.hbase.mapreduce.TestMultithreadedTableMapper)
>> > >   testSimpleCase(org.apache.hadoop.hbase.mapreduce.TestImportExport)
>> > >   testMetaExport(org.apache.hadoop.hbase.mapreduce.TestImportExport)
>> > >
>> > >
>> > >
>> >
>> testExportScannerBatching(org.apache.hadoop.hbase.mapreduce.TestImportExport)
>> > >   testWithFilter(org.apache.hadoop.hbase.mapreduce.TestImportExport)
>> > >   testWithDeletes(org.apache.hadoop.hbase.mapreduce.TestImportExport)
>> > >
>> > >
>> > >
>> >
>> testExcludeMinorCompaction(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat)
>> > >
>> > >
>> > >
>> >
>> testMRIncrementalLoad(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat)
>> > >
>> > >
>> > >
>> >
>> testMRIncrementalLoadWithSplit(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat)
>> > >   testMROnTable(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
>> > > expected:<0> but was:<1>
>> > >
>> > >
>> > >
>> >
>> testMROnTableWithCustomMapper(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
>> > > expected:<0> but was:<1>
>> > >
>> > >
>> > >
>> >
>> testBulkOutputWithTsvImporterTextMapper(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
>> > > expected:<0> but was:<1>
>> > >
>> > >
>> > >
>> >
>> testBulkOutputWithAnExistingTable(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
>> > > expected:<0> but was:<1>
>> > >
>> > >
>> > >
>> >
>> testMROnTableWithTimestamp(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
>> > > expected:<0> but was:<1>
>> > >
>> > >
>> > >
>> >
>> testBulkOutputWithoutAnExistingTable(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
>> > > expected:<0> but was:<1>
>> > >
>> > testRowCounterNoColumn(org.apache.hadoop.hbase.mapreduce.TestRowCounter)
>> > >
>> > >
>> > >
>> >
>> testRowCounterHiddenColumn(org.apache.hadoop.hbase.mapreduce.TestRowCounter)
>> > >
>> > >
>> > >
>> >
>> testRowCounterExclusiveColumn(org.apache.hadoop.hbase.mapreduce.TestRowCounter)
>> > >   testCombiner(org.apache.hadoop.hbase.mapreduce.TestTableMapReduce)
>> > >
>> > >
>> >
>> testMultiRegionTable(org.apache.hadoop.hbase.mapreduce.TestTableMapReduce)
>> > >
>> > >
>> > >
>> >
>> testScanEmptyToAPP(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)
>> > >
>> > >
>> > >
>> >
>> testScanEmptyToBBA(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)
>> > >
>> > >
>> > >
>> >
>> testScanEmptyToBBB(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)
>> > >
>> > >
>> > >
>> >
>> testScanEmptyToOPP(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)
>> > >
>> > >
>> > >
>> >
>> testScanEmptyToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)
>> > >   testWALPlayer(org.apache.hadoop.hbase.mapreduce.TestWALPlayer):
>> > > expected:<0> but was:<1>
>> > >
>> > >
>> > >
>> >
>> testScanYZYToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
>> > >
>> > >
>> > >
>> >
>> testScanOPPToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
>> > >
>> > >
>> > >
>> >
>> testScanYYXToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
>> > >
>> > >
>> > >
>> >
>> testScanOBBToOPP(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
>> > >
>> > >
>> > >
>> >
>> testScanOBBToQPP(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
>> > >
>> > >
>> > >
>> >
>> testScanFromConfiguration(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
>> > >
>> > >
>> > >
>> >
>> testScanYYYToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
>> > >
>> > >
>> > >
>> >
>> testExportFileSystemState(org.apache.hadoop.hbase.snapshot.TestExportSnapshot):
>> > > expected:<0> but was:<1>
>> > >
>> > >
>> > >
>> >
>> testSnapshotWithRefsExportFileSystemState(org.apache.hadoop.hbase.snapshot.TestExportSnapshot):
>> > > expected:<0> but was:<1>
>> > >
>> > >
>> > > Anyone else seeing this?
>> > > St.Ack
>> > >
>> > >
>> > >
>> > > On Fri, Oct 11, 2013 at 10:05 AM, Stack <st...@duboce.net> wrote:
>> > >
>> > > > On Thu, Oct 10, 2013 at 6:39 PM, Sergey Shelukhin <
>> > > sergey@hortonworks.com>wrote:
>> > > >
>> > > >> >Can we agree if the IT tests are green for a certain number of
>> runs
>> > in
>> > > a
>> > > >> row, then it's stable?
>> > > >>
>> > > >> What do you mean by IT tests are green? Ours are mostly green
>> lately
>> > > >> (except for recently fixed bugs).
>> > > >> Can you please share some investigation details? Maybe file bugs
>> with
>> > > >> description of symptoms, like logs and stuff; are you sure you are
>> > > hitting
>> > > >> 9696 in particular?
>> > > >>
>> > > >
>> > > > We've been trying to keep up HBASE-9696 w/ ongoing notes.  We
>> should do
>> > > > better for sure but big picture is that we have evidence that what
>> is
>> > in
>> > > > HBASE-9696 is an improvement over what we have now having had two
>> > > sustained
>> > > > runs w/o data loss.   The fix is needed so we can do long-running
>> > > hbase-it
>> > > > suites; w/o it we were just crash-landing a few hours in.
>> > > >
>> > > >
>> > > >> 9696 is a very big patch too, it can introduce more bugs and will
>> > > require
>> > > >> more fixing.
>> > > >> We do need to have some deadline where large/risky changes cannot
>> go
>> > > imho.
>> > > >>
>> > > >>
>> > > >>
>> > > > Agree but after reviews, I do not know how to avoid it (see 9696 and
>> > its
>> > > > RB)
>> > > >
>> > > > I suggest we commit hbase-9696 as is since it an incompatible change
>> > with
>> > > > its introduction of two new states, states that we do not seem to be
>> > able
>> > > > to do without.  Then I cut an RC.  If further issue in 9696, we can
>> > fine
>> > > > tune/bug-fix post release.
>> > > >
>> > > > On another note, a rig run that has been going for almost 24 hours
>> has
>> > > > gone further than any run of the last few weeks.  That is good.
>> > > >
>> > > > Let us know if need any more info/insight.  Almost there.
>> > > > St.Ack
>> > > >
>> > >
>> >
>>
>> --
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to
>> which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified
>> that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender
>> immediately
>> and delete it from your system. Thank You.
>>
>
>

Re: HEADSUP: Working on new 0.96.0RC

Posted by Stack <st...@duboce.net>.
False alarm.  Local DNS issue.  Seems to work on a real box.
Thanks,
St.Ack


On Fri, Oct 11, 2013 at 2:09 PM, Devaraj Das <dd...@hortonworks.com> wrote:

> Likewise, no failures with hadoop-2.2
>
>
> On Fri, Oct 11, 2013 at 2:07 PM, Ted Yu <yu...@gmail.com> wrote:
>
> > Can you provide some detail about the test failure ?
> >
> > I ran test suite for trunk on hadoop 2.2 and didn't see such failure.
> >
> > Cheers
> >
> >
> > On Fri, Oct 11, 2013 at 2:03 PM, Stack <st...@duboce.net> wrote:
> >
> > > Anyone tried the 2.2 hadoop that is up for vote at the moment?  I tried
> > our
> > > unit tests and got these failures:
> > >
> > > Failed tests:
> > > testCopyTable(org.apache.hadoop.hbase.mapreduce.TestCopyTable):
> > > expected:<0> but was:<1>
> > >   testStartStopRow(org.apache.hadoop.hbase.mapreduce.TestCopyTable):
> > > expected:<0> but was:<1>
> > >
> > >
> > >
> >
> testMultithreadedTableMapper(org.apache.hadoop.hbase.mapreduce.TestMultithreadedTableMapper)
> > >   testSimpleCase(org.apache.hadoop.hbase.mapreduce.TestImportExport)
> > >   testMetaExport(org.apache.hadoop.hbase.mapreduce.TestImportExport)
> > >
> > >
> > >
> >
> testExportScannerBatching(org.apache.hadoop.hbase.mapreduce.TestImportExport)
> > >   testWithFilter(org.apache.hadoop.hbase.mapreduce.TestImportExport)
> > >   testWithDeletes(org.apache.hadoop.hbase.mapreduce.TestImportExport)
> > >
> > >
> > >
> >
> testExcludeMinorCompaction(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat)
> > >
> > >
> > >
> >
> testMRIncrementalLoad(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat)
> > >
> > >
> > >
> >
> testMRIncrementalLoadWithSplit(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat)
> > >   testMROnTable(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
> > > expected:<0> but was:<1>
> > >
> > >
> > >
> >
> testMROnTableWithCustomMapper(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
> > > expected:<0> but was:<1>
> > >
> > >
> > >
> >
> testBulkOutputWithTsvImporterTextMapper(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
> > > expected:<0> but was:<1>
> > >
> > >
> > >
> >
> testBulkOutputWithAnExistingTable(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
> > > expected:<0> but was:<1>
> > >
> > >
> > >
> >
> testMROnTableWithTimestamp(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
> > > expected:<0> but was:<1>
> > >
> > >
> > >
> >
> testBulkOutputWithoutAnExistingTable(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
> > > expected:<0> but was:<1>
> > >
> > testRowCounterNoColumn(org.apache.hadoop.hbase.mapreduce.TestRowCounter)
> > >
> > >
> > >
> >
> testRowCounterHiddenColumn(org.apache.hadoop.hbase.mapreduce.TestRowCounter)
> > >
> > >
> > >
> >
> testRowCounterExclusiveColumn(org.apache.hadoop.hbase.mapreduce.TestRowCounter)
> > >   testCombiner(org.apache.hadoop.hbase.mapreduce.TestTableMapReduce)
> > >
> > >
> >
> testMultiRegionTable(org.apache.hadoop.hbase.mapreduce.TestTableMapReduce)
> > >
> > >
> > >
> >
> testScanEmptyToAPP(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)
> > >
> > >
> > >
> >
> testScanEmptyToBBA(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)
> > >
> > >
> > >
> >
> testScanEmptyToBBB(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)
> > >
> > >
> > >
> >
> testScanEmptyToOPP(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)
> > >
> > >
> > >
> >
> testScanEmptyToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)
> > >   testWALPlayer(org.apache.hadoop.hbase.mapreduce.TestWALPlayer):
> > > expected:<0> but was:<1>
> > >
> > >
> > >
> >
> testScanYZYToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
> > >
> > >
> > >
> >
> testScanOPPToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
> > >
> > >
> > >
> >
> testScanYYXToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
> > >
> > >
> > >
> >
> testScanOBBToOPP(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
> > >
> > >
> > >
> >
> testScanOBBToQPP(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
> > >
> > >
> > >
> >
> testScanFromConfiguration(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
> > >
> > >
> > >
> >
> testScanYYYToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
> > >
> > >
> > >
> >
> testExportFileSystemState(org.apache.hadoop.hbase.snapshot.TestExportSnapshot):
> > > expected:<0> but was:<1>
> > >
> > >
> > >
> >
> testSnapshotWithRefsExportFileSystemState(org.apache.hadoop.hbase.snapshot.TestExportSnapshot):
> > > expected:<0> but was:<1>
> > >
> > >
> > > Anyone else seeing this?
> > > St.Ack
> > >
> > >
> > >
> > > On Fri, Oct 11, 2013 at 10:05 AM, Stack <st...@duboce.net> wrote:
> > >
> > > > On Thu, Oct 10, 2013 at 6:39 PM, Sergey Shelukhin <
> > > sergey@hortonworks.com>wrote:
> > > >
> > > >> >Can we agree if the IT tests are green for a certain number of runs
> > in
> > > a
> > > >> row, then it's stable?
> > > >>
> > > >> What do you mean by IT tests are green? Ours are mostly green lately
> > > >> (except for recently fixed bugs).
> > > >> Can you please share some investigation details? Maybe file bugs
> with
> > > >> description of symptoms, like logs and stuff; are you sure you are
> > > hitting
> > > >> 9696 in particular?
> > > >>
> > > >
> > > > We've been trying to keep up HBASE-9696 w/ ongoing notes.  We should
> do
> > > > better for sure but big picture is that we have evidence that what is
> > in
> > > > HBASE-9696 is an improvement over what we have now having had two
> > > sustained
> > > > runs w/o data loss.   The fix is needed so we can do long-running
> > > hbase-it
> > > > suites; w/o it we were just crash-landing a few hours in.
> > > >
> > > >
> > > >> 9696 is a very big patch too, it can introduce more bugs and will
> > > require
> > > >> more fixing.
> > > >> We do need to have some deadline where large/risky changes cannot go
> > > imho.
> > > >>
> > > >>
> > > >>
> > > > Agree but after reviews, I do not know how to avoid it (see 9696 and
> > its
> > > > RB)
> > > >
> > > > I suggest we commit hbase-9696 as is since it an incompatible change
> > with
> > > > its introduction of two new states, states that we do not seem to be
> > able
> > > > to do without.  Then I cut an RC.  If further issue in 9696, we can
> > fine
> > > > tune/bug-fix post release.
> > > >
> > > > On another note, a rig run that has been going for almost 24 hours
> has
> > > > gone further than any run of the last few weeks.  That is good.
> > > >
> > > > Let us know if need any more info/insight.  Almost there.
> > > > St.Ack
> > > >
> > >
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: HEADSUP: Working on new 0.96.0RC

Posted by Devaraj Das <dd...@hortonworks.com>.
Likewise, no failures with hadoop-2.2


On Fri, Oct 11, 2013 at 2:07 PM, Ted Yu <yu...@gmail.com> wrote:

> Can you provide some detail about the test failure ?
>
> I ran test suite for trunk on hadoop 2.2 and didn't see such failure.
>
> Cheers
>
>
> On Fri, Oct 11, 2013 at 2:03 PM, Stack <st...@duboce.net> wrote:
>
> > Anyone tried the 2.2 hadoop that is up for vote at the moment?  I tried
> our
> > unit tests and got these failures:
> >
> > Failed tests:
> > testCopyTable(org.apache.hadoop.hbase.mapreduce.TestCopyTable):
> > expected:<0> but was:<1>
> >   testStartStopRow(org.apache.hadoop.hbase.mapreduce.TestCopyTable):
> > expected:<0> but was:<1>
> >
> >
> >
> testMultithreadedTableMapper(org.apache.hadoop.hbase.mapreduce.TestMultithreadedTableMapper)
> >   testSimpleCase(org.apache.hadoop.hbase.mapreduce.TestImportExport)
> >   testMetaExport(org.apache.hadoop.hbase.mapreduce.TestImportExport)
> >
> >
> >
> testExportScannerBatching(org.apache.hadoop.hbase.mapreduce.TestImportExport)
> >   testWithFilter(org.apache.hadoop.hbase.mapreduce.TestImportExport)
> >   testWithDeletes(org.apache.hadoop.hbase.mapreduce.TestImportExport)
> >
> >
> >
> testExcludeMinorCompaction(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat)
> >
> >
> >
> testMRIncrementalLoad(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat)
> >
> >
> >
> testMRIncrementalLoadWithSplit(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat)
> >   testMROnTable(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
> > expected:<0> but was:<1>
> >
> >
> >
> testMROnTableWithCustomMapper(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
> > expected:<0> but was:<1>
> >
> >
> >
> testBulkOutputWithTsvImporterTextMapper(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
> > expected:<0> but was:<1>
> >
> >
> >
> testBulkOutputWithAnExistingTable(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
> > expected:<0> but was:<1>
> >
> >
> >
> testMROnTableWithTimestamp(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
> > expected:<0> but was:<1>
> >
> >
> >
> testBulkOutputWithoutAnExistingTable(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
> > expected:<0> but was:<1>
> >
> testRowCounterNoColumn(org.apache.hadoop.hbase.mapreduce.TestRowCounter)
> >
> >
> >
> testRowCounterHiddenColumn(org.apache.hadoop.hbase.mapreduce.TestRowCounter)
> >
> >
> >
> testRowCounterExclusiveColumn(org.apache.hadoop.hbase.mapreduce.TestRowCounter)
> >   testCombiner(org.apache.hadoop.hbase.mapreduce.TestTableMapReduce)
> >
> >
> testMultiRegionTable(org.apache.hadoop.hbase.mapreduce.TestTableMapReduce)
> >
> >
> >
> testScanEmptyToAPP(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)
> >
> >
> >
> testScanEmptyToBBA(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)
> >
> >
> >
> testScanEmptyToBBB(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)
> >
> >
> >
> testScanEmptyToOPP(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)
> >
> >
> >
> testScanEmptyToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)
> >   testWALPlayer(org.apache.hadoop.hbase.mapreduce.TestWALPlayer):
> > expected:<0> but was:<1>
> >
> >
> >
> testScanYZYToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
> >
> >
> >
> testScanOPPToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
> >
> >
> >
> testScanYYXToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
> >
> >
> >
> testScanOBBToOPP(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
> >
> >
> >
> testScanOBBToQPP(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
> >
> >
> >
> testScanFromConfiguration(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
> >
> >
> >
> testScanYYYToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
> >
> >
> >
> testExportFileSystemState(org.apache.hadoop.hbase.snapshot.TestExportSnapshot):
> > expected:<0> but was:<1>
> >
> >
> >
> testSnapshotWithRefsExportFileSystemState(org.apache.hadoop.hbase.snapshot.TestExportSnapshot):
> > expected:<0> but was:<1>
> >
> >
> > Anyone else seeing this?
> > St.Ack
> >
> >
> >
> > On Fri, Oct 11, 2013 at 10:05 AM, Stack <st...@duboce.net> wrote:
> >
> > > On Thu, Oct 10, 2013 at 6:39 PM, Sergey Shelukhin <
> > sergey@hortonworks.com>wrote:
> > >
> > >> >Can we agree if the IT tests are green for a certain number of runs
> in
> > a
> > >> row, then it's stable?
> > >>
> > >> What do you mean by IT tests are green? Ours are mostly green lately
> > >> (except for recently fixed bugs).
> > >> Can you please share some investigation details? Maybe file bugs with
> > >> description of symptoms, like logs and stuff; are you sure you are
> > hitting
> > >> 9696 in particular?
> > >>
> > >
> > > We've been trying to keep up HBASE-9696 w/ ongoing notes.  We should do
> > > better for sure but big picture is that we have evidence that what is
> in
> > > HBASE-9696 is an improvement over what we have now having had two
> > sustained
> > > runs w/o data loss.   The fix is needed so we can do long-running
> > hbase-it
> > > suites; w/o it we were just crash-landing a few hours in.
> > >
> > >
> > >> 9696 is a very big patch too, it can introduce more bugs and will
> > require
> > >> more fixing.
> > >> We do need to have some deadline where large/risky changes cannot go
> > imho.
> > >>
> > >>
> > >>
> > > Agree but after reviews, I do not know how to avoid it (see 9696 and
> its
> > > RB)
> > >
> > > I suggest we commit hbase-9696 as is since it an incompatible change
> with
> > > its introduction of two new states, states that we do not seem to be
> able
> > > to do without.  Then I cut an RC.  If further issue in 9696, we can
> fine
> > > tune/bug-fix post release.
> > >
> > > On another note, a rig run that has been going for almost 24 hours has
> > > gone further than any run of the last few weeks.  That is good.
> > >
> > > Let us know if need any more info/insight.  Almost there.
> > > St.Ack
> > >
> >
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: HEADSUP: Working on new 0.96.0RC

Posted by Ted Yu <yu...@gmail.com>.
Can you provide some detail about the test failure ?

I ran test suite for trunk on hadoop 2.2 and didn't see such failure.

Cheers


On Fri, Oct 11, 2013 at 2:03 PM, Stack <st...@duboce.net> wrote:

> Anyone tried the 2.2 hadoop that is up for vote at the moment?  I tried our
> unit tests and got these failures:
>
> Failed tests:
> testCopyTable(org.apache.hadoop.hbase.mapreduce.TestCopyTable):
> expected:<0> but was:<1>
>   testStartStopRow(org.apache.hadoop.hbase.mapreduce.TestCopyTable):
> expected:<0> but was:<1>
>
>
> testMultithreadedTableMapper(org.apache.hadoop.hbase.mapreduce.TestMultithreadedTableMapper)
>   testSimpleCase(org.apache.hadoop.hbase.mapreduce.TestImportExport)
>   testMetaExport(org.apache.hadoop.hbase.mapreduce.TestImportExport)
>
>
> testExportScannerBatching(org.apache.hadoop.hbase.mapreduce.TestImportExport)
>   testWithFilter(org.apache.hadoop.hbase.mapreduce.TestImportExport)
>   testWithDeletes(org.apache.hadoop.hbase.mapreduce.TestImportExport)
>
>
> testExcludeMinorCompaction(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat)
>
>
> testMRIncrementalLoad(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat)
>
>
> testMRIncrementalLoadWithSplit(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat)
>   testMROnTable(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
> expected:<0> but was:<1>
>
>
> testMROnTableWithCustomMapper(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
> expected:<0> but was:<1>
>
>
> testBulkOutputWithTsvImporterTextMapper(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
> expected:<0> but was:<1>
>
>
> testBulkOutputWithAnExistingTable(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
> expected:<0> but was:<1>
>
>
> testMROnTableWithTimestamp(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
> expected:<0> but was:<1>
>
>
> testBulkOutputWithoutAnExistingTable(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
> expected:<0> but was:<1>
>   testRowCounterNoColumn(org.apache.hadoop.hbase.mapreduce.TestRowCounter)
>
>
> testRowCounterHiddenColumn(org.apache.hadoop.hbase.mapreduce.TestRowCounter)
>
>
> testRowCounterExclusiveColumn(org.apache.hadoop.hbase.mapreduce.TestRowCounter)
>   testCombiner(org.apache.hadoop.hbase.mapreduce.TestTableMapReduce)
>
> testMultiRegionTable(org.apache.hadoop.hbase.mapreduce.TestTableMapReduce)
>
>
> testScanEmptyToAPP(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)
>
>
> testScanEmptyToBBA(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)
>
>
> testScanEmptyToBBB(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)
>
>
> testScanEmptyToOPP(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)
>
>
> testScanEmptyToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)
>   testWALPlayer(org.apache.hadoop.hbase.mapreduce.TestWALPlayer):
> expected:<0> but was:<1>
>
>
> testScanYZYToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
>
>
> testScanOPPToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
>
>
> testScanYYXToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
>
>
> testScanOBBToOPP(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
>
>
> testScanOBBToQPP(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
>
>
> testScanFromConfiguration(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
>
>
> testScanYYYToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)
>
>
> testExportFileSystemState(org.apache.hadoop.hbase.snapshot.TestExportSnapshot):
> expected:<0> but was:<1>
>
>
> testSnapshotWithRefsExportFileSystemState(org.apache.hadoop.hbase.snapshot.TestExportSnapshot):
> expected:<0> but was:<1>
>
>
> Anyone else seeing this?
> St.Ack
>
>
>
> On Fri, Oct 11, 2013 at 10:05 AM, Stack <st...@duboce.net> wrote:
>
> > On Thu, Oct 10, 2013 at 6:39 PM, Sergey Shelukhin <
> sergey@hortonworks.com>wrote:
> >
> >> >Can we agree if the IT tests are green for a certain number of runs in
> a
> >> row, then it's stable?
> >>
> >> What do you mean by IT tests are green? Ours are mostly green lately
> >> (except for recently fixed bugs).
> >> Can you please share some investigation details? Maybe file bugs with
> >> description of symptoms, like logs and stuff; are you sure you are
> hitting
> >> 9696 in particular?
> >>
> >
> > We've been trying to keep up HBASE-9696 w/ ongoing notes.  We should do
> > better for sure but big picture is that we have evidence that what is in
> > HBASE-9696 is an improvement over what we have now having had two
> sustained
> > runs w/o data loss.   The fix is needed so we can do long-running
> hbase-it
> > suites; w/o it we were just crash-landing a few hours in.
> >
> >
> >> 9696 is a very big patch too, it can introduce more bugs and will
> require
> >> more fixing.
> >> We do need to have some deadline where large/risky changes cannot go
> imho.
> >>
> >>
> >>
> > Agree but after reviews, I do not know how to avoid it (see 9696 and its
> > RB)
> >
> > I suggest we commit hbase-9696 as is since it an incompatible change with
> > its introduction of two new states, states that we do not seem to be able
> > to do without.  Then I cut an RC.  If further issue in 9696, we can fine
> > tune/bug-fix post release.
> >
> > On another note, a rig run that has been going for almost 24 hours has
> > gone further than any run of the last few weeks.  That is good.
> >
> > Let us know if need any more info/insight.  Almost there.
> > St.Ack
> >
>

Re: HEADSUP: Working on new 0.96.0RC

Posted by Stack <st...@duboce.net>.
Anyone tried the 2.2 hadoop that is up for vote at the moment?  I tried our
unit tests and got these failures:

Failed tests:
testCopyTable(org.apache.hadoop.hbase.mapreduce.TestCopyTable):
expected:<0> but was:<1>
  testStartStopRow(org.apache.hadoop.hbase.mapreduce.TestCopyTable):
expected:<0> but was:<1>

testMultithreadedTableMapper(org.apache.hadoop.hbase.mapreduce.TestMultithreadedTableMapper)
  testSimpleCase(org.apache.hadoop.hbase.mapreduce.TestImportExport)
  testMetaExport(org.apache.hadoop.hbase.mapreduce.TestImportExport)

testExportScannerBatching(org.apache.hadoop.hbase.mapreduce.TestImportExport)
  testWithFilter(org.apache.hadoop.hbase.mapreduce.TestImportExport)
  testWithDeletes(org.apache.hadoop.hbase.mapreduce.TestImportExport)

testExcludeMinorCompaction(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat)

testMRIncrementalLoad(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat)

testMRIncrementalLoadWithSplit(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat)
  testMROnTable(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
expected:<0> but was:<1>

testMROnTableWithCustomMapper(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
expected:<0> but was:<1>

testBulkOutputWithTsvImporterTextMapper(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
expected:<0> but was:<1>

testBulkOutputWithAnExistingTable(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
expected:<0> but was:<1>

testMROnTableWithTimestamp(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
expected:<0> but was:<1>

testBulkOutputWithoutAnExistingTable(org.apache.hadoop.hbase.mapreduce.TestImportTsv):
expected:<0> but was:<1>
  testRowCounterNoColumn(org.apache.hadoop.hbase.mapreduce.TestRowCounter)

testRowCounterHiddenColumn(org.apache.hadoop.hbase.mapreduce.TestRowCounter)

testRowCounterExclusiveColumn(org.apache.hadoop.hbase.mapreduce.TestRowCounter)
  testCombiner(org.apache.hadoop.hbase.mapreduce.TestTableMapReduce)
  testMultiRegionTable(org.apache.hadoop.hbase.mapreduce.TestTableMapReduce)

testScanEmptyToAPP(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)

testScanEmptyToBBA(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)

testScanEmptyToBBB(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)

testScanEmptyToOPP(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)

testScanEmptyToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1)
  testWALPlayer(org.apache.hadoop.hbase.mapreduce.TestWALPlayer):
expected:<0> but was:<1>

testScanYZYToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)

testScanOPPToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)

testScanYYXToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)

testScanOBBToOPP(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)

testScanOBBToQPP(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)

testScanFromConfiguration(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)

testScanYYYToEmpty(org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2)

testExportFileSystemState(org.apache.hadoop.hbase.snapshot.TestExportSnapshot):
expected:<0> but was:<1>

testSnapshotWithRefsExportFileSystemState(org.apache.hadoop.hbase.snapshot.TestExportSnapshot):
expected:<0> but was:<1>


Anyone else seeing this?
St.Ack



On Fri, Oct 11, 2013 at 10:05 AM, Stack <st...@duboce.net> wrote:

> On Thu, Oct 10, 2013 at 6:39 PM, Sergey Shelukhin <se...@hortonworks.com>wrote:
>
>> >Can we agree if the IT tests are green for a certain number of runs in a
>> row, then it's stable?
>>
>> What do you mean by IT tests are green? Ours are mostly green lately
>> (except for recently fixed bugs).
>> Can you please share some investigation details? Maybe file bugs with
>> description of symptoms, like logs and stuff; are you sure you are hitting
>> 9696 in particular?
>>
>
> We've been trying to keep up HBASE-9696 w/ ongoing notes.  We should do
> better for sure but big picture is that we have evidence that what is in
> HBASE-9696 is an improvement over what we have now having had two sustained
> runs w/o data loss.   The fix is needed so we can do long-running hbase-it
> suites; w/o it we were just crash-landing a few hours in.
>
>
>> 9696 is a very big patch too, it can introduce more bugs and will require
>> more fixing.
>> We do need to have some deadline where large/risky changes cannot go imho.
>>
>>
>>
> Agree but after reviews, I do not know how to avoid it (see 9696 and its
> RB)
>
> I suggest we commit hbase-9696 as is since it an incompatible change with
> its introduction of two new states, states that we do not seem to be able
> to do without.  Then I cut an RC.  If further issue in 9696, we can fine
> tune/bug-fix post release.
>
> On another note, a rig run that has been going for almost 24 hours has
> gone further than any run of the last few weeks.  That is good.
>
> Let us know if need any more info/insight.  Almost there.
> St.Ack
>

Re: HEADSUP: Working on new 0.96.0RC

Posted by Stack <st...@duboce.net>.
On Thu, Oct 10, 2013 at 6:39 PM, Sergey Shelukhin <se...@hortonworks.com>wrote:

> >Can we agree if the IT tests are green for a certain number of runs in a
> row, then it's stable?
>
> What do you mean by IT tests are green? Ours are mostly green lately
> (except for recently fixed bugs).
> Can you please share some investigation details? Maybe file bugs with
> description of symptoms, like logs and stuff; are you sure you are hitting
> 9696 in particular?
>

We've been trying to keep up HBASE-9696 w/ ongoing notes.  We should do
better for sure but big picture is that we have evidence that what is in
HBASE-9696 is an improvement over what we have now having had two sustained
runs w/o data loss.   The fix is needed so we can do long-running hbase-it
suites; w/o it we were just crash-landing a few hours in.


> 9696 is a very big patch too, it can introduce more bugs and will require
> more fixing.
> We do need to have some deadline where large/risky changes cannot go imho.
>
>
>
Agree but after reviews, I do not know how to avoid it (see 9696 and its RB)

I suggest we commit hbase-9696 as is since it an incompatible change with
its introduction of two new states, states that we do not seem to be able
to do without.  Then I cut an RC.  If further issue in 9696, we can fine
tune/bug-fix post release.

On another note, a rig run that has been going for almost 24 hours has gone
further than any run of the last few weeks.  That is good.

Let us know if need any more info/insight.  Almost there.
St.Ack

Re: HEADSUP: Working on new 0.96.0RC

Posted by Sergey Shelukhin <se...@hortonworks.com>.
>Can we agree if the IT tests are green for a certain number of runs in a
row, then it's stable?

What do you mean by IT tests are green? Ours are mostly green lately
(except for recently fixed bugs).
Can you please share some investigation details? Maybe file bugs with
description of symptoms, like logs and stuff; are you sure you are hitting
9696 in particular?
9696 is a very big patch too, it can introduce more bugs and will require
more fixing.
We do need to have some deadline where large/risky changes cannot go imho.





On Thu, Oct 10, 2013 at 10:14 AM, Jimmy Xiang <jx...@cloudera.com> wrote:

> Can we agree if the IT tests are green for a certain number of runs in a
> row, then it's stable?
>
>
> On Thu, Oct 10, 2013 at 10:08 AM, Sergey Shelukhin
> <se...@hortonworks.com>wrote:
>
> > It looks like HBASE-9724 got committed. Was it the final patch for it?
> It's
> > a small and hopefully safe.
> >
> > If patch is large and risky, and the feature it fixes is
> semi-experimental,
> > like HBASE-9696, IMHO it should not be blocker for the release.
> > The concern is that we keep making large changes to AM that fix some bugs
> > but may introduce more bugs (like it happened with the last one), so it's
> > hard to tell when it will stabilize at all.
> >
> >
> >
> > On Wed, Oct 9, 2013 at 9:22 PM, Elliott Clark <ec...@apache.org> wrote:
> >
> > > To me it feels like HBASE-9724 should go into 0.96.  We're not
> releasing
> > a
> > > new rc tonight, it seems really weird to hold up a bug fix to try and
> hit
> > > some unknown, and un-agreed upon, deadline.
> > >
> > >
> > > On Wed, Oct 9, 2013 at 8:08 PM, Stack <st...@duboce.net> wrote:
> > >
> > > > On Wed, Oct 9, 2013 at 7:14 PM, Enis Söztutar <en...@apache.org>
> wrote:
> > > >
> > > >> > Anyways, if you fellas can't wait anymore, just say and we'll
> figure
> > > out
> > > >> something.
> > > >> As I see it, HBASE-9563 is committed,
> > > >
> > > >
> > > > It is still open and committed with qualification "Stack:...Was going
> > to
> > > > try this first but likely needs more..."  and "Elliott: +1 I think
> it's
> > > > an improvement even if it doesn't 100% fix the master issue."
> > > >
> > > >
> > > >
> > > >> and HBASE-9696 is not a blocker
> > > >> against 0.96. But if you argue that 9696 is indeed a blocker, let's
> > > raise
> > > >> it as such.
> > > >
> > > >
> > > >
> > > > Agree.
> > > >
> > > >
> > > >
> > > >> There is no point in creating an RC, an immediately sinking it
> > > >> if we cannot verify the RC for a +1. We don't run into data loss
> > issues
> > > >> anymore which is why I still think we can release 0.96 even without
> > 9696
> > > >> and 9724. Nothing is preventing us to release 0.96.1, with this and
> > more
> > > >> fixes in let's say a couple of weeks or months.
> > > >>
> > > >> I guess let's wait for tomorrow to see whether there is any progress
> > on
> > > >> 9563 and 9696.
> > > >>
> > > >
> > > > Yes.  Lets take this up tomorrow.  Elliott and I are on the master
> > issue,
> > > > HBASE-9563, this evening.
> > > >
> > > > Thanks Enis,
> > > > St.Ack
> > > >
> > >
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: HEADSUP: Working on new 0.96.0RC

Posted by Jimmy Xiang <jx...@cloudera.com>.
Can we agree if the IT tests are green for a certain number of runs in a
row, then it's stable?


On Thu, Oct 10, 2013 at 10:08 AM, Sergey Shelukhin
<se...@hortonworks.com>wrote:

> It looks like HBASE-9724 got committed. Was it the final patch for it? It's
> a small and hopefully safe.
>
> If patch is large and risky, and the feature it fixes is semi-experimental,
> like HBASE-9696, IMHO it should not be blocker for the release.
> The concern is that we keep making large changes to AM that fix some bugs
> but may introduce more bugs (like it happened with the last one), so it's
> hard to tell when it will stabilize at all.
>
>
>
> On Wed, Oct 9, 2013 at 9:22 PM, Elliott Clark <ec...@apache.org> wrote:
>
> > To me it feels like HBASE-9724 should go into 0.96.  We're not releasing
> a
> > new rc tonight, it seems really weird to hold up a bug fix to try and hit
> > some unknown, and un-agreed upon, deadline.
> >
> >
> > On Wed, Oct 9, 2013 at 8:08 PM, Stack <st...@duboce.net> wrote:
> >
> > > On Wed, Oct 9, 2013 at 7:14 PM, Enis Söztutar <en...@apache.org> wrote:
> > >
> > >> > Anyways, if you fellas can't wait anymore, just say and we'll figure
> > out
> > >> something.
> > >> As I see it, HBASE-9563 is committed,
> > >
> > >
> > > It is still open and committed with qualification "Stack:...Was going
> to
> > > try this first but likely needs more..."  and "Elliott: +1 I think it's
> > > an improvement even if it doesn't 100% fix the master issue."
> > >
> > >
> > >
> > >> and HBASE-9696 is not a blocker
> > >> against 0.96. But if you argue that 9696 is indeed a blocker, let's
> > raise
> > >> it as such.
> > >
> > >
> > >
> > > Agree.
> > >
> > >
> > >
> > >> There is no point in creating an RC, an immediately sinking it
> > >> if we cannot verify the RC for a +1. We don't run into data loss
> issues
> > >> anymore which is why I still think we can release 0.96 even without
> 9696
> > >> and 9724. Nothing is preventing us to release 0.96.1, with this and
> more
> > >> fixes in let's say a couple of weeks or months.
> > >>
> > >> I guess let's wait for tomorrow to see whether there is any progress
> on
> > >> 9563 and 9696.
> > >>
> > >
> > > Yes.  Lets take this up tomorrow.  Elliott and I are on the master
> issue,
> > > HBASE-9563, this evening.
> > >
> > > Thanks Enis,
> > > St.Ack
> > >
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: HEADSUP: Working on new 0.96.0RC

Posted by Sergey Shelukhin <se...@hortonworks.com>.
It looks like HBASE-9724 got committed. Was it the final patch for it? It's
a small and hopefully safe.

If patch is large and risky, and the feature it fixes is semi-experimental,
like HBASE-9696, IMHO it should not be blocker for the release.
The concern is that we keep making large changes to AM that fix some bugs
but may introduce more bugs (like it happened with the last one), so it's
hard to tell when it will stabilize at all.



On Wed, Oct 9, 2013 at 9:22 PM, Elliott Clark <ec...@apache.org> wrote:

> To me it feels like HBASE-9724 should go into 0.96.  We're not releasing a
> new rc tonight, it seems really weird to hold up a bug fix to try and hit
> some unknown, and un-agreed upon, deadline.
>
>
> On Wed, Oct 9, 2013 at 8:08 PM, Stack <st...@duboce.net> wrote:
>
> > On Wed, Oct 9, 2013 at 7:14 PM, Enis Söztutar <en...@apache.org> wrote:
> >
> >> > Anyways, if you fellas can't wait anymore, just say and we'll figure
> out
> >> something.
> >> As I see it, HBASE-9563 is committed,
> >
> >
> > It is still open and committed with qualification "Stack:...Was going to
> > try this first but likely needs more..."  and "Elliott: +1 I think it's
> > an improvement even if it doesn't 100% fix the master issue."
> >
> >
> >
> >> and HBASE-9696 is not a blocker
> >> against 0.96. But if you argue that 9696 is indeed a blocker, let's
> raise
> >> it as such.
> >
> >
> >
> > Agree.
> >
> >
> >
> >> There is no point in creating an RC, an immediately sinking it
> >> if we cannot verify the RC for a +1. We don't run into data loss issues
> >> anymore which is why I still think we can release 0.96 even without 9696
> >> and 9724. Nothing is preventing us to release 0.96.1, with this and more
> >> fixes in let's say a couple of weeks or months.
> >>
> >> I guess let's wait for tomorrow to see whether there is any progress on
> >> 9563 and 9696.
> >>
> >
> > Yes.  Lets take this up tomorrow.  Elliott and I are on the master issue,
> > HBASE-9563, this evening.
> >
> > Thanks Enis,
> > St.Ack
> >
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: HEADSUP: Working on new 0.96.0RC

Posted by Elliott Clark <ec...@apache.org>.
To me it feels like HBASE-9724 should go into 0.96.  We're not releasing a
new rc tonight, it seems really weird to hold up a bug fix to try and hit
some unknown, and un-agreed upon, deadline.


On Wed, Oct 9, 2013 at 8:08 PM, Stack <st...@duboce.net> wrote:

> On Wed, Oct 9, 2013 at 7:14 PM, Enis Söztutar <en...@apache.org> wrote:
>
>> > Anyways, if you fellas can't wait anymore, just say and we'll figure out
>> something.
>> As I see it, HBASE-9563 is committed,
>
>
> It is still open and committed with qualification "Stack:...Was going to
> try this first but likely needs more..."  and "Elliott: +1 I think it's
> an improvement even if it doesn't 100% fix the master issue."
>
>
>
>> and HBASE-9696 is not a blocker
>> against 0.96. But if you argue that 9696 is indeed a blocker, let's raise
>> it as such.
>
>
>
> Agree.
>
>
>
>> There is no point in creating an RC, an immediately sinking it
>> if we cannot verify the RC for a +1. We don't run into data loss issues
>> anymore which is why I still think we can release 0.96 even without 9696
>> and 9724. Nothing is preventing us to release 0.96.1, with this and more
>> fixes in let's say a couple of weeks or months.
>>
>> I guess let's wait for tomorrow to see whether there is any progress on
>> 9563 and 9696.
>>
>
> Yes.  Lets take this up tomorrow.  Elliott and I are on the master issue,
> HBASE-9563, this evening.
>
> Thanks Enis,
> St.Ack
>

Re: HEADSUP: Working on new 0.96.0RC

Posted by Stack <st...@duboce.net>.
On Wed, Oct 9, 2013 at 7:14 PM, Enis Söztutar <en...@apache.org> wrote:

> > Anyways, if you fellas can't wait anymore, just say and we'll figure out
> something.
> As I see it, HBASE-9563 is committed,


It is still open and committed with qualification "Stack:...Was going to
try this first but likely needs more..."  and "Elliott: +1 I think it's an
improvement even if it doesn't 100% fix the master issue."



> and HBASE-9696 is not a blocker
> against 0.96. But if you argue that 9696 is indeed a blocker, let's raise
> it as such.



Agree.



> There is no point in creating an RC, an immediately sinking it
> if we cannot verify the RC for a +1. We don't run into data loss issues
> anymore which is why I still think we can release 0.96 even without 9696
> and 9724. Nothing is preventing us to release 0.96.1, with this and more
> fixes in let's say a couple of weeks or months.
>
> I guess let's wait for tomorrow to see whether there is any progress on
> 9563 and 9696.
>

Yes.  Lets take this up tomorrow.  Elliott and I are on the master issue,
HBASE-9563, this evening.

Thanks Enis,
St.Ack

Re: HEADSUP: Working on new 0.96.0RC

Posted by Enis Söztutar <en...@apache.org>.
> Anyways, if you fellas can't wait anymore, just say and we'll figure out
something.
As I see it, HBASE-9563 is committed, and HBASE-9696 is not a blocker
against 0.96. But if you argue that 9696 is indeed a blocker, let's raise
it as such. There is no point in creating an RC, an immediately sinking it
if we cannot verify the RC for a +1. We don't run into data loss issues
anymore which is why I still think we can release 0.96 even without 9696
and 9724. Nothing is preventing us to release 0.96.1, with this and more
fixes in let's say a couple of weeks or months.

I guess let's wait for tomorrow to see whether there is any progress on
9563 and 9696.

Enis


On Wed, Oct 9, 2013 at 5:55 PM, Stack <st...@duboce.net> wrote:

> On Wed, Oct 9, 2013 at 5:30 PM, Enis Söztutar <en...@apache.org> wrote:
>
>>  HBASE-9563 is trivial enough and it is already in 0.96. We may have run
>> that into some point, but not lately. Do you see your tests succeeding
>> with
>> HBASE-9563 and HBASE-9696?
>>
>>
> Both are under test in independent rigs.  For HBASE-9563, we are trying to
> repro the clash of the masters to see if the patch helped.  We've also
> instrumented the rig so we can get more data when we hit the hang again.
>
> Anyways, if you fellas can't wait anymore, just say and we'll figure out
> something.
>

Re: HEADSUP: Working on new 0.96.0RC

Posted by Stack <st...@duboce.net>.
On Wed, Oct 9, 2013 at 5:30 PM, Enis Söztutar <en...@apache.org> wrote:

>  HBASE-9563 is trivial enough and it is already in 0.96. We may have run
> that into some point, but not lately. Do you see your tests succeeding with
> HBASE-9563 and HBASE-9696?
>
>
Both are under test in independent rigs.  For HBASE-9563, we are trying to
repro the clash of the masters to see if the patch helped.  We've also
instrumented the rig so we can get more data when we hit the hang again.

Anyways, if you fellas can't wait anymore, just say and we'll figure out
something.

Re: HEADSUP: Working on new 0.96.0RC

Posted by Enis Söztutar <en...@apache.org>.
 HBASE-9563 is trivial enough and it is already in 0.96. We may have run
that into some point, but not lately. Do you see your tests succeeding with
HBASE-9563 and HBASE-9696?


On Wed, Oct 9, 2013 at 5:13 PM, Stack <st...@duboce.net> wrote:

> On Wed, Oct 9, 2013 at 4:51 PM, Enis Söztutar <en...@apache.org> wrote:
>
> > HBASE-9563 is already committed to 0.96. That leaves only HBASE-9696 and
> > HBASE-9724 under discussion. I am holding on committing 9724 for the time
> > being. Are there any more issues that might be a blocker against this
> > release?
> >
> >
> As mentioned above, HBASE-9563 makes it so our hbase-it suite does not
> complete.  We've not had a successful run with weeks on our end.  This
> issue is our current stumbling block.  Let me designate it a blocker while
> we are digging and discussing.  You fellas are not running into this?
>
> Thanks,
> St.Ack
>

Re: HEADSUP: Working on new 0.96.0RC

Posted by Stack <st...@duboce.net>.
On Wed, Oct 9, 2013 at 4:51 PM, Enis Söztutar <en...@apache.org> wrote:

> HBASE-9563 is already committed to 0.96. That leaves only HBASE-9696 and
> HBASE-9724 under discussion. I am holding on committing 9724 for the time
> being. Are there any more issues that might be a blocker against this
> release?
>
>
As mentioned above, HBASE-9563 makes it so our hbase-it suite does not
complete.  We've not had a successful run with weeks on our end.  This
issue is our current stumbling block.  Let me designate it a blocker while
we are digging and discussing.  You fellas are not running into this?

Thanks,
St.Ack

Re: HEADSUP: Working on new 0.96.0RC

Posted by Enis Söztutar <en...@apache.org>.
HBASE-9563 is already committed to 0.96. That leaves only HBASE-9696 and
HBASE-9724 under discussion. I am holding on committing 9724 for the time
being. Are there any more issues that might be a blocker against this
release?

After 1.5 years without a major release, and the RC process nearing 40
days, I think we should only accept absolute blockers at this point. As far
as I am concerned, neither 9724 nor 9696 is a blocker against 0.96. Merge
is a new feature, and nothing critical depends on it. We can release saying
that merge is experimental (which was how it originally introduced, AFAIK)
and disable merge in CM for now if it makes tests flaky. We did not
identify a root cause that would point to 9696 although we are running
tests with CM for some time. We can still fix the merge and do a quick
0.96.1, in the release train model that proved to be so successful for
0.94. We do not have to delay 0.96 another month just because to fix a
corner case for a new feature.

As per our testing, we have been testing the 95 and 96 branches for a
couple of months. We still see some sporadic failures for CM tests, but no
blockers at this point. Most of the issues have been fixed so far. Our
nightlies run ITTBLL, ITLAV, both with and without CM running for ~3 hours,
ITMTTR, and many other IT's. My manual runs for longer intervals also
succeeds for now. Remember that none of these IT's would run even once for
earlier versions of 0.94 or before.

Ellliot, what are the root causes for the failures you are seeing? There
are no blockers raised as far as I can see. Let's decide on HBASE-9696
whether it is a blocker, and do the new candidate based on that unless
there are more blockers.

Enis


On Wed, Oct 9, 2013 at 2:52 PM, Elliott Clark <ec...@apache.org> wrote:

> On Wed, Oct 9, 2013 at 2:33 PM, Devaraj Das <dd...@hortonworks.com> wrote:
> >
> > For the 0.96.0 version, can we not say that "merge" should be used
> > with caution.
>
>
> I would feel very uncomfortable with that.  Telling people to just
> hope that the servers don't crash while a merge is going on seems like
> an unwise strategy.  Crashing or power failures are completely beyond
> users control. Since we have a proposed fix it seems better to me,
> that we hold off on this.  Get the tests done.  Then get the patch in,
> and start another round of testing.
>
> Also the master not coming back up, while not a known data loss issue
> like 9696 is very concerning.  We should get to the bottom of this.
> It's making TestMTTR fail, along with others sporadically.
>
> We've taken > 1.5 years on this release and we're on the home stretch.
>  We should make sure this is a really stable and quality release and
> not try and rush it.  Right now we're failing IT tests left and right.
>  We can't even pass an ingest test that lasts 4 hours.  That's
> something I can't see myself recommending to anyone in it's current
> state.  So that seems to me something that we shouldn't release.  And
> if we put up an RC now then we just know that it's going to fail IT
> tests and so will probably be a failed RC.
>
> I want this release out as badly as anyone else but I'd rather we have
> something that people can really and truly trust and not just
> something we have rushed.
>

Re: HEADSUP: Working on new 0.96.0RC

Posted by Elliott Clark <ec...@apache.org>.
On Wed, Oct 9, 2013 at 2:33 PM, Devaraj Das <dd...@hortonworks.com> wrote:
>
> For the 0.96.0 version, can we not say that "merge" should be used
> with caution.


I would feel very uncomfortable with that.  Telling people to just
hope that the servers don't crash while a merge is going on seems like
an unwise strategy.  Crashing or power failures are completely beyond
users control. Since we have a proposed fix it seems better to me,
that we hold off on this.  Get the tests done.  Then get the patch in,
and start another round of testing.

Also the master not coming back up, while not a known data loss issue
like 9696 is very concerning.  We should get to the bottom of this.
It's making TestMTTR fail, along with others sporadically.

We've taken > 1.5 years on this release and we're on the home stretch.
 We should make sure this is a really stable and quality release and
not try and rush it.  Right now we're failing IT tests left and right.
 We can't even pass an ingest test that lasts 4 hours.  That's
something I can't see myself recommending to anyone in it's current
state.  So that seems to me something that we shouldn't release.  And
if we put up an RC now then we just know that it's going to fail IT
tests and so will probably be a failed RC.

I want this release out as badly as anyone else but I'd rather we have
something that people can really and truly trust and not just
something we have rushed.

Re: HEADSUP: Working on new 0.96.0RC

Posted by Jimmy Xiang <jx...@cloudera.com>.
Not the move call, the HMaster#unassign() call.  It reads the region from
meta.  It is used in MoveRandomRegionOfTableAction.

For this call, we don't want to check AM region states map because it is
used by hbck in case AM region states is stale.


On Wed, Oct 9, 2013 at 4:05 PM, Enis Söztutar <en...@apache.org> wrote:

> On Wed, Oct 9, 2013 at 2:49 PM, Jimmy Xiang <jx...@cloudera.com> wrote:
>
> > I prefer to have 9696 in. It's not just about merging.  I am also trying
> > to make sure splitting is good.  Currently, if a region is splitting, the
> > two daughters are wrote to meta at first. CM could move them around
> before
> > master knows about these two new regions. So they could be
> double-assigned
> > for a short while.  It could be a cause why ITBLL still shows data loss
> > somewhere.
> >
>
> Is this really the case? If client learns the daughter regions from meta,
> before master learns about the split, even if they call
> HMaster.moveRegion(), they would get UnknownRegionException, no?
>
>  void move(final byte[] encodedRegionName,
>
>       final byte[] destServerName) throws HBaseIOException {
>
> RegionState regionState = assignmentManager.getRegionStates().
>
>       getRegionState(Bytes.toString(encodedRegionName));
>
>     if (regionState == null) {
>
>       throw
> newUnknownRegionException(Bytes.toStringBinary(encodedRegionName));
>     }
>
>
>
> >
> > I think we should make sure ITBLL runs well with no data loss before we
> > release 0.96.0.  Data loss is a big concern to me.
> >
> >
> > On Wed, Oct 9, 2013 at 2:33 PM, Devaraj Das <dd...@hortonworks.com>
> wrote:
> >
> >> I am not sure I agree with this though. The reason being - HBASE-9696
> was
> >> raised on Saturday and we have cut an RC after that. So why not another
> >> one
> >> now? For the 0.96.0 version, can we not say that "merge" should be used
> >> with caution. Also, it is not guaranteed that we will not face any new
> IT
> >> issues after 9696 goes in, right?
> >> Let's cut 0.96.0 now and fix remaining issues in 0.96.1. Thoughts?
> >>
> >>
> >> On Wed, Oct 9, 2013 at 2:17 PM, Elliott Clark <ec...@apache.org>
> wrote:
> >>
> >> > At this point I think that we should have real clean IT test runs
> before
> >> > cutting another release.  And we can't really get that until the
> master
> >> > always comes back up (The issue stack was working on yesterday) and
> >> until
> >> > merging is stable.  I would like to see those two things fixed before
> >> 0.96
> >> >
> >> >
> >> > On Wed, Oct 9, 2013 at 1:38 PM, Devaraj Das <dd...@hortonworks.com>
> >> wrote:
> >> >
> >> > > I'd say we cut an RC now (without any more fixes).
> >> > >
> >> > >
> >> > > On Wed, Oct 9, 2013 at 12:45 PM, Jimmy Xiang <jx...@cloudera.com>
> >> > wrote:
> >> > >
> >> > > > It's testing now. :)
> >> > > >
> >> > > >
> >> > > > On Wed, Oct 9, 2013 at 12:42 PM, Sergey Shelukhin <
> >> > > sergey@hortonworks.com
> >> > > > >wrote:
> >> > > >
> >> > > > > 9696 looks a little bit scary... did you guys test it on your
> rig?
> >> > > > >
> >> > > > >
> >> > > > > On Wed, Oct 9, 2013 at 11:54 AM, Stack <st...@duboce.net>
> wrote:
> >> > > > >
> >> > > > > > On Wed, Oct 9, 2013 at 10:45 AM, Enis Söztutar <
> enis@apache.org
> >> >
> >> > > > wrote:
> >> > > > > >
> >> > > > > > > I just committed
> >> > > https://issues.apache.org/jira/browse/HBASE-9730for
> >> > > > > > > this.
> >> > > > > > > Time for another RC, what do you think?
> >> > > > > > >
> >> > > > > > > You know what they say, sixth time is the charm.
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > I can cut one no problem.  Just say.
> >> > > > > >
> >> > > > > > Does your test rig pass?  Ours hasn't yet because of
> HBASE-9563;
> >> > > master
> >> > > > > is
> >> > > > > > killed and won't come back though restarted and tests fail.
> >> > > > > >
> >> > > > > > Do we want HBASE-9696 in there?  It is currently under test.
> >> > > > > >
> >> > > > > > And HBASE-9724 Failed region split is not handled correctly by
> >> AM?
> >> > > > > >
> >> > > > > > But if you fellas need me to put up a new one, just say.  Just
> >> > takes
> >> > > a
> >> > > > > few
> >> > > > > > hours.
> >> > > > > >
> >> > > > > > St.Ack
> >> > > > > >
> >> > > > >
> >> > > > > --
> >> > > > > CONFIDENTIALITY NOTICE
> >> > > > > NOTICE: This message is intended for the use of the individual
> or
> >> > > entity
> >> > > > to
> >> > > > > which it is addressed and may contain information that is
> >> > confidential,
> >> > > > > privileged and exempt from disclosure under applicable law. If
> the
> >> > > reader
> >> > > > > of this message is not the intended recipient, you are hereby
> >> > notified
> >> > > > that
> >> > > > > any printing, copying, dissemination, distribution, disclosure
> or
> >> > > > > forwarding of this communication is strictly prohibited. If you
> >> have
> >> > > > > received this communication in error, please contact the sender
> >> > > > immediately
> >> > > > > and delete it from your system. Thank You.
> >> > > > >
> >> > > >
> >> > >
> >> > > --
> >> > > CONFIDENTIALITY NOTICE
> >> > > NOTICE: This message is intended for the use of the individual or
> >> entity
> >> > to
> >> > > which it is addressed and may contain information that is
> >> confidential,
> >> > > privileged and exempt from disclosure under applicable law. If the
> >> reader
> >> > > of this message is not the intended recipient, you are hereby
> notified
> >> > that
> >> > > any printing, copying, dissemination, distribution, disclosure or
> >> > > forwarding of this communication is strictly prohibited. If you have
> >> > > received this communication in error, please contact the sender
> >> > immediately
> >> > > and delete it from your system. Thank You.
> >> > >
> >> >
> >>
> >> --
> >> CONFIDENTIALITY NOTICE
> >> NOTICE: This message is intended for the use of the individual or entity
> >> to
> >> which it is addressed and may contain information that is confidential,
> >> privileged and exempt from disclosure under applicable law. If the
> reader
> >> of this message is not the intended recipient, you are hereby notified
> >> that
> >> any printing, copying, dissemination, distribution, disclosure or
> >> forwarding of this communication is strictly prohibited. If you have
> >> received this communication in error, please contact the sender
> >> immediately
> >> and delete it from your system. Thank You.
> >>
> >
> >
>

Re: HEADSUP: Working on new 0.96.0RC

Posted by Enis Söztutar <en...@apache.org>.
On Wed, Oct 9, 2013 at 2:49 PM, Jimmy Xiang <jx...@cloudera.com> wrote:

> I prefer to have 9696 in. It's not just about merging.  I am also trying
> to make sure splitting is good.  Currently, if a region is splitting, the
> two daughters are wrote to meta at first. CM could move them around before
> master knows about these two new regions. So they could be double-assigned
> for a short while.  It could be a cause why ITBLL still shows data loss
> somewhere.
>

Is this really the case? If client learns the daughter regions from meta,
before master learns about the split, even if they call
HMaster.moveRegion(), they would get UnknownRegionException, no?

 void move(final byte[] encodedRegionName,

      final byte[] destServerName) throws HBaseIOException {

RegionState regionState = assignmentManager.getRegionStates().

      getRegionState(Bytes.toString(encodedRegionName));

    if (regionState == null) {

      throw newUnknownRegionException(Bytes.toStringBinary(encodedRegionName));
    }



>
> I think we should make sure ITBLL runs well with no data loss before we
> release 0.96.0.  Data loss is a big concern to me.
>
>
> On Wed, Oct 9, 2013 at 2:33 PM, Devaraj Das <dd...@hortonworks.com> wrote:
>
>> I am not sure I agree with this though. The reason being - HBASE-9696 was
>> raised on Saturday and we have cut an RC after that. So why not another
>> one
>> now? For the 0.96.0 version, can we not say that "merge" should be used
>> with caution. Also, it is not guaranteed that we will not face any new IT
>> issues after 9696 goes in, right?
>> Let's cut 0.96.0 now and fix remaining issues in 0.96.1. Thoughts?
>>
>>
>> On Wed, Oct 9, 2013 at 2:17 PM, Elliott Clark <ec...@apache.org> wrote:
>>
>> > At this point I think that we should have real clean IT test runs before
>> > cutting another release.  And we can't really get that until the master
>> > always comes back up (The issue stack was working on yesterday) and
>> until
>> > merging is stable.  I would like to see those two things fixed before
>> 0.96
>> >
>> >
>> > On Wed, Oct 9, 2013 at 1:38 PM, Devaraj Das <dd...@hortonworks.com>
>> wrote:
>> >
>> > > I'd say we cut an RC now (without any more fixes).
>> > >
>> > >
>> > > On Wed, Oct 9, 2013 at 12:45 PM, Jimmy Xiang <jx...@cloudera.com>
>> > wrote:
>> > >
>> > > > It's testing now. :)
>> > > >
>> > > >
>> > > > On Wed, Oct 9, 2013 at 12:42 PM, Sergey Shelukhin <
>> > > sergey@hortonworks.com
>> > > > >wrote:
>> > > >
>> > > > > 9696 looks a little bit scary... did you guys test it on your rig?
>> > > > >
>> > > > >
>> > > > > On Wed, Oct 9, 2013 at 11:54 AM, Stack <st...@duboce.net> wrote:
>> > > > >
>> > > > > > On Wed, Oct 9, 2013 at 10:45 AM, Enis Söztutar <enis@apache.org
>> >
>> > > > wrote:
>> > > > > >
>> > > > > > > I just committed
>> > > https://issues.apache.org/jira/browse/HBASE-9730for
>> > > > > > > this.
>> > > > > > > Time for another RC, what do you think?
>> > > > > > >
>> > > > > > > You know what they say, sixth time is the charm.
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > I can cut one no problem.  Just say.
>> > > > > >
>> > > > > > Does your test rig pass?  Ours hasn't yet because of HBASE-9563;
>> > > master
>> > > > > is
>> > > > > > killed and won't come back though restarted and tests fail.
>> > > > > >
>> > > > > > Do we want HBASE-9696 in there?  It is currently under test.
>> > > > > >
>> > > > > > And HBASE-9724 Failed region split is not handled correctly by
>> AM?
>> > > > > >
>> > > > > > But if you fellas need me to put up a new one, just say.  Just
>> > takes
>> > > a
>> > > > > few
>> > > > > > hours.
>> > > > > >
>> > > > > > St.Ack
>> > > > > >
>> > > > >
>> > > > > --
>> > > > > CONFIDENTIALITY NOTICE
>> > > > > NOTICE: This message is intended for the use of the individual or
>> > > entity
>> > > > to
>> > > > > which it is addressed and may contain information that is
>> > confidential,
>> > > > > privileged and exempt from disclosure under applicable law. If the
>> > > reader
>> > > > > of this message is not the intended recipient, you are hereby
>> > notified
>> > > > that
>> > > > > any printing, copying, dissemination, distribution, disclosure or
>> > > > > forwarding of this communication is strictly prohibited. If you
>> have
>> > > > > received this communication in error, please contact the sender
>> > > > immediately
>> > > > > and delete it from your system. Thank You.
>> > > > >
>> > > >
>> > >
>> > > --
>> > > CONFIDENTIALITY NOTICE
>> > > NOTICE: This message is intended for the use of the individual or
>> entity
>> > to
>> > > which it is addressed and may contain information that is
>> confidential,
>> > > privileged and exempt from disclosure under applicable law. If the
>> reader
>> > > of this message is not the intended recipient, you are hereby notified
>> > that
>> > > any printing, copying, dissemination, distribution, disclosure or
>> > > forwarding of this communication is strictly prohibited. If you have
>> > > received this communication in error, please contact the sender
>> > immediately
>> > > and delete it from your system. Thank You.
>> > >
>> >
>>
>> --
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to
>> which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified
>> that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender
>> immediately
>> and delete it from your system. Thank You.
>>
>
>

Re: HEADSUP: Working on new 0.96.0RC

Posted by Jimmy Xiang <jx...@cloudera.com>.
I prefer to have 9696 in. It's not just about merging.  I am also trying to
make sure splitting is good.  Currently, if a region is splitting, the two
daughters are wrote to meta at first. CM could move them around before
master knows about these two new regions. So they could be double-assigned
for a short while.  It could be a cause why ITBLL still shows data loss
somewhere.

I think we should make sure ITBLL runs well with no data loss before we
release 0.96.0.  Data loss is a big concern to me.


On Wed, Oct 9, 2013 at 2:33 PM, Devaraj Das <dd...@hortonworks.com> wrote:

> I am not sure I agree with this though. The reason being - HBASE-9696 was
> raised on Saturday and we have cut an RC after that. So why not another one
> now? For the 0.96.0 version, can we not say that "merge" should be used
> with caution. Also, it is not guaranteed that we will not face any new IT
> issues after 9696 goes in, right?
> Let's cut 0.96.0 now and fix remaining issues in 0.96.1. Thoughts?
>
>
> On Wed, Oct 9, 2013 at 2:17 PM, Elliott Clark <ec...@apache.org> wrote:
>
> > At this point I think that we should have real clean IT test runs before
> > cutting another release.  And we can't really get that until the master
> > always comes back up (The issue stack was working on yesterday) and until
> > merging is stable.  I would like to see those two things fixed before
> 0.96
> >
> >
> > On Wed, Oct 9, 2013 at 1:38 PM, Devaraj Das <dd...@hortonworks.com>
> wrote:
> >
> > > I'd say we cut an RC now (without any more fixes).
> > >
> > >
> > > On Wed, Oct 9, 2013 at 12:45 PM, Jimmy Xiang <jx...@cloudera.com>
> > wrote:
> > >
> > > > It's testing now. :)
> > > >
> > > >
> > > > On Wed, Oct 9, 2013 at 12:42 PM, Sergey Shelukhin <
> > > sergey@hortonworks.com
> > > > >wrote:
> > > >
> > > > > 9696 looks a little bit scary... did you guys test it on your rig?
> > > > >
> > > > >
> > > > > On Wed, Oct 9, 2013 at 11:54 AM, Stack <st...@duboce.net> wrote:
> > > > >
> > > > > > On Wed, Oct 9, 2013 at 10:45 AM, Enis Söztutar <en...@apache.org>
> > > > wrote:
> > > > > >
> > > > > > > I just committed
> > > https://issues.apache.org/jira/browse/HBASE-9730for
> > > > > > > this.
> > > > > > > Time for another RC, what do you think?
> > > > > > >
> > > > > > > You know what they say, sixth time is the charm.
> > > > > >
> > > > > >
> > > > > >
> > > > > > I can cut one no problem.  Just say.
> > > > > >
> > > > > > Does your test rig pass?  Ours hasn't yet because of HBASE-9563;
> > > master
> > > > > is
> > > > > > killed and won't come back though restarted and tests fail.
> > > > > >
> > > > > > Do we want HBASE-9696 in there?  It is currently under test.
> > > > > >
> > > > > > And HBASE-9724 Failed region split is not handled correctly by
> AM?
> > > > > >
> > > > > > But if you fellas need me to put up a new one, just say.  Just
> > takes
> > > a
> > > > > few
> > > > > > hours.
> > > > > >
> > > > > > St.Ack
> > > > > >
> > > > >
> > > > > --
> > > > > CONFIDENTIALITY NOTICE
> > > > > NOTICE: This message is intended for the use of the individual or
> > > entity
> > > > to
> > > > > which it is addressed and may contain information that is
> > confidential,
> > > > > privileged and exempt from disclosure under applicable law. If the
> > > reader
> > > > > of this message is not the intended recipient, you are hereby
> > notified
> > > > that
> > > > > any printing, copying, dissemination, distribution, disclosure or
> > > > > forwarding of this communication is strictly prohibited. If you
> have
> > > > > received this communication in error, please contact the sender
> > > > immediately
> > > > > and delete it from your system. Thank You.
> > > > >
> > > >
> > >
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> > >
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: HEADSUP: Working on new 0.96.0RC

Posted by Devaraj Das <dd...@hortonworks.com>.
I am not sure I agree with this though. The reason being - HBASE-9696 was
raised on Saturday and we have cut an RC after that. So why not another one
now? For the 0.96.0 version, can we not say that "merge" should be used
with caution. Also, it is not guaranteed that we will not face any new IT
issues after 9696 goes in, right?
Let's cut 0.96.0 now and fix remaining issues in 0.96.1. Thoughts?


On Wed, Oct 9, 2013 at 2:17 PM, Elliott Clark <ec...@apache.org> wrote:

> At this point I think that we should have real clean IT test runs before
> cutting another release.  And we can't really get that until the master
> always comes back up (The issue stack was working on yesterday) and until
> merging is stable.  I would like to see those two things fixed before 0.96
>
>
> On Wed, Oct 9, 2013 at 1:38 PM, Devaraj Das <dd...@hortonworks.com> wrote:
>
> > I'd say we cut an RC now (without any more fixes).
> >
> >
> > On Wed, Oct 9, 2013 at 12:45 PM, Jimmy Xiang <jx...@cloudera.com>
> wrote:
> >
> > > It's testing now. :)
> > >
> > >
> > > On Wed, Oct 9, 2013 at 12:42 PM, Sergey Shelukhin <
> > sergey@hortonworks.com
> > > >wrote:
> > >
> > > > 9696 looks a little bit scary... did you guys test it on your rig?
> > > >
> > > >
> > > > On Wed, Oct 9, 2013 at 11:54 AM, Stack <st...@duboce.net> wrote:
> > > >
> > > > > On Wed, Oct 9, 2013 at 10:45 AM, Enis Söztutar <en...@apache.org>
> > > wrote:
> > > > >
> > > > > > I just committed
> > https://issues.apache.org/jira/browse/HBASE-9730for
> > > > > > this.
> > > > > > Time for another RC, what do you think?
> > > > > >
> > > > > > You know what they say, sixth time is the charm.
> > > > >
> > > > >
> > > > >
> > > > > I can cut one no problem.  Just say.
> > > > >
> > > > > Does your test rig pass?  Ours hasn't yet because of HBASE-9563;
> > master
> > > > is
> > > > > killed and won't come back though restarted and tests fail.
> > > > >
> > > > > Do we want HBASE-9696 in there?  It is currently under test.
> > > > >
> > > > > And HBASE-9724 Failed region split is not handled correctly by AM?
> > > > >
> > > > > But if you fellas need me to put up a new one, just say.  Just
> takes
> > a
> > > > few
> > > > > hours.
> > > > >
> > > > > St.Ack
> > > > >
> > > >
> > > > --
> > > > CONFIDENTIALITY NOTICE
> > > > NOTICE: This message is intended for the use of the individual or
> > entity
> > > to
> > > > which it is addressed and may contain information that is
> confidential,
> > > > privileged and exempt from disclosure under applicable law. If the
> > reader
> > > > of this message is not the intended recipient, you are hereby
> notified
> > > that
> > > > any printing, copying, dissemination, distribution, disclosure or
> > > > forwarding of this communication is strictly prohibited. If you have
> > > > received this communication in error, please contact the sender
> > > immediately
> > > > and delete it from your system. Thank You.
> > > >
> > >
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: HEADSUP: Working on new 0.96.0RC

Posted by Elliott Clark <ec...@apache.org>.
At this point I think that we should have real clean IT test runs before
cutting another release.  And we can't really get that until the master
always comes back up (The issue stack was working on yesterday) and until
merging is stable.  I would like to see those two things fixed before 0.96


On Wed, Oct 9, 2013 at 1:38 PM, Devaraj Das <dd...@hortonworks.com> wrote:

> I'd say we cut an RC now (without any more fixes).
>
>
> On Wed, Oct 9, 2013 at 12:45 PM, Jimmy Xiang <jx...@cloudera.com> wrote:
>
> > It's testing now. :)
> >
> >
> > On Wed, Oct 9, 2013 at 12:42 PM, Sergey Shelukhin <
> sergey@hortonworks.com
> > >wrote:
> >
> > > 9696 looks a little bit scary... did you guys test it on your rig?
> > >
> > >
> > > On Wed, Oct 9, 2013 at 11:54 AM, Stack <st...@duboce.net> wrote:
> > >
> > > > On Wed, Oct 9, 2013 at 10:45 AM, Enis Söztutar <en...@apache.org>
> > wrote:
> > > >
> > > > > I just committed
> https://issues.apache.org/jira/browse/HBASE-9730for
> > > > > this.
> > > > > Time for another RC, what do you think?
> > > > >
> > > > > You know what they say, sixth time is the charm.
> > > >
> > > >
> > > >
> > > > I can cut one no problem.  Just say.
> > > >
> > > > Does your test rig pass?  Ours hasn't yet because of HBASE-9563;
> master
> > > is
> > > > killed and won't come back though restarted and tests fail.
> > > >
> > > > Do we want HBASE-9696 in there?  It is currently under test.
> > > >
> > > > And HBASE-9724 Failed region split is not handled correctly by AM?
> > > >
> > > > But if you fellas need me to put up a new one, just say.  Just takes
> a
> > > few
> > > > hours.
> > > >
> > > > St.Ack
> > > >
> > >
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> > >
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: HEADSUP: Working on new 0.96.0RC

Posted by Devaraj Das <dd...@hortonworks.com>.
I'd say we cut an RC now (without any more fixes).


On Wed, Oct 9, 2013 at 12:45 PM, Jimmy Xiang <jx...@cloudera.com> wrote:

> It's testing now. :)
>
>
> On Wed, Oct 9, 2013 at 12:42 PM, Sergey Shelukhin <sergey@hortonworks.com
> >wrote:
>
> > 9696 looks a little bit scary... did you guys test it on your rig?
> >
> >
> > On Wed, Oct 9, 2013 at 11:54 AM, Stack <st...@duboce.net> wrote:
> >
> > > On Wed, Oct 9, 2013 at 10:45 AM, Enis Söztutar <en...@apache.org>
> wrote:
> > >
> > > > I just committed https://issues.apache.org/jira/browse/HBASE-9730for
> > > > this.
> > > > Time for another RC, what do you think?
> > > >
> > > > You know what they say, sixth time is the charm.
> > >
> > >
> > >
> > > I can cut one no problem.  Just say.
> > >
> > > Does your test rig pass?  Ours hasn't yet because of HBASE-9563; master
> > is
> > > killed and won't come back though restarted and tests fail.
> > >
> > > Do we want HBASE-9696 in there?  It is currently under test.
> > >
> > > And HBASE-9724 Failed region split is not handled correctly by AM?
> > >
> > > But if you fellas need me to put up a new one, just say.  Just takes a
> > few
> > > hours.
> > >
> > > St.Ack
> > >
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: HEADSUP: Working on new 0.96.0RC

Posted by Jimmy Xiang <jx...@cloudera.com>.
It's testing now. :)


On Wed, Oct 9, 2013 at 12:42 PM, Sergey Shelukhin <se...@hortonworks.com>wrote:

> 9696 looks a little bit scary... did you guys test it on your rig?
>
>
> On Wed, Oct 9, 2013 at 11:54 AM, Stack <st...@duboce.net> wrote:
>
> > On Wed, Oct 9, 2013 at 10:45 AM, Enis Söztutar <en...@apache.org> wrote:
> >
> > > I just committed https://issues.apache.org/jira/browse/HBASE-9730 for
> > > this.
> > > Time for another RC, what do you think?
> > >
> > > You know what they say, sixth time is the charm.
> >
> >
> >
> > I can cut one no problem.  Just say.
> >
> > Does your test rig pass?  Ours hasn't yet because of HBASE-9563; master
> is
> > killed and won't come back though restarted and tests fail.
> >
> > Do we want HBASE-9696 in there?  It is currently under test.
> >
> > And HBASE-9724 Failed region split is not handled correctly by AM?
> >
> > But if you fellas need me to put up a new one, just say.  Just takes a
> few
> > hours.
> >
> > St.Ack
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: HEADSUP: Working on new 0.96.0RC

Posted by Sergey Shelukhin <se...@hortonworks.com>.
9696 looks a little bit scary... did you guys test it on your rig?


On Wed, Oct 9, 2013 at 11:54 AM, Stack <st...@duboce.net> wrote:

> On Wed, Oct 9, 2013 at 10:45 AM, Enis Söztutar <en...@apache.org> wrote:
>
> > I just committed https://issues.apache.org/jira/browse/HBASE-9730 for
> > this.
> > Time for another RC, what do you think?
> >
> > You know what they say, sixth time is the charm.
>
>
>
> I can cut one no problem.  Just say.
>
> Does your test rig pass?  Ours hasn't yet because of HBASE-9563; master is
> killed and won't come back though restarted and tests fail.
>
> Do we want HBASE-9696 in there?  It is currently under test.
>
> And HBASE-9724 Failed region split is not handled correctly by AM?
>
> But if you fellas need me to put up a new one, just say.  Just takes a few
> hours.
>
> St.Ack
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: HEADSUP: Working on new 0.96.0RC

Posted by Stack <st...@duboce.net>.
On Wed, Oct 9, 2013 at 10:45 AM, Enis Söztutar <en...@apache.org> wrote:

> I just committed https://issues.apache.org/jira/browse/HBASE-9730 for
> this.
> Time for another RC, what do you think?
>
> You know what they say, sixth time is the charm.



I can cut one no problem.  Just say.

Does your test rig pass?  Ours hasn't yet because of HBASE-9563; master is
killed and won't come back though restarted and tests fail.

Do we want HBASE-9696 in there?  It is currently under test.

And HBASE-9724 Failed region split is not handled correctly by AM?

But if you fellas need me to put up a new one, just say.  Just takes a few
hours.

St.Ack

Re: HEADSUP: Working on new 0.96.0RC

Posted by Enis Söztutar <en...@apache.org>.
I just committed https://issues.apache.org/jira/browse/HBASE-9730 for this.
Time for another RC, what do you think?

You know what they say, sixth time is the charm.
Enis


On Tue, Oct 8, 2013 at 2:43 PM, Enis Söztutar <en...@apache.org> wrote:

> HEADS UP:
> I think we may have to sink this one. Our tests ITBLL and ITLAV with CM
> fails consistently, and we suspect a problem with HBASE-9612 (although not
> confirmed yet)
>
> More details are coming soon after more digging into logs.
> Enis
>
>
> On Mon, Oct 7, 2013 at 1:46 PM, Stack <st...@duboce.net> wrote:
>
>> On Mon, Oct 7, 2013 at 10:05 AM, Steve Loughran <stevel@hortonworks.com
>> >wrote:
>>
>> > go
>> >
>> > well, those are the .gz files, not the JARs, I'll have to download and
>> > check...
>> >
>> >
>> Give me list of jars you want a sha for and I'll run them for you against
>> the build RC (and publish it)
>>
>>
>>
>> > BTW, there's a new Hadoop RC out in staging: 2.2.0
>> >
>> >
>> > The RC is available at:
>> > http://people.apache.org/~acmurthy/hadoop-2.2.0-rc0
>> > The RC tag in svn is here:
>> > http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.2.0-rc0
>>
>>
>> Yeah.  Hopefully our RC works against it (I didn't try it).
>>
>> St.Ack
>>
>
>

Re: HEADSUP: Working on new 0.96.0RC

Posted by Enis Söztutar <en...@apache.org>.
HEADS UP:
I think we may have to sink this one. Our tests ITBLL and ITLAV with CM
fails consistently, and we suspect a problem with HBASE-9612 (although not
confirmed yet)

More details are coming soon after more digging into logs.
Enis


On Mon, Oct 7, 2013 at 1:46 PM, Stack <st...@duboce.net> wrote:

> On Mon, Oct 7, 2013 at 10:05 AM, Steve Loughran <stevel@hortonworks.com
> >wrote:
>
> > go
> >
> > well, those are the .gz files, not the JARs, I'll have to download and
> > check...
> >
> >
> Give me list of jars you want a sha for and I'll run them for you against
> the build RC (and publish it)
>
>
>
> > BTW, there's a new Hadoop RC out in staging: 2.2.0
> >
> >
> > The RC is available at:
> > http://people.apache.org/~acmurthy/hadoop-2.2.0-rc0
> > The RC tag in svn is here:
> > http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.2.0-rc0
>
>
> Yeah.  Hopefully our RC works against it (I didn't try it).
>
> St.Ack
>

Re: HEADSUP: Working on new 0.96.0RC

Posted by Stack <st...@duboce.net>.
On Mon, Oct 7, 2013 at 10:05 AM, Steve Loughran <st...@hortonworks.com>wrote:

> go
>
> well, those are the .gz files, not the JARs, I'll have to download and
> check...
>
>
Give me list of jars you want a sha for and I'll run them for you against
the build RC (and publish it)



> BTW, there's a new Hadoop RC out in staging: 2.2.0
>
>
> The RC is available at:
> http://people.apache.org/~acmurthy/hadoop-2.2.0-rc0
> The RC tag in svn is here:
> http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.2.0-rc0


Yeah.  Hopefully our RC works against it (I didn't try it).

St.Ack

Re: HEADSUP: Working on new 0.96.0RC

Posted by Steve Loughran <st...@hortonworks.com>.
go

well, those are the .gz files, not the JARs, I'll have to download and
check...

BTW, there's a new Hadoop RC out in staging: 2.2.0


The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.2.0-rc0
The RC tag in svn is here:
http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.2.0-rc0

-steve

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: HEADSUP: Working on new 0.96.0RC

Posted by Stack <st...@duboce.net>.
On Mon, Oct 7, 2013 at 1:58 AM, Steve Loughran <st...@hortonworks.com>wrote:

> just a check to make sure i am pulling down the right version from staging:
> what is the sha1 of the latest RC?
>
>
>
You saw the *.mds files up here
http://people.apache.org/~stack/hbase-0.96.0RC4/?  SHA1 is in them?  (I
just compared to what I have here down on build box here).

Does that answer your question mighty Steve?

St.Ack



> On 6 October 2013 01:00, Stack <st...@duboce.net> wrote:
>
> > On Fri, Oct 4, 2013 at 2:20 PM, Stack <st...@duboce.net> wrote:
> >
> > > Waiting on HBASE-9612 jenkins build but starting in making a new RC.
>  It
> > > takes a few hours if all goes well.  Please no commits on 0.96 branch
> > till
> > > the all clear is sounded.  Thanks.
> > >
> > >
> > All clear, but please only important bug fixes for 0.96 branch; nothing
> > that might destabilize.  If you do commit one, mark it fixed in version
> > 0.96.1.
> > Thanks,
> > St.Ack
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: HEADSUP: Working on new 0.96.0RC

Posted by Steve Loughran <st...@hortonworks.com>.
just a check to make sure i am pulling down the right version from staging:
what is the sha1 of the latest RC?


On 6 October 2013 01:00, Stack <st...@duboce.net> wrote:

> On Fri, Oct 4, 2013 at 2:20 PM, Stack <st...@duboce.net> wrote:
>
> > Waiting on HBASE-9612 jenkins build but starting in making a new RC.  It
> > takes a few hours if all goes well.  Please no commits on 0.96 branch
> till
> > the all clear is sounded.  Thanks.
> >
> >
> All clear, but please only important bug fixes for 0.96 branch; nothing
> that might destabilize.  If you do commit one, mark it fixed in version
> 0.96.1.
> Thanks,
> St.Ack
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: HEADSUP: Working on new 0.96.0RC

Posted by Stack <st...@duboce.net>.
On Fri, Oct 4, 2013 at 2:20 PM, Stack <st...@duboce.net> wrote:

> Waiting on HBASE-9612 jenkins build but starting in making a new RC.  It
> takes a few hours if all goes well.  Please no commits on 0.96 branch till
> the all clear is sounded.  Thanks.
>
>
All clear, but please only important bug fixes for 0.96 branch; nothing
that might destabilize.  If you do commit one, mark it fixed in version
0.96.1.
Thanks,
St.Ack