You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2015/11/25 18:02:11 UTC

[jira] [Created] (HBASE-14883) TestSplitTransactionOnCluster#testFailedSplit flakey

stack created HBASE-14883:
-----------------------------

             Summary: TestSplitTransactionOnCluster#testFailedSplit flakey
                 Key: HBASE-14883
                 URL: https://issues.apache.org/jira/browse/HBASE-14883
             Project: HBase
          Issue Type: Sub-task
          Components: flakey, test
    Affects Versions: 1.2.0, 1.3.0
            Reporter: stack


Only in branch-1 and branch-1.2.

Fails look like this:

https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.3/jdk=latest1.8,label=Hadoop/397/

TEST-org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.xml.<init>


If I look in the xml, I see this:

{code}
  <testcase name="testFailedSplit" classname="org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster" time="8.275">
    <flakyFailure type="java.lang.AssertionError:">java.lang.AssertionError: null
	at org.junit.Assert.fail(Assert.java:86)
	at org.junit.Assert.assertTrue(Assert.java:41)
	at org.junit.Assert.assertTrue(Assert.java:52)
	at org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster.testFailedSplit(TestSplitTransactionOnCluster.java:1339)

      <system-err><![CDATA[
{code}

... the xml is cut off.

testFailedSplit seems to be the culprit.

If I look in the -output.txt I see:

{code}
....

2015-11-25 09:00:37,816 DEBUG [asf905.gq1.ygridcore.net,48894,1448441976062_ChoreService_1] balancer.BaseLoadBalancer$Cluster(838):  Lowest locality region index is 0 and its region server contains 3 regions
2015-11-25 09:00:37,816 DEBUG [asf905.gq1.ygridcore.net,48894,1448441976062_ChoreService_1] balancer.BaseLoadBalancer$Cluster(813): Lowest locality region server with non zero regions is asf905.gq1.ygridcore.net with locality 0.0
2015-11-25 09:00:37,816 DEBUG [asf905.gq1.ygridcore.net,48894,1448441976062_ChoreService_1] balancer.BaseLoadBalancer$Cluster(838):  Lowest locality region index is 0 and its region server contains 3 regions
2015-11-25 09:00:37,817 DEBUG [asf905.gq1.ygridcore.net,48894,1448441976062_ChoreService_1] balancer.BaseLoadBalancer$Cluster(813): Lowest locality region server with non zero regions is asf905.gq1.ygridcore.net with locality 0.0

...
{code}

spewing...

This test was added here:

{code}
kalashnikov:hbase.git.commit stack$ git log -S testFailedSplit  ./hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java
commit 871444cb0a733b82af843952253b4545a407979a
Author: Andrew Purtell <ap...@apache.org>
Date:   Mon Dec 15 17:31:33 2014 -0800

    HBASE-12686 Failures in split before PONR not clearing the daughter regions from regions in transition during rollback (Vandana Ayyalasomayajula)
{code}

The balancer is not coming back true (line #1339 assert is null according to above)


{code}
...
1337       regions = TESTING_UTIL.getHBaseAdmin().getTableRegions(tableName);
1338       assertTrue(regions.size() == 1);
1339       assertTrue(admin.balancer());
...
{code}


Line #1339 was not in original test. It was added later:

{code}
commit 46f993b19fa11d1a8880d08045be43e38017b46a
Author: Virag Kothari <vi...@yahoo-inc.com>
Date:   Wed Jan 7 10:58:32 2015 -0800

    HBASE-12694 testTableExistsIfTheSpecifiedTableRegionIsSplitParent in TestSplitTransactionOnCluster class leaves regions in transition (Vandana Ayyalasomayajula)

{code}

We are having trouble achieving a balance.... Let me see.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)