You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2015/11/19 21:47:11 UTC

[jira] [Reopened] (HBASE-14777) Fix Inter Cluster Replication Future ordering issues

     [ https://issues.apache.org/jira/browse/HBASE-14777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack reopened HBASE-14777:
---------------------------

Since this patch went in, replication related tests are failing the 1.7 build:

See https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.3/379/

and 


https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.3/380/jdk=latest1.7,label=Hadoop/testReport/

If you click on the tests, it is not showing you anything useful (unfortunately, TODO) but if you go to the artifacts and dig to find these tests you see:


-------------------------------------------------------------------------------
Test set: org.apache.hadoop.hbase.replication.TestReplicationEndpoint
-------------------------------------------------------------------------------
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 283.994 sec <<< FAILURE! - in org.apache.hadoop.hbase.replication.TestReplicationEndpoint
testInterClusterReplication(org.apache.hadoop.hbase.replication.TestReplicationEndpoint)  Time elapsed: 120.72 sec  <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 120000 milliseconds
	at java.lang.Thread.start0(Native Method)
	at java.lang.Thread.start(Thread.java:714)
	at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949)
	at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1360)
	at org.apache.hadoop.hbase.client.ResultBoundedCompletionService.submit(ResultBoundedCompletionService.java:146)
	at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.addCallsForCurrentReplica(ScannerCallableWithReplicas.java:279)
	at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:166)
	at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:60)
	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
	at org.apache.hadoop.hbase.client.ClientSmallReversedScanner.loadCache(ClientSmallReversedScanner.java:212)
	at org.apache.hadoop.hbase.client.ClientSmallReversedScanner.next(ClientSmallReversedScanner.java:186)
	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1279)
	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1185)
	at org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:370)
	at org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:321)
	at org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:238)
	at org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:190)
	at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1430)
	at org.apache.hadoop.hbase.client.HTable.put(HTable.java:1021)
	at org.apache.hadoop.hbase.replication.TestReplicationEndpoint.doPut(TestReplicationEndpoint.java:265)
	at org.apache.hadoop.hbase.replication.TestReplicationEndpoint.doPut(TestReplicationEndpoint.java:257)
	at org.apache.hadoop.hbase.replication.TestReplicationEndpoint.testInterClusterReplication(TestReplicationEndpoint.java:202)

Here is output: https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.3/380/jdk=latest1.7,label=Hadoop/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.replication.TestReplicationEndpoint-output.txt

Please take a look. I've started a new build of branch-1.3 in the meantime to try and get more data. Resolve if you don't think it this patch.




> Fix Inter Cluster Replication Future ordering issues
> ----------------------------------------------------
>
>                 Key: HBASE-14777
>                 URL: https://issues.apache.org/jira/browse/HBASE-14777
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>            Reporter: Bhupendra Kumar Jain
>            Assignee: Ashu Pachauri
>            Priority: Critical
>             Fix For: 2.0.0, 1.2.0, 1.3.0
>
>         Attachments: HBASE-14777-1.patch, HBASE-14777-2.patch, HBASE-14777-3.patch, HBASE-14777-4.patch, HBASE-14777-5.patch, HBASE-14777-6.patch, HBASE-14777.patch
>
>
> Replication fails with IndexOutOfBoundsException 
> {code}
> regionserver.ReplicationSource$ReplicationSourceWorkerThread(939): org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint threw unknown exception:java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
> 	at java.util.ArrayList.rangeCheck(Unknown Source)
> 	at java.util.ArrayList.remove(Unknown Source)
> 	at org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint.replicate(HBaseInterClusterReplicationEndpoint.java:222)
> {code}
> Its happening due to incorrect removal of entries from the replication entries list. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)