You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Andrew Purtell <ap...@apache.org> on 2010/12/16 20:19:04 UTC

recent replication changes failing tests?

The previous four builds of our 0.90 based variant were ok, but recent changes to replication seem to be problematic. The major difference between our version and upstream is use of secure RPC.

We also see a possibly related problem on the 0.90 branch on Hudson (https://hudson.apache.org/hudson/job/HBase-0.90)
>>>
java.lang.AssertionError: Waited too much time for queueFailover replication
	at org.junit.Assert.fail(Assert.java:91)
	at org.apache.hadoop.hbase.replication.TestReplication.queueFailover(TestReplication.java:560)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
<<<

>From our Hudson:

Changes
  HBASE-3360  ReplicationLogCleaner is enabled by default in 0.90 -- causes NPE (detail)
  HBASE-3363  ReplicationSink should batch delete (detail)
  HBASE-3365  EOFE contacting crashed RS causes Master abort (detail)
  Adding a fix for this test that was missing (detail)

(Prior to this change set previous four runs were ok.)

First result:

>>>
Running org.apache.hadoop.hbase.replication.TestReplication
killed.
[HUDSON] Recording test results
[INFO] ------------------------------------------------------------------------
[ERROR] BUILD ERROR
[INFO] ------------------------------------------------------------------------
[INFO] Error while executing forked tests.; nested exception is org.apache.maven.surefire.booter.shade.org.codehaus.plexus.util.cli.CommandLineException: Error while executing external command, process killed.

Process timeout out after 900 seconds
<<<

Next:
>>>
-------------------------------------------------------------------------------
Test set: org.apache.hadoop.hbase.replication.TestReplication
-------------------------------------------------------------------------------
Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 272.887 sec <<< FAILURE!
testAddAndRemoveClusters(org.apache.hadoop.hbase.replication.TestReplication)  Time elapsed: 24.792 sec  <<< FAILURE!
java.lang.AssertionError: Waited too much time for put replication
	at org.junit.Assert.fail(Assert.java:91)
	at org.apache.hadoop.hbase.replication.TestReplication.testAddAndRemoveClusters(TestReplication.java:390)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
...
<<<

Best regards,

    - Andy



      

Re: recent replication changes failing tests?

Posted by Jean-Daniel Cryans <jd...@apache.org>.
Two reasons I saw that test failing:

https://issues.apache.org/jira/browse/HBASE-3366
https://issues.apache.org/jira/browse/HBASE-3367

I discovered the second while testing the fix for the first one.

J-D

On Thu, Dec 16, 2010 at 11:19 AM, Andrew Purtell <ap...@apache.org> wrote:
> The previous four builds of our 0.90 based variant were ok, but recent changes to replication seem to be problematic. The major difference between our version and upstream is use of secure RPC.
>
> We also see a possibly related problem on the 0.90 branch on Hudson (https://hudson.apache.org/hudson/job/HBase-0.90)
>>>>
> java.lang.AssertionError: Waited too much time for queueFailover replication
>        at org.junit.Assert.fail(Assert.java:91)
>        at org.apache.hadoop.hbase.replication.TestReplication.queueFailover(TestReplication.java:560)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
> <<<
>
> From our Hudson:
>
> Changes
>  HBASE-3360  ReplicationLogCleaner is enabled by default in 0.90 -- causes NPE (detail)
>  HBASE-3363  ReplicationSink should batch delete (detail)
>  HBASE-3365  EOFE contacting crashed RS causes Master abort (detail)
>  Adding a fix for this test that was missing (detail)
>
> (Prior to this change set previous four runs were ok.)
>
> First result:
>
>>>>
> Running org.apache.hadoop.hbase.replication.TestReplication
> killed.
> [HUDSON] Recording test results
> [INFO] ------------------------------------------------------------------------
> [ERROR] BUILD ERROR
> [INFO] ------------------------------------------------------------------------
> [INFO] Error while executing forked tests.; nested exception is org.apache.maven.surefire.booter.shade.org.codehaus.plexus.util.cli.CommandLineException: Error while executing external command, process killed.
>
> Process timeout out after 900 seconds
> <<<
>
> Next:
>>>>
> -------------------------------------------------------------------------------
> Test set: org.apache.hadoop.hbase.replication.TestReplication
> -------------------------------------------------------------------------------
> Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 272.887 sec <<< FAILURE!
> testAddAndRemoveClusters(org.apache.hadoop.hbase.replication.TestReplication)  Time elapsed: 24.792 sec  <<< FAILURE!
> java.lang.AssertionError: Waited too much time for put replication
>        at org.junit.Assert.fail(Assert.java:91)
>        at org.apache.hadoop.hbase.replication.TestReplication.testAddAndRemoveClusters(TestReplication.java:390)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
> ...
> <<<
>
> Best regards,
>
>    - Andy
>
>
>
>
>