You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Andrew Purtell <ap...@apache.org> on 2010/12/16 20:19:04 UTC
recent replication changes failing tests?
The previous four builds of our 0.90 based variant were ok, but recent changes to replication seem to be problematic. The major difference between our version and upstream is use of secure RPC.
We also see a possibly related problem on the 0.90 branch on Hudson (https://hudson.apache.org/hudson/job/HBase-0.90)
>>>
java.lang.AssertionError: Waited too much time for queueFailover replication
at org.junit.Assert.fail(Assert.java:91)
at org.apache.hadoop.hbase.replication.TestReplication.queueFailover(TestReplication.java:560)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
<<<
>From our Hudson:
Changes
HBASE-3360 ReplicationLogCleaner is enabled by default in 0.90 -- causes NPE (detail)
HBASE-3363 ReplicationSink should batch delete (detail)
HBASE-3365 EOFE contacting crashed RS causes Master abort (detail)
Adding a fix for this test that was missing (detail)
(Prior to this change set previous four runs were ok.)
First result:
>>>
Running org.apache.hadoop.hbase.replication.TestReplication
killed.
[HUDSON] Recording test results
[INFO] ------------------------------------------------------------------------
[ERROR] BUILD ERROR
[INFO] ------------------------------------------------------------------------
[INFO] Error while executing forked tests.; nested exception is org.apache.maven.surefire.booter.shade.org.codehaus.plexus.util.cli.CommandLineException: Error while executing external command, process killed.
Process timeout out after 900 seconds
<<<
Next:
>>>
-------------------------------------------------------------------------------
Test set: org.apache.hadoop.hbase.replication.TestReplication
-------------------------------------------------------------------------------
Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 272.887 sec <<< FAILURE!
testAddAndRemoveClusters(org.apache.hadoop.hbase.replication.TestReplication) Time elapsed: 24.792 sec <<< FAILURE!
java.lang.AssertionError: Waited too much time for put replication
at org.junit.Assert.fail(Assert.java:91)
at org.apache.hadoop.hbase.replication.TestReplication.testAddAndRemoveClusters(TestReplication.java:390)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
...
<<<
Best regards,
- Andy
Re: recent replication changes failing tests?
Posted by Jean-Daniel Cryans <jd...@apache.org>.
Two reasons I saw that test failing:
https://issues.apache.org/jira/browse/HBASE-3366
https://issues.apache.org/jira/browse/HBASE-3367
I discovered the second while testing the fix for the first one.
J-D
On Thu, Dec 16, 2010 at 11:19 AM, Andrew Purtell <ap...@apache.org> wrote:
> The previous four builds of our 0.90 based variant were ok, but recent changes to replication seem to be problematic. The major difference between our version and upstream is use of secure RPC.
>
> We also see a possibly related problem on the 0.90 branch on Hudson (https://hudson.apache.org/hudson/job/HBase-0.90)
>>>>
> java.lang.AssertionError: Waited too much time for queueFailover replication
> at org.junit.Assert.fail(Assert.java:91)
> at org.apache.hadoop.hbase.replication.TestReplication.queueFailover(TestReplication.java:560)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> <<<
>
> From our Hudson:
>
> Changes
> HBASE-3360 ReplicationLogCleaner is enabled by default in 0.90 -- causes NPE (detail)
> HBASE-3363 ReplicationSink should batch delete (detail)
> HBASE-3365 EOFE contacting crashed RS causes Master abort (detail)
> Adding a fix for this test that was missing (detail)
>
> (Prior to this change set previous four runs were ok.)
>
> First result:
>
>>>>
> Running org.apache.hadoop.hbase.replication.TestReplication
> killed.
> [HUDSON] Recording test results
> [INFO] ------------------------------------------------------------------------
> [ERROR] BUILD ERROR
> [INFO] ------------------------------------------------------------------------
> [INFO] Error while executing forked tests.; nested exception is org.apache.maven.surefire.booter.shade.org.codehaus.plexus.util.cli.CommandLineException: Error while executing external command, process killed.
>
> Process timeout out after 900 seconds
> <<<
>
> Next:
>>>>
> -------------------------------------------------------------------------------
> Test set: org.apache.hadoop.hbase.replication.TestReplication
> -------------------------------------------------------------------------------
> Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 272.887 sec <<< FAILURE!
> testAddAndRemoveClusters(org.apache.hadoop.hbase.replication.TestReplication) Time elapsed: 24.792 sec <<< FAILURE!
> java.lang.AssertionError: Waited too much time for put replication
> at org.junit.Assert.fail(Assert.java:91)
> at org.apache.hadoop.hbase.replication.TestReplication.testAddAndRemoveClusters(TestReplication.java:390)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> ...
> <<<
>
> Best regards,
>
> - Andy
>
>
>
>
>