You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by Josh Elser <jo...@gmail.com> on 2013/04/20 03:27:14 UTC

org.apache.accumulo.test.TestAccumuloSplitRecovery 1.5 hangs indefinitely

Is anyone else seeing this? I didn't have this happening early this week 
(Wednesday, maybe?).

It gets stuck trying to get the Connector:

     at 
org.apache.accumulo.core.util.UtilWaitThread.sleep(UtilWaitThread.java:26)
     at 
org.apache.accumulo.core.client.impl.ServerClient.executeRaw(ServerClient.java:112)
     at 
org.apache.accumulo.core.client.impl.ServerClient.execute(ServerClient.java:71)
     at 
org.apache.accumulo.core.client.impl.ConnectorImpl.<init>(ConnectorImpl.java:64)
     at 
org.apache.accumulo.core.client.ZooKeeperInstance.getConnector(ZooKeeperInstance.java:227)
     at 
org.apache.accumulo.core.client.ZooKeeperInstance.getConnector(ZooKeeperInstance.java:222)
     at 
org.apache.accumulo.test.TestAccumuloSplitRecovery.test(TestAccumuloSplitRecovery.java:87)

The ZKMain was running, as was the Master; I'm not quite sure how to 
debug it. I grabbed stacktraces from each process when it happened, and 
it appears that I can reliably reproduce it (about 3/3 so far).

Re: org.apache.accumulo.test.TestAccumuloSplitRecovery 1.5 hangs indefinitely

Posted by Josh Elser <jo...@gmail.com>.
Keith,

r1470734 seems to have resolved things for me. Thanks for the fix.

On 04/22/2013 12:36 PM, Keith Turner wrote:
> On Fri, Apr 19, 2013 at 9:47 PM, Josh Elser <jo...@gmail.com> wrote:
>
>> Thought about it more, and remembered about the JUnit temp dir. Found that
>> the two TServers both lost their ZK lock.
>>
>> Perhaps the configuration is just a little too constrained?
>
> Possibly.  I noticed the test did not have a timeout set, I added that.
>
>
>>
>> On 04/19/2013 09:27 PM, Josh Elser wrote:
>>
>>> Is anyone else seeing this? I didn't have this happening early this week
>>> (Wednesday, maybe?).
>>>
>>> It gets stuck trying to get the Connector:
>>>
>>>      at org.apache.accumulo.core.util.**UtilWaitThread.sleep(**
>>> UtilWaitThread.java:26)
>>>      at org.apache.accumulo.core.**client.impl.ServerClient.**
>>> executeRaw(ServerClient.java:**112)
>>>      at org.apache.accumulo.core.**client.impl.ServerClient.**
>>> execute(ServerClient.java:71)
>>>      at org.apache.accumulo.core.**client.impl.ConnectorImpl.<**
>>> init>(ConnectorImpl.java:64)
>>>      at org.apache.accumulo.core.**client.ZooKeeperInstance.**
>>> getConnector(**ZooKeeperInstance.java:227)
>>>      at org.apache.accumulo.core.**client.ZooKeeperInstance.**
>>> getConnector(**ZooKeeperInstance.java:222)
>>>      at org.apache.accumulo.test.**TestAccumuloSplitRecovery.**test(**
>>> TestAccumuloSplitRecovery.**java:87)
>>>
>>> The ZKMain was running, as was the Master; I'm not quite sure how to
>>> debug it. I grabbed stacktraces from each process when it happened, and it
>>> appears that I can reliably reproduce it (about 3/3 so far).
>>>
>>


Re: org.apache.accumulo.test.TestAccumuloSplitRecovery 1.5 hangs indefinitely

Posted by Keith Turner <ke...@deenlo.com>.
On Fri, Apr 19, 2013 at 9:47 PM, Josh Elser <jo...@gmail.com> wrote:

> Thought about it more, and remembered about the JUnit temp dir. Found that
> the two TServers both lost their ZK lock.
>
> Perhaps the configuration is just a little too constrained?


Possibly.  I noticed the test did not have a timeout set, I added that.


>
>
> On 04/19/2013 09:27 PM, Josh Elser wrote:
>
>> Is anyone else seeing this? I didn't have this happening early this week
>> (Wednesday, maybe?).
>>
>> It gets stuck trying to get the Connector:
>>
>>     at org.apache.accumulo.core.util.**UtilWaitThread.sleep(**
>> UtilWaitThread.java:26)
>>     at org.apache.accumulo.core.**client.impl.ServerClient.**
>> executeRaw(ServerClient.java:**112)
>>     at org.apache.accumulo.core.**client.impl.ServerClient.**
>> execute(ServerClient.java:71)
>>     at org.apache.accumulo.core.**client.impl.ConnectorImpl.<**
>> init>(ConnectorImpl.java:64)
>>     at org.apache.accumulo.core.**client.ZooKeeperInstance.**
>> getConnector(**ZooKeeperInstance.java:227)
>>     at org.apache.accumulo.core.**client.ZooKeeperInstance.**
>> getConnector(**ZooKeeperInstance.java:222)
>>     at org.apache.accumulo.test.**TestAccumuloSplitRecovery.**test(**
>> TestAccumuloSplitRecovery.**java:87)
>>
>> The ZKMain was running, as was the Master; I'm not quite sure how to
>> debug it. I grabbed stacktraces from each process when it happened, and it
>> appears that I can reliably reproduce it (about 3/3 so far).
>>
>
>

Re: org.apache.accumulo.test.TestAccumuloSplitRecovery 1.5 hangs indefinitely

Posted by Josh Elser <jo...@gmail.com>.
Thought about it more, and remembered about the JUnit temp dir. Found 
that the two TServers both lost their ZK lock.

Perhaps the configuration is just a little too constrained?

On 04/19/2013 09:27 PM, Josh Elser wrote:
> Is anyone else seeing this? I didn't have this happening early this 
> week (Wednesday, maybe?).
>
> It gets stuck trying to get the Connector:
>
>     at 
> org.apache.accumulo.core.util.UtilWaitThread.sleep(UtilWaitThread.java:26)
>     at 
> org.apache.accumulo.core.client.impl.ServerClient.executeRaw(ServerClient.java:112)
>     at 
> org.apache.accumulo.core.client.impl.ServerClient.execute(ServerClient.java:71)
>     at 
> org.apache.accumulo.core.client.impl.ConnectorImpl.<init>(ConnectorImpl.java:64)
>     at 
> org.apache.accumulo.core.client.ZooKeeperInstance.getConnector(ZooKeeperInstance.java:227)
>     at 
> org.apache.accumulo.core.client.ZooKeeperInstance.getConnector(ZooKeeperInstance.java:222)
>     at 
> org.apache.accumulo.test.TestAccumuloSplitRecovery.test(TestAccumuloSplitRecovery.java:87)
>
> The ZKMain was running, as was the Master; I'm not quite sure how to 
> debug it. I grabbed stacktraces from each process when it happened, 
> and it appears that I can reliably reproduce it (about 3/3 so far).