You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "Josh Elser (JIRA)" <ji...@apache.org> on 2014/12/15 18:07:13 UTC

[jira] [Commented] (ACCUMULO-3421) DistributedTrace.enable will eat exceptions about failing to connect to ZK

    [ https://issues.apache.org/jira/browse/ACCUMULO-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246849#comment-14246849 ] 

Josh Elser commented on ACCUMULO-3421:
--------------------------------------

[~billie.rinaldi] brought up a good point to me in chat that we might not want the exception to propagate back up to the client. If there is a spurious ZK exception, or ZK is just unavailable at the moment, it's probably undesirable to tank the application. However, this currently doesn't happen because we rely on a watcher to update ZooTraceClient which isn't set if we fail to talk to ZK in the initialization.

Perhaps we need to watch for the exception and start some timer thread to re-attempt the connection to ZK. After we connect successfully (get the hosts, if any, the first time) the Watcher should be sufficient.

> DistributedTrace.enable will eat exceptions about failing to connect to ZK
> --------------------------------------------------------------------------
>
>                 Key: ACCUMULO-3421
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3421
>             Project: Accumulo
>          Issue Type: Bug
>          Components: test
>            Reporter: Josh Elser
>            Assignee: Josh Elser
>             Fix For: 1.7.0
>
>
> From a failed TracerRecoversAfterOfflineTableIT
> {noformat}
> java.lang.RuntimeException: Failed to connect to zookeeper (localhost:2181) within 2x zookeeper timeout period 30000
> 	at org.apache.accumulo.fate.zookeeper.ZooSession.connect(ZooSession.java:118)
> 	at org.apache.accumulo.fate.zookeeper.ZooSession.getSession(ZooSession.java:163)
> 	at org.apache.accumulo.fate.zookeeper.ZooReader.getSession(ZooReader.java:39)
> 	at org.apache.accumulo.fate.zookeeper.ZooReader.getZooKeeper(ZooReader.java:43)
> 	at org.apache.accumulo.fate.zookeeper.ZooReader.exists(ZooReader.java:166)
> 	at org.apache.accumulo.tracer.ZooTraceClient.process(ZooTraceClient.java:82)
> 	at org.apache.accumulo.tracer.ZooTraceClient.configure(ZooTraceClient.java:75)
> 	at org.apache.accumulo.core.trace.DistributedTrace.loadInstance(DistributedTrace.java:184)
> 	at org.apache.accumulo.core.trace.DistributedTrace.loadSpanReceivers(DistributedTrace.java:166)
> 	at org.apache.accumulo.core.trace.DistributedTrace.enableTracing(DistributedTrace.java:143)
> 	at org.apache.accumulo.core.trace.DistributedTrace.enable(DistributedTrace.java:101)
> 	at org.apache.accumulo.core.trace.DistributedTrace.enable(DistributedTrace.java:86)
> 	at org.apache.accumulo.test.TracerRecoversAfterOfflineTableIT.test(TracerRecoversAfterOfflineTableIT.java:77)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> 	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> 	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> 	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> 	at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}
> The problem is that {{org.apache.accumulo.tracer.ZooTraceClient.process(ZooTraceClient.java:82)}} eats the Exception and it doesn't propagate back up the stack through {{org.apache.accumulo.tracer.ZooTraceClient.configure(ZooTraceClient.java:75)}}. Thus, the test just saw a "successful" call to DistributedTrace.enable, tried to run the test, which ultimately failed because tracing wasn't actually enabled.
> I think we need to make sure that such an exception propagates back to the caller.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)