You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2018/10/25 07:41:27 UTC

[GitHub] clintropolis opened a new pull request #6516: fix exception in Supervisor.start causing overlord unable to become leader

clintropolis opened a new pull request #6516: fix exception in Supervisor.start causing overlord unable to become leader
URL: https://github.com/apache/incubator-druid/pull/6516
 
 
   This PR fixes an issue where an exception thrown by a `Supervisor.start()` implementation can wreck the leadership lifecycle start of `SupervisorManager` which in turn wrecks `TaskMaster` start, prevent any overlord from obtaining leadership.
   
   Observed in a test cluster with a custom kinesis indexing extension, which was broken by recent changes to core druid in which `aws-java-sdk` dependencies are pulled in as well as a version bump, resulting in the custom extension expecting some jars to be provided that no longer are, and of a different version. Anyway, it's failure to start caused the cluster to be without any functioning overlord, which doesn't seem the most chill behavior. After this patch, failing supervisor starts will be logged at error level, but the `SupervisorManager` will still attempt to start any remaining supervisors, allowing the overlord to continue functioning in a partially degraded state instead of not at all. Similar in spirit to issue and fix of #6512
   
   Relevant logs:
   ```
   2018-10-25T04:34:43,537 ERROR [LeaderSelector[/demo/overlord/_OVERLORD]] org.apache.druid.curator.discovery.CuratorDruidLeaderSelector - listener becomeLeader() failed. Unable to become leader: {class=org.apache.druid.curator.discovery.CuratorDruidLeaderSelector, exceptionType=class java.lang.RuntimeException, exceptionMessage=java.lang.reflect.InvocationTargetException}
   java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
   	at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]
   	at org.apache.druid.indexing.overlord.TaskMaster$1.becomeLeader(TaskMaster.java:153) ~[druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating]
   	at org.apache.druid.curator.discovery.CuratorDruidLeaderSelector$1.isLeader(CuratorDruidLeaderSelector.java:98) [druid-server-0.13.0-incubating.jar:0.13.0-incubating]
   	at org.apache.curator.framework.recipes.leader.LeaderLatch$9.apply(LeaderLatch.java:665) [curator-recipes-4.0.0.jar:4.0.0]
   	at org.apache.curator.framework.recipes.leader.LeaderLatch$9.apply(LeaderLatch.java:661) [curator-recipes-4.0.0.jar:4.0.0]
   	at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:93) [curator-framework-4.0.0.jar:4.0.0]
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181]
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
   	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
   Caused by: java.lang.reflect.InvocationTargetException
   	at sun.reflect.GeneratedMethodAccessor34.invoke(Unknown Source) ~[?:?]
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_181]
   	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_181]
   	at org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler.start(Lifecycle.java:412) ~[java-util-0.13.0-incubating.jar:0.13.0-incubating]
   	at org.apache.druid.java.util.common.lifecycle.Lifecycle.start(Lifecycle.java:311) ~[java-util-0.13.0-incubating.jar:0.13.0-incubating]
   	at org.apache.druid.indexing.overlord.TaskMaster$1.becomeLeader(TaskMaster.java:150) ~[druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating]
   	... 7 more
   Caused by: java.lang.NoClassDefFoundError: com/amazonaws/transform/JsonErrorUnmarshallerV2
   	at com.amazonaws.services.kinesis.AmazonKinesisClient.init(AmazonKinesisClient.java:226) ~[?:?]
   	at com.amazonaws.services.kinesis.AmazonKinesisClient.<init>(AmazonKinesisClient.java:222) ~[?:?]
   	at com.amazonaws.services.kinesis.AmazonKinesisClient.<init>(AmazonKinesisClient.java:196) ~[?:?]
   	at org.apache.druid.indexing.kinesis.KinesisRecordSupplier.<init>(KinesisRecordSupplier.java:264) ~[?:?]
   	at org.apache.druid.indexing.kinesis.supervisor.KinesisSupervisor.setupRecordSupplier(KinesisSupervisor.java:800) ~[?:?]
   	at org.apache.druid.indexing.kinesis.supervisor.KinesisSupervisor.start(KinesisSupervisor.java:340) ~[?:?]
   	at org.apache.druid.indexing.overlord.supervisor.SupervisorManager.createAndStartSupervisorInternal(SupervisorManager.java:290) ~[druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating]
   	at org.apache.druid.indexing.overlord.supervisor.SupervisorManager.start(SupervisorManager.java:136) ~[druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating]
   	at sun.reflect.GeneratedMethodAccessor34.invoke(Unknown Source) ~[?:?]
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_181]
   	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_181]
   	at org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler.start(Lifecycle.java:412) ~[java-util-0.13.0-incubating.jar:0.13.0-incubating]
   	at org.apache.druid.java.util.common.lifecycle.Lifecycle.start(Lifecycle.java:311) ~[java-util-0.13.0-incubating.jar:0.13.0-incubating]
   	at org.apache.druid.indexing.overlord.TaskMaster$1.becomeLeader(TaskMaster.java:150) ~[druid-indexing-service-0.13.0-incubating.jar:0.13.0-incubating]
   	... 7 more
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org