You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/04/04 23:24:11 UTC

[GitHub] [druid] afire007 opened a new issue #9619: Druid Router unable to Discovery Broker Services

afire007 opened a new issue #9619: Druid Router unable to Discovery Broker Services
URL: https://github.com/apache/druid/issues/9619
 
 
   **Description:**  When hosting the druid router service within a docker swarm cluster and connecting it to my 3 node zookeeper quorum the router service is unable to discover any brokers on the druid/broker service path in zookeeper.  Ive validated that the coordinator service is able to discover all services, however the router service is unable to identify the broker services or any service for that matter.  
   
   Ive triple checked that there is no firewall between the hosts, and I am able to telnet to both zookeeper, the brokers, and the coordinator from the router docker container within the swarm cluster on the appropriate ports.  
   
   Ive also validated that the coordinator correctly identifies the brokers and all other services, but the router service fails to identify the brokers.  Its even stranger that the coordinator properly identifies the hosts but the router is unable to.  See below coordinator/router endpoints for the cluster configuration they are posting.  
   
   **/druid/coordinator/v1/cluster**
   `{"coordinator":[{"host":"10.0.7.30","service":"druid/coordinator","plaintextPort":8081}],"overlord":[{"host":"10.0.7.30","service":"druid/coordinator","plaintextPort":8081}],"broker":[{"host":"10.0.7.32","service":"druid/broker","plaintextPort":8082}],"historical":[{"host":"10.0.7.41","service":"druid/historical","plaintextPort":8083},{"host":"10.0.7.42","service":"druid/historical","plaintextPort":8083}]}`
   
   **/druid/router/v1/brokers**
   `{"druid/broker":[]}
   `
   **common.runtime.properties**
   ```
   druid.extensions.loadList=["postgresql-metadata-storage","druid-s3-extensions", "druid-kafka-indexing-service", "druid-datasketches"]
   druid.host=<THIS IS REPLACED WITH THE IP OF THE DOCKER CONTAINER AT STARTUP>
   druid.startup.logging.logProperties=true
   druid.zk.service.host=zookeeper1:2181,zookeeper2:2181,zookeeper3:2181
   druid.zk.paths.base=/druid
   druid.metadata.storage.type=postgresql
   druid.metadata.storage.connector.connectURI=jdbc:postgresql://postgres-db:5432/druid
   druid.metadata.storage.connector.user=druid
   druid.metadata.storage.connector.password=druid
   druid.storage.type=s3
   druid.storage.bucket=druid-storage
   druid.storage.baseKey=druid/segments
   druid.s3.accessKey=SAMPLE_ACCESS_KEY
   druid.s3.secretKey=SAMPLE_ACCESS_SECRET_KEY
   druid.indexer.logs.type=s3
   druid.indexer.logs.s3Bucket=druid-logs
   druid.indexer.logs.s3Prefix=druid/indexing-logs
   druid.selectors.indexing.serviceName=druid/overlord
   druid.selectors.coordinator.serviceName=druid/coordinator
   druid.monitoring.monitors=["org.apache.druid.java.util.metrics.JvmMonitor"]
   druid.emitter=noop
   druid.emitter.logging.logLevel=info
   druid.indexing.doubleStorage=double
   druid.server.hiddenProperties=["druid.s3.accessKey","druid.s3.secretKey","druid.metadata.storage.connector.password"]
   druid.sql.enable=true
   druid.lookup.enableLookupSyncOnStartup=false
   
   ```
   **router runtime.properties**
   ```
   druid.service=druid/router
   druid.plaintextPort=8888
   druid.router.http.numConnections=50
   druid.router.http.readTimeout=PT5M
   druid.router.http.numMaxThreads=100
   druid.server.http.numThreads=100
   druid.router.defaultBrokerServiceName=druid/broker
   druid.router.coordinatorServiceName=druid/coordinator
   druid.router.managementProxy.enabled=true
   ```
   
   ### Affected Version
   
   Version: 0.16.1
   
   ### Description
   
   Please include as much detailed information about the problem as possible.
   - Cluster size: 6 Nodes
   - Configurations in use: Zookeeper 3 node quorum
   - Steps to reproduce the problem: Run router service with 0.16.1
   - The error message or stack traces encountered:
   ```
   2020-04-04T23:21:26,212 INFO [main] org.eclipse.jetty.server.Server - jetty-9.4.10.v20180503; built: 2018-05-03T15:56:21.710Z; git: daa59876e6f384329b122929e70a80934569428c; jvm 1.8.0_212-b04
   2020-04-04T23:21:26,213 ERROR [CoordinatorRuleManager-Exec--0] org.apache.druid.curator.discovery.ServerDiscoverySelector - No server instance found for [druid/coordinator]
   2020-04-04T23:21:26,226 INFO [NodeTypeWatcher[COORDINATOR]] org.apache.druid.curator.discovery.CuratorDruidNodeDiscoveryProvider$NodeTypeWatcher - Received INITIALIZED in node watcher.
   2020-04-04T23:21:26,227 INFO [NodeTypeWatcher[BROKER]] org.apache.druid.curator.discovery.CuratorDruidNodeDiscoveryProvider$NodeTypeWatcher - Received INITIALIZED in node watcher.
   2020-04-04T23:21:26,227 ERROR [CoordinatorRuleManager-Exec--0] org.apache.druid.server.router.CoordinatorRuleManager - Exception while polling for rules
   org.apache.druid.java.util.common.IOE: No known server
           at org.apache.druid.discovery.DruidLeaderClient.getCurrentKnownLeader(DruidLeaderClient.java:297) ~[druid-server-0.16.1-incubating.jar:0.16.1-incubating]
           at org.apache.druid.discovery.DruidLeaderClient.makeRequest(DruidLeaderClient.java:132) ~[druid-server-0.16.1-incubating.jar:0.16.1-incubating]
           at org.apache.druid.discovery.DruidLeaderClient.makeRequest(DruidLeaderClient.java:140) ~[druid-server-0.16.1-incubating.jar:0.16.1-incubating]
           at org.apache.druid.server.router.CoordinatorRuleManager.poll(CoordinatorRuleManager.java:141) [druid-server-0.16.1-incubating.jar:0.16.1-incubating]
           at org.apache.druid.server.router.CoordinatorRuleManager$1.run(CoordinatorRuleManager.java:106) [druid-server-0.16.1-incubating.jar:0.16.1-incubating]
           at org.apache.druid.java.util.common.concurrent.ScheduledExecutors$1.call(ScheduledExecutors.java:55) [druid-core-0.16.1-incubating.jar:0.16.1-incubating]
           at org.apache.druid.java.util.common.concurrent.ScheduledExecutors$1.call(ScheduledExecutors.java:51) [druid-core-0.16.1-incubating.jar:0.16.1-incubating]
           at org.apache.druid.java.util.common.concurrent.ScheduledExecutors$2.run(ScheduledExecutors.java:92) [druid-core-0.16.1-incubating.jar:0.16.1-incubating]
           at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_212]
           at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_212]
           at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_212]
           at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_212]
           at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_212]
           at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_212]
           at java.lang.Thread.run(Thread.java:748) [?:1.8.0_212]
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] afire007 edited a comment on issue #9619: Druid Router unable to Discover Broker Services

Posted by GitBox <gi...@apache.org>.
afire007 edited a comment on issue #9619: Druid Router unable to Discover Broker Services
URL: https://github.com/apache/druid/issues/9619#issuecomment-609161091
 
 
   The root cause for this issue was a problem with Zookeeper syncing across the quorum.  Some of the zk nodes appeared to not sync up with the leader for whatever reason in a docker swarm.  I configured druid to connect to the first ZK node and it managed to properly pull up all other service paths.  
   
   To fix the problem I federated out the zk Quorum to its own docker stack and deployed it separately  from the druid cluster.  

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] afire007 closed issue #9619: Druid Router unable to Discover Broker Services

Posted by GitBox <gi...@apache.org>.
afire007 closed issue #9619: Druid Router unable to Discover Broker Services
URL: https://github.com/apache/druid/issues/9619
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] afire007 commented on issue #9619: Druid Router unable to Discover Broker Services

Posted by GitBox <gi...@apache.org>.
afire007 commented on issue #9619: Druid Router unable to Discover Broker Services
URL: https://github.com/apache/druid/issues/9619#issuecomment-609161091
 
 
   The root cause for this issue was a problem with Zookeeper syncing across the quorum.  Some of the zk nodes appeared to not sync up with the leader for whatever reason in a docker swarm.  I configured druid to connect to the first ZK node and it managed to properly pull up all other service paths.  

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org