You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2020/06/03 07:21:34 UTC

[GitHub] [pulsar] hmoragrega commented on issue #7140: Zombie producers shown as connected after a broker instance crashed

hmoragrega commented on issue #7140:
URL: https://github.com/apache/pulsar/issues/7140#issuecomment-638009352


   thanks for checking out:
   * We are running a docker image: `apachepulsar/pulsar-all:2.5.1`
   * The broker process was killed by k8s:
   ```
   2020-06-01 20:31:26.000 UTC
   Memory cgroup out of memory: Kill process 281696 (java) score 1862 or sacrifice child Killed process 281696 (java) total-vm:6202804kB, anon-rss:2484912kB, file-rss:24084kB, shmem-rss:0kB
   
   2020-06-01 20:31:26.152 UTC
   Killed process 281696 (java) total-vm:6202804kB, anon-rss:2484912kB, file-rss:24084kB, shmem-rss:0kB}] Conditions:[{Type:KernelDeadlock Status:False Transition:2020-05-10 23:28:18.17330381 +0000 UTC m=+5.104380376 Reason:KernelHasNoDeadlock Message:kernel has no deadlock} {Type:ReadonlyFilesystem Status:False Transition:2020-05-10 23:28:18.173304129 +0000 UTC m=+5.104380498 Reason:FilesystemIsNotReadOnly Message:Filesystem is not read-only}]}
   
   ```
   I've checked the pulsar logs around the event but I haven't found any outstanding message, just the application being restarted 
   ```
   I 2020-06-01T20:31:19.772967095Z 20:31:19.772 [pulsar-client-io-42-1] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://pulsar/pulsar/10.1.2.182:8080/healthcheck][reader-98d2f9bedb] Subscribed to topic on 10.1.2.182/10.1.2.182:6650 -- consumer: 3380
   I 2020-06-01T20:31:19.785581683Z 20:31:19.785 [pulsar-io-22-3] INFO  org.apache.pulsar.broker.service.ServerCnx - [PersistentTopic{topic=persistent://pulsar/pulsar/10.1.2.182:8080/healthcheck}][pulsar-20-3385] Closing producer on cnx /10.1.2.182:33968
   I 2020-06-01T20:31:19.785694992Z 20:31:19.785 [pulsar-io-22-3] INFO  org.apache.pulsar.broker.service.ServerCnx - [PersistentTopic{topic=persistent://pulsar/pulsar/10.1.2.182:8080/healthcheck}][pulsar-20-3385] Closed producer on cnx /10.1.2.182:33968
   I 2020-06-01T20:31:19.785802934Z 20:31:19.785 [pulsar-client-io-42-1] INFO  org.apache.pulsar.client.impl.ProducerImpl - [persistent://pulsar/pulsar/10.1.2.182:8080/healthcheck] [pulsar-20-3385] Closed Producer
   I 2020-06-01T20:31:19.785898196Z 20:31:19.785 [pulsar-io-22-3] INFO  org.apache.pulsar.broker.service.ServerCnx - [/10.1.2.182:33968] Closing consumer: 3380
   I 2020-06-01T20:31:19.786763101Z 20:31:19.786 [pulsar-io-22-3] INFO  org.apache.pulsar.broker.service.AbstractDispatcherSingleActiveConsumer - Removing consumer Consumer{subscription=PersistentSubscription{topic=persistent://pulsar/pulsar/10.1.2.182:8080/healthcheck, name=reader-98d2f9bedb}, consumerId=3380, consumerName=9574f, address=/10.1.2.182:33968}
   I 2020-06-01T20:31:19.787468692Z 20:31:19.786 [pulsar-io-22-3] INFO  org.apache.pulsar.broker.service.persistent.PersistentSubscription - [persistent://pulsar/pulsar/10.1.2.182:8080/healthcheck][reader-98d2f9bedb] Successfully closed subscription [NonDurableCursorImpl{ledger=pulsar/pulsar/10.1.2.182:8080/persistent/healthcheck, ackPos=38739:397, readPos=38739:398}]
   I 2020-06-01T20:31:19.787644898Z 20:31:19.787 [pulsar-io-22-3] INFO  org.apache.pulsar.broker.service.persistent.PersistentSubscription - [persistent://pulsar/pulsar/10.1.2.182:8080/healthcheck][reader-98d2f9bedb] Successfully closed dispatcher for reader
   I 2020-06-01T20:31:19.787662374Z 20:31:19.787 [pulsar-external-listener-43-1] INFO  org.eclipse.jetty.server.RequestLog - 127.0.0.1 - - [01/Jun/2020:20:31:19 +0000] "GET /admin/v2/brokers/health HTTP/1.1" 200 2 "-" "curl/7.64.0" 26
   I 2020-06-01T20:31:19.787773157Z 20:31:19.787 [pulsar-io-22-3] INFO  org.apache.pulsar.broker.service.ServerCnx - [/10.1.2.182:33968] Closed consumer Consumer{subscription=PersistentSubscription{topic=persistent://pulsar/pulsar/10.1.2.182:8080/healthcheck, name=reader-98d2f9bedb}, consumerId=3380, consumerName=9574f, address=/10.1.2.182:33968}
   I 2020-06-01T20:31:19.787804660Z 20:31:19.787 [pulsar-client-io-42-1] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://pulsar/pulsar/10.1.2.182:8080/healthcheck] [reader-98d2f9bedb] Closed consumer
   I 2020-06-01T20:31:23.906091867Z 20:31:23.905 [pulsar-ordered-OrderedExecutor-3-0-EventThread] INFO  org.apache.pulsar.zookeeper.ZooKeeperDataCache - [State:CONNECTED Timeout:30000 sessionid:0x300369d33e0001b local:/10.1.2.182:33466 remoteserver:pulsar-zookeeper-2.pulsar-zookeeper/10.1.8.80:2181 lastZxid:25770466537 xid:107048 sent:107048 recv:124047 queuedpkts:0 pendingresp:0 queuedevents:0] Received ZooKeeper watch event: WatchedEvent state:SyncConnected type:NodeDataChanged path:/loadbalance/brokers/10.1.41.80:8080
   I 2020-06-01T20:31:24.719776356Z 20:31:24.719 [pulsar-load-manager-3-1] INFO  org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl - Writing local data to ZooKeeper because maximum change 11.40127182006836% exceeded threshold 10%; time since last report written is 5.0 seconds
   I 2020-06-01T20:31:24.725426343Z 20:31:24.725 [pulsar-ordered-OrderedExecutor-3-0-EventThread] INFO  org.apache.pulsar.zookeeper.ZooKeeperDataCache - [State:CONNECTED Timeout:30000 sessionid:0x300369d33e0001b local:/10.1.2.182:33466 remoteserver:pulsar-zookeeper-2.pulsar-zookeeper/10.1.8.80:2181 lastZxid:25770466539 xid:107051 sent:107051 recv:124051 queuedpkts:0 pendingresp:0 queuedevents:0] Received ZooKeeper watch event: WatchedEvent state:SyncConnected type:NodeDataChanged path:/loadbalance/brokers/10.1.2.182:8080
   
   // Restart
   
   I 2020-06-01T20:31:27.095166566Z [conf/pulsar_env.sh] Applying config PULSAR_EXTRA_OPTS = -Dpulsar.log.root.level=info
   I 2020-06-01T20:31:27.095312356Z [conf/pulsar_env.sh] Applying config PULSAR_GC = "-XX:+UseG1GC -XX:MaxGCPauseMillis=10"
   I 2020-06-01T20:31:27.095324965Z [conf/pulsar_env.sh] Applying config PULSAR_MEM = "-Xms2g -Xmx2g -XX:MaxDirectMemorySize=2g -Dio.netty.leakDetectionLevel=disabled -Dio.netty.recycler.linkCapacity=1024 -XX:+ParallelRefProcEnabled -XX:+UnlockExperimentalVMOptions -XX:+AggressiveOpts -XX:+DoEscapeAnalysis -XX:ParallelGCThreads=32 -XX:ConcGCThreads=32 -XX:G1NewSizePercent=50 -XX:+DisableExplicitGC -XX:-ResizePLAB -XX:+ExitOnOutOfMemoryError -XX:+PerfDisableSharedMem"
   I 2020-06-01T20:31:27.095333038Z [conf/broker.conf] Applying config advertisedAddress = 10.1.2.182
   ...
   I 2020-06-01T20:31:27.095428042Z [conf/broker.conf] Applying config zookeeperServers = pulsar-zookeeper-0.pulsar-zookeeper,pulsar-zookeeper-1.pulsar-zookeeper,pulsar-zookeeper-2.pulsar-zookeeper
   I 2020-06-01T20:31:27.118097760Z [conf/pulsar_env.sh] Applying config PULSAR_EXTRA_OPTS = -Dpulsar.log.root.level=info
   I 2020-06-01T20:31:27.118143135Z [conf/pulsar_env.sh] Applying config PULSAR_GC = "-XX:+UseG1GC -XX:MaxGCPauseMillis=10"
   I 2020-06-01T20:31:27.118149618Z [conf/pulsar_env.sh] Applying config PULSAR_MEM = "-Xms2g -Xmx2g -XX:MaxDirectMemorySize=2g -Dio.netty.leakDetectionLevel=disabled -Dio.netty.recycler.linkCapacity=1024 -XX:+ParallelRefProcEnabled -XX:+UnlockExperimentalVMOptions -XX:+AggressiveOpts -XX:+DoEscapeAnalysis -XX:ParallelGCThreads=32 -XX:ConcGCThreads=32 -XX:G1NewSizePercent=50 -XX:+DisableExplicitGC -XX:-ResizePLAB -XX:+ExitOnOutOfMemoryError -XX:+PerfDisableSharedMem"
   I 2020-06-01T20:31:27.118155877Z [conf/pulsar_env.sh] Applying config PULSAR_EXTRA_OPTS = -Dpulsar.log.root.level=info
   I 2020-06-01T20:31:27.118160698Z [conf/pulsar_env.sh] Applying config PULSAR_GC = "-XX:+UseG1GC -XX:MaxGCPauseMillis=10"
   I 2020-06-01T20:31:27.118165936Z [conf/pulsar_env.sh] Applying config PULSAR_MEM = "-Xms2g -Xmx2g -XX:MaxDirectMemorySize=2g -Dio.netty.leakDetectionLevel=disabled -Dio.netty.recycler.linkCapacity=1024 -XX:+ParallelRefProcEnabled -XX:+UnlockExperimentalVMOptions -XX:+AggressiveOpts -XX:+DoEscapeAnalysis -XX:ParallelGCThreads=32 -XX:ConcGCThreads=32 -XX:G1NewSizePercent=50 -XX:+DisableExplicitGC -XX:-ResizePLAB -XX:+ExitOnOutOfMemoryError -XX:+PerfDisableSharedMem"
   I 2020-06-01T20:31:27.141939511Z [conf/pulsar_env.sh] Applying config PULSAR_EXTRA_OPTS = -Dpulsar.log.root.level=info
   I 2020-06-01T20:31:27.141971873Z [conf/pulsar_env.sh] Applying config PULSAR_GC = "-XX:+UseG1GC -XX:MaxGCPauseMillis=10"
   I 2020-06-01T20:31:27.142012430Z [conf/pulsar_env.sh] Applying config PULSAR_MEM = "-Xms2g -Xmx2g -XX:MaxDirectMemorySize=2g -Dio.netty.leakDetectionLevel=disabled -Dio.netty.recycler.linkCapacity=1024 -XX:+ParallelRefProcEnabled -XX:+UnlockExperimentalVMOptions -XX:+AggressiveOpts -XX:+DoEscapeAnalysis -XX:ParallelGCThreads=32 -XX:ConcGCThreads=32 -XX:G1NewSizePercent=50 -XX:+DisableExplicitGC -XX:-ResizePLAB -XX:+ExitOnOutOfMemoryError -XX:+PerfDisableSharedMem"
   E 2020-06-01T20:31:27.934534415Z [AppClassLoader@18b4aac2] info AspectJ Weaver Version 1.9.2 built on Wednesday Oct 24, 2018 at 15:43:33 GMT
   E 2020-06-01T20:31:27.934578455Z [AppClassLoader@18b4aac2] info register classloader sun.misc.Launcher$AppClassLoader@18b4aac2
   E 2020-06-01T20:31:27.934621976Z [AppClassLoader@18b4aac2] info using configuration file:/pulsar/lib/org.apache.pulsar-pulsar-zookeeper-utils-2.5.1.jar!/META-INF/aop.xml
   E 2020-06-01T20:31:27.934630474Z [AppClassLoader@18b4aac2] info using configuration file:/pulsar/lib/org.apache.pulsar-pulsar-zookeeper-2.5.1.jar!/META-INF/aop.xml
   E 2020-06-01T20:31:27.940125061Z [AppClassLoader@18b4aac2] info register aspect org.apache.pulsar.broker.zookeeper.aspectj.ClientCnxnAspect
   E 2020-06-01T20:31:28.218736757Z [AppClassLoader@18b4aac2] info register aspect org.apache.pulsar.zookeeper.FinalRequestProcessorAspect
   E 2020-06-01T20:31:28.223107367Z [AppClassLoader@18b4aac2] info register aspect org.apache.pulsar.zookeeper.ZooKeeperServerAspect
   I 2020-06-01T20:31:30.492087527Z 20:31:30.488 [main] INFO  org.apache.pulsar.broker.PulsarService - Starting Pulsar Broker service; version: '2.5.1'
   I 2020-06-01T20:31:30.494428749Z 20:31:30.494 [main] INFO  org.apache.pulsar.broker.PulsarService - Git Revision c36170c3acc0bd432036baf7d9471a4c34320395
   I 2020-06-01T20:31:30.494766089Z 20:31:30.494 [main] INFO  org.apache.pulsar.broker.PulsarService - Built by guangning <gu...@apache.org> on ubuntu-s-4vcpu-8gb-sfo1-01 at 2020-04-15T08:53:35+0000


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org