You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2021/02/02 16:58:15 UTC

[GitHub] [pulsar] fmiguelez opened a new issue #9428: Sink instances die (and do not restart automatically) due to "Bookie handle is not available" error

fmiguelez opened a new issue #9428:
URL: https://github.com/apache/pulsar/issues/9428


   **Describe the bug**
   We have a cluster with 3 brokers and 3 bookies. We are facing issues with our sink that writes to an external DB. This sink reads from 36 topics using 3 instances. Each topic data goes to a DB table.
   
   The thing is that now and then our instances die with error below (that does not reach our sink code). Eventually all three instances die. 
   
   > 	20:00:37.699 [pulsar-client-io-1-2] WARN  org.apache.pulsar.client.impl.BinaryProtoLookupService - [persistent://dbus/tdf/measure-route-15min] failed to get schema : org.apache.pulsar.client.api.PulsarClientException: java.io.IOException: Bookie handle is not available -  ledger=47 - operation=Failed to read entry - entry=0
   > 	java.util.concurrent.CompletionException: org.apache.pulsar.client.api.PulsarClientException: java.io.IOException: Bookie handle is not available -  ledger=47 - operation=Failed to read entry - entry=0
   > 			at java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326) ~[?:1.8.0_275]
   > 			at java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338) ~[?:1.8.0_275]
   > 			at java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:925) ~[?:1.8.0_275]
   > 			at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:967) ~[?:1.8.0_275]
   > 			at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:940) ~[?:1.8.0_275]
   > 			at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) [?:1.8.0_275]
   > 			at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) [?:1.8.0_275]
   > 			at org.apache.pulsar.client.impl.ClientCnx.handleGetSchemaResponse(ClientCnx.java:736) [org.apache.pulsar-pulsar-client-original-2.7.0.jar:2.7.0]
   > 			at org.apache.pulsar.common.protocol.PulsarDecoder.channelRead(PulsarDecoder.java:373) [org.apache.pulsar-pulsar-common-2.7.0.jar:2.7.0]
   > 			at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324) [io.netty-netty-codec-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296) [io.netty-netty-codec-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:792) [io.netty-netty-transport-native-epoll-4.1.51.Final-linux-x86_64.jar:4.1.51.Final]
   > 			at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:475) [io.netty-netty-transport-native-epoll-4.1.51.Final-linux-x86_64.jar:4.1.51.Final]
   > 			at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) [io.netty-netty-transport-native-epoll-4.1.51.Final-linux-x86_64.jar:4.1.51.Final]
   > 			at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [io.netty-netty-common-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [io.netty-netty-common-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.51.Final.jar:4.1.51.Final]
   > 			at java.lang.Thread.run(Thread.java:748) [?:1.8.0_275]
   > 	Caused by: org.apache.pulsar.client.api.PulsarClientException: java.io.IOException: Bookie handle is not available -  ledger=47 - operation=Failed to read entry - entry=0
   > 			at org.apache.pulsar.client.impl.ClientCnx.getPulsarClientException(ClientCnx.java:1031) ~[org.apache.pulsar-pulsar-client-original-2.7.0.jar:2.7.0]
   > 			at org.apache.pulsar.client.impl.ClientCnx.lambda$sendGetSchema$13(ClientCnx.java:803) ~[org.apache.pulsar-pulsar-client-original-2.7.0.jar:2.7.0]
   > 			at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:966) ~[?:1.8.0_275]
   > 			... 24 more
   > 	20:00:37.707 [dbus/tdf/histsink-vertica1-1] ERROR org.apache.pulsar.functions.instance.JavaInstanceRunnable - [dbus/tdf/histsink-vertica1:1] Uncaught exception in Java Instance
   > 	com.google.common.util.concurrent.UncheckedExecutionException: org.apache.commons.lang3.SerializationException: Failed at fetching schema info for 0
   > 			at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2051) ~[com.google.guava-guava-30.0-jre.jar:?]
   > 			at com.google.common.cache.LocalCache.get(LocalCache.java:3951) ~[com.google.guava-guava-30.0-jre.jar:?]
   > 			at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974) ~[com.google.guava-guava-30.0-jre.jar:?]
   > 			at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4935) ~[com.google.guava-guava-30.0-jre.jar:?]
   > 			at org.apache.pulsar.client.impl.schema.reader.AbstractMultiVersionReader.read(AbstractMultiVersionReader.java:86) ~[org.apache.pulsar-pulsar-client-original-2.7.0.jar:2.7.0]
   > 			at org.apache.pulsar.client.impl.schema.AbstractStructSchema.decode(AbstractStructSchema.java:60) ~[org.apache.pulsar-pulsar-client-original-2.7.0.jar:2.7.0]
   > 			at org.apache.pulsar.client.impl.schema.AutoConsumeSchema.decode(AutoConsumeSchema.java:97) ~[org.apache.pulsar-pulsar-client-original-2.7.0.jar:2.7.0]
   > 			at org.apache.pulsar.client.impl.schema.AutoConsumeSchema.decode(AutoConsumeSchema.java:40) ~[org.apache.pulsar-pulsar-client-original-2.7.0.jar:2.7.0]
   > 			at org.apache.pulsar.client.impl.MessageImpl.getValue(MessageImpl.java:301) ~[org.apache.pulsar-pulsar-client-original-2.7.0.jar:2.7.0]
   > 			at org.apache.pulsar.functions.source.PulsarRecord.getValue(PulsarRecord.java:74) ~[org.apache.pulsar-pulsar-functions-instance-2.7.0.jar:2.7.0]
   > 			at org.apache.pulsar.functions.instance.JavaInstanceRunnable.readInput(JavaInstanceRunnable.java:400) ~[org.apache.pulsar-pulsar-functions-instance-2.7.0.jar:?]
   > 			at org.apache.pulsar.functions.instance.JavaInstanceRunnable.run(JavaInstanceRunnable.java:237) [org.apache.pulsar-pulsar-functions-instance-2.7.0.jar:?]
   > 			at java.lang.Thread.run(Thread.java:748) [?:1.8.0_275]
   > 	Caused by: org.apache.commons.lang3.SerializationException: Failed at fetching schema info for 0
   > 			at org.apache.pulsar.client.impl.schema.reader.AbstractMultiVersionReader.getSchemaInfoByVersion(AbstractMultiVersionReader.java:125) ~[org.apache.pulsar-pulsar-client-original-2.7.0.jar:2.7.0]
   > 			at org.apache.pulsar.client.impl.schema.generic.MultiVersionGenericAvroReader.loadReader(MultiVersionGenericAvroReader.java:43) ~[org.apache.pulsar-pulsar-client-original-2.7.0.jar:2.7.0]
   > 			at org.apache.pulsar.client.impl.schema.reader.AbstractMultiVersionReader$1.load(AbstractMultiVersionReader.java:52) ~[org.apache.pulsar-pulsar-client-original-2.7.0.jar:2.7.0]
   > 			at org.apache.pulsar.client.impl.schema.reader.AbstractMultiVersionReader$1.load(AbstractMultiVersionReader.java:49) ~[org.apache.pulsar-pulsar-client-original-2.7.0.jar:2.7.0]
   > 			at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3529) ~[com.google.guava-guava-30.0-jre.jar:?]
   > 			at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2278) ~[com.google.guava-guava-30.0-jre.jar:?]
   > 			at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2155) ~[com.google.guava-guava-30.0-jre.jar:?]
   > 			at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2045) ~[com.google.guava-guava-30.0-jre.jar:?]
   > 			... 12 more
   > 	Caused by: org.apache.pulsar.client.api.PulsarClientException: java.io.IOException: Bookie handle is not available -  ledger=47 - operation=Failed to read entry - entry=0
   > 			at org.apache.pulsar.client.impl.ClientCnx.getPulsarClientException(ClientCnx.java:1031) ~[org.apache.pulsar-pulsar-client-original-2.7.0.jar:2.7.0]
   > 			at org.apache.pulsar.client.impl.ClientCnx.lambda$sendGetSchema$13(ClientCnx.java:803) ~[org.apache.pulsar-pulsar-client-original-2.7.0.jar:2.7.0]
   > 			at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:966) ~[?:1.8.0_275]
   > 			at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:940) ~[?:1.8.0_275]
   > 			at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) ~[?:1.8.0_275]
   > 			at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) ~[?:1.8.0_275]
   > 			at org.apache.pulsar.client.impl.ClientCnx.handleGetSchemaResponse(ClientCnx.java:736) ~[org.apache.pulsar-pulsar-client-original-2.7.0.jar:2.7.0]
   > 			at org.apache.pulsar.common.protocol.PulsarDecoder.channelRead(PulsarDecoder.java:373) ~[org.apache.pulsar-pulsar-common-2.7.0.jar:2.7.0]
   > 			at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324) ~[io.netty-netty-codec-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296) ~[io.netty-netty-codec-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) ~[io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) ~[io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:792) ~[io.netty-netty-transport-native-epoll-4.1.51.Final-linux-x86_64.jar:4.1.51.Final]
   > 			at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:475) ~[io.netty-netty-transport-native-epoll-4.1.51.Final-linux-x86_64.jar:4.1.51.Final]
   > 			at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) ~[io.netty-netty-transport-native-epoll-4.1.51.Final-linux-x86_64.jar:4.1.51.Final]
   > 			at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) ~[io.netty-netty-common-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[io.netty-netty-common-4.1.51.Final.jar:4.1.51.Final]
   > 			at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[io.netty-netty-common-4.1.51.Final.jar:4.1.51.Final]
   > 			... 1 more
   > 	20:00:37.712 [dbus/tdf/histsink-vertica1-1] INFO  org.apache.pulsar.functions.instance.JavaInstanceRunnable - Closing instance
   > 	20:00:37.792 [pulsar-client-io-1-7] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-det-15min] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.793 [pulsar-client-io-1-2] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-det-1min] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.795 [pulsar-client-io-1-2] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-det-30min] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.796 [pulsar-client-io-1-7] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-det-5min] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.797 [pulsar-client-io-1-5] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-det-day] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.798 [pulsar-client-io-1-7] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-det-hour] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.799 [pulsar-client-io-1-2] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-det-month] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.800 [pulsar-client-io-1-2] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-det-raw] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.800 [pulsar-client-io-1-7] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-det-year] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.801 [pulsar-client-io-1-7] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-link-15min] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.802 [pulsar-client-io-1-7] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-link-1min] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.803 [pulsar-client-io-1-5] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-link-30min] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.804 [pulsar-client-io-1-2] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-link-5min] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.804 [pulsar-client-io-1-2] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-link-day] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.806 [pulsar-client-io-1-5] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-link-hour] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.807 [pulsar-client-io-1-5] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-link-month] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.808 [pulsar-client-io-1-5] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-link-raw] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.808 [pulsar-client-io-1-2] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-link-year] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.809 [pulsar-client-io-1-5] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-point-15min] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.810 [pulsar-client-io-1-7] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-point-1min] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.811 [pulsar-client-io-1-7] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-point-30min] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.812 [pulsar-client-io-1-2] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-point-5min] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.813 [pulsar-client-io-1-5] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-point-day] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.813 [pulsar-client-io-1-5] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-point-hour] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.814 [pulsar-client-io-1-7] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-point-month] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.815 [pulsar-client-io-1-2] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-point-raw] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.816 [pulsar-client-io-1-2] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-point-year] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.817 [pulsar-client-io-1-5] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-route-15min] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.818 [pulsar-client-io-1-7] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-route-1min] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.819 [pulsar-client-io-1-7] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-route-30min] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.820 [pulsar-client-io-1-2] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-route-5min] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.821 [pulsar-client-io-1-7] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-route-day] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.823 [pulsar-client-io-1-5] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-route-hour] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.824 [pulsar-client-io-1-7] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-route-month] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.825 [pulsar-client-io-1-7] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-route-raw] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.826 [pulsar-client-io-1-2] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://dbus/tdf/measure-route-year] [sub-tdf-histsink-vertica1] Closed consumer
   > 	20:00:37.835 [dbus/tdf/histsink-vertica1-1] INFO  org.apache.pulsar.functions.instance.JavaInstanceRunnable - Unloading JAR files for function InstanceConfig(instanceId=1, functionId=5f18d3e6-530c-49a3-a128-ff01a8069b98, functionVersion=2ef762e6-ed0d-44d4-aca3-39ec09d176e0, functionDetails=tenant: "dbus"
   > 	namespace: "tdf"
   > 	name: "histsink-vertica1"
   > 	className: "org.apache.pulsar.functions.api.utils.IdentityFunction"
   > 	parallelism: 3
   > 	source {
   > 	  typeClassName: "org.apache.pulsar.client.api.schema.GenericRecord"
   > 	  subscriptionName: "sub-tdf-histsink-vertica1"
   > 	  inputSpecs {
   > 		key: "persistent://dbus/tdf/measure-det-15min"
   > 		value {
   > 		}
   > 	  }
   > 	...
   > 	componentType: SINK
   > 	customRuntimeOptions: "{ \"jobNamespace\" : \"dbus\" }"
   > 	, maxBufferedTuples=1024, functionAuthenticationSpec=null, port=40545, clusterName=databus, maxPendingAsyncRequests=1000)
   > 	20:00:37.836 [main] INFO  org.apache.pulsar.functions.runtime.JavaInstanceStarter - RuntimeSpawner quit, shutting down JavaInstance
   > 	20:00:37.837 [main] INFO  org.apache.pulsar.client.impl.PulsarClientImpl - Client closing. URL: pulsar://pulsar2:6650
   > 	20:00:37.847 [pulsar-client-io-1-7] INFO  org.apache.pulsar.client.impl.ClientCnx - [id: 0x7b07901f, L:/10.0.4.61:46984 ! R:pulsar1/10.0.4.64:6650] Disconnected
   > 	20:00:37.848 [pulsar-client-io-1-2] INFO  org.apache.pulsar.client.impl.ClientCnx - [id: 0x5d65a905, L:/10.0.4.61:48236 ! R:pulsar2/10.0.4.60:6650] Disconnected
   > 	20:00:37.851 [pulsar-client-io-1-5] INFO  org.apache.pulsar.client.impl.ClientCnx - [id: 0x053f4733, L:/10.0.4.61:56168 ! R:pulsar3/10.0.4.62:6650] Disconnected
   > 
   
   
   
   **To Reproduce**
   We are producing data to all these topics with up to 4000 messages per topic every 5 minutes. After some hours (or even 
   minutes) this issue arises.
   
   We need to restart the sink. Sometimes they fail immediately and others they continue processing data. 
   
   We have option **autoSkipNonRecoverableData**  enabled on all broker instances.
   
   **Expected behavior**
   
   Option **autoSkipNonRecoverableData** should let pulsar client feeding sink continue processing messages, even if any fail. 
   
   Ideally these "Bookie handle is not available" errors should not happen. 
   
   In any case the sink instances should be able to restart automatically. 
   
   **Screenshots**
   
   **Desktop (please complete the following information):**
    - OS: CentOS 7
    - Running on Docker Swarm cluster
   
   **Additional context**
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] codelipenghui commented on issue #9428: Sink instances die (and do not restart automatically) due to "Bookie handle is not available" error

Posted by GitBox <gi...@apache.org>.
codelipenghui commented on issue #9428:
URL: https://github.com/apache/pulsar/issues/9428#issuecomment-1058893749


   The issue had no activity for 30 days, mark with Stale label.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] codelipenghui commented on issue #9428: Sink instances die (and do not restart automatically) due to "Bookie handle is not available" error

Posted by GitBox <gi...@apache.org>.
codelipenghui commented on issue #9428:
URL: https://github.com/apache/pulsar/issues/9428#issuecomment-788937594


   @fmiguelez The `autoSkipNonRecoverableData` only works for the ledger does not exist, does not work for some IO exception, this will avoid losing data in some unknown circumstances.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org