You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Tim Ward <ti...@origamienergy.com.INVALID> on 2019/06/04 14:59:37 UTC

Java consumer error handling on DNS lookup failure

I have a Kafka client written in Java running in Kubernetes, and Kafka running in Kubernetes.

When the client is running but no Kafka nodes are running it appears from the exception below that the DNS lookup fails, then something catches the exception, logs it, and reties. Apparently without returning or throwing from poll().

This would all be fair enough ... except that the retry happens every few milliseconds, causing a large stack trace to be logged every few milliseconds, which, if this keeps going for a few days, eats up an awful lot of space in the cloud logging system. And it *can* keep happening for days or weeks in a development environment, because a developer working on another part of the system may not care, or even know, that this part is broken.

What can I do to reduce the volume of logging data? Some combination of interventions that could


  *   Retry less quickly than every few milliseconds
  *   Retry a finite number of times before giving up altogether
  *   Cause poll() to throw rather than retry
  *   Not include the stack trace in the log messages

might be helpful. The general approach to K8s applications seems to be that if a dependency doesn't exist the client application should simply crash out, so that Kubernetes' backoff and retry mechanism will do what's wanted, in which case some way of getting poll() to throw rather than swallow this exception might be the answer?

Error connecting to node confluent-0.confluent.mynamespace.svc.cluster.local:9091 (id: 0 rack: null)
java.io.IOException: Can't resolve address: confluent-0.confluent.mynamespace.svc.cluster.local:9091
              at org.apache.kafka.common.network.Selector.doConnect(Selector.java:235) ~[kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.common.network.Selector.connect(Selector.java:214) ~[kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:864) [kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.NetworkClient.access$700(NetworkClient.java:64) [kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.NetworkClient$DefaultMetadataUpdater.maybeUpdate(NetworkClient.java:1035) [kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.NetworkClient$DefaultMetadataUpdater.maybeUpdate(NetworkClient.java:920) [kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:508) [kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:271) [kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:242) [kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:233) [kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitMetadataUpdate(ConsumerNetworkClient.java:161) [kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:243) [kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:314) [kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1218) [kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1181) [kafka-clients-2.0.0.jar:?]
              at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1115) [kafka-clients-2.0.0.jar:?]
              at com.origamienergy.etpu.nodes.md.MetrologyWriteWorker.run(MetrologyWriteWorker.java:51) [tiger-v2.5-0-g0fd0b8f.jar:?]
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_212]
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_212]
              at java.lang.Thread.run(Thread.java:748) [?:1.8.0_212]
Caused by: java.nio.channels.UnresolvedAddressException
              at sun.nio.ch.Net.checkAddress(Net.java:101) ~[?:1.8.0_212]
              at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:622) ~[?:1.8.0_212]
              at org.apache.kafka.common.network.Selector.doConnect(Selector.java:233) ~[kafka-clients-2.0.0.jar:?]
              ... 19 more

Tim Ward

This email is from Origami Energy Limited. The contents of this email and any attachment are confidential to the intended recipient(s). If you are not an intended recipient: (i) do not use, disclose, distribute, copy or publish this email or its contents; (ii) please contact Origami Energy Limited immediately; and then (iii) delete this email. For more information, our privacy policy is available here: https://origamienergy.com/privacy-policy/. Origami Energy Limited (company number 8619644) is a company registered in England with its registered office at Ashcombe Court, Woolsack Way, Godalming, GU7 1LQ.