You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Nishant Verma <ni...@gmail.com> on 2017/08/25 08:36:01 UTC
Error on Kafka Connect side || HDFS Connector
Hello All
We are using Kafka Connect to pull records from our Kafka topics to HDFS.
We have a working setup which has giving accurate results last week and
before that with
no data loss for 8-9 billion records. Our source produces some 18 million
records per hour.
We see below error at times in the connector logs which leads to heavy data
loss.
org.apache.kafka.common.errors.WakeupException
at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.maybeTriggerWakeup(ConsumerNetworkClient.java:411)
at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:239)
at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:188)
at
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.commitOffsetsSync(ConsumerCoordinator.java:578)
at
org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1125)
at
org.apache.kafka.connect.runtime.WorkerSinkTask.doCommitSync(WorkerSinkTask.java:255)
at
org.apache.kafka.connect.runtime.WorkerSinkTask.doCommit(WorkerSinkTask.java:274)
at
org.apache.kafka.connect.runtime.WorkerSinkTask.commitOffsets(WorkerSinkTask.java:348)
at
org.apache.kafka.connect.runtime.WorkerSinkTask.closePartitions(WorkerSinkTask.java:480)
at
org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:152)
at
org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:139)
at
org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:182)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:744)
2017-08-24 21:52:41,390 [pool-1-thread-196] ERROR (WorkerTask.java:142) -
Task is being killed and will not recover until manually restarted
*AND:*
2017-08-24 21:52:40,605 [pool-1-thread-191] ERROR (WorkerTask.java:141) -
Task HDFS-Dev2-03-34 threw an uncaught and unrecoverable exception
java.lang.NullPointerException
at io.confluent.connect.hdfs.DataWriter.close(DataWriter.java:296)
at
io.confluent.connect.hdfs.HdfsSinkTask.close(HdfsSinkTask.java:121)
at
org.apache.kafka.connect.runtime.WorkerSinkTask.commitOffsets(WorkerSinkTask.java:317)
at
org.apache.kafka.connect.runtime.WorkerSinkTask.closePartitions(WorkerSinkTask.java:480)
at
org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:152)
at
org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:139)
at
org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:182)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:744)
2017-08-24 21:52:40,605 [pool-1-thread-191] ERROR (WorkerTask.java:142) -
Task is being killed and will not recover until manually restarted
Although it says Task is being killed, but when we print the status of all
the task threads, all are in "RUNNING" state.
What could be the root cause of these errors and how to overcome them?
Thanks
Nishant