You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2018/08/16 02:39:55 UTC

[GitHub] huang2012chao edited a comment on issue #5343: Kafka indexing tasks occasionally fail with 'Stream closed'

huang2012chao edited a comment on issue #5343: Kafka indexing tasks occasionally fail with 'Stream closed'
URL: https://github.com/apache/incubator-druid/issues/5343#issuecomment-413404880
 
 
   @jihoonson 
   if i insert some data  into kafka ,the task will success! just like the task id =[index_kafka_kafka_ins_de42264b8a929b1_cpgcjopj]
   There many task failed in the KIS,like this :
   
   2018-08-15T12:04:05,611 INFO [qtp1310871029-50] io.druid.indexing.overlord.ForkingTaskRunner - Killing process for task: index_kafka_kafka_ins_a2b40069b9da776_hkjphola
   2018-08-15T12:04:05,664 INFO [forking-task-runner-0] io.druid.storage.hdfs.tasklog.HdfsTaskLogs - Writing task log to: hdfs://master:9000/druid/indexing-logs/index_kafka_kafka_ins_a2b40069b9da776_hkjphola
   2018-08-15T12:04:07,929 INFO [forking-task-runner-0] io.druid.storage.hdfs.tasklog.HdfsTaskLogs - Wrote task log to: hdfs://master:9000/druid/indexing-logs/index_kafka_kafka_ins_a2b40069b9da776_hkjphola
   2018-08-15T12:04:07,929 INFO [forking-task-runner-0] io.druid.indexing.overlord.ForkingTaskRunner - Exception caught during execution
   java.io.IOException: Stream closed
   	at java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:170) ~[?:1.8.0_181]
   	at java.io.BufferedInputStream.read1(BufferedInputStream.java:291) ~[?:1.8.0_181]
   	at java.io.BufferedInputStream.read(BufferedInputStream.java:345) ~[?:1.8.0_181]
   	at java.io.FilterInputStream.read(FilterInputStream.java:107) ~[?:1.8.0_181]
   	at com.google.common.io.ByteStreams.copy(ByteStreams.java:175) ~[guava-16.0.1.jar:?]
   	at io.druid.indexing.overlord.ForkingTaskRunner$1.call(ForkingTaskRunner.java:452) [druid-indexing-service-0.12.1.jar:0.12.1]
   	at io.druid.indexing.overlord.ForkingTaskRunner$1.call(ForkingTaskRunner.java:224) [druid-indexing-service-0.12.1.jar:0.12.1]
   	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_181]
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181]
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
   	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
   2018-08-15T12:04:08,279 INFO [forking-task-runner-0] io.druid.indexing.overlord.ForkingTaskRunner - Removing task directory: var/druid/task/index_kafka_kafka_ins_a2b40069b9da776_hkjphola
   2018-08-15T12:04:08,783 INFO [WorkerTaskMonitor] io.druid.indexing.worker.WorkerTaskMonitor - Job's finished. Completed [index_kafka_kafka_ins_a2b40069b9da776_hkjphola] with status [FAILED]
   
   I guess it happed because there it no message push into kafka?
   i have used  druid 0.10.2.0 ,kafka 2.11-1.0.0,zookeeper 3.4.10.
   if you find anything error to configure,pelease tell me! 
   ![image](https://user-images.githubusercontent.com/13690781/44183638-650a0580-a13e-11e8-9888-4251164e0467.png)
   the success log like [index_kafka_kafka_ins_de42264b8a929b1_cpgcjopj]:
   [index_kafka_kafka_ins_de42264b8a929b1_cpgcjopj.txt](https://github.com/apache/incubator-druid/files/2292232/index_kafka_kafka_ins_de42264b8a929b1_cpgcjopj.txt)
   
   in the overlord the failed task log:
   2018-08-15T12:04:06,040 INFO [KafkaSupervisor-kafka_ins] io.druid.indexing.overlord.MetadataTaskStorage - Updating task index_kafka_kafka_ins_a2b40069b9da776_hkjphola to status: TaskStatus{id=index_kafka_kafka_ins_a2b40069b9da776_hkjphola, status=FAILED, duration=-1}
   2018-08-15T12:04:07,560 WARN [KafkaSupervisor-kafka_disk-Reporting-0] io.druid.indexing.kafka.supervisor.KafkaSupervisor - Lag metric: Kafka partitions [0, 1, 2] do not match task partitions []
   2018-08-15T12:04:07,560 INFO [KafkaSupervisor-kafka_disk-Reporting-0] io.druid.java.util.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2018-08-15T12:04:07.560Z","service":"druid/overlord","host":"master:8090","version":"0.12.1","metric":"ingest/kafka/lag","value":0,"dataSource":"kafka_disk"}]
   2018-08-15T12:04:07,569 WARN [KafkaSupervisor-kafka_ins-Reporting-0] io.druid.indexing.kafka.supervisor.KafkaSupervisor - Lag metric: Kafka partitions [0, 1, 2] do not match task partitions []
   2018-08-15T12:04:07,569 INFO [KafkaSupervisor-kafka_ins-Reporting-0] io.druid.java.util.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2018-08-15T12:04:07.569Z","service":"druid/overlord","host":"master:8090","version":"0.12.1","metric":"ingest/kafka/lag","value":0,"dataSource":"kafka_ins"}]
   2018-08-15T12:04:08,587 INFO [Curator-PathChildrenCache-3] io.druid.indexing.overlord.RemoteTaskRunner - Worker[slave2:8091] wrote FAILED status for task [index_kafka_kafka_ins_a2b40069b9da776_hkjphola] on [TaskLocation{host='slave2', port=8102, tlsPort=-1}]
   2018-08-15T12:04:08,587 INFO [Curator-PathChildrenCache-3] io.druid.indexing.overlord.RemoteTaskRunner - Worker[slave2:8091] completed task[index_kafka_kafka_ins_a2b40069b9da776_hkjphola] with status[FAILED]
   2018-08-15T12:04:08,587 INFO [Curator-PathChildrenCache-3] io.druid.indexing.overlord.TaskQueue - Received FAILED status for task: index_kafka_kafka_ins_a2b40069b9da776_hkjphola
   2018-08-15T12:04:08,954 INFO [KafkaSupervisor-kafka_ins] io.druid.indexing.overlord.TaskQueue - Task done: KafkaIndexTask{id=index_kafka_kafka_ins_a2b40069b9da776_hkjphola, type=index_kafka, dataSource=kafka_ins}
   2018-08-15T12:04:08,955 INFO [Curator-PathChildrenCache-3] io.druid.indexing.overlord.RemoteTaskRunner - Cleaning up task[index_kafka_kafka_ins_a2b40069b9da776_hkjphola] on worker[slave2:8091]
   2018-08-15T12:04:08,978 WARN [Curator-PathChildrenCache-3] io.druid.indexing.overlord.TaskQueue - Unknown task completed: index_kafka_kafka_ins_a2b40069b9da776_hkjphola
   2018-08-15T12:04:08,978 INFO [Curator-PathChildrenCache-3] io.druid.java.util.emitter.core.LoggingEmitter - Event [{"feed":"metrics","timestamp":"2018-08-15T12:04:08.978Z","service":"druid/overlord","host":"master:8090","version":"0.12.1","metric":"task/run/time","value":103077,"dataSource":"kafka_ins","taskStatus":"FAILED","taskType":"index_kafka"}]
   2018-08-15T12:04:08,978 INFO [Curator-PathChildrenCache-3] io.druid.indexing.overlord.TaskQueue - Task FAILED: KafkaIndexTask{id=index_kafka_kafka_ins_a2b40069b9da776_hkjphola, type=index_kafka, dataSource=kafka_ins} (103077 run duration)
   2018-08-15T12:04:08,978 INFO [Curator-PathChildrenCache-3] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_kafka_kafka_ins_a2b40069b9da776_hkjphola] status changed to [FAILED].
   2018-08-15T12:04:08,979 INFO [Curator-PathChildrenCache-3] io.druid.indexing.overlord.RemoteTaskRunner - Task[index_kafka_kafka_ins_a2b40069b9da776_hkjphola] went bye bye.
   2018-08-15T12:04:08,990 INFO [KafkaIndexTaskClient-kafka_disk-0] io.druid.indexing.kafka.KafkaIndexTaskClient - No TaskLocation available for task [index_kafka_kafka_disk_8258b8d8c8b7341_fgmoceab], this task may not have been assigned to a worker yet or may have already completed
   2018-08-15T12:04:08,992 INFO [KafkaSupervisor-kafka_ins] io.druid.indexing.kafka.supervisor.KafkaSupervisor - {id='kafka_ins', generationTime=2018-08-15T12:04:08.992Z, payload={dataSource='kafka_ins', topic='ins', partitions=3, replicas=1, durationSeconds=3600, active=[{id='index_kafka_kafka_ins_a2b40069b9da776_hkjphola', startTime=null, remainingSeconds=null}], publishing=[{id='index_kafka_kafka_ins_de42264b8a929b1_cpgcjopj', startTime=2018-08-15T10:49:33.227Z, remainingSeconds=925}]}}
   2018-08-15T12:04:09,001 INFO [KafkaSupervisor-kafka_disk] io.druid.indexing.kafka.supervisor.KafkaSupervisor - {id='kafka_disk', generationTime=2018-08-15T12:04:09.001Z, payload={dataSource='kafka_disk', topic='disk', partitions=3, replicas=1, durationSeconds=3600, active=[{id='index_kafka_kafka_disk_8258b8d8c8b7341_fgmoceab', startTime=null, remainingSeconds=null}], publishing=[{id='index_kafka_kafka_disk_8258b8d8c8b7341_fpojhnlo', startTime=2018-08-15T10:59:25.500Z, remainingSeconds=1650}]}}
   2018-08-15T12:04:09,005 WARN [KafkaSupervisor-kafka_ins] io.druid.indexing.kafka.supervisor.KafkaSupervisor - Task [index_kafka_kafka_ins_a2b40069b9da776_hkjphola] failed to return start time, killing task
   2018-08-15T12:04:09,317 INFO [KafkaSupervisor-kafka_ins] io.druid.indexing.kafka.supervisor.KafkaSupervisor - Number of tasks [0] does not match configured numReplicas [1] in task group [0], creating more tasks
   2018-08-15T12:04:09,318 INFO [KafkaSupervisor-kafka_ins] io.druid.indexing.overlord.MetadataTaskStorage - Inserting task index_kafka_kafka_ins_a2b40069b9da776_epkmnncf with status: TaskStatus{id=index_kafka_kafka_ins_a2b40069b9da776_epkmnncf, status=RUNNING, duration=-1}
   2018-08-15T12:04:09,983 INFO [KafkaSupervisor-kafka_ins] io.druid.indexing.overlord.TaskLockbox - Adding task[index_kafka_kafka_ins_a2b40069b9da776_epkmnncf] to activeTasks
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org