You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Guozhang Wang (JIRA)" <ji...@apache.org> on 2017/06/14 19:42:00 UTC

[jira] [Commented] (KAFKA-5440) Kafka Streams report state RUNNING even if all threads are dead

    [ https://issues.apache.org/jira/browse/KAFKA-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16049574#comment-16049574 ] 

Guozhang Wang commented on KAFKA-5440:
--------------------------------------

This is a good find, and maybe it is related to some sub-tasks of https://issues.apache.org/jira/browse/KAFKA-5156. CC [~enothereska]

> Kafka Streams report state RUNNING even if all threads are dead
> ---------------------------------------------------------------
>
>                 Key: KAFKA-5440
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5440
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 0.11.0.0, 0.10.2.1
>            Reporter: Matthias J. Sax
>
> From the mailing list:
> {quote}
> Hi All,
> We recently implemented a health check for a Kafka Streams based application. The health check is simply checking the state of Kafka Streams by calling KafkaStreams.state(). It reports healthy if it’s not in PENDING_SHUTDOWN or NOT_RUNNING states. 
> We truly appreciate having the possibility to easily check the state of Kafka Streams but to our surprise we noticed that KafkaStreams.state() returns RUNNING even though all StreamThreads has crashed and reached NOT_RUNNING state. Is this intended behaviour or is it a bug? Semantically it seems weird to me that KafkaStreams would say it’s RUNNING when it is in fact not consuming anything since all underlying working threads has crashed. 
> If this is intended behaviour I would appreciate an explanation of why that is the case. Also in that case, how could I determine if the consumption from Kafka hasn’t crashed? 
> If this is not intended behaviour, how fast could I expect it to be fixed? I wouldn’t mind fixing it myself but I’m not sure if this is considered trivial or big enough to require a JIRA. Also, if I would implement a fix I’d like your input on what would be a reasonable solution. By just inspecting to code I have an idea but I’m not sure I understand all the implication so I’d be happy to hear your thoughts first. 
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)