You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/05/06 21:04:04 UTC

[GitHub] [incubator-druid] clintropolis commented on a change in pull request #7428: Add errors and state to stream supervisor status API endpoint

clintropolis commented on a change in pull request #7428: Add errors and state to stream supervisor status API endpoint
URL: https://github.com/apache/incubator-druid/pull/7428#discussion_r281344967
 
 

 ##########
 File path: docs/content/development/extensions-core/kafka-ingestion.md
 ##########
 @@ -214,6 +214,26 @@ offsets as reported by Kafka, the consumer lag per partition, as well as the agg
 consumer lag per partition may be reported as negative values if the supervisor has not received a recent latest offset
 response from Kafka. The aggregate lag value will always be >= 0.
 
+The status report also contains the supervisor's state and a list of recently thrown exceptions (whose max size can be 
+controlled using the `druid.supervisor.stream.maxStoredExceptionEvents` config parameter).  The list of states is as
+follows:
+
+|State|Description|
+|-----|-----------|
+|UNHEALTHY_SUPERVISOR|The supervisor has encountered errors on the past `druid.supervisor.stream.unhealthinessThreshold` iterations|
+|UNHEALTHY_TASKS|The last `druid.supervisor.stream.taskUnhealthinessThreshold` tasks have all failed|
+|UNABLE_TO_CONNECT_TO_STREAM|The supervisor is encountering connectivity issues with Kafka and has not successfully connected in the past|
+|LOST_CONTACT_WITH_STREAM|The supervisor is encountering connectivity issues with Kafka but has successfully connected in the past|
+|WAITING_TO_RUN (first iteration only)|The supervisor has been initialized and hasn't started connecting to the stream|
+|CONNECTING_TO_STREAM (first iteration only)|The supervisor is trying to connect to the stream and update partition data|
+|DISCOVERING_INITIAL_TASKS (first iteration only)|The supervisor is discovering already-running tasks|
+|CREATING_TASKS (first iteration only)|The supervisor is creating tasks and discovering state|
+|RUNNING|The supervisor has started tasks and is waiting for taskDuration to elapse|
+|SUSPENDED|The supervisor has been suspended|
+|SHUTTING_DOWN|Shutdown has been called but the supervisor hasn’t fully shutdown yet|
 
 Review comment:
   I think this state should be `STOPPING` instead of `SHUTTING_DOWN` since it is tied to the supervisor 'stop' method and importantly to avoid confusion with the deprecated supervisor 'shutdown' API call which is now called 'terminate' which tombstones the supervisor.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org