You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@storm.apache.org by "Jungtaek Lim (JIRA)" <ji...@apache.org> on 2016/05/12 00:36:12 UTC

[jira] [Commented] (STORM-1136) Provide a bin script to check consumer lag from KafkaSpout to Kafka topic offsets

    [ https://issues.apache.org/jira/browse/STORM-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281056#comment-15281056 ] 

Jungtaek Lim commented on STORM-1136:
-------------------------------------

I didn't notice this is filed to JIRA issue but have been thinking same thing (on UI).

Btw, I just talked with Priyank about this issue and think it would be better to share my thoughts around this issue.

- I guess Kafka is de-facto standard of data source for Storm, but still not be the first class. So storm-core shouldn't be coupled with kafka client, or discussion should be made first.
- It means Kafka Spout should provide those information to Nimbus so that UI can query to Nimbus via RPC.
- I was thinking about including partition information to heartbeat on spout task, but I guess they're rather big for heartbeat message.
-- If it doesn't affect performance or ZK load, it would be the easiest way to implement.
- When providing partition information, data structure should be general so that Nimbus can parse them without coupling with kafka client.
-- Spark introduces similar feature and it stores offset information with generalized data structure (StreamInputInfo): https://github.com/apache/spark/pull/7081

> Provide a bin script to check consumer lag from KafkaSpout to Kafka topic offsets
> ---------------------------------------------------------------------------------
>
>                 Key: STORM-1136
>                 URL: https://issues.apache.org/jira/browse/STORM-1136
>             Project: Apache Storm
>          Issue Type: Improvement
>          Components: storm-kafka
>            Reporter: Sriharsha Chintalapani
>            Assignee: Priyank Shah
>
> We store kafkaspout offsets in zkroot + id path in zookeeper. Kafka provides a utility and a protocol request to fetch latest offsets into topic
> {code}
> example:
> bin/kafka-run-classh.sh kafka.tools.GetOffsetTool 
> {code}
> we should provide a way for the user to check how far the kafka spout read into topic and whats the lag. If we can expose this via UI even better.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)