You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Roger Hoover (JIRA)" <ji...@apache.org> on 2015/01/06 07:28:34 UTC

[jira] [Updated] (SAMZA-503) Lag gauge very slow to update for slow jobs

     [ https://issues.apache.org/jira/browse/SAMZA-503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Roger Hoover updated SAMZA-503:
-------------------------------
    Description: 
For slow jobs, the KafkaSystemConsumerMetrics.%s-%s-messages-behind-high-watermark) gauge does not get updated very often.

Here's my test setup.  I created a job that processes a single message and sleeps 5 seconds.  In another shell, I have another process loading 1000 messages every second to the input topic.

To reproduce:
* Create a job that processes one message and sleeps for 5 seconds
* Create it's input topic but do not populate it yet
* Start the job
* Load 1000s of messages to it's input topic.  You can keep adding messages with a "wait -n 1 <kafka console producer command>"

What happens:
* Run jconsole to view the JMX metrics
* The %s-%s-messages-behind-high-watermark gauge will stay at 0 for a LONG time (~10 minutes?) before finally updating.

What should happen:
* The gauge should get updated at a reasonable interval (a least every few seconds)

I think what's happening is that the BrokerProxy only updates the high watermark when a consumer is ready for more messages.  When the job is so slow, this rarely happens to the metric doesn't get updated. 

  was:
For slow jobs, the KafkaSystemConsumerMetrics.%s-%s-messages-behind-high-watermark) gauge does not get updated very often.

Here's my test setup.  I created a job that processes a single message and sleep 5 seconds.  In another shell, I have another process loading 1000 messages every second to the input topic.

To reproduce:
* Create a job that processes one message and sleeps for 5 seconds
* Create it's input topic but do not populate it yet
* Start the job
* Load 1000s of messages to it's input topic.  You can keep adding messages with a "wait -n 1 <kafka console producer command>"

What happens:
* Run jconsole to view the JMX metrics
* The %s-%s-messages-behind-high-watermark gauge will stay at 0 for a LONG time (~10 minutes?) before finally updating.

What should happen:
* The gauge should get updated at a reasonable interval (a least every few seconds)

I think what's happening is that the BrokerProxy only updates the high watermark when a consumer is ready for more messages.  When the job is so slow, this rarely happens to the metric doesn't get updated. 


> Lag gauge very slow to update for slow jobs
> -------------------------------------------
>
>                 Key: SAMZA-503
>                 URL: https://issues.apache.org/jira/browse/SAMZA-503
>             Project: Samza
>          Issue Type: Bug
>          Components: metrics
>    Affects Versions: 0.8.0
>         Environment: Mac OS X, Oracle Java 7, ProcessJobFactory
>            Reporter: Roger Hoover
>
> For slow jobs, the KafkaSystemConsumerMetrics.%s-%s-messages-behind-high-watermark) gauge does not get updated very often.
> Here's my test setup.  I created a job that processes a single message and sleeps 5 seconds.  In another shell, I have another process loading 1000 messages every second to the input topic.
> To reproduce:
> * Create a job that processes one message and sleeps for 5 seconds
> * Create it's input topic but do not populate it yet
> * Start the job
> * Load 1000s of messages to it's input topic.  You can keep adding messages with a "wait -n 1 <kafka console producer command>"
> What happens:
> * Run jconsole to view the JMX metrics
> * The %s-%s-messages-behind-high-watermark gauge will stay at 0 for a LONG time (~10 minutes?) before finally updating.
> What should happen:
> * The gauge should get updated at a reasonable interval (a least every few seconds)
> I think what's happening is that the BrokerProxy only updates the high watermark when a consumer is ready for more messages.  When the job is so slow, this rarely happens to the metric doesn't get updated. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)