You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Chris Riccomini (JIRA)" <ji...@apache.org> on 2014/07/21 19:16:45 UTC

[jira] [Commented] (SAMZA-349) MetricsSnapshotReporter does not refresh metrics every 60 seconds

    [ https://issues.apache.org/jira/browse/SAMZA-349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068819#comment-14068819 ] 

Chris Riccomini commented on SAMZA-349:
---------------------------------------

What should be done here is dependent on the downstream consumer. If you have a running count, the downstream system can simply keep track of the last report it got, and subtract the last report from the current report to get the difference. If you have just an incremental count, then the downstream system can keep the running total.

Given that both implementations allow you to derive the same overall information, I agree that it makes more sense to just send incremental updates.

I see a problem with the patch you've provided, though. We fetch the value in one line, and then reset back to 0 in the next. This leads to a race condition where we might fetch the value, then another thread updates the count (reporting happens on a separate thread from SamzaContainer's main thread), then we call clear. In this scenario, there is data loss since we reset the counter after it's been updated again, but before it's been reported.

To solve this problem, I think we need to use getAndSet in Counter.set, and return the old value. Then we can call clear() and atomically get the old value while updating the new value back to 0.

> MetricsSnapshotReporter does not refresh metrics every 60 seconds
> -----------------------------------------------------------------
>
>                 Key: SAMZA-349
>                 URL: https://issues.apache.org/jira/browse/SAMZA-349
>             Project: Samza
>          Issue Type: Bug
>            Reporter: Yan Fang
>            Assignee: Yan Fang
>         Attachments: SAMZA-349.patch
>
>
> If my understanding is correct, the metrics we provide are for every 60 seconds and all counters will be reset every 60 seconds. Current the MetricsSnapshotReporter seems missing this implementation. It sends out the metrics every 60 seconds but does not reset the counter value.
> {code}
>  registry.getGroup(group).foreach {
>           case (name, metric) =>
>             metric.visit(new MetricsVisitor {
>               def counter(counter: Counter) = groupMsg.put(name, counter.getCount: java.lang.Long)
>               def gauge[T](gauge: Gauge[T]) = groupMsg.put(name, gauge.getValue.asInstanceOf[Object])
>             })
>         }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)