You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/06/30 21:10:00 UTC

[jira] [Commented] (SAMZA-1356) Improve monitoring for state restore

    [ https://issues.apache.org/jira/browse/SAMZA-1356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070729#comment-16070729 ] 

ASF GitHub Bot commented on SAMZA-1356:
---------------------------------------

GitHub user jmakes opened a pull request:

    https://github.com/apache/samza/pull/241

    SAMZA-1356: Improve monitoring for state restore

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jmakes/samza samza-1356

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/samza/pull/241.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #241
    
----
commit b44bfbb8317bf8a7e4212eb970d795fa196fd53c
Author: Jacob Maes <jm...@linkedin.com>
Date:   2017-06-30T21:08:42Z

    SAMZA-1356: Improve monitoring for state restore

----


> Improve monitoring for state restore
> ------------------------------------
>
>                 Key: SAMZA-1356
>                 URL: https://issues.apache.org/jira/browse/SAMZA-1356
>             Project: Samza
>          Issue Type: Bug
>            Reporter: Jake Maes
>            Assignee: Jake Maes
>             Fix For: 0.13.1
>
>
> There are a couple problems that can affect our ability to troubleshoot state restore from changelog.
> 1. KeyValueStorageEngine logs a message for every 1M messages restored, but it doesn't print anything for smaller stores. We should add a message to report the final number of entries restored.
> 2. While the "restore-time" metric is a gauge, the KeyValueStorageEngineMetrics "messages-restored" and "messages-bytes" are both counters, and counters are often reported in terms of deltas so the value disappears after one data point. Since these values only matter for the beginning of the job, we should switch them to gauges so the value is retained for later monitoring. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)