You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Branislav Cogic (JIRA)" <ji...@apache.org> on 2016/05/24 08:10:12 UTC

[jira] [Commented] (SAMZA-856) Optionally automatically commit based on number of processed messages

    [ https://issues.apache.org/jira/browse/SAMZA-856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15297872#comment-15297872 ] 

Branislav Cogic commented on SAMZA-856:
---------------------------------------

The patch is attached.
Here is a RB link: https://reviews.apache.org/r/47761

> Optionally automatically commit based on number of processed messages
> ---------------------------------------------------------------------
>
>                 Key: SAMZA-856
>                 URL: https://issues.apache.org/jira/browse/SAMZA-856
>             Project: Samza
>          Issue Type: Improvement
>          Components: container
>            Reporter: Elias Levy
>            Assignee: Branislav Cogic
>         Attachments: SAMZA-856.0.patch
>
>
> Currently Samza support automatic checkpoint commits based on time via the task.commit.ms property.  The number of messages processed during any time window will vary with the throughput of the system.  Thus, the current automatic checkpointing can't guarantee a maximum number of messages being reprocessed when recovering after a failure.
> I propose the addition of an option that would automatically commit checkpoints  after a configurable number of messages have been processed.  The messages could be counted per container, per task, or per stream. Properties could be named task.commit.msg.container.cnt, task.commit.msg.task.cnt and/or task.commit.msg.stream.cnt.
> Alternatively, a per stream count limit could use different values for different streams. E.g. task.commit.msg.stream.<some_stream>.cnt=1000, task.commit.msg.stream.<some_other_stream>.cnt=200.
> A message count auto commit would be orthogonal to the existing time based auto commit and they could be used at the same time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)