You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Chris Riccomini (JIRA)" <ji...@apache.org> on 2014/04/30 19:12:16 UTC

[jira] [Commented] (SAMZA-23) TaskInstance commits for all TaskCoordinator.commit calls

    [ https://issues.apache.org/jira/browse/SAMZA-23?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985759#comment-13985759 ] 

Chris Riccomini commented on SAMZA-23:
--------------------------------------

Work has been done on SAMZA-253 to handle a similar usage pattern with shutdown (rather than commit). Should take a look at SAMZA-253 when working on this ticket.

> TaskInstance commits for all TaskCoordinator.commit calls
> ---------------------------------------------------------
>
>                 Key: SAMZA-23
>                 URL: https://issues.apache.org/jira/browse/SAMZA-23
>             Project: Samza
>          Issue Type: Bug
>          Components: container
>    Affects Versions: 0.6.0
>            Reporter: Chris Riccomini
>
> If a StreamTask calls TaskCoordinator.commit, all TaskInstances will commit their SystemProducers, TaskStorageManager, and CheckpointManager. The problem with this is that if you have 400 partitions in a SamzaContainer (for example), and each calls TaskCoordinator.commit once per second, you actually get 400 commits per TaskInstance per second. This is incorrect behavior. We should make TaskInstance commit itself only when its task.commit.ms window has expired, or when TaskCoordinator.commit was called ONLY by its StreamTask partition.



--
This message was sent by Atlassian JIRA
(v6.2#6252)