You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Martin Kleppmann (JIRA)" <ji...@apache.org> on 2014/06/16 21:02:02 UTC

[jira] [Resolved] (SAMZA-23) TaskInstance commits for all TaskCoordinator.commit calls

     [ https://issues.apache.org/jira/browse/SAMZA-23?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martin Kleppmann resolved SAMZA-23.
-----------------------------------

    Resolution: Fixed
      Assignee: Martin Kleppmann

This was implemented as part of SAMZA-253: if you request commit, you can now specify whether to commit only the current partition, or all partitions in the container. I'm resolving this issue.

> TaskInstance commits for all TaskCoordinator.commit calls
> ---------------------------------------------------------
>
>                 Key: SAMZA-23
>                 URL: https://issues.apache.org/jira/browse/SAMZA-23
>             Project: Samza
>          Issue Type: Bug
>          Components: container
>    Affects Versions: 0.6.0
>            Reporter: Chris Riccomini
>            Assignee: Martin Kleppmann
>             Fix For: 0.7.0
>
>
> If a StreamTask calls TaskCoordinator.commit, all TaskInstances will commit their SystemProducers, TaskStorageManager, and CheckpointManager. The problem with this is that if you have 400 partitions in a SamzaContainer (for example), and each calls TaskCoordinator.commit once per second, you actually get 400 commits per TaskInstance per second. This is incorrect behavior. We should make TaskInstance commit itself only when its task.commit.ms window has expired, or when TaskCoordinator.commit was called ONLY by its StreamTask partition.



--
This message was sent by Atlassian JIRA
(v6.2#6252)