You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Guozhang Wang (JIRA)" <ji...@apache.org> on 2015/06/18 02:44:01 UTC

[jira] [Resolved] (SAMZA-662) Samza auto-creates changelog stream without sufficient partitions when container number > 1

     [ https://issues.apache.org/jira/browse/SAMZA-662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Guozhang Wang resolved SAMZA-662.
---------------------------------
    Resolution: Fixed

> Samza auto-creates changelog stream without sufficient partitions when container number > 1
> -------------------------------------------------------------------------------------------
>
>                 Key: SAMZA-662
>                 URL: https://issues.apache.org/jira/browse/SAMZA-662
>             Project: Samza
>          Issue Type: Bug
>          Components: container
>    Affects Versions: 0.9.0
>            Reporter: Yan Fang
>            Assignee: Guozhang Wang
>             Fix For: 0.9.1
>
>         Attachments: SAMZA-662-0.9.1-branch.patch, SAMZA-662.v1.patch
>
>
> We currently provide auto-create for changelog streams. However, when there are more than 1 containers, it is possible that Samza creates a changelog stream with insufficient partitions. 
> Reason:
> assume we have an input stream with 3 partitions and then we assign 3 containers for this job. According to the [JobCoordinator|https://github.com/apache/samza/blob/master/samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala], we will get:
> ||Container(Model) || InputStream Partition || Changelog Partition ||
> |0 | 0 | 0|
> |1 | 1 | 1|
> |2 | 2 | 2|
> If Container 0 is brought up first, it calls 
> {code}
>     val maxChangeLogStreamPartitions = containerModel.getTasks.values
>             .max(Ordering.by { task:TaskModel => task.getChangelogPartition.getPartitionId })
>             .getChangelogPartition.getPartitionId + 1
> {code}
> in [SamzaContainer|https://github.com/apache/samza/blob/master/samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala].
> The maxChangeLogStreamPartition is 1. So we will auto-create a changelog stream with only 1 partitions.
> Similarly, if the Container 2 is brought up first, we will get a stream with 2 partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)