You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Yan Fang (JIRA)" <ji...@apache.org> on 2015/04/28 23:02:07 UTC
[jira] [Created] (SAMZA-662) Samza auto-creates changelog stream
without sufficient partitions when container number > 1
Yan Fang created SAMZA-662:
------------------------------
Summary: Samza auto-creates changelog stream without sufficient partitions when container number > 1
Key: SAMZA-662
URL: https://issues.apache.org/jira/browse/SAMZA-662
Project: Samza
Issue Type: Bug
Components: container
Affects Versions: 0.9.0
Reporter: Yan Fang
We currently provide auto-create for changelog streams. However, when there are more than 1 containers, it is possible that Samza creates a changelog stream with insufficient partitions.
Reason:
assume we have an input stream with 3 partitions and then we assign 3 containers for this job. According to the [JobCoordinator|https://github.com/apache/samza/blob/master/samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala], we will get:
||Container(Model) || InputStream Partition || Changelog Partition ||
|0 | 0 | 0|
|1 | 1 | 1|
|2 | 2 | 2|
If Container 0 is brought up first, it calls
{code}
val maxChangeLogStreamPartitions = containerModel.getTasks.values
.max(Ordering.by { task:TaskModel => task.getChangelogPartition.getPartitionId })
.getChangelogPartition.getPartitionId + 1
{code}
in [SamzaContainer|https://github.com/apache/samza/blob/master/samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala].
The maxChangeLogStreamPartition is 1. So we will auto-create a changelog stream with only 1 partitions.
Similarly, if the Container 2 is brought up first, we will get a stream with 2 partitions.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)