You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Ken (JIRA)" <ji...@apache.org> on 2016/03/04 12:44:40 UTC

[jira] [Commented] (SAMZA-882) Detect partition count changes in input streams

    [ https://issues.apache.org/jira/browse/SAMZA-882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15179775#comment-15179775 ] 

Ken commented on SAMZA-882:
---------------------------

JIRA issue SAMZA-882 is applicable to new partitions or re-partitioning. In this comment I would like to extend the conversation to include partitions that become dormant or obsolete. If there is a better venue for this conversation please direct me to it.

In the Samza examples, the partition key is often user id. It is reasonable to expect users to enter and exit an organisation. Thus new partitions will be created and existing partitions will become dormant. Real world scenarios are 'employee leaves company' or 'customer closes account'.

One StreamTask instance and one Thread instance are created by a Samza Container per partition (for simplicity, this scenario is a Samza Job with a single input stream). Thus all dormant partitions (of the input stream) will be polled for new messages (and yet there will be no new messages).

What is the opinion of Samza architects regarding the resources that are allocated to dormant partitions? The resources are: processing via polling for new messages and switching threads; heap memory (probably not much). The load balancing via Yarn could be uneven (e.g. one node could be assigned a disproportionate number of dormant partitions).

If I have misunderstood any concept please inform me (I am a Samza newbie).


> Detect partition count changes in input streams
> -----------------------------------------------
>
>                 Key: SAMZA-882
>                 URL: https://issues.apache.org/jira/browse/SAMZA-882
>             Project: Samza
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Navina Ramesh
>            Assignee: Navina Ramesh
>             Fix For: 0.10.1
>
>
> This is a known issue where any change in the partition count in the upstream affects the Samza job and it needs to be restarted. In such scenarios, we experience data loss or incorrect processing because the application logic depends on the partitioning strategy. It is worsened by the fact that we don't even have a good mechanism to detect such a change. 
> As a first-step towards detection, I propose that we modify the stream metadata cache maintained in Samza such that when there a change in partition count, we increment a gauge metric. This way we can at least attach a hook to monitor when this happens and take necessary actions. 
> However, in the long-term, we need to come up with a better strategy for handling this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)