You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/03/30 01:09:41 UTC

[jira] [Commented] (SAMZA-1126) Semantics of ProcessorId in Samza

    [ https://issues.apache.org/jira/browse/SAMZA-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15948191#comment-15948191 ] 

ASF GitHub Bot commented on SAMZA-1126:
---------------------------------------

GitHub user navina opened a pull request:

    https://github.com/apache/samza/pull/103

    SAMZA-1126 - Semantics of processorId in Samza

    Implementation based on [SEP-1](https://cwiki.apache.org/confluence/display/SAMZA/SEP-1%3A+Semantics+of+ProcessorId+in+Samza)
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/navina/samza SAMZA-1126

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/samza/pull/103.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #103
    
----
commit b0b31979ab5da4d83878152058f446c42dbce82f
Author: navina <na...@apache.org>
Date:   2017-03-29T22:09:05Z

    Removing CoordinationService from JobCoordinatorFactory interface

commit d7e649368b9b05ad307368581fc28d396289bfb7
Author: navina <na...@apache.org>
Date:   2017-03-23T21:22:58Z

    Added ProcessorIdGenerator. StreamProcessor tests fail due to containerId type mismatch in Containermodel

commit edfac1163cdf7191d83feaf3c11bc27dfd41d3c2
Author: navina <na...@apache.org>
Date:   2017-03-24T18:41:11Z

    First pass at changing to string; ContainerModel has both processorId and containerId

commit 94e000127d79a7106bcd3e9c7d9363f0596a3c16
Author: navina <na...@apache.org>
Date:   2017-03-24T20:49:49Z

    Added jobmodel deserialization for compatibility between old and new models

commit 9290483466f4504306201d78522de23a03753855
Author: navina <na...@apache.org>
Date:   2017-03-24T21:21:49Z

    Adding custom deserializer for ContainerModel

commit fea84abe7707f93af0a133d7d30be7dfa4311911
Author: navina <na...@apache.org>
Date:   2017-03-24T23:50:12Z

    updating code in TestSamzaContainer.scala

commit f8b85a52731039996cf9357c7c85adc36e98ae5a
Author: navina <na...@apache.org>
Date:   2017-03-25T01:57:30Z

    Mostly builds. TaskProxy stuff needs to be fixed

commit 17d905ff920c70e4c95d5fd3b5c23ff94c549f25
Author: navina <na...@apache.org>
Date:   2017-03-27T21:26:06Z

    Removed static method from ProcessorIdGenerator and made it a ClassLoader util

commit da4566201b834da8aa1dcead594ace3a98332132
Author: navina <na...@apache.org>
Date:   2017-03-28T18:15:46Z

    Changing containerId to string in YarnClusterResourceManager state

commit 987f2ffdb1c581eadc7907fd1050a36d9dacd7cc
Author: navina <na...@apache.org>
Date:   2017-03-28T18:29:27Z

    Fixing some documents

commit 96a104394add60bea554ce6766a4ec2459c8745a
Author: navina <na...@apache.org>
Date:   2017-03-29T21:23:11Z

    Fixed breaking changes after rebase

commit 8d32878d5d29b39c4a5363f9407ef1473efa3fd4
Author: navina <na...@apache.org>
Date:   2017-03-30T01:06:29Z

    Commenting a flaky test and rebasing

----


> Semantics of ProcessorId in Samza 
> ----------------------------------
>
>                 Key: SAMZA-1126
>                 URL: https://issues.apache.org/jira/browse/SAMZA-1126
>             Project: Samza
>          Issue Type: Sub-task
>            Reporter: Navina Ramesh
>            Assignee: Navina Ramesh
>             Fix For: 0.13.0
>
>
> Until today, we have been using "processorId" to be synonymous to the logical "containerId", assigned by Samza. 
> It is easy for Samza to generate a unique set of containerIds per job because the number of containers is expected to be fixed/constant throughout the job's lifecycle. However, with the new Zookeeper based model, we allow the number of processors to be changed while the job is executing. In other words, we want to make a Samza job "elastic" in nature. 
> The proposal in SAMZA-1084 expects the user to assign a unique processorId to each StreamProcessor associated with the job. This is tedious on the user since the processors are going to be distributed across one or more machines and the user should coordinate among these machines for guaranteeing uniqueness of processorId within a job. 
> The goal of this JIRA is to understand and define the semantics of processorId and investigate a solution which does not impose this requirement on the user. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)