You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Alex Buck (JIRA)" <ji...@apache.org> on 2016/03/31 15:23:25 UTC

[jira] [Commented] (SAMZA-348) Configure Samza jobs through a stream

    [ https://issues.apache.org/jira/browse/SAMZA-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15219855#comment-15219855 ] 

Alex Buck commented on SAMZA-348:
---------------------------------

Hi [~criccomini], I am interested in using explicit restarts as described in the design document. 

I have had a go at implementing this locally by updating the ConfigManager to listen for restart messages and I would love to contribute. If this is ok, please would you create a subtask for me to do this?

I have also found a bug with the json deserialisation in the YarnUtil class that is used by ConfigManager to query the yarn webapp api for all the running applications so I would like to contribute a fix for that as well please. Should I email the dev mailing list about this or could you create a jira for that too?

Thank you.

> Configure Samza jobs through a stream
> -------------------------------------
>
>                 Key: SAMZA-348
>                 URL: https://issues.apache.org/jira/browse/SAMZA-348
>             Project: Samza
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>            Reporter: Chris Riccomini
>            Assignee: Chris Riccomini
>              Labels: design, project
>         Attachments: DESIGN-SAMZA-348-0.md, DESIGN-SAMZA-348-0.pdf, DESIGN-SAMZA-348-1.md, DESIGN-SAMZA-348-1.pdf
>
>
> Samza's existing config setup is problematic for a number of reasons:
> # It's completely immutable once a job starts. This prevents any dynamic reconfiguration and auto-scaling. It is debatable whether we want these feature or not, but our existing implementation actively prevents it. See SAMZA-334 for discussion.
> # We pass existing configuration through environment variables. YARN exports environment variables in a shell script, which limits the size to the varargs length on the machine. This is usually ~128KB. See SAMZA-333 and SAMZA-337 for details.
> # User-defined configuration (the Config object) and programmatic configuration (checkpoints and TaskName:State mappings (see SAMZA-123)) are handled differently. It's debatable whether this makes sense.
> In SAMZA-123, [~jghoman] and I propose implementing a ConfigLog. This log would replace both the checkpoint topic and the existing config environment variables in SamzaContainer and Samza's YARN AM.
> I'd like to keep this ticket's scope limited to just the implementation of the ConfigLog, and not re-designing how Samza's config is used in the code (SAMZA-40). We should, however, discuss how this feature would affect dynamic reconfiguration/auto-scaling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)