You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Navina Ramesh (JIRA)" <ji...@apache.org> on 2015/04/21 20:35:00 UTC

[jira] [Updated] (SAMZA-617) YARN host affinity in Samza

     [ https://issues.apache.org/jira/browse/SAMZA-617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Navina Ramesh updated SAMZA-617:
--------------------------------
    Attachment: DESIGN-SAMZA-617-2.pdf
                DESIGN-SAMZA-617-2.md

Mostly corrected the document based on feedback. 
One notable change in the design -  how to specify the path for persisting the store (when it is outside of YARN's working directory). 
In the earlier design, the location of the store would default to java.io.tmpDir if the STATE_ROOT_DIR was not specified. Now, it would default to YARN's current working directory, thereby disabling local state re-use. 
This simplifies the logic of the clean-up script and maintains backward compatibility. 

> YARN host affinity in Samza
> ---------------------------
>
>                 Key: SAMZA-617
>                 URL: https://issues.apache.org/jira/browse/SAMZA-617
>             Project: Samza
>          Issue Type: Bug
>            Reporter: Navina Ramesh
>            Assignee: Navina Ramesh
>         Attachments: DESIGN-SAMZA-617-0.md, DESIGN-SAMZA-617-0.pdf, DESIGN-SAMZA-617-1.md, DESIGN-SAMZA-617-1.pdf, DESIGN-SAMZA-617-2.md, DESIGN-SAMZA-617-2.pdf
>
>
> Today in Samza we do not guarantee that a container gets deployed on the same machine upon a job upgrade/restart. Hence, the co-located data needs to restored every time a container restarts. Restoring data each time can be expensive, especially for applications that have a large data set.
> If we can enable restarting containers on the same machine, we can re-use available local state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)