You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Chris Riccomini (JIRA)" <ji...@apache.org> on 2013/08/21 20:13:54 UTC

[jira] [Created] (SAMZA-42) Add a job setup phase to Samza

Chris Riccomini created SAMZA-42:
------------------------------------

             Summary: Add a job setup phase to Samza
                 Key: SAMZA-42
                 URL: https://issues.apache.org/jira/browse/SAMZA-42
             Project: Samza
          Issue Type: Bug
          Components: container
    Affects Versions: 0.6.0
            Reporter: Chris Riccomini


We have several use cases for doing things once at the beginning of a Samza job's execution (before containers start). Examples:

* Validate or create checkpoint topic (if using KafkaCheckpointManager)
* Validate or create state topic (if using LoggedStore)

Right now, we have to do this in the container, which means that there's a race condition when running on YARN, as each container will try to create the same topic.

Initially, I thought this logic could be put in the YARN AM, but then we'd have to put corresponding logic in the LocalJobFactory. This gets problematic if we implement SAMZA-41, since there would no longer be a central place to do a "before job starts" operation with the LocalJobFactory. If we don't do SAMZA-41, then we should be fine putting this logic in the YARN AM and LocalJobFactory.

Alternatively, we could put this logic in JobRunner. One downside to this is that it would mean the JobRunner would need full access to the grid that it was trying to execute on (not just the RM) so that it could talk to Kafka/ZooKeeper (for example). I think this is actually fine, since we always execute our jobs from a spot that has access to the full grid.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira