You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Yan Fang (JIRA)" <ji...@apache.org> on 2014/06/27 07:50:25 UTC

[jira] [Updated] (SAMZA-307) Simplify YARN deploy procedure

     [ https://issues.apache.org/jira/browse/SAMZA-307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yan Fang updated SAMZA-307:
---------------------------

    Summary: Simplify YARN deploy procedure   (was: Simplify deploy procedure )

> Simplify YARN deploy procedure 
> -------------------------------
>
>                 Key: SAMZA-307
>                 URL: https://issues.apache.org/jira/browse/SAMZA-307
>             Project: Samza
>          Issue Type: Improvement
>            Reporter: Yan Fang
>
> Currently, we have two ways of deploying the samza job to YARN cluster, from [HDFS|https://samza.incubator.apache.org/learn/tutorials/0.7.0/deploy-samza-job-from-hdfs.html] and [Http | https://samza.incubator.apache.org/learn/tutorials/0.7.0/run-in-multi-node-yarn.html], but neither of them is out-of-box. Users have to go through the tutorial, add dependencies, recompile, put the job package to HDFS or Http and then finally run.
> I feel it is a little cumbersome sometimes. We maybe able to provide a simpler way to deploy the job.
> 1. When users have YARN and HDFS in the same cluster (such as CDH5), we can provide a job-submit script which takes cluster configuration, then call some jave code to upload the assembly (all the samza needed jars and is already-compiled) along with user's job jar (which changes frequently) to the HDFS, and then run the job as usual. (Yes, I learnt it from [Spark's Yarn deploy|http://spark.apache.org/docs/latest/running-on-yarn.html])
> 2.  
>  
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)