You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@oozie.apache.org by "Robert Kanter (JIRA)" <ji...@apache.org> on 2015/03/14 00:27:39 UTC

[jira] [Updated] (OOZIE-2170) Oozie should automatically set configs to make Spark jobs show up in the Spark History Server

     [ https://issues.apache.org/jira/browse/OOZIE-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Kanter updated OOZIE-2170:
---------------------------------
    Summary: Oozie should automatically set configs to make Spark jobs show up in the Spark History Server  (was: Oozie should automatically sets configs to make Spark jobs show up in the Spark History Server)

> Oozie should automatically set configs to make Spark jobs show up in the Spark History Server
> ---------------------------------------------------------------------------------------------
>
>                 Key: OOZIE-2170
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2170
>             Project: Oozie
>          Issue Type: Improvement
>          Components: action
>    Affects Versions: trunk
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>
> If you use "yarn-cluster" for the Spark action's master, the Spark jobs don't show up in the Spark History Server or properly link to it from the Spark AM.
> The user needs to set this in their Spark action in the workflow.xml:
> {code:xml}
> <spark-opts>--conf spark.yarn.historyServer.address=http://SPH18088 --conf spark.eventLog.dir=hdfs://NN:8020/user/spark/applicationHistory --conf spark.eventLog.enabled=true</spark-opts>
> {code}
> It would be nice if Oozie did this automatically via some oozie-site.xml config(s).  We could do something similar how the hadoop configs are setup where it will load a Spark .conf file from a directory based on the RM specified in the <job-tracker>.
> While we're at it, it might be good to document how to use Spark on YARN:
> # Include the spark-assembly jar with your workflow (this is unfortunately not published in maven)
> # Specify "yarn-cluster" as the master
> Also, the Spark example should delete the output dir in {{<prepare>}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)