You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Satish Subhashrao Saley (JIRA)" <ji...@apache.org> on 2017/09/22 22:15:00 UTC
[jira] [Comment Edited] (OOZIE-3062) Set HADOOP_CONF_DIR for spark action

    [ https://issues.apache.org/jira/browse/OOZIE-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16177242#comment-16177242 ] 

Satish Subhashrao Saley edited comment on OOZIE-3062 at 9/22/17 10:14 PM:
--------------------------------------------------------------------------

bq.env setting is usually separated by , in hadoop configs and not File.pathSeparator. Can you verify?
Yes, Hadoop uses comma (,) as separator. 
https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobConf.java#L289
It will be confusing for users to follow different conventions. Created OOZIE-3068


was (Author: satishsaley):
bq.env setting is usually separated by , in hadoop configs and not File.pathSeparator. Can you verify?
Yes, Hadoop uses comma (,) as separator. 
https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobConf.java#L289
It will be confusing for users to follow different conventions.

> Set HADOOP_CONF_DIR for spark action
> ------------------------------------
>
>                 Key: OOZIE-3062
>                 URL: https://issues.apache.org/jira/browse/OOZIE-3062
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Satish Subhashrao Saley
>            Assignee: Satish Subhashrao Saley
>         Attachments: OOZIE-3062-1.patch
>
>
> OOZIE-2569 created core-site.xml, hdfs-site.xml, mapred-site.xml, yarn-site.xml for spark action. But for spark to consider these files ([spark code ref|https://github.com/apache/spark/blob/83fe3b5e10f8dc62245ea37143abb96be0f39805/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala#L284]), we need to set HADOOP_CONF_DIR environmental variable, otherwise it will point to default HADOOP_CONF_DIR configured on the node. Therefore, Oozie should set HADOOP_CONF_DIR.
> Due to a bug YARN-4727, yarn was not considering user specified value for whiltelisted environment variables including HADOOP_CONF_DIR. So, this will work on specified hadoop fix versions in YARN-4727.
> This fix in Oozie will be no-op on Hadoop without YARN-4727 fix and Oozie will behave as it is.
> To verify this on one node machine. 
> 1. In hadoop mapred-site.xml, set 
> {code}
> <property>
>    <name>mapreduce.fileoutputcommitter.marksuccessfuljobs</name>
>    <value>false</value>
> </property>
> {code}
> 2. Run any spark file copy example from Oozie. Check output directory, _SUCCESS flag is missing. It should be there because we explicitly set     conf.set("mapreduce.fileoutputcommitter.marksuccessfuljobs", "true"); in https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java#L293



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)