You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Satish Subhashrao Saley (JIRA)" <ji...@apache.org> on 2016/04/08 20:31:25 UTC
[jira] [Commented] (OOZIE-2479) SparkContext Not Using Yarn Config

    [ https://issues.apache.org/jira/browse/OOZIE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15232673#comment-15232673 ] 

Satish Subhashrao Saley commented on OOZIE-2479:
------------------------------------------------

You may need to set up {{oozie.service.HadoopAccessorService.hadoop.configurations}} property which will point to the *-site.xmls of hadoop.
[https://oozie.apache.org/docs/4.2.0/oozie-default.xml#oozie.service.HadoopAccessorService.hadoop.configurations]

> SparkContext Not Using Yarn Config
> ----------------------------------
>
>                 Key: OOZIE-2479
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2479
>             Project: Oozie
>          Issue Type: Bug
>          Components: workflow
>    Affects Versions: 4.2.0
>         Environment: Oozie 4.2.0.2.3.4.0-3485
> Spark 1.4.1
> Scala 2.10.5
> HDP 2.3
>            Reporter: Breandán Mac Parland
>            Assignee: Satish Subhashrao Saley
>
> The spark action does not appear to use the jobTracker setting  in job.properties (or in the yarn config) when creating the SparkContext. When jobTracker property is set to use  myDomain:8050 (to match the yarn.resourcemanager.address setting), I can see in the oozie UI (click on job > action > action configuration) that myDomain:8050 is being submitted but when I drill down into the hadoop job history logs I see the error indicating that a default 0.0.0.0:8032 is being used:
> *job.properties*
> {code}
> nameNode=hdfs://myDomain:8020
> jobTracker=myOtherDomain:8050
> queueName=default
> master=yarn # have also tried yarn-cluster and yarn-client
>  
> oozie.use.system.libpath=true
> oozie.wf.application.path=${nameNode}/bmp/
> oozie.action.sharelib.for.spark=spark2 # I've added the updated spark libs I need in here
> {code}
>  
> *workflow*
> {code}
> <workflow-app xmlns='uri:oozie:workflow:0.5' name='MyWorkflow'>
>     <start to='spark-node' />
>     <action name='spark-node'>
>         <spark xmlns="uri:oozie:spark-action:0.1">
>             <job-tracker>${jobTracker}</job-tracker>
>             <name-node>${nameNode}</name-node>
>             <prepare>
>                 <delete path="${nameNode}/bmp/output"/>
>             </prepare>
>             <master>${master}</master>
>             <name>My Workflow</name>
>             <class>uk.co.bmp.drivers.MyDriver</class>
>             <jar>${nameNode}/bmp/lib/bmp.spark-assembly-1.0.jar</jar>
>             <spark-opts>--conf spark.yarn.historyServer.address=http://myDomain:18088 --conf spark.eventLog.dir=hdfs://myDomain/user/spark/applicationHistory --conf spark.eventLog.enabled=true</spark-opts>
>             <arg>${nameNode}/bmp/input/input_file.csv</arg>
>         </spark>
>         <ok to="end" />
>         <error to="fail" />
>     </action>
>     <kill name="fail">
>         <message>Workflow failed, error
>             message[${wf:errorMessage(wf:lastErrorNode())}]
>         </message>
>     </kill>
>     <end name='end' />
> </workflow-app>
> {code}
> *Error*
> {code}
> Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception,Call From myDomain/ipAddress to 0.0.0.0:8032 failed on connection exception: java.net.ConnectException: Connection refused. For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
> ...
> at org.apache.spark.SparkContext.<init>(SparkContext.scala:497)
> ...
> {code}
> Where is it pulling 8032 from? Why does it not use the port configured in the job.properties?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)