You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Satish Subhashrao Saley (JIRA)" <ji...@apache.org> on 2016/04/08 20:31:25 UTC
[jira] [Commented] (OOZIE-2479) SparkContext Not Using Yarn Config
[ https://issues.apache.org/jira/browse/OOZIE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15232673#comment-15232673 ]
Satish Subhashrao Saley commented on OOZIE-2479:
------------------------------------------------
You may need to set up {{oozie.service.HadoopAccessorService.hadoop.configurations}} property which will point to the *-site.xmls of hadoop.
[https://oozie.apache.org/docs/4.2.0/oozie-default.xml#oozie.service.HadoopAccessorService.hadoop.configurations]
> SparkContext Not Using Yarn Config
> ----------------------------------
>
> Key: OOZIE-2479
> URL: https://issues.apache.org/jira/browse/OOZIE-2479
> Project: Oozie
> Issue Type: Bug
> Components: workflow
> Affects Versions: 4.2.0
> Environment: Oozie 4.2.0.2.3.4.0-3485
> Spark 1.4.1
> Scala 2.10.5
> HDP 2.3
> Reporter: Breandán Mac Parland
> Assignee: Satish Subhashrao Saley
>
> The spark action does not appear to use the jobTracker setting in job.properties (or in the yarn config) when creating the SparkContext. When jobTracker property is set to use myDomain:8050 (to match the yarn.resourcemanager.address setting), I can see in the oozie UI (click on job > action > action configuration) that myDomain:8050 is being submitted but when I drill down into the hadoop job history logs I see the error indicating that a default 0.0.0.0:8032 is being used:
> *job.properties*
> {code}
> nameNode=hdfs://myDomain:8020
> jobTracker=myOtherDomain:8050
> queueName=default
> master=yarn # have also tried yarn-cluster and yarn-client
>
> oozie.use.system.libpath=true
> oozie.wf.application.path=${nameNode}/bmp/
> oozie.action.sharelib.for.spark=spark2 # I've added the updated spark libs I need in here
> {code}
>
> *workflow*
> {code}
> <workflow-app xmlns='uri:oozie:workflow:0.5' name='MyWorkflow'>
> <start to='spark-node' />
> <action name='spark-node'>
> <spark xmlns="uri:oozie:spark-action:0.1">
> <job-tracker>${jobTracker}</job-tracker>
> <name-node>${nameNode}</name-node>
> <prepare>
> <delete path="${nameNode}/bmp/output"/>
> </prepare>
> <master>${master}</master>
> <name>My Workflow</name>
> <class>uk.co.bmp.drivers.MyDriver</class>
> <jar>${nameNode}/bmp/lib/bmp.spark-assembly-1.0.jar</jar>
> <spark-opts>--conf spark.yarn.historyServer.address=http://myDomain:18088 --conf spark.eventLog.dir=hdfs://myDomain/user/spark/applicationHistory --conf spark.eventLog.enabled=true</spark-opts>
> <arg>${nameNode}/bmp/input/input_file.csv</arg>
> </spark>
> <ok to="end" />
> <error to="fail" />
> </action>
> <kill name="fail">
> <message>Workflow failed, error
> message[${wf:errorMessage(wf:lastErrorNode())}]
> </message>
> </kill>
> <end name='end' />
> </workflow-app>
> {code}
> *Error*
> {code}
> Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception,Call From myDomain/ipAddress to 0.0.0.0:8032 failed on connection exception: java.net.ConnectException: Connection refused. For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
> ...
> at org.apache.spark.SparkContext.<init>(SparkContext.scala:497)
> ...
> {code}
> Where is it pulling 8032 from? Why does it not use the port configured in the job.properties?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)