You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@oozie.apache.org by Liping Zhang <zl...@gmail.com> on 2016/02/04 20:57:25 UTC

Fwd: gridx spark job failed with oozie

Dear Oozie user and dev,

We have a our spark job need to be run as a workflow in oozie.


1.Now the spark job can be run successfully in submmit command line as
below:

spark-submit --master spark://ip-10-0-4-248.us-west-1.compute.internal:7077
--class com.gridx.spark.MeterReadingLoader --name 'smud_test1'
--driver-class-path
/opt/cloudera/parcels/CDH/jars/guava-16.0.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/jars/jets3t-0.9.0.jar
 --conf
spark.executor.extraClassPath=/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/jars/jets3t-0.9.0.jar
~/spark-all.jar -i s3n://meter-data/batch_data_phase1/smud_phase1_10.csv -k
smud_stage -h 10.0.4.243 -t 60 -z America/Los_Angeles -l smud_test1 -g SMUD


2.However, when we use oozie REST API or Hue-OOzie in CDH to submit the
same spark job with following REST API, it will launch an oozie launcher
job"
oozie:launcher:T=spark:W=meter_reading_loader:A=spark-17c0:ID=0000027-160202081901924-oozie-oozi-W",
and  be failed with  OOM and PermGen exception.

BTW, Our gridx jar "spark-all.jar" has 88M size.

Here is the screenshot, and attached is the workflow for oozie.

curl -X POST -H "Content-Type: application/xml" -d @config.xml
http://localhost:11000/oozie/v2/jobs?action=start


oozie parameters:

[image: Inline image 4]


oozie job in job CDH resource manager UI(port 8088):

[image: Inline image 2]



Exceptions and logs:

[image: Inline image 1]

[image: Inline image 3]



I also tried to enlarge the MaxPermGen  and memory, still got no luck. Can
you help out? Thanks very much!



-- 
Cheers,
-----
Big Data - Big Wisdom - Big Value
--------------
Michelle Zhang (张莉苹)

Re: gridx spark job failed with oozie

Posted by Serega Sheypak <se...@gmail.com>.
probably you need to increase mem for oozie launcher itself?

2016-02-04 20:57 GMT+01:00 Liping Zhang <zl...@gmail.com>:

> Dear Oozie user and dev,
>
> We have a our spark job need to be run as a workflow in oozie.
>
>
> 1.Now the spark job can be run successfully in submmit command line as
> below:
>
> spark-submit --master
> spark://ip-10-0-4-248.us-west-1.compute.internal:7077 --class
> com.gridx.spark.MeterReadingLoader --name 'smud_test1' --driver-class-path
> /opt/cloudera/parcels/CDH/jars/guava-16.0.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/jars/jets3t-0.9.0.jar  --conf
> spark.executor.extraClassPath=/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/jars/jets3t-0.9.0.jar
> ~/spark-all.jar -i s3n://meter-data/batch_data_phase1/smud_phase1_10.csv -k
> smud_stage -h 10.0.4.243 -t 60 -z America/Los_Angeles -l smud_test1 -g SMUD
>
>
> 2.However, when we use oozie REST API or Hue-OOzie in CDH to submit the
> same spark job with following REST API, it will launch an oozie launcher
> job"
> oozie:launcher:T=spark:W=meter_reading_loader:A=spark-17c0:ID=0000027-160202081901924-oozie-oozi-W",
> and  be failed with  OOM and PermGen exception.
>
> BTW, Our gridx jar "spark-all.jar" has 88M size.
>
> Here is the screenshot, and attached is the workflow for oozie.
>
> curl -X POST -H "Content-Type: application/xml" -d @config.xml
> http://localhost:11000/oozie/v2/jobs?action=start
>
>
> oozie parameters:
>
> [image: Inline image 4]
>
>
> oozie job in job CDH resource manager UI(port 8088):
>
> [image: Inline image 2]
>
>
>
> Exceptions and logs:
>
> [image: Inline image 1]
>
> [image: Inline image 3]
>
>
>
> I also tried to enlarge the MaxPermGen  and memory, still got no luck. Can
> you help out? Thanks very much!
>
>
>
> --
> Cheers,
> -----
> Big Data - Big Wisdom - Big Value
> --------------
> Michelle Zhang (张莉苹)
>