You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@oozie.apache.org by "Oehmichen, Axel" <ax...@imperial.ac.uk> on 2015/11/11 12:09:32 UTC

Spark action using python file as JAR

Hello,

I am trying to use OOzie to get some python workflows running. I have installed OOzie and Spark using mapR 5.0 which comes with OOzie 4.2 and Spark 1.4.1.
No matter what I do, I get this error message: "java.lang.ClassNotFoundException: Class org.apache.oozie.action.hadoop.SparkMain not found" Error code JA018.

I was able to reproduce using the wordcount.py example. (https://github.com/apache/spark/blob/master/examples/src/main/python/wordcount.py)
(The idea of running wordcount comes from Nitin Kumar message)

The command I run is "$ /opt/mapr/oozie/oozie-4.2.0/bin/oozie job -oozie="http://localhost:11000/oozie" -config job.properties -run "
I have tried through the Java API as well and I end up with the same result.

My job.properties contains:
nameNode=maprfs:///
jobTracker=spark-master:8032
oozie.wf.application.path=maprfs:/user/mapr/

my workflow.xml:

<workflow-app xmlns='uri:oozie:workflow:0.5' name='Test'>
    <start to='spark-node' />

    <action name='spark-node'>
        <spark xmlns="uri:oozie:spark-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <master>yarn-client</master>
            <mode>client</mode>
             <name>wordcount</name>
            <jar>wordcount.py</jar>
            <spark-opts>--num-executors 2 --driver-memory 1024m --executor-memory 512m --executor-cores 1</spark-opts>
        </spark>
        <ok to="end" />
        <error to="fail" />
    </action>

    <kill name="fail">
        <message>Workflow failed, error
            message[${wf:errorMessage(wf:lastErrorNode())}]
        </message>
    </kill>
    <end name='end' />
</workflow-app>

I have tried to change the oozie.wf.application.path, specify explicitly the jar path, remove ord add different fields in the xml, put the wordcount a little bit everywhere and some other stuff but nothing changed...

I welcome any suggestion or point out any error I made.

Many thanks.

Axel


Re: Spark action using python file as JAR

Posted by Robert Kanter <rk...@cloudera.com>.
I don't know if MapR makes any changes to how the sharelib works, so you
might try asking in their mailing list or forums to see if anyone can help
you there.  The information I shared about the sharelib in my previous
email was with my "Apache Hat" on, and should apply to Oozie 4.2 assuming
MapR didn't change anything.

As for CDH, while the version number doesn't say 4.2 (it's currently 4.1),
we do backport a number of patches on top of 4.1, including a lot of 4.2
and patches not in an Apache release yet.  This is true for most, if not
all, of the components we ship.

- Robert

On Thu, Nov 12, 2015 at 5:46 AM, Oehmichen, Axel <
axelfrancois.oehmichen11@imperial.ac.uk> wrote:

> Hello Robert,
>
> I have used maprR to install my Hadoop stack (I prefer Cloudera but CDH
> doesn't support yet OOzie 4.2) so I believed that the sharelib installation
> was correct.
>
> Before I change it manually, I got the java example running so I guess
> that the sharelib is properly built?
>
> Best,
> --
> Axel Oehmichen
> Research Assistant
> Data Science Institute
> +44 (0) 7 842 734 702
> Axelfrancois.oehmichen11@imperial.ac.uk
>
>
> -----Original Message-----
> From: Robert Kanter [mailto:rkanter@cloudera.com]
> Sent: 11 November 2015 18:03
> To: user@oozie.apache.org
> Subject: Re: Spark action using python file as JAR
>
> Hi Axel,
>
> The sharelib is not properly installed.  With Oozie 4.x, there's now an
> extra directory.  Instead of /oozie/share/lib/spark it should be
> /oozie/share/lib/lib_<timestamp>/spark.
>
> If you use the oozie-setup script, it will properly install the sharelib
> for you.
> http://oozie.apache.org/docs/4.2.0/AG_Install.html#Oozie_Server_Setup
>
> - Robert
>
>
> On Wed, Nov 11, 2015 at 6:51 AM, Oehmichen, Axel <
> axelfrancois.oehmichen11@imperial.ac.uk> wrote:
>
> > Tried again and it yielded the same error.
> >
> > Many thanks.
> >
> > Best,
> > --
> > Axel Oehmichen
> > Research Assistant
> > Data Science Institute
> > +44 (0) 7 842 734 702
> > Axelfrancois.oehmichen11@imperial.ac.uk
> >
> >
> > -----Original Message-----
> > From: Oussama Chougna [mailto:oussama1983@hotmail.com]
> > Sent: 11 November 2015 14:23
> > To: user@oozie.apache.org
> > Subject: RE: Spark action using python file as JAR
> >
> > OK,
> > Now in your job.properties include:
> > oozie.use.system.libpath=true
> >
> > This tells Oozie to use that sharelib.
> > Cheers,
> > Oussama Chougna
> >
> > > From: axelfrancois.oehmichen11@imperial.ac.uk
> > > To: user@oozie.apache.org
> > > Subject: RE: Spark action using python file as JAR
> > > Date: Wed, 11 Nov 2015 14:18:45 +0000
> > >
> > > Hello Oussama,
> > >
> > > Thanks for the response. The sharelib folder does exists on HDFS
> > > under
> > /oozie/share/lib/spark
> > >
> > > Best,
> > > Axel
> > >
> > > -----Original Message-----
> > > From: Oussama Chougna [mailto:oussama1983@hotmail.com]
> > > Sent: 11 November 2015 13:30
> > > To: user@oozie.apache.org
> > > Subject: RE: Spark action using python file as JAR
> > >
> > > Hi Axel,
> > > Did you also install the Oozie sharelib? Sounds like you missing the
> > sharelib, it is installed on HDFS. See the Oozie docs/MapR for a howto.
> > > Cheers,
> > >
> > > Oussama Chougna
> > >
> > > > From: axelfrancois.oehmichen11@imperial.ac.uk
> > > > To: user@oozie.apache.org
> > > > Subject: Spark action using python file as JAR
> > > > Date: Wed, 11 Nov 2015 11:09:32 +0000
> > > >
> > > > Hello,
> > > >
> > > > I am trying to use OOzie to get some python workflows running. I
> > > > have
> > installed OOzie and Spark using mapR 5.0 which comes with OOzie 4.2
> > and Spark 1.4.1.
> > > > No matter what I do, I get this error message:
> > "java.lang.ClassNotFoundException: Class
> > org.apache.oozie.action.hadoop.SparkMain not found" Error code JA018.
> > > >
> > > > I was able to reproduce using the wordcount.py example.
> > > > (https://github.com/apache/spark/blob/master/examples/src/main/pyt
> > > > hon/
> > > > wordcount.py) (The idea of running wordcount comes from Nitin
> > > > Kumar
> > > > message)
> > > >
> > > > The command I run is "$ /opt/mapr/oozie/oozie-4.2.0/bin/oozie job
> > -oozie="http://localhost:11000/oozie" -config job.properties -run "
> > > > I have tried through the Java API as well and I end up with the
> > > > same
> > result.
> > > >
> > > > My job.properties contains:
> > > > nameNode=maprfs:///
> > > > jobTracker=spark-master:8032
> > > > oozie.wf.application.path=maprfs:/user/mapr/
> > > >
> > > > my workflow.xml:
> > > >
> > > > <workflow-app xmlns='uri:oozie:workflow:0.5' name='Test'>
> > > >     <start to='spark-node' />
> > > >
> > > >     <action name='spark-node'>
> > > >         <spark xmlns="uri:oozie:spark-action:0.1">
> > > >             <job-tracker>${jobTracker}</job-tracker>
> > > >             <name-node>${nameNode}</name-node>
> > > >             <master>yarn-client</master>
> > > >             <mode>client</mode>
> > > >              <name>wordcount</name>
> > > >             <jar>wordcount.py</jar>
> > > >             <spark-opts>--num-executors 2 --driver-memory 1024m
> > --executor-memory 512m --executor-cores 1</spark-opts>
> > > >         </spark>
> > > >         <ok to="end" />
> > > >         <error to="fail" />
> > > >     </action>
> > > >
> > > >     <kill name="fail">
> > > >         <message>Workflow failed, error
> > > >             message[${wf:errorMessage(wf:lastErrorNode())}]
> > > >         </message>
> > > >     </kill>
> > > >     <end name='end' />
> > > > </workflow-app>
> > > >
> > > > I have tried to change the oozie.wf.application.path, specify
> > explicitly the jar path, remove ord add different fields in the xml,
> > put the wordcount a little bit everywhere and some other stuff but
> > nothing changed...
> > > >
> > > > I welcome any suggestion or point out any error I made.
> > > >
> > > > Many thanks.
> > > >
> > > > Axel
> > > >
> > >
> >
> >
>

RE: Spark action using python file as JAR

Posted by "Oehmichen, Axel" <ax...@imperial.ac.uk>.
Hello Robert,

I have used maprR to install my Hadoop stack (I prefer Cloudera but CDH doesn't support yet OOzie 4.2) so I believed that the sharelib installation was correct.

Before I change it manually, I got the java example running so I guess that the sharelib is properly built?

Best,
--
Axel Oehmichen
Research Assistant
Data Science Institute
+44 (0) 7 842 734 702 
Axelfrancois.oehmichen11@imperial.ac.uk


-----Original Message-----
From: Robert Kanter [mailto:rkanter@cloudera.com] 
Sent: 11 November 2015 18:03
To: user@oozie.apache.org
Subject: Re: Spark action using python file as JAR

Hi Axel,

The sharelib is not properly installed.  With Oozie 4.x, there's now an extra directory.  Instead of /oozie/share/lib/spark it should be /oozie/share/lib/lib_<timestamp>/spark.

If you use the oozie-setup script, it will properly install the sharelib for you.
http://oozie.apache.org/docs/4.2.0/AG_Install.html#Oozie_Server_Setup

- Robert


On Wed, Nov 11, 2015 at 6:51 AM, Oehmichen, Axel < axelfrancois.oehmichen11@imperial.ac.uk> wrote:

> Tried again and it yielded the same error.
>
> Many thanks.
>
> Best,
> --
> Axel Oehmichen
> Research Assistant
> Data Science Institute
> +44 (0) 7 842 734 702
> Axelfrancois.oehmichen11@imperial.ac.uk
>
>
> -----Original Message-----
> From: Oussama Chougna [mailto:oussama1983@hotmail.com]
> Sent: 11 November 2015 14:23
> To: user@oozie.apache.org
> Subject: RE: Spark action using python file as JAR
>
> OK,
> Now in your job.properties include:
> oozie.use.system.libpath=true
>
> This tells Oozie to use that sharelib.
> Cheers,
> Oussama Chougna
>
> > From: axelfrancois.oehmichen11@imperial.ac.uk
> > To: user@oozie.apache.org
> > Subject: RE: Spark action using python file as JAR
> > Date: Wed, 11 Nov 2015 14:18:45 +0000
> >
> > Hello Oussama,
> >
> > Thanks for the response. The sharelib folder does exists on HDFS 
> > under
> /oozie/share/lib/spark
> >
> > Best,
> > Axel
> >
> > -----Original Message-----
> > From: Oussama Chougna [mailto:oussama1983@hotmail.com]
> > Sent: 11 November 2015 13:30
> > To: user@oozie.apache.org
> > Subject: RE: Spark action using python file as JAR
> >
> > Hi Axel,
> > Did you also install the Oozie sharelib? Sounds like you missing the
> sharelib, it is installed on HDFS. See the Oozie docs/MapR for a howto.
> > Cheers,
> >
> > Oussama Chougna
> >
> > > From: axelfrancois.oehmichen11@imperial.ac.uk
> > > To: user@oozie.apache.org
> > > Subject: Spark action using python file as JAR
> > > Date: Wed, 11 Nov 2015 11:09:32 +0000
> > >
> > > Hello,
> > >
> > > I am trying to use OOzie to get some python workflows running. I 
> > > have
> installed OOzie and Spark using mapR 5.0 which comes with OOzie 4.2 
> and Spark 1.4.1.
> > > No matter what I do, I get this error message:
> "java.lang.ClassNotFoundException: Class 
> org.apache.oozie.action.hadoop.SparkMain not found" Error code JA018.
> > >
> > > I was able to reproduce using the wordcount.py example.
> > > (https://github.com/apache/spark/blob/master/examples/src/main/pyt
> > > hon/
> > > wordcount.py) (The idea of running wordcount comes from Nitin 
> > > Kumar
> > > message)
> > >
> > > The command I run is "$ /opt/mapr/oozie/oozie-4.2.0/bin/oozie job
> -oozie="http://localhost:11000/oozie" -config job.properties -run "
> > > I have tried through the Java API as well and I end up with the 
> > > same
> result.
> > >
> > > My job.properties contains:
> > > nameNode=maprfs:///
> > > jobTracker=spark-master:8032
> > > oozie.wf.application.path=maprfs:/user/mapr/
> > >
> > > my workflow.xml:
> > >
> > > <workflow-app xmlns='uri:oozie:workflow:0.5' name='Test'>
> > >     <start to='spark-node' />
> > >
> > >     <action name='spark-node'>
> > >         <spark xmlns="uri:oozie:spark-action:0.1">
> > >             <job-tracker>${jobTracker}</job-tracker>
> > >             <name-node>${nameNode}</name-node>
> > >             <master>yarn-client</master>
> > >             <mode>client</mode>
> > >              <name>wordcount</name>
> > >             <jar>wordcount.py</jar>
> > >             <spark-opts>--num-executors 2 --driver-memory 1024m
> --executor-memory 512m --executor-cores 1</spark-opts>
> > >         </spark>
> > >         <ok to="end" />
> > >         <error to="fail" />
> > >     </action>
> > >
> > >     <kill name="fail">
> > >         <message>Workflow failed, error
> > >             message[${wf:errorMessage(wf:lastErrorNode())}]
> > >         </message>
> > >     </kill>
> > >     <end name='end' />
> > > </workflow-app>
> > >
> > > I have tried to change the oozie.wf.application.path, specify
> explicitly the jar path, remove ord add different fields in the xml, 
> put the wordcount a little bit everywhere and some other stuff but 
> nothing changed...
> > >
> > > I welcome any suggestion or point out any error I made.
> > >
> > > Many thanks.
> > >
> > > Axel
> > >
> >
>
>

Re: Spark action using python file as JAR

Posted by Robert Kanter <rk...@cloudera.com>.
Hi Axel,

The sharelib is not properly installed.  With Oozie 4.x, there's now an
extra directory.  Instead of /oozie/share/lib/spark it should be
/oozie/share/lib/lib_<timestamp>/spark.

If you use the oozie-setup script, it will properly install the sharelib
for you.
http://oozie.apache.org/docs/4.2.0/AG_Install.html#Oozie_Server_Setup

- Robert


On Wed, Nov 11, 2015 at 6:51 AM, Oehmichen, Axel <
axelfrancois.oehmichen11@imperial.ac.uk> wrote:

> Tried again and it yielded the same error.
>
> Many thanks.
>
> Best,
> --
> Axel Oehmichen
> Research Assistant
> Data Science Institute
> +44 (0) 7 842 734 702
> Axelfrancois.oehmichen11@imperial.ac.uk
>
>
> -----Original Message-----
> From: Oussama Chougna [mailto:oussama1983@hotmail.com]
> Sent: 11 November 2015 14:23
> To: user@oozie.apache.org
> Subject: RE: Spark action using python file as JAR
>
> OK,
> Now in your job.properties include:
> oozie.use.system.libpath=true
>
> This tells Oozie to use that sharelib.
> Cheers,
> Oussama Chougna
>
> > From: axelfrancois.oehmichen11@imperial.ac.uk
> > To: user@oozie.apache.org
> > Subject: RE: Spark action using python file as JAR
> > Date: Wed, 11 Nov 2015 14:18:45 +0000
> >
> > Hello Oussama,
> >
> > Thanks for the response. The sharelib folder does exists on HDFS under
> /oozie/share/lib/spark
> >
> > Best,
> > Axel
> >
> > -----Original Message-----
> > From: Oussama Chougna [mailto:oussama1983@hotmail.com]
> > Sent: 11 November 2015 13:30
> > To: user@oozie.apache.org
> > Subject: RE: Spark action using python file as JAR
> >
> > Hi Axel,
> > Did you also install the Oozie sharelib? Sounds like you missing the
> sharelib, it is installed on HDFS. See the Oozie docs/MapR for a howto.
> > Cheers,
> >
> > Oussama Chougna
> >
> > > From: axelfrancois.oehmichen11@imperial.ac.uk
> > > To: user@oozie.apache.org
> > > Subject: Spark action using python file as JAR
> > > Date: Wed, 11 Nov 2015 11:09:32 +0000
> > >
> > > Hello,
> > >
> > > I am trying to use OOzie to get some python workflows running. I have
> installed OOzie and Spark using mapR 5.0 which comes with OOzie 4.2 and
> Spark 1.4.1.
> > > No matter what I do, I get this error message:
> "java.lang.ClassNotFoundException: Class
> org.apache.oozie.action.hadoop.SparkMain not found" Error code JA018.
> > >
> > > I was able to reproduce using the wordcount.py example.
> > > (https://github.com/apache/spark/blob/master/examples/src/main/python/
> > > wordcount.py) (The idea of running wordcount comes from Nitin Kumar
> > > message)
> > >
> > > The command I run is "$ /opt/mapr/oozie/oozie-4.2.0/bin/oozie job
> -oozie="http://localhost:11000/oozie" -config job.properties -run "
> > > I have tried through the Java API as well and I end up with the same
> result.
> > >
> > > My job.properties contains:
> > > nameNode=maprfs:///
> > > jobTracker=spark-master:8032
> > > oozie.wf.application.path=maprfs:/user/mapr/
> > >
> > > my workflow.xml:
> > >
> > > <workflow-app xmlns='uri:oozie:workflow:0.5' name='Test'>
> > >     <start to='spark-node' />
> > >
> > >     <action name='spark-node'>
> > >         <spark xmlns="uri:oozie:spark-action:0.1">
> > >             <job-tracker>${jobTracker}</job-tracker>
> > >             <name-node>${nameNode}</name-node>
> > >             <master>yarn-client</master>
> > >             <mode>client</mode>
> > >              <name>wordcount</name>
> > >             <jar>wordcount.py</jar>
> > >             <spark-opts>--num-executors 2 --driver-memory 1024m
> --executor-memory 512m --executor-cores 1</spark-opts>
> > >         </spark>
> > >         <ok to="end" />
> > >         <error to="fail" />
> > >     </action>
> > >
> > >     <kill name="fail">
> > >         <message>Workflow failed, error
> > >             message[${wf:errorMessage(wf:lastErrorNode())}]
> > >         </message>
> > >     </kill>
> > >     <end name='end' />
> > > </workflow-app>
> > >
> > > I have tried to change the oozie.wf.application.path, specify
> explicitly the jar path, remove ord add different fields in the xml, put
> the wordcount a little bit everywhere and some other stuff but nothing
> changed...
> > >
> > > I welcome any suggestion or point out any error I made.
> > >
> > > Many thanks.
> > >
> > > Axel
> > >
> >
>
>

RE: Spark action using python file as JAR

Posted by "Oehmichen, Axel" <ax...@imperial.ac.uk>.
Tried again and it yielded the same error.

Many thanks.

Best,
--
Axel Oehmichen
Research Assistant
Data Science Institute
+44 (0) 7 842 734 702 
Axelfrancois.oehmichen11@imperial.ac.uk


-----Original Message-----
From: Oussama Chougna [mailto:oussama1983@hotmail.com] 
Sent: 11 November 2015 14:23
To: user@oozie.apache.org
Subject: RE: Spark action using python file as JAR

OK,
Now in your job.properties include:
oozie.use.system.libpath=true

This tells Oozie to use that sharelib.
Cheers,
Oussama Chougna

> From: axelfrancois.oehmichen11@imperial.ac.uk
> To: user@oozie.apache.org
> Subject: RE: Spark action using python file as JAR
> Date: Wed, 11 Nov 2015 14:18:45 +0000
> 
> Hello Oussama,
> 
> Thanks for the response. The sharelib folder does exists on HDFS under /oozie/share/lib/spark
> 
> Best,
> Axel
> 
> -----Original Message-----
> From: Oussama Chougna [mailto:oussama1983@hotmail.com] 
> Sent: 11 November 2015 13:30
> To: user@oozie.apache.org
> Subject: RE: Spark action using python file as JAR
> 
> Hi Axel,
> Did you also install the Oozie sharelib? Sounds like you missing the sharelib, it is installed on HDFS. See the Oozie docs/MapR for a howto.
> Cheers,
> 
> Oussama Chougna
> 
> > From: axelfrancois.oehmichen11@imperial.ac.uk
> > To: user@oozie.apache.org
> > Subject: Spark action using python file as JAR
> > Date: Wed, 11 Nov 2015 11:09:32 +0000
> > 
> > Hello,
> > 
> > I am trying to use OOzie to get some python workflows running. I have installed OOzie and Spark using mapR 5.0 which comes with OOzie 4.2 and Spark 1.4.1.
> > No matter what I do, I get this error message: "java.lang.ClassNotFoundException: Class org.apache.oozie.action.hadoop.SparkMain not found" Error code JA018.
> > 
> > I was able to reproduce using the wordcount.py example. 
> > (https://github.com/apache/spark/blob/master/examples/src/main/python/
> > wordcount.py) (The idea of running wordcount comes from Nitin Kumar 
> > message)
> > 
> > The command I run is "$ /opt/mapr/oozie/oozie-4.2.0/bin/oozie job -oozie="http://localhost:11000/oozie" -config job.properties -run "
> > I have tried through the Java API as well and I end up with the same result.
> > 
> > My job.properties contains:
> > nameNode=maprfs:///
> > jobTracker=spark-master:8032
> > oozie.wf.application.path=maprfs:/user/mapr/
> > 
> > my workflow.xml:
> > 
> > <workflow-app xmlns='uri:oozie:workflow:0.5' name='Test'>
> >     <start to='spark-node' />
> > 
> >     <action name='spark-node'>
> >         <spark xmlns="uri:oozie:spark-action:0.1">
> >             <job-tracker>${jobTracker}</job-tracker>
> >             <name-node>${nameNode}</name-node>
> >             <master>yarn-client</master>
> >             <mode>client</mode>
> >              <name>wordcount</name>
> >             <jar>wordcount.py</jar>
> >             <spark-opts>--num-executors 2 --driver-memory 1024m --executor-memory 512m --executor-cores 1</spark-opts>
> >         </spark>
> >         <ok to="end" />
> >         <error to="fail" />
> >     </action>
> > 
> >     <kill name="fail">
> >         <message>Workflow failed, error
> >             message[${wf:errorMessage(wf:lastErrorNode())}]
> >         </message>
> >     </kill>
> >     <end name='end' />
> > </workflow-app>
> > 
> > I have tried to change the oozie.wf.application.path, specify explicitly the jar path, remove ord add different fields in the xml, put the wordcount a little bit everywhere and some other stuff but nothing changed...
> > 
> > I welcome any suggestion or point out any error I made.
> > 
> > Many thanks.
> > 
> > Axel
> > 
>  		 	   		  
 		 	   		  

RE: Spark action using python file as JAR

Posted by Oussama Chougna <ou...@hotmail.com>.
OK,
Now in your job.properties include:
oozie.use.system.libpath=true

This tells Oozie to use that sharelib.
Cheers,
Oussama Chougna

> From: axelfrancois.oehmichen11@imperial.ac.uk
> To: user@oozie.apache.org
> Subject: RE: Spark action using python file as JAR
> Date: Wed, 11 Nov 2015 14:18:45 +0000
> 
> Hello Oussama,
> 
> Thanks for the response. The sharelib folder does exists on HDFS under /oozie/share/lib/spark
> 
> Best,
> Axel
> 
> -----Original Message-----
> From: Oussama Chougna [mailto:oussama1983@hotmail.com] 
> Sent: 11 November 2015 13:30
> To: user@oozie.apache.org
> Subject: RE: Spark action using python file as JAR
> 
> Hi Axel,
> Did you also install the Oozie sharelib? Sounds like you missing the sharelib, it is installed on HDFS. See the Oozie docs/MapR for a howto.
> Cheers,
> 
> Oussama Chougna
> 
> > From: axelfrancois.oehmichen11@imperial.ac.uk
> > To: user@oozie.apache.org
> > Subject: Spark action using python file as JAR
> > Date: Wed, 11 Nov 2015 11:09:32 +0000
> > 
> > Hello,
> > 
> > I am trying to use OOzie to get some python workflows running. I have installed OOzie and Spark using mapR 5.0 which comes with OOzie 4.2 and Spark 1.4.1.
> > No matter what I do, I get this error message: "java.lang.ClassNotFoundException: Class org.apache.oozie.action.hadoop.SparkMain not found" Error code JA018.
> > 
> > I was able to reproduce using the wordcount.py example. 
> > (https://github.com/apache/spark/blob/master/examples/src/main/python/
> > wordcount.py) (The idea of running wordcount comes from Nitin Kumar 
> > message)
> > 
> > The command I run is "$ /opt/mapr/oozie/oozie-4.2.0/bin/oozie job -oozie="http://localhost:11000/oozie" -config job.properties -run "
> > I have tried through the Java API as well and I end up with the same result.
> > 
> > My job.properties contains:
> > nameNode=maprfs:///
> > jobTracker=spark-master:8032
> > oozie.wf.application.path=maprfs:/user/mapr/
> > 
> > my workflow.xml:
> > 
> > <workflow-app xmlns='uri:oozie:workflow:0.5' name='Test'>
> >     <start to='spark-node' />
> > 
> >     <action name='spark-node'>
> >         <spark xmlns="uri:oozie:spark-action:0.1">
> >             <job-tracker>${jobTracker}</job-tracker>
> >             <name-node>${nameNode}</name-node>
> >             <master>yarn-client</master>
> >             <mode>client</mode>
> >              <name>wordcount</name>
> >             <jar>wordcount.py</jar>
> >             <spark-opts>--num-executors 2 --driver-memory 1024m --executor-memory 512m --executor-cores 1</spark-opts>
> >         </spark>
> >         <ok to="end" />
> >         <error to="fail" />
> >     </action>
> > 
> >     <kill name="fail">
> >         <message>Workflow failed, error
> >             message[${wf:errorMessage(wf:lastErrorNode())}]
> >         </message>
> >     </kill>
> >     <end name='end' />
> > </workflow-app>
> > 
> > I have tried to change the oozie.wf.application.path, specify explicitly the jar path, remove ord add different fields in the xml, put the wordcount a little bit everywhere and some other stuff but nothing changed...
> > 
> > I welcome any suggestion or point out any error I made.
> > 
> > Many thanks.
> > 
> > Axel
> > 
>  		 	   		  
 		 	   		  

RE: Spark action using python file as JAR

Posted by "Oehmichen, Axel" <ax...@imperial.ac.uk>.
Hello Oussama,

Thanks for the response. The sharelib folder does exists on HDFS under /oozie/share/lib/spark

Best,
Axel

-----Original Message-----
From: Oussama Chougna [mailto:oussama1983@hotmail.com] 
Sent: 11 November 2015 13:30
To: user@oozie.apache.org
Subject: RE: Spark action using python file as JAR

Hi Axel,
Did you also install the Oozie sharelib? Sounds like you missing the sharelib, it is installed on HDFS. See the Oozie docs/MapR for a howto.
Cheers,

Oussama Chougna

> From: axelfrancois.oehmichen11@imperial.ac.uk
> To: user@oozie.apache.org
> Subject: Spark action using python file as JAR
> Date: Wed, 11 Nov 2015 11:09:32 +0000
> 
> Hello,
> 
> I am trying to use OOzie to get some python workflows running. I have installed OOzie and Spark using mapR 5.0 which comes with OOzie 4.2 and Spark 1.4.1.
> No matter what I do, I get this error message: "java.lang.ClassNotFoundException: Class org.apache.oozie.action.hadoop.SparkMain not found" Error code JA018.
> 
> I was able to reproduce using the wordcount.py example. 
> (https://github.com/apache/spark/blob/master/examples/src/main/python/
> wordcount.py) (The idea of running wordcount comes from Nitin Kumar 
> message)
> 
> The command I run is "$ /opt/mapr/oozie/oozie-4.2.0/bin/oozie job -oozie="http://localhost:11000/oozie" -config job.properties -run "
> I have tried through the Java API as well and I end up with the same result.
> 
> My job.properties contains:
> nameNode=maprfs:///
> jobTracker=spark-master:8032
> oozie.wf.application.path=maprfs:/user/mapr/
> 
> my workflow.xml:
> 
> <workflow-app xmlns='uri:oozie:workflow:0.5' name='Test'>
>     <start to='spark-node' />
> 
>     <action name='spark-node'>
>         <spark xmlns="uri:oozie:spark-action:0.1">
>             <job-tracker>${jobTracker}</job-tracker>
>             <name-node>${nameNode}</name-node>
>             <master>yarn-client</master>
>             <mode>client</mode>
>              <name>wordcount</name>
>             <jar>wordcount.py</jar>
>             <spark-opts>--num-executors 2 --driver-memory 1024m --executor-memory 512m --executor-cores 1</spark-opts>
>         </spark>
>         <ok to="end" />
>         <error to="fail" />
>     </action>
> 
>     <kill name="fail">
>         <message>Workflow failed, error
>             message[${wf:errorMessage(wf:lastErrorNode())}]
>         </message>
>     </kill>
>     <end name='end' />
> </workflow-app>
> 
> I have tried to change the oozie.wf.application.path, specify explicitly the jar path, remove ord add different fields in the xml, put the wordcount a little bit everywhere and some other stuff but nothing changed...
> 
> I welcome any suggestion or point out any error I made.
> 
> Many thanks.
> 
> Axel
> 
 		 	   		  

RE: Spark action using python file as JAR

Posted by Oussama Chougna <ou...@hotmail.com>.
Hi Axel,
Did you also install the Oozie sharelib? Sounds like you missing the sharelib, it is installed on HDFS. See the Oozie docs/MapR for a howto.
Cheers,

Oussama Chougna

> From: axelfrancois.oehmichen11@imperial.ac.uk
> To: user@oozie.apache.org
> Subject: Spark action using python file as JAR
> Date: Wed, 11 Nov 2015 11:09:32 +0000
> 
> Hello,
> 
> I am trying to use OOzie to get some python workflows running. I have installed OOzie and Spark using mapR 5.0 which comes with OOzie 4.2 and Spark 1.4.1.
> No matter what I do, I get this error message: "java.lang.ClassNotFoundException: Class org.apache.oozie.action.hadoop.SparkMain not found" Error code JA018.
> 
> I was able to reproduce using the wordcount.py example. (https://github.com/apache/spark/blob/master/examples/src/main/python/wordcount.py)
> (The idea of running wordcount comes from Nitin Kumar message)
> 
> The command I run is "$ /opt/mapr/oozie/oozie-4.2.0/bin/oozie job -oozie="http://localhost:11000/oozie" -config job.properties -run "
> I have tried through the Java API as well and I end up with the same result.
> 
> My job.properties contains:
> nameNode=maprfs:///
> jobTracker=spark-master:8032
> oozie.wf.application.path=maprfs:/user/mapr/
> 
> my workflow.xml:
> 
> <workflow-app xmlns='uri:oozie:workflow:0.5' name='Test'>
>     <start to='spark-node' />
> 
>     <action name='spark-node'>
>         <spark xmlns="uri:oozie:spark-action:0.1">
>             <job-tracker>${jobTracker}</job-tracker>
>             <name-node>${nameNode}</name-node>
>             <master>yarn-client</master>
>             <mode>client</mode>
>              <name>wordcount</name>
>             <jar>wordcount.py</jar>
>             <spark-opts>--num-executors 2 --driver-memory 1024m --executor-memory 512m --executor-cores 1</spark-opts>
>         </spark>
>         <ok to="end" />
>         <error to="fail" />
>     </action>
> 
>     <kill name="fail">
>         <message>Workflow failed, error
>             message[${wf:errorMessage(wf:lastErrorNode())}]
>         </message>
>     </kill>
>     <end name='end' />
> </workflow-app>
> 
> I have tried to change the oozie.wf.application.path, specify explicitly the jar path, remove ord add different fields in the xml, put the wordcount a little bit everywhere and some other stuff but nothing changed...
> 
> I welcome any suggestion or point out any error I made.
> 
> Many thanks.
> 
> Axel
>