You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@oozie.apache.org by Tim Chan <ti...@chan.net> on 2012/07/04 08:09:28 UTC

Running python script using Oozie

I would like to use Oozie to run a python script on a worker node.

I've been looking at the documentation located here:

https://github.com/yahoo/oozie/wiki/Oozie-WF-use-cases

under the heading: Java-Main Action with Script support

Is ReadErrorStream some custom class? It is not a part of the Java IO API.

Is there updated documentation on running scripts (ruby, python, perl,
etc) using Oozie?

Re: Running python script using Oozie

Posted by Tim Chan <ti...@chan.net>.
Running the hadoop fs -put command as a Java action seems to work much
better. I'll stick to that for now. Thanks for the help.

On Fri, Jul 6, 2012 at 12:00 AM, Harsh J <ha...@cloudera.com> wrote:
> In addition to what Mohammad has already suggested, you may also try
> to bump the mapred.child.ulimit value, since this script runs as a
> forked process if am right:
>
> Set oozie.launcher.mapred.child.ulimit to 2.5 GB in KB, so '2621440'.
>
> On Fri, Jul 6, 2012 at 12:21 PM, Mohammad Islam <mi...@yahoo.com> wrote:
>>
>>
>> What about this?
>>
>>
>> <property> <name>oozie.launcher.mapred.child.java.opts</name> <value>-server -Xmx1G -Djava.net.preferIPv4Stack=true</value> <description>setting memory usage to 1024MB</description> </property>
>>
>> More details could be found at:
>> http://incubator.apache.org/oozie/pig-cookbook.html
>>
>>
>> Also can you try the command line "hadoop fs -ls"?
>>
>>
>> ----- Original Message -----
>> From: Tim Chan <ti...@chan.net>
>> To: oozie-users@incubator.apache.org; Mohammad Islam <mi...@yahoo.com>
>> Cc:
>> Sent: Thursday, July 5, 2012 9:47 PM
>> Subject: Re: Running python script using Oozie
>>
>> Hi Mohammad,
>>
>> That didn't seem to help.
>>
>> Here is my action:
>>
>>    <action name="hdfs-put">
>>         <shell xmlns="uri:oozie:shell-action:0.1">
>>             <job-tracker>${jobTracker}</job-tracker>
>>             <name-node>${nameNode}</name-node>
>>             <configuration>
>>                 <property>
>>                     <name>mapred.job.queue.name</name>
>>                     <value>${queueName}</value>
>>                 </property>
>>                 <property>
>>                     <name>oozie.launcher.mapred.child.java.opts</name>
>>                     <value>-Xmx1G</value>
>>                 </property>
>>             </configuration>
>>
>>             <exec>/usr/bin/hadoop</exec>
>>             <argument>fs</argument>
>>             <argument>-ls</argument>
>>
>>             <capture-output/>
>>         </shell>
>>
>>
>>
>> On Thu, Jul 5, 2012 at 8:18 PM, Mohammad Islam <mi...@yahoo.com> wrote:
>>> Hi Tim,
>>> Could you try by adding this into shell action definition:
>>> <name>oozie.launcher.mapred.child.java.opts</name>
>>> <value>-Xmx1G </value>
>>>
>>> Regards,
>>> Mohammad
>>>
>>>
>>> ----- Original Message -----
>>> From: Tim Chan <ti...@chan.net>
>>> To: oozie-users@incubator.apache.org
>>> Cc:
>>> Sent: Thursday, July 5, 2012 8:04 PM
>>> Subject: Re: Running python script using Oozie
>>>
>>> I can run my python script now.
>>>
>>>
>>> But I am trying to use the shell action to run:
>>>
>>> hadoop fs -put fileName
>>>
>>>
>>> I get the following error in the logs:
>>>
>>>
>>> Error occurred during initialization of VM
>>> Could not reserve enough space for object heap
>>> Exit code of the Shell command 1
>>>
>>> What might be the problem?
>>>
>>>
>>>
>>> On Thu, Jul 5, 2012 at 3:21 PM, Harish Krishnan
>>> <ha...@gmail.com> wrote:
>>>> Hi Tim,
>>>>
>>>> Is <shell xmlns="uri:oozie:shell-action:0.1"> correct?
>>>> I thought this should be <shell xmlns="uri:oozie:workflow:0.2">
>>>>
>>>> Thanks & Regards,
>>>> Harish.T.K
>>>>
>>>>
>>>> On Thu, Jul 5, 2012 at 2:53 PM, Tim Chan <ti...@chan.net> wrote:
>>>>
>>>>> Alejandro,
>>>>>
>>>>> We're running Oozie server 2.3.2-cdh3u4.
>>>>> The shell action appears to be supported based on the documentation,
>>>>> but when I run my workflow, I get the following error in the oozie
>>>>> logs:
>>>>>
>>>>>
>>>>>  E0701: XML schema error, cvc-complex-type.2.4.c: The matching
>>>>> wildcard is strict, but no declaration can be found for element
>>>>> 'shell'.
>>>>>
>>>>> When I use xmlns="uri:oozie:workflow:0.3" I get the following error:
>>>>>
>>>>>  XException, org.apache.oozie.command.CommandException: E0701: XML
>>>>> schema error, cvc-elt.1: Cannot find the declaration of element
>>>>> 'workflow-app'.
>>>>> org.apache.oozie.command.CommandException: E0701: XML schema error,
>>>>> cvc-elt.1: Cannot find the declaration of element 'workflow-app'.
>>>>>
>>>>>
>>>>> Here is m workflow.xml:
>>>>>
>>>>> <workflow-app xmlns="uri:oozie:workflow:0.2"
>>>>> name="dlx-mapping-processor-main">
>>>>>
>>>>>     <start to="shell-test"/>
>>>>>
>>>>>     <action name="shell-test">
>>>>>         <shell xmlns="uri:oozie:shell-action:0.1">
>>>>>             <job-tracker>${jobTracker}</job-tracker>
>>>>>             <name-node>${nameNode}</name-node>
>>>>>             <configuration>
>>>>>                 <property>
>>>>>                     <name>mapred.job.queue.name</name>
>>>>>                     <value>${queueName}</value>
>>>>>                 </property>
>>>>>             </configuration>
>>>>>
>>>>>             <exec>pwd</exec>
>>>>>
>>>>>             <capture-output/>
>>>>>
>>>>>         </shell>
>>>>>
>>>>>         <ok to="end"/>
>>>>>         <error to="fail"/>
>>>>>     </action>
>>>>>
>>>>>     <kill name="fail">
>>>>>         <message>Node failed, error
>>>>> message[${wf:errorMessage(wf:lastErrorNode())}]</message>
>>>>>     </kill>
>>>>>
>>>>>     <end name="end"/>
>>>>> </workflow-app>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Jul 5, 2012 at 9:59 AM, Alejandro Abdelnur <tu...@cloudera.com>
>>>>> wrote:
>>>>> > Hi TIm,
>>>>> >
>>>>> > I think the Shell action would be better suited to run a phyton script.
>>>>> And
>>>>> > keep in mind phyton and all the libs you need should be avail in all
>>>>> nodes
>>>>> > in the cluster.
>>>>> >
>>>>> > Thanks
>>>>> >
>>>>> > Alejandro
>>>>> >
>>>>> > On Tue, Jul 3, 2012 at 11:09 PM, Tim Chan <ti...@chan.net> wrote:
>>>>> >
>>>>> >> I would like to use Oozie to run a python script on a worker node.
>>>>> >>
>>>>> >> I've been looking at the documentation located here:
>>>>> >>
>>>>> >> https://github.com/yahoo/oozie/wiki/Oozie-WF-use-cases
>>>>> >>
>>>>> >> under the heading: Java-Main Action with Script support
>>>>> >>
>>>>> >> Is ReadErrorStream some custom class? It is not a part of the Java IO
>>>>> API.
>>>>> >>
>>>>> >> Is there updated documentation on running scripts (ruby, python, perl,
>>>>> >> etc) using Oozie?
>>>>> >>
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Alejandro
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>   Tim Chan   //  tim@chan.net   //   213.784.2523
>>>>>
>>>
>>>
>>>
>>> --
>>>   Tim Chan   //  tim@chan.net   //   213.784.2523
>>>
>>
>>
>>
>> --
>>   Tim Chan   //  tim@chan.net   //   213.784.2523
>>
>
>
>
> --
> Harsh J



-- 
  Tim Chan   //  tim@chan.net   //   213.784.2523

Re: Running python script using Oozie

Posted by Harsh J <ha...@cloudera.com>.
In addition to what Mohammad has already suggested, you may also try
to bump the mapred.child.ulimit value, since this script runs as a
forked process if am right:

Set oozie.launcher.mapred.child.ulimit to 2.5 GB in KB, so '2621440'.

On Fri, Jul 6, 2012 at 12:21 PM, Mohammad Islam <mi...@yahoo.com> wrote:
>
>
> What about this?
>
>
> <property> <name>oozie.launcher.mapred.child.java.opts</name> <value>-server -Xmx1G -Djava.net.preferIPv4Stack=true</value> <description>setting memory usage to 1024MB</description> </property>
>
> More details could be found at:
> http://incubator.apache.org/oozie/pig-cookbook.html
>
>
> Also can you try the command line "hadoop fs -ls"?
>
>
> ----- Original Message -----
> From: Tim Chan <ti...@chan.net>
> To: oozie-users@incubator.apache.org; Mohammad Islam <mi...@yahoo.com>
> Cc:
> Sent: Thursday, July 5, 2012 9:47 PM
> Subject: Re: Running python script using Oozie
>
> Hi Mohammad,
>
> That didn't seem to help.
>
> Here is my action:
>
>    <action name="hdfs-put">
>         <shell xmlns="uri:oozie:shell-action:0.1">
>             <job-tracker>${jobTracker}</job-tracker>
>             <name-node>${nameNode}</name-node>
>             <configuration>
>                 <property>
>                     <name>mapred.job.queue.name</name>
>                     <value>${queueName}</value>
>                 </property>
>                 <property>
>                     <name>oozie.launcher.mapred.child.java.opts</name>
>                     <value>-Xmx1G</value>
>                 </property>
>             </configuration>
>
>             <exec>/usr/bin/hadoop</exec>
>             <argument>fs</argument>
>             <argument>-ls</argument>
>
>             <capture-output/>
>         </shell>
>
>
>
> On Thu, Jul 5, 2012 at 8:18 PM, Mohammad Islam <mi...@yahoo.com> wrote:
>> Hi Tim,
>> Could you try by adding this into shell action definition:
>> <name>oozie.launcher.mapred.child.java.opts</name>
>> <value>-Xmx1G </value>
>>
>> Regards,
>> Mohammad
>>
>>
>> ----- Original Message -----
>> From: Tim Chan <ti...@chan.net>
>> To: oozie-users@incubator.apache.org
>> Cc:
>> Sent: Thursday, July 5, 2012 8:04 PM
>> Subject: Re: Running python script using Oozie
>>
>> I can run my python script now.
>>
>>
>> But I am trying to use the shell action to run:
>>
>> hadoop fs -put fileName
>>
>>
>> I get the following error in the logs:
>>
>>
>> Error occurred during initialization of VM
>> Could not reserve enough space for object heap
>> Exit code of the Shell command 1
>>
>> What might be the problem?
>>
>>
>>
>> On Thu, Jul 5, 2012 at 3:21 PM, Harish Krishnan
>> <ha...@gmail.com> wrote:
>>> Hi Tim,
>>>
>>> Is <shell xmlns="uri:oozie:shell-action:0.1"> correct?
>>> I thought this should be <shell xmlns="uri:oozie:workflow:0.2">
>>>
>>> Thanks & Regards,
>>> Harish.T.K
>>>
>>>
>>> On Thu, Jul 5, 2012 at 2:53 PM, Tim Chan <ti...@chan.net> wrote:
>>>
>>>> Alejandro,
>>>>
>>>> We're running Oozie server 2.3.2-cdh3u4.
>>>> The shell action appears to be supported based on the documentation,
>>>> but when I run my workflow, I get the following error in the oozie
>>>> logs:
>>>>
>>>>
>>>>  E0701: XML schema error, cvc-complex-type.2.4.c: The matching
>>>> wildcard is strict, but no declaration can be found for element
>>>> 'shell'.
>>>>
>>>> When I use xmlns="uri:oozie:workflow:0.3" I get the following error:
>>>>
>>>>  XException, org.apache.oozie.command.CommandException: E0701: XML
>>>> schema error, cvc-elt.1: Cannot find the declaration of element
>>>> 'workflow-app'.
>>>> org.apache.oozie.command.CommandException: E0701: XML schema error,
>>>> cvc-elt.1: Cannot find the declaration of element 'workflow-app'.
>>>>
>>>>
>>>> Here is m workflow.xml:
>>>>
>>>> <workflow-app xmlns="uri:oozie:workflow:0.2"
>>>> name="dlx-mapping-processor-main">
>>>>
>>>>     <start to="shell-test"/>
>>>>
>>>>     <action name="shell-test">
>>>>         <shell xmlns="uri:oozie:shell-action:0.1">
>>>>             <job-tracker>${jobTracker}</job-tracker>
>>>>             <name-node>${nameNode}</name-node>
>>>>             <configuration>
>>>>                 <property>
>>>>                     <name>mapred.job.queue.name</name>
>>>>                     <value>${queueName}</value>
>>>>                 </property>
>>>>             </configuration>
>>>>
>>>>             <exec>pwd</exec>
>>>>
>>>>             <capture-output/>
>>>>
>>>>         </shell>
>>>>
>>>>         <ok to="end"/>
>>>>         <error to="fail"/>
>>>>     </action>
>>>>
>>>>     <kill name="fail">
>>>>         <message>Node failed, error
>>>> message[${wf:errorMessage(wf:lastErrorNode())}]</message>
>>>>     </kill>
>>>>
>>>>     <end name="end"/>
>>>> </workflow-app>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Jul 5, 2012 at 9:59 AM, Alejandro Abdelnur <tu...@cloudera.com>
>>>> wrote:
>>>> > Hi TIm,
>>>> >
>>>> > I think the Shell action would be better suited to run a phyton script.
>>>> And
>>>> > keep in mind phyton and all the libs you need should be avail in all
>>>> nodes
>>>> > in the cluster.
>>>> >
>>>> > Thanks
>>>> >
>>>> > Alejandro
>>>> >
>>>> > On Tue, Jul 3, 2012 at 11:09 PM, Tim Chan <ti...@chan.net> wrote:
>>>> >
>>>> >> I would like to use Oozie to run a python script on a worker node.
>>>> >>
>>>> >> I've been looking at the documentation located here:
>>>> >>
>>>> >> https://github.com/yahoo/oozie/wiki/Oozie-WF-use-cases
>>>> >>
>>>> >> under the heading: Java-Main Action with Script support
>>>> >>
>>>> >> Is ReadErrorStream some custom class? It is not a part of the Java IO
>>>> API.
>>>> >>
>>>> >> Is there updated documentation on running scripts (ruby, python, perl,
>>>> >> etc) using Oozie?
>>>> >>
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > Alejandro
>>>>
>>>>
>>>>
>>>> --
>>>>   Tim Chan   //  tim@chan.net   //   213.784.2523
>>>>
>>
>>
>>
>> --
>>   Tim Chan   //  tim@chan.net   //   213.784.2523
>>
>
>
>
> --
>   Tim Chan   //  tim@chan.net   //   213.784.2523
>



-- 
Harsh J

Re: Running python script using Oozie

Posted by Mohammad Islam <mi...@yahoo.com>.

What about this?


<property> <name>oozie.launcher.mapred.child.java.opts</name> <value>-server -Xmx1G -Djava.net.preferIPv4Stack=true</value> <description>setting memory usage to 1024MB</description> </property>

More details could be found at:
http://incubator.apache.org/oozie/pig-cookbook.html


Also can you try the command line "hadoop fs -ls"?


----- Original Message -----
From: Tim Chan <ti...@chan.net>
To: oozie-users@incubator.apache.org; Mohammad Islam <mi...@yahoo.com>
Cc: 
Sent: Thursday, July 5, 2012 9:47 PM
Subject: Re: Running python script using Oozie

Hi Mohammad,

That didn't seem to help.

Here is my action:

   <action name="hdfs-put">
        <shell xmlns="uri:oozie:shell-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <configuration>
                <property>
                    <name>mapred.job.queue.name</name>
                    <value>${queueName}</value>
                </property>
                <property>
                    <name>oozie.launcher.mapred.child.java.opts</name>
                    <value>-Xmx1G</value>
                </property>
            </configuration>

            <exec>/usr/bin/hadoop</exec>
            <argument>fs</argument>
            <argument>-ls</argument>

            <capture-output/>
        </shell>



On Thu, Jul 5, 2012 at 8:18 PM, Mohammad Islam <mi...@yahoo.com> wrote:
> Hi Tim,
> Could you try by adding this into shell action definition:
> <name>oozie.launcher.mapred.child.java.opts</name>
> <value>-Xmx1G </value>
>
> Regards,
> Mohammad
>
>
> ----- Original Message -----
> From: Tim Chan <ti...@chan.net>
> To: oozie-users@incubator.apache.org
> Cc:
> Sent: Thursday, July 5, 2012 8:04 PM
> Subject: Re: Running python script using Oozie
>
> I can run my python script now.
>
>
> But I am trying to use the shell action to run:
>
> hadoop fs -put fileName
>
>
> I get the following error in the logs:
>
>
> Error occurred during initialization of VM
> Could not reserve enough space for object heap
> Exit code of the Shell command 1
>
> What might be the problem?
>
>
>
> On Thu, Jul 5, 2012 at 3:21 PM, Harish Krishnan
> <ha...@gmail.com> wrote:
>> Hi Tim,
>>
>> Is <shell xmlns="uri:oozie:shell-action:0.1"> correct?
>> I thought this should be <shell xmlns="uri:oozie:workflow:0.2">
>>
>> Thanks & Regards,
>> Harish.T.K
>>
>>
>> On Thu, Jul 5, 2012 at 2:53 PM, Tim Chan <ti...@chan.net> wrote:
>>
>>> Alejandro,
>>>
>>> We're running Oozie server 2.3.2-cdh3u4.
>>> The shell action appears to be supported based on the documentation,
>>> but when I run my workflow, I get the following error in the oozie
>>> logs:
>>>
>>>
>>>  E0701: XML schema error, cvc-complex-type.2.4.c: The matching
>>> wildcard is strict, but no declaration can be found for element
>>> 'shell'.
>>>
>>> When I use xmlns="uri:oozie:workflow:0.3" I get the following error:
>>>
>>>  XException, org.apache.oozie.command.CommandException: E0701: XML
>>> schema error, cvc-elt.1: Cannot find the declaration of element
>>> 'workflow-app'.
>>> org.apache.oozie.command.CommandException: E0701: XML schema error,
>>> cvc-elt.1: Cannot find the declaration of element 'workflow-app'.
>>>
>>>
>>> Here is m workflow.xml:
>>>
>>> <workflow-app xmlns="uri:oozie:workflow:0.2"
>>> name="dlx-mapping-processor-main">
>>>
>>>     <start to="shell-test"/>
>>>
>>>     <action name="shell-test">
>>>         <shell xmlns="uri:oozie:shell-action:0.1">
>>>             <job-tracker>${jobTracker}</job-tracker>
>>>             <name-node>${nameNode}</name-node>
>>>             <configuration>
>>>                 <property>
>>>                     <name>mapred.job.queue.name</name>
>>>                     <value>${queueName}</value>
>>>                 </property>
>>>             </configuration>
>>>
>>>             <exec>pwd</exec>
>>>
>>>             <capture-output/>
>>>
>>>         </shell>
>>>
>>>         <ok to="end"/>
>>>         <error to="fail"/>
>>>     </action>
>>>
>>>     <kill name="fail">
>>>         <message>Node failed, error
>>> message[${wf:errorMessage(wf:lastErrorNode())}]</message>
>>>     </kill>
>>>
>>>     <end name="end"/>
>>> </workflow-app>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Jul 5, 2012 at 9:59 AM, Alejandro Abdelnur <tu...@cloudera.com>
>>> wrote:
>>> > Hi TIm,
>>> >
>>> > I think the Shell action would be better suited to run a phyton script.
>>> And
>>> > keep in mind phyton and all the libs you need should be avail in all
>>> nodes
>>> > in the cluster.
>>> >
>>> > Thanks
>>> >
>>> > Alejandro
>>> >
>>> > On Tue, Jul 3, 2012 at 11:09 PM, Tim Chan <ti...@chan.net> wrote:
>>> >
>>> >> I would like to use Oozie to run a python script on a worker node.
>>> >>
>>> >> I've been looking at the documentation located here:
>>> >>
>>> >> https://github.com/yahoo/oozie/wiki/Oozie-WF-use-cases
>>> >>
>>> >> under the heading: Java-Main Action with Script support
>>> >>
>>> >> Is ReadErrorStream some custom class? It is not a part of the Java IO
>>> API.
>>> >>
>>> >> Is there updated documentation on running scripts (ruby, python, perl,
>>> >> etc) using Oozie?
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > Alejandro
>>>
>>>
>>>
>>> --
>>>   Tim Chan   //  tim@chan.net   //   213.784.2523
>>>
>
>
>
> --
>   Tim Chan   //  tim@chan.net   //   213.784.2523
>



-- 
  Tim Chan   //  tim@chan.net   //   213.784.2523


Re: Running python script using Oozie

Posted by Tim Chan <ti...@chan.net>.
Hi Mohammad,

That didn't seem to help.

Here is my action:

   <action name="hdfs-put">
        <shell xmlns="uri:oozie:shell-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <configuration>
                <property>
                    <name>mapred.job.queue.name</name>
                    <value>${queueName}</value>
                </property>
                <property>
                    <name>oozie.launcher.mapred.child.java.opts</name>
                    <value>-Xmx1G</value>
                </property>
            </configuration>

            <exec>/usr/bin/hadoop</exec>
            <argument>fs</argument>
            <argument>-ls</argument>

            <capture-output/>
        </shell>



On Thu, Jul 5, 2012 at 8:18 PM, Mohammad Islam <mi...@yahoo.com> wrote:
> Hi Tim,
> Could you try by adding this into shell action definition:
> <name>oozie.launcher.mapred.child.java.opts</name>
> <value>-Xmx1G </value>
>
> Regards,
> Mohammad
>
>
> ----- Original Message -----
> From: Tim Chan <ti...@chan.net>
> To: oozie-users@incubator.apache.org
> Cc:
> Sent: Thursday, July 5, 2012 8:04 PM
> Subject: Re: Running python script using Oozie
>
> I can run my python script now.
>
>
> But I am trying to use the shell action to run:
>
> hadoop fs -put fileName
>
>
> I get the following error in the logs:
>
>
> Error occurred during initialization of VM
> Could not reserve enough space for object heap
> Exit code of the Shell command 1
>
> What might be the problem?
>
>
>
> On Thu, Jul 5, 2012 at 3:21 PM, Harish Krishnan
> <ha...@gmail.com> wrote:
>> Hi Tim,
>>
>> Is <shell xmlns="uri:oozie:shell-action:0.1"> correct?
>> I thought this should be <shell xmlns="uri:oozie:workflow:0.2">
>>
>> Thanks & Regards,
>> Harish.T.K
>>
>>
>> On Thu, Jul 5, 2012 at 2:53 PM, Tim Chan <ti...@chan.net> wrote:
>>
>>> Alejandro,
>>>
>>> We're running Oozie server 2.3.2-cdh3u4.
>>> The shell action appears to be supported based on the documentation,
>>> but when I run my workflow, I get the following error in the oozie
>>> logs:
>>>
>>>
>>>  E0701: XML schema error, cvc-complex-type.2.4.c: The matching
>>> wildcard is strict, but no declaration can be found for element
>>> 'shell'.
>>>
>>> When I use xmlns="uri:oozie:workflow:0.3" I get the following error:
>>>
>>>  XException, org.apache.oozie.command.CommandException: E0701: XML
>>> schema error, cvc-elt.1: Cannot find the declaration of element
>>> 'workflow-app'.
>>> org.apache.oozie.command.CommandException: E0701: XML schema error,
>>> cvc-elt.1: Cannot find the declaration of element 'workflow-app'.
>>>
>>>
>>> Here is m workflow.xml:
>>>
>>> <workflow-app xmlns="uri:oozie:workflow:0.2"
>>> name="dlx-mapping-processor-main">
>>>
>>>     <start to="shell-test"/>
>>>
>>>     <action name="shell-test">
>>>         <shell xmlns="uri:oozie:shell-action:0.1">
>>>             <job-tracker>${jobTracker}</job-tracker>
>>>             <name-node>${nameNode}</name-node>
>>>             <configuration>
>>>                 <property>
>>>                     <name>mapred.job.queue.name</name>
>>>                     <value>${queueName}</value>
>>>                 </property>
>>>             </configuration>
>>>
>>>             <exec>pwd</exec>
>>>
>>>             <capture-output/>
>>>
>>>         </shell>
>>>
>>>         <ok to="end"/>
>>>         <error to="fail"/>
>>>     </action>
>>>
>>>     <kill name="fail">
>>>         <message>Node failed, error
>>> message[${wf:errorMessage(wf:lastErrorNode())}]</message>
>>>     </kill>
>>>
>>>     <end name="end"/>
>>> </workflow-app>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Jul 5, 2012 at 9:59 AM, Alejandro Abdelnur <tu...@cloudera.com>
>>> wrote:
>>> > Hi TIm,
>>> >
>>> > I think the Shell action would be better suited to run a phyton script.
>>> And
>>> > keep in mind phyton and all the libs you need should be avail in all
>>> nodes
>>> > in the cluster.
>>> >
>>> > Thanks
>>> >
>>> > Alejandro
>>> >
>>> > On Tue, Jul 3, 2012 at 11:09 PM, Tim Chan <ti...@chan.net> wrote:
>>> >
>>> >> I would like to use Oozie to run a python script on a worker node.
>>> >>
>>> >> I've been looking at the documentation located here:
>>> >>
>>> >> https://github.com/yahoo/oozie/wiki/Oozie-WF-use-cases
>>> >>
>>> >> under the heading: Java-Main Action with Script support
>>> >>
>>> >> Is ReadErrorStream some custom class? It is not a part of the Java IO
>>> API.
>>> >>
>>> >> Is there updated documentation on running scripts (ruby, python, perl,
>>> >> etc) using Oozie?
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > Alejandro
>>>
>>>
>>>
>>> --
>>>   Tim Chan   //  tim@chan.net   //   213.784.2523
>>>
>
>
>
> --
>   Tim Chan   //  tim@chan.net   //   213.784.2523
>



-- 
  Tim Chan   //  tim@chan.net   //   213.784.2523

Re: Running python script using Oozie

Posted by Mohammad Islam <mi...@yahoo.com>.
Hi Tim,
Could you try by adding this into shell action definition:
<name>oozie.launcher.mapred.child.java.opts</name>
<value>-Xmx1G </value>

Regards,
Mohammad


----- Original Message -----
From: Tim Chan <ti...@chan.net>
To: oozie-users@incubator.apache.org
Cc: 
Sent: Thursday, July 5, 2012 8:04 PM
Subject: Re: Running python script using Oozie

I can run my python script now.


But I am trying to use the shell action to run:

hadoop fs -put fileName


I get the following error in the logs:


Error occurred during initialization of VM
Could not reserve enough space for object heap
Exit code of the Shell command 1

What might be the problem?



On Thu, Jul 5, 2012 at 3:21 PM, Harish Krishnan
<ha...@gmail.com> wrote:
> Hi Tim,
>
> Is <shell xmlns="uri:oozie:shell-action:0.1"> correct?
> I thought this should be <shell xmlns="uri:oozie:workflow:0.2">
>
> Thanks & Regards,
> Harish.T.K
>
>
> On Thu, Jul 5, 2012 at 2:53 PM, Tim Chan <ti...@chan.net> wrote:
>
>> Alejandro,
>>
>> We're running Oozie server 2.3.2-cdh3u4.
>> The shell action appears to be supported based on the documentation,
>> but when I run my workflow, I get the following error in the oozie
>> logs:
>>
>>
>>  E0701: XML schema error, cvc-complex-type.2.4.c: The matching
>> wildcard is strict, but no declaration can be found for element
>> 'shell'.
>>
>> When I use xmlns="uri:oozie:workflow:0.3" I get the following error:
>>
>>  XException, org.apache.oozie.command.CommandException: E0701: XML
>> schema error, cvc-elt.1: Cannot find the declaration of element
>> 'workflow-app'.
>> org.apache.oozie.command.CommandException: E0701: XML schema error,
>> cvc-elt.1: Cannot find the declaration of element 'workflow-app'.
>>
>>
>> Here is m workflow.xml:
>>
>> <workflow-app xmlns="uri:oozie:workflow:0.2"
>> name="dlx-mapping-processor-main">
>>
>>     <start to="shell-test"/>
>>
>>     <action name="shell-test">
>>         <shell xmlns="uri:oozie:shell-action:0.1">
>>             <job-tracker>${jobTracker}</job-tracker>
>>             <name-node>${nameNode}</name-node>
>>             <configuration>
>>                 <property>
>>                     <name>mapred.job.queue.name</name>
>>                     <value>${queueName}</value>
>>                 </property>
>>             </configuration>
>>
>>             <exec>pwd</exec>
>>
>>             <capture-output/>
>>
>>         </shell>
>>
>>         <ok to="end"/>
>>         <error to="fail"/>
>>     </action>
>>
>>     <kill name="fail">
>>         <message>Node failed, error
>> message[${wf:errorMessage(wf:lastErrorNode())}]</message>
>>     </kill>
>>
>>     <end name="end"/>
>> </workflow-app>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Thu, Jul 5, 2012 at 9:59 AM, Alejandro Abdelnur <tu...@cloudera.com>
>> wrote:
>> > Hi TIm,
>> >
>> > I think the Shell action would be better suited to run a phyton script.
>> And
>> > keep in mind phyton and all the libs you need should be avail in all
>> nodes
>> > in the cluster.
>> >
>> > Thanks
>> >
>> > Alejandro
>> >
>> > On Tue, Jul 3, 2012 at 11:09 PM, Tim Chan <ti...@chan.net> wrote:
>> >
>> >> I would like to use Oozie to run a python script on a worker node.
>> >>
>> >> I've been looking at the documentation located here:
>> >>
>> >> https://github.com/yahoo/oozie/wiki/Oozie-WF-use-cases
>> >>
>> >> under the heading: Java-Main Action with Script support
>> >>
>> >> Is ReadErrorStream some custom class? It is not a part of the Java IO
>> API.
>> >>
>> >> Is there updated documentation on running scripts (ruby, python, perl,
>> >> etc) using Oozie?
>> >>
>> >
>> >
>> >
>> > --
>> > Alejandro
>>
>>
>>
>> --
>>   Tim Chan   //  tim@chan.net   //   213.784.2523
>>



-- 
  Tim Chan   //  tim@chan.net   //   213.784.2523


Re: Running python script using Oozie

Posted by Tim Chan <ti...@chan.net>.
I can run my python script now.


But I am trying to use the shell action to run:

hadoop fs -put fileName


I get the following error in the logs:


Error occurred during initialization of VM
Could not reserve enough space for object heap
Exit code of the Shell command 1

What might be the problem?



On Thu, Jul 5, 2012 at 3:21 PM, Harish Krishnan
<ha...@gmail.com> wrote:
> Hi Tim,
>
> Is <shell xmlns="uri:oozie:shell-action:0.1"> correct?
> I thought this should be <shell xmlns="uri:oozie:workflow:0.2">
>
> Thanks & Regards,
> Harish.T.K
>
>
> On Thu, Jul 5, 2012 at 2:53 PM, Tim Chan <ti...@chan.net> wrote:
>
>> Alejandro,
>>
>> We're running Oozie server 2.3.2-cdh3u4.
>> The shell action appears to be supported based on the documentation,
>> but when I run my workflow, I get the following error in the oozie
>> logs:
>>
>>
>>  E0701: XML schema error, cvc-complex-type.2.4.c: The matching
>> wildcard is strict, but no declaration can be found for element
>> 'shell'.
>>
>> When I use xmlns="uri:oozie:workflow:0.3" I get the following error:
>>
>>  XException, org.apache.oozie.command.CommandException: E0701: XML
>> schema error, cvc-elt.1: Cannot find the declaration of element
>> 'workflow-app'.
>> org.apache.oozie.command.CommandException: E0701: XML schema error,
>> cvc-elt.1: Cannot find the declaration of element 'workflow-app'.
>>
>>
>> Here is m workflow.xml:
>>
>> <workflow-app xmlns="uri:oozie:workflow:0.2"
>> name="dlx-mapping-processor-main">
>>
>>     <start to="shell-test"/>
>>
>>     <action name="shell-test">
>>         <shell xmlns="uri:oozie:shell-action:0.1">
>>             <job-tracker>${jobTracker}</job-tracker>
>>             <name-node>${nameNode}</name-node>
>>             <configuration>
>>                 <property>
>>                     <name>mapred.job.queue.name</name>
>>                     <value>${queueName}</value>
>>                 </property>
>>             </configuration>
>>
>>             <exec>pwd</exec>
>>
>>             <capture-output/>
>>
>>         </shell>
>>
>>         <ok to="end"/>
>>         <error to="fail"/>
>>     </action>
>>
>>     <kill name="fail">
>>         <message>Node failed, error
>> message[${wf:errorMessage(wf:lastErrorNode())}]</message>
>>     </kill>
>>
>>     <end name="end"/>
>> </workflow-app>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Thu, Jul 5, 2012 at 9:59 AM, Alejandro Abdelnur <tu...@cloudera.com>
>> wrote:
>> > Hi TIm,
>> >
>> > I think the Shell action would be better suited to run a phyton script.
>> And
>> > keep in mind phyton and all the libs you need should be avail in all
>> nodes
>> > in the cluster.
>> >
>> > Thanks
>> >
>> > Alejandro
>> >
>> > On Tue, Jul 3, 2012 at 11:09 PM, Tim Chan <ti...@chan.net> wrote:
>> >
>> >> I would like to use Oozie to run a python script on a worker node.
>> >>
>> >> I've been looking at the documentation located here:
>> >>
>> >> https://github.com/yahoo/oozie/wiki/Oozie-WF-use-cases
>> >>
>> >> under the heading: Java-Main Action with Script support
>> >>
>> >> Is ReadErrorStream some custom class? It is not a part of the Java IO
>> API.
>> >>
>> >> Is there updated documentation on running scripts (ruby, python, perl,
>> >> etc) using Oozie?
>> >>
>> >
>> >
>> >
>> > --
>> > Alejandro
>>
>>
>>
>> --
>>   Tim Chan   //  tim@chan.net   //   213.784.2523
>>



-- 
  Tim Chan   //  tim@chan.net   //   213.784.2523

Re: Running python script using Oozie

Posted by Harish Krishnan <ha...@gmail.com>.
Hi Tim,

Is <shell xmlns="uri:oozie:shell-action:0.1"> correct?
I thought this should be <shell xmlns="uri:oozie:workflow:0.2">

Thanks & Regards,
Harish.T.K


On Thu, Jul 5, 2012 at 2:53 PM, Tim Chan <ti...@chan.net> wrote:

> Alejandro,
>
> We're running Oozie server 2.3.2-cdh3u4.
> The shell action appears to be supported based on the documentation,
> but when I run my workflow, I get the following error in the oozie
> logs:
>
>
>  E0701: XML schema error, cvc-complex-type.2.4.c: The matching
> wildcard is strict, but no declaration can be found for element
> 'shell'.
>
> When I use xmlns="uri:oozie:workflow:0.3" I get the following error:
>
>  XException, org.apache.oozie.command.CommandException: E0701: XML
> schema error, cvc-elt.1: Cannot find the declaration of element
> 'workflow-app'.
> org.apache.oozie.command.CommandException: E0701: XML schema error,
> cvc-elt.1: Cannot find the declaration of element 'workflow-app'.
>
>
> Here is m workflow.xml:
>
> <workflow-app xmlns="uri:oozie:workflow:0.2"
> name="dlx-mapping-processor-main">
>
>     <start to="shell-test"/>
>
>     <action name="shell-test">
>         <shell xmlns="uri:oozie:shell-action:0.1">
>             <job-tracker>${jobTracker}</job-tracker>
>             <name-node>${nameNode}</name-node>
>             <configuration>
>                 <property>
>                     <name>mapred.job.queue.name</name>
>                     <value>${queueName}</value>
>                 </property>
>             </configuration>
>
>             <exec>pwd</exec>
>
>             <capture-output/>
>
>         </shell>
>
>         <ok to="end"/>
>         <error to="fail"/>
>     </action>
>
>     <kill name="fail">
>         <message>Node failed, error
> message[${wf:errorMessage(wf:lastErrorNode())}]</message>
>     </kill>
>
>     <end name="end"/>
> </workflow-app>
>
>
>
>
>
>
>
>
> On Thu, Jul 5, 2012 at 9:59 AM, Alejandro Abdelnur <tu...@cloudera.com>
> wrote:
> > Hi TIm,
> >
> > I think the Shell action would be better suited to run a phyton script.
> And
> > keep in mind phyton and all the libs you need should be avail in all
> nodes
> > in the cluster.
> >
> > Thanks
> >
> > Alejandro
> >
> > On Tue, Jul 3, 2012 at 11:09 PM, Tim Chan <ti...@chan.net> wrote:
> >
> >> I would like to use Oozie to run a python script on a worker node.
> >>
> >> I've been looking at the documentation located here:
> >>
> >> https://github.com/yahoo/oozie/wiki/Oozie-WF-use-cases
> >>
> >> under the heading: Java-Main Action with Script support
> >>
> >> Is ReadErrorStream some custom class? It is not a part of the Java IO
> API.
> >>
> >> Is there updated documentation on running scripts (ruby, python, perl,
> >> etc) using Oozie?
> >>
> >
> >
> >
> > --
> > Alejandro
>
>
>
> --
>   Tim Chan   //  tim@chan.net   //   213.784.2523
>

Re: Running python script using Oozie

Posted by Tim Chan <ti...@chan.net>.
Alejandro,

We're running Oozie server 2.3.2-cdh3u4.
The shell action appears to be supported based on the documentation,
but when I run my workflow, I get the following error in the oozie
logs:


 E0701: XML schema error, cvc-complex-type.2.4.c: The matching
wildcard is strict, but no declaration can be found for element
'shell'.

When I use xmlns="uri:oozie:workflow:0.3" I get the following error:

 XException, org.apache.oozie.command.CommandException: E0701: XML
schema error, cvc-elt.1: Cannot find the declaration of element
'workflow-app'.
org.apache.oozie.command.CommandException: E0701: XML schema error,
cvc-elt.1: Cannot find the declaration of element 'workflow-app'.


Here is m workflow.xml:

<workflow-app xmlns="uri:oozie:workflow:0.2" name="dlx-mapping-processor-main">

    <start to="shell-test"/>

    <action name="shell-test">
        <shell xmlns="uri:oozie:shell-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <configuration>
                <property>
                    <name>mapred.job.queue.name</name>
                    <value>${queueName}</value>
                </property>
            </configuration>

            <exec>pwd</exec>

            <capture-output/>

        </shell>

        <ok to="end"/>
        <error to="fail"/>
    </action>

    <kill name="fail">
        <message>Node failed, error
message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>

    <end name="end"/>
</workflow-app>








On Thu, Jul 5, 2012 at 9:59 AM, Alejandro Abdelnur <tu...@cloudera.com> wrote:
> Hi TIm,
>
> I think the Shell action would be better suited to run a phyton script. And
> keep in mind phyton and all the libs you need should be avail in all nodes
> in the cluster.
>
> Thanks
>
> Alejandro
>
> On Tue, Jul 3, 2012 at 11:09 PM, Tim Chan <ti...@chan.net> wrote:
>
>> I would like to use Oozie to run a python script on a worker node.
>>
>> I've been looking at the documentation located here:
>>
>> https://github.com/yahoo/oozie/wiki/Oozie-WF-use-cases
>>
>> under the heading: Java-Main Action with Script support
>>
>> Is ReadErrorStream some custom class? It is not a part of the Java IO API.
>>
>> Is there updated documentation on running scripts (ruby, python, perl,
>> etc) using Oozie?
>>
>
>
>
> --
> Alejandro



-- 
  Tim Chan   //  tim@chan.net   //   213.784.2523

Re: Running python script using Oozie

Posted by Alejandro Abdelnur <tu...@cloudera.com>.
Hi TIm,

I think the Shell action would be better suited to run a phyton script. And
keep in mind phyton and all the libs you need should be avail in all nodes
in the cluster.

Thanks

Alejandro

On Tue, Jul 3, 2012 at 11:09 PM, Tim Chan <ti...@chan.net> wrote:

> I would like to use Oozie to run a python script on a worker node.
>
> I've been looking at the documentation located here:
>
> https://github.com/yahoo/oozie/wiki/Oozie-WF-use-cases
>
> under the heading: Java-Main Action with Script support
>
> Is ReadErrorStream some custom class? It is not a part of the Java IO API.
>
> Is there updated documentation on running scripts (ruby, python, perl,
> etc) using Oozie?
>



-- 
Alejandro