You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@oozie.apache.org by Garry Turkington <g....@improvedigital.com> on 2014/06/09 15:00:20 UTC

Using HCat within a Pig action

Hi,

I've got some Pig scripts that access data via HCat. They run fine on the command line but if I try to get some executed as part of an Oozie action it is failing. Unfortunately with very little detailed error messages.

So before I go into the specifics can I clarify what is needed to get Pig/HCat integration working with Oozie?

I'm running this on CDH5 and the output of "oozie admin -listsharelib" includes Pig and Hcatalog. Within my Pig scripts I am referring to HCat tables by  name alone, i.e. no hcat:// URI. The  hive-site.xml that works for Hive actions is available. I have other non-HCat workflows  running fine, including Pig and Hive actions.

When I run Pig scripts that use HCatalog from the CLI I need specify -useHcatalog and have HCAT.BIN defined; should I be passing values for these to the Pig script within <argument> elements in the action definition? (I've tried both with and without).

Anything else that is required for this to work? Or pointers to any documentation with examples/specs for what's needed? I found different parts of the picture spread around but no definitive spec or full examples.

I cut my script down to the following; note the commented out second statement, we don't even get as far as trying to read the (existing and containing data) table:

REGISTER  /opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/pig/piggybank.jar

mydata = LOAD 'testtable_hcat' USING org.apache.hcatalog.pig.HCatLoader();
-- store mydata into '/tmp/zz.out' using PigStorage();

I commented out the store because the only error I get includes the seeming code JA018 and it was suggested on the Google this may be permission related. Anything I need consider here? The cluster isn't using any external security provider and only basic authentication:

job_1402172905909_0054 FAILED/KILLEDJA018

Here's the workflow.xml:
<workflow-app xmlns="uri:oozie:workflow:0.4" name="shell-wf">
    <start to="pig-node"/>
    <action name="pig-node">
<pig>
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <job-xml>${workflowRoot}/hive-site.xml</job-xml>
            <configuration>
                <property>
                    <name>mapred.job.queue.name</name>
                    <value>${queueName}</value>
                </property>
            </configuration>
            <script>${workflowRoot}/pig/simple.pig</script>
        </pig>
        <ok to="end"/>
        <error to="fail"/>
</action>

    <kill name="fail">
        <message>Pig action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name="end"/>
</workflow-app>

Thanks
Garry



Re: Using HCat within a Pig action

Posted by Mona Chitnis <ch...@yahoo-inc.com.INVALID>.
Executing sql commands to create new table/database through the oozie pig
action is something I haven’t tried myself and most likely it will not
work. Adding and deleting partitions of an existing table is executed via
an HCatClient singleton running in same process as your Oozie job

On 6/18/14, 3:15 PM, "Garry Turkington" <g....@improvedigital.com>
wrote:

>Robert, Mona,
>
>Thanks for the pointers. It was the sharelib config element that I was
>missing. I did have it set in my job.properties file but I was specifying
>action.sharelib.for.pig instead of oozie.action.sharelib.for.pig. Oops!
>
>I can now use the HCatLoader and HCatStorer within my Pig scripts which
>is cool. I'm now wondering though if its supported to use the 'sql'
>command within Pig which calls out to the hcat script. On the CLI one
>would specify a value for hcat.bin pointing to this script and all is
>well. But after trying various combinations within my workflow definition
>it's not apparent how to set things up for this  to work in an action.
>Unfortunately my CDH5 VM doesn't seem to be logging the output of the
>failing task and just giving the very generic job killed message.
>
>Looking at the Pig source code it does seem to be executing hcat in an
>external process, does this sort of thing play well within an Oozie
>action?
>
>Thanks
>Garry
>
>-----Original Message-----
>From: Robert Kanter [mailto:rkanter@cloudera.com]
>Sent: 17 June 2014 05:34
>To: user@oozie.apache.org
>Subject: Re: Using HCat within a Pig action
>
>I was able to run the HCatalog example, which uses HCatLoader in Pig,
>with CDH 5.something.
>I did have to modify the example to include the "hcatalog" sharelib by
>adding this to the pig action's <configuration> section:
><property>
>     <name>oozie.action.sharelib.for.pig</name>
>     <value>pig,hcatalog</value>
></property>
>Did you try adding this?
>
>Can you look at the output from the launcher job's map task (the MR job
>that succeeded)?  It will likely have an error message from Pig in it
>that should give us an idea of what's wrong.
>
>
>On Mon, Jun 16, 2014 at 5:33 PM, Garry Turkington <
>g.turkington@improvedigital.com> wrote:
>
>> Hi Mona,
>>
>> Thanks for the input. Some more info below.
>>
>> If I look at my workflow in the Oozie web ui it still looks a bit odd.
>> Looking at the task for the Pig node I see
>>
>> Status: ERROR
>> Error code: JA018
>> Error message: Main class [org.apache.oozie.action.hadoop.PigMain],
>> exit code [2] External id: <job id> External Status: FAILED/KILLED
>> Clicking on the child job link gives "n/a" as the child job.
>> The <job id> listed above shows as successful in the MapReduce logs
>> though.  Am I driving the interface wrong? :)
>>
>> I looked at that example -- thanks for the pointer -- and on first cut
>> it seems to be showing the same behaviour. I simplified it a bit to
>> remove the coordinator job and just run the script directly --
>> modifying job.properties to have all the right variables and the same
>>result as above.
>>
>> Re hive-site.xml I tried setting permission explicitly but no affect.
>> Plus as mentioned I can get a Hive action to work so in other contexts
>> the file appears to be read correctly.
>>
>> I did also notice the readme for the example mentions HCatalog 0.4.1;
>> that is a quite old version. It's obviously hard to compare directly
>> since HCatalog versions are now aligned with Hive but 0.5 came out in
>> 2013 and the version I'm running with is 0.12. This is using CDH5. Has
>> HCatalog support been seen to work since the 0.4.1 release?
>>
>> Any other thoughts?
>>
>> Thanks
>> Garry
>>
>>
>> -----Original Message-----
>> From: Mona Chitnis [mailto:chitnis@yahoo-inc.com.INVALID]
>> Sent: 16 June 2014 20:05
>> To: user@oozie.apache.org
>> Subject: Re: Using HCat within a Pig action
>>
>> Hello Garry,
>>
>> Oozie first launches a ³launcher² job which then launches the actual
>> Pig/Hive job. The ŒSUCCEEDED¹ job that you see on Yarn is this
>> launcher job, which succeeded in its job of launching the actual pig
>> job, but actual pig job failed. Oozie console will show child jobs
>> under tab ŒChild Jobs¹ pointing to the child jobs that the launcher
>> spawned, so you can go there directly.
>>
>> What is your error exactly? If it says Œhive-site.xml Permission
>> Denied¹, I¹ve encountered that error before and for some reason it was
>> because the hive-site.xml unix permissions were changed from 600
>> (rw-r--r--). This is the hive-site.xml that you are including in your
>> oozie workflow application directory. I don¹t know the root cause of
>> it yet but if you keep the permissions as 600 it should work.
>>
>> If it¹s not the above error, could you elaborate on it, and copy the
>> stack trace here?
>>
>> Hive updated versions have indeed changed the package hierarchy for
>> HCatLoader and HCatStorer so the changes you made here are correct.
>>
>> You will find an example (examples/apps/hcatalog) in the examples
>> folder in Oozie and includes a README. I have created JIRA
>> (https://issues.apache.org/jira/browse/OOZIE-1881) to append the above
>> info as an FAQ to the HCatalog integration doc as well as a
>> walkthrough of this Pig+HCat example.
>>
>> Regards,
>> ‹
>> Mona
>>
>> On 6/15/14, 2:47 PM, "Garry Turkington"
>> <g....@improvedigital.com>
>> wrote:
>>
>> >Hi,
>> >
>> >
>> >
>> >Digging into this some more I have a little more information about
>> >the problem.
>> >
>> >
>> >
>> >The simple Pig script is as below, i.e.:
>> >
>> >
>> >
>> >REGISTER
>> >/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/pig/piggybank.ja
>> >r
>> >
>> >
>> >
>> >mydata = LOAD 'testtable_hcat' USING
>> >org.apache.hive.hcatalog.pig.HCatLoader();
>> >
>> >This fails with the following message in the Oozie CLI and UI:
>> >
>> >0000003-140615021945919-oozie-oozi-W@pig-node
>> >    ERROR     job_1402823993415_0004 FAILED/KILLEDJA018
>> >
>> >
>> >
>> >Though that particular MR job id is marked as successful in the MR
>> >and YARN logs. Which is I think why it's proving difficult to find
>> >any more logging.
>> >
>> >
>> >
>> >What does work:
>> >
>> >* Hive actions within Oozie
>> >
>> >* Other Pig actions (that don't use HCatalog) within Oozie
>> >
>> >* This Pig script run from the CLI as either the submitting or yarn
>> >user
>> >
>> >
>> >
>> >I did change 2 things; the package name for the HCATLoader as the
>> >org.apache.hcatalog.* is now deprecated in favour of
>> >org.apache.hive.hcatalog.* and the /user/yarn directory was not
>>present.
>> >But neither made an impact.
>> >
>> >
>> >
>> >I think the JA018 -- referred to as being due to the output dir
>> >already existing  in oozie-defaults.xml is actually referring to
>>something else.
>> >Possibly a missing library.
>> >
>> >
>> >
>> >To run the script from the command line I add the -useHCatalog
>> >argument to Pig which explicitly adds jars to the classpath. Though
>> >many of these would be for the hcat binary etc which I'm not using.
>> >The HCatalog adaptor for Pig though does appear to be in the Oozie
>>sharelib:
>> >
>> >
>> >
>> >[cloudera@localhost ~]$ oozie admin -shareliblist hcatalog | grep -i
>> >pig
>> >
>> >
>> >hdfs://localhost.localdomain:8020/user/oozie/share/lib/lib_2014040411
>> >28 20/ hcatalog/hive-hcatalog-pig-adapter-0.12.0-cdh5.0.0.jar
>> >
>> >
>> >
>> >Any insight in any of the above from anyone? The fact I can't find
>> >any examples of this Oozie/Pig/Hcat combo working isn't filling me
>> >with confidence.
>> >
>> >
>> >
>> >One thing that would help -- if Pig is dropping an error log file is
>> >there any way of capturing that/making it available? I tried doing
>> >the equivalent of a "pig -l > <destination>" in the workflow.xml but
>> >that didn't seem to work either.
>> >
>> >
>> >
>> >Or any thoughts on when things would be failing in such a way that
>> >the MapReduce job is logged as successful but Oozie sees the action
>> >as failed/killed?
>> >
>> >
>> >
>> >Any pointers well received,
>> >
>> >Garry
>> >
>> >
>> >
>> >
>> >
>> >-----Original Message-----
>> >From: Garry Turkington [mailto:g.turkington@improvedigital.com]
>> >Sent: 11 June 2014 00:11
>> >To: user@oozie.apache.org
>> >Subject: RE: Using HCat within a Pig action
>> >
>> >
>> >
>> >Mona,
>> >
>> >
>> >
>> >Thanks for the response.
>> >
>> >
>> >
>> >That doesn't quite look like my problem though; my Hive Oozie actions
>> >are working fine. As are my Pig Oozie actions, but things start
>> >breaking when trying to use HCat from within the Pig action.
>> >
>> >
>> >
>> >Are there any additional arguments required -- or configuration
>> >options
>> >-- for  a Pig job using HCat? Or any working  examples anywhere?
>> >
>> >
>> >
>> >Thanks
>> >
>> >Garry
>> >
>> >
>> >
>> >-----Original Message-----
>> >
>> >From: Mona Chitnis [mailto:chitnis@yahoo-inc.com.INVALID]
>> >
>> >Sent: 09 June 2014 19:20
>> >
>> >To: user@oozie.apache.org<ma...@oozie.apache.org>
>> >
>> >Subject: Re: Using HCat within a Pig action
>> >
>> >
>> >
>> >Looks like some discussion on this problem already
>> >https://groups.google.com/a/cloudera.org/forum/#!topic/hue-user/m8NnJ
>> >vz
>> >xGA
>> >Q
>> >
>> >
>> >
>> >On 6/9/14, 6:00 AM, "Garry Turkington"
>> ><g.turkington@improvedigital.com<mailto:g.turkington@improvedigital.c
>> >om
>> >>>
>> >
>> >wrote:
>> >
>> >
>> >
>> >>Hi,
>> >
>> >>
>> >
>> >>I've got some Pig scripts that access data via HCat. They run fine
>> >>on
>> >
>> >>the command line but if I try to get some executed as part of an
>> >>Oozie
>> >
>> >>action it is failing. Unfortunately with very little detailed error
>> >>messages.
>> >
>> >>
>> >
>> >>So before I go into the specifics can I clarify what is needed to
>> >>get
>> >
>> >>Pig/HCat integration working with Oozie?
>> >
>> >>
>> >
>> >>I'm running this on CDH5 and the output of "oozie admin -listsharelib"
>> >
>> >>includes Pig and Hcatalog. Within my Pig scripts I am referring to
>> >>HCat
>> >
>> >>tables by  name alone, i.e. no hcat:// URI. The  hive-site.xml that
>> >
>> >>works for Hive actions is available. I have other non-HCat workflows
>> >
>> >>running fine, including Pig and Hive actions.
>> >
>> >>
>> >
>> >>When I run Pig scripts that use HCatalog from the CLI I need specify
>> >
>> >>-useHcatalog and have HCAT.BIN defined; should I be passing values
>> >>for
>> >
>> >>these to the Pig script within <argument> elements in the action
>> >
>> >>definition? (I've tried both with and without).
>> >
>> >>
>> >
>> >>Anything else that is required for this to work? Or pointers to any
>> >
>> >>documentation with examples/specs for what's needed? I found
>> >>different
>> >
>> >>parts of the picture spread around but no definitive spec or full
>> >
>> >>examples.
>> >
>> >>
>> >
>> >>I cut my script down to the following; note the commented out second
>> >
>> >>statement, we don't even get as far as trying to read the (existing
>> >>and
>> >
>> >>containing data) table:
>> >
>> >>
>> >
>> >>REGISTER
>> >
>> >>/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/pig/piggybank.j
>> >>ar
>> >
>> >>
>> >
>> >>mydata = LOAD 'testtable_hcat' USING
>> >
>> >>org.apache.hcatalog.pig.HCatLoader();
>> >
>> >>-- store mydata into '/tmp/zz.out' using PigStorage();
>> >
>> >>
>> >
>> >>I commented out the store because the only error I get includes the
>> >
>> >>seeming code JA018 and it was suggested on the Google this may be
>> >
>> >>permission related. Anything I need consider here? The cluster isn't
>> >
>> >>using any external security provider and only basic authentication:
>> >
>> >>
>> >
>> >>job_1402172905909_0054 FAILED/KILLEDJA018
>> >
>> >>
>> >
>> >>Here's the workflow.xml:
>> >
>> >><workflow-app xmlns="uri:oozie:workflow:0.4" name="shell-wf">
>> >
>> >>    <start to="pig-node"/>
>> >
>> >>    <action name="pig-node">
>> >
>> >><pig>
>> >
>> >>            <job-tracker>${jobTracker}</job-tracker>
>> >
>> >>            <name-node>${nameNode}</name-node>
>> >
>> >>            <job-xml>${workflowRoot}/hive-site.xml</job-xml>
>> >
>> >>            <configuration>
>> >
>> >>                <property>
>> >
>> >>                    <name>mapred.job.queue.name</name>
>> >
>> >>                    <value>${queueName}</value>
>> >
>> >>                </property>
>> >
>> >>            </configuration>
>> >
>> >>            <script>${workflowRoot}/pig/simple.pig</script>
>> >
>> >>        </pig>
>> >
>> >>        <ok to="end"/>
>> >
>> >>        <error to="fail"/>
>> >
>> >></action>
>> >
>> >>
>> >
>> >>    <kill name="fail">
>> >
>> >>        <message>Pig action failed, error
>> >
>> >>message[${wf:errorMessage(wf:lastErrorNode())}]</message>
>> >
>> >>    </kill>
>> >
>> >>    <end name="end"/>
>> >
>> >></workflow-app>
>> >
>> >>
>> >
>> >>Thanks
>> >
>> >>Garry
>> >
>> >>
>> >
>> >>
>> >
>> >
>> >
>> >
>> >
>> >-----
>> >
>> >No virus found in this message.
>> >
>> >Checked by AVG - www.avg.com<http://www.avg.com>
>> >
>> >Version: 2014.0.4570 / Virus Database: 3955/7637 - Release Date:
>> >06/07/14
>> >
>> >
>> >
>> >-----
>> >
>> >No virus found in this message.
>> >
>> >Checked by AVG - www.avg.com<http://www.avg.com>
>> >
>> >Version: 2014.0.4570 / Virus Database: 3955/7637 - Release Date:
>> >06/07/14
>>
>>


RE: Using HCat within a Pig action

Posted by Garry Turkington <g....@improvedigital.com>.
Robert, Mona,

Thanks for the pointers. It was the sharelib config element that I was missing. I did have it set in my job.properties file but I was specifying action.sharelib.for.pig instead of oozie.action.sharelib.for.pig. Oops!

I can now use the HCatLoader and HCatStorer within my Pig scripts which is cool. I'm now wondering though if its supported to use the 'sql' command within Pig which calls out to the hcat script. On the CLI one would specify a value for hcat.bin pointing to this script and all is well. But after trying various combinations within my workflow definition it's not apparent how to set things up for this  to work in an action. Unfortunately my CDH5 VM doesn't seem to be logging the output of the failing task and just giving the very generic job killed message.

Looking at the Pig source code it does seem to be executing hcat in an external process, does this sort of thing play well within an Oozie action?

Thanks
Garry

-----Original Message-----
From: Robert Kanter [mailto:rkanter@cloudera.com] 
Sent: 17 June 2014 05:34
To: user@oozie.apache.org
Subject: Re: Using HCat within a Pig action

I was able to run the HCatalog example, which uses HCatLoader in Pig, with CDH 5.something.
I did have to modify the example to include the "hcatalog" sharelib by adding this to the pig action's <configuration> section:
<property>
     <name>oozie.action.sharelib.for.pig</name>
     <value>pig,hcatalog</value>
</property>
Did you try adding this?

Can you look at the output from the launcher job's map task (the MR job that succeeded)?  It will likely have an error message from Pig in it that should give us an idea of what's wrong.


On Mon, Jun 16, 2014 at 5:33 PM, Garry Turkington < g.turkington@improvedigital.com> wrote:

> Hi Mona,
>
> Thanks for the input. Some more info below.
>
> If I look at my workflow in the Oozie web ui it still looks a bit odd.
> Looking at the task for the Pig node I see
>
> Status: ERROR
> Error code: JA018
> Error message: Main class [org.apache.oozie.action.hadoop.PigMain], 
> exit code [2] External id: <job id> External Status: FAILED/KILLED 
> Clicking on the child job link gives "n/a" as the child job.
> The <job id> listed above shows as successful in the MapReduce logs 
> though.  Am I driving the interface wrong? :)
>
> I looked at that example -- thanks for the pointer -- and on first cut 
> it seems to be showing the same behaviour. I simplified it a bit to 
> remove the coordinator job and just run the script directly --  
> modifying job.properties to have all the right variables and the same result as above.
>
> Re hive-site.xml I tried setting permission explicitly but no affect. 
> Plus as mentioned I can get a Hive action to work so in other contexts 
> the file appears to be read correctly.
>
> I did also notice the readme for the example mentions HCatalog 0.4.1; 
> that is a quite old version. It's obviously hard to compare directly 
> since HCatalog versions are now aligned with Hive but 0.5 came out in 
> 2013 and the version I'm running with is 0.12. This is using CDH5. Has 
> HCatalog support been seen to work since the 0.4.1 release?
>
> Any other thoughts?
>
> Thanks
> Garry
>
>
> -----Original Message-----
> From: Mona Chitnis [mailto:chitnis@yahoo-inc.com.INVALID]
> Sent: 16 June 2014 20:05
> To: user@oozie.apache.org
> Subject: Re: Using HCat within a Pig action
>
> Hello Garry,
>
> Oozie first launches a ³launcher² job which then launches the actual 
> Pig/Hive job. The ŒSUCCEEDED¹ job that you see on Yarn is this 
> launcher job, which succeeded in its job of launching the actual pig 
> job, but actual pig job failed. Oozie console will show child jobs 
> under tab ŒChild Jobs¹ pointing to the child jobs that the launcher 
> spawned, so you can go there directly.
>
> What is your error exactly? If it says Œhive-site.xml Permission 
> Denied¹, I¹ve encountered that error before and for some reason it was 
> because the hive-site.xml unix permissions were changed from 600 
> (rw-r--r--). This is the hive-site.xml that you are including in your 
> oozie workflow application directory. I don¹t know the root cause of 
> it yet but if you keep the permissions as 600 it should work.
>
> If it¹s not the above error, could you elaborate on it, and copy the 
> stack trace here?
>
> Hive updated versions have indeed changed the package hierarchy for 
> HCatLoader and HCatStorer so the changes you made here are correct.
>
> You will find an example (examples/apps/hcatalog) in the examples 
> folder in Oozie and includes a README. I have created JIRA
> (https://issues.apache.org/jira/browse/OOZIE-1881) to append the above 
> info as an FAQ to the HCatalog integration doc as well as a 
> walkthrough of this Pig+HCat example.
>
> Regards,
> ‹
> Mona
>
> On 6/15/14, 2:47 PM, "Garry Turkington" 
> <g....@improvedigital.com>
> wrote:
>
> >Hi,
> >
> >
> >
> >Digging into this some more I have a little more information about 
> >the problem.
> >
> >
> >
> >The simple Pig script is as below, i.e.:
> >
> >
> >
> >REGISTER
> >/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/pig/piggybank.ja
> >r
> >
> >
> >
> >mydata = LOAD 'testtable_hcat' USING
> >org.apache.hive.hcatalog.pig.HCatLoader();
> >
> >This fails with the following message in the Oozie CLI and UI:
> >
> >0000003-140615021945919-oozie-oozi-W@pig-node
> >    ERROR     job_1402823993415_0004 FAILED/KILLEDJA018
> >
> >
> >
> >Though that particular MR job id is marked as successful in the MR 
> >and YARN logs. Which is I think why it's proving difficult to find 
> >any more logging.
> >
> >
> >
> >What does work:
> >
> >* Hive actions within Oozie
> >
> >* Other Pig actions (that don't use HCatalog) within Oozie
> >
> >* This Pig script run from the CLI as either the submitting or yarn 
> >user
> >
> >
> >
> >I did change 2 things; the package name for the HCATLoader as the
> >org.apache.hcatalog.* is now deprecated in favour of
> >org.apache.hive.hcatalog.* and the /user/yarn directory was not present.
> >But neither made an impact.
> >
> >
> >
> >I think the JA018 -- referred to as being due to the output dir 
> >already existing  in oozie-defaults.xml is actually referring to something else.
> >Possibly a missing library.
> >
> >
> >
> >To run the script from the command line I add the -useHCatalog 
> >argument to Pig which explicitly adds jars to the classpath. Though 
> >many of these would be for the hcat binary etc which I'm not using. 
> >The HCatalog adaptor for Pig though does appear to be in the Oozie sharelib:
> >
> >
> >
> >[cloudera@localhost ~]$ oozie admin -shareliblist hcatalog | grep -i 
> >pig
> >
> >
> >hdfs://localhost.localdomain:8020/user/oozie/share/lib/lib_2014040411
> >28 20/ hcatalog/hive-hcatalog-pig-adapter-0.12.0-cdh5.0.0.jar
> >
> >
> >
> >Any insight in any of the above from anyone? The fact I can't find 
> >any examples of this Oozie/Pig/Hcat combo working isn't filling me 
> >with confidence.
> >
> >
> >
> >One thing that would help -- if Pig is dropping an error log file is 
> >there any way of capturing that/making it available? I tried doing 
> >the equivalent of a "pig -l > <destination>" in the workflow.xml but 
> >that didn't seem to work either.
> >
> >
> >
> >Or any thoughts on when things would be failing in such a way that 
> >the MapReduce job is logged as successful but Oozie sees the action 
> >as failed/killed?
> >
> >
> >
> >Any pointers well received,
> >
> >Garry
> >
> >
> >
> >
> >
> >-----Original Message-----
> >From: Garry Turkington [mailto:g.turkington@improvedigital.com]
> >Sent: 11 June 2014 00:11
> >To: user@oozie.apache.org
> >Subject: RE: Using HCat within a Pig action
> >
> >
> >
> >Mona,
> >
> >
> >
> >Thanks for the response.
> >
> >
> >
> >That doesn't quite look like my problem though; my Hive Oozie actions 
> >are working fine. As are my Pig Oozie actions, but things start 
> >breaking when trying to use HCat from within the Pig action.
> >
> >
> >
> >Are there any additional arguments required -- or configuration 
> >options
> >-- for  a Pig job using HCat? Or any working  examples anywhere?
> >
> >
> >
> >Thanks
> >
> >Garry
> >
> >
> >
> >-----Original Message-----
> >
> >From: Mona Chitnis [mailto:chitnis@yahoo-inc.com.INVALID]
> >
> >Sent: 09 June 2014 19:20
> >
> >To: user@oozie.apache.org<ma...@oozie.apache.org>
> >
> >Subject: Re: Using HCat within a Pig action
> >
> >
> >
> >Looks like some discussion on this problem already 
> >https://groups.google.com/a/cloudera.org/forum/#!topic/hue-user/m8NnJ
> >vz
> >xGA
> >Q
> >
> >
> >
> >On 6/9/14, 6:00 AM, "Garry Turkington"
> ><g.turkington@improvedigital.com<mailto:g.turkington@improvedigital.c
> >om
> >>>
> >
> >wrote:
> >
> >
> >
> >>Hi,
> >
> >>
> >
> >>I've got some Pig scripts that access data via HCat. They run fine 
> >>on
> >
> >>the command line but if I try to get some executed as part of an 
> >>Oozie
> >
> >>action it is failing. Unfortunately with very little detailed error 
> >>messages.
> >
> >>
> >
> >>So before I go into the specifics can I clarify what is needed to 
> >>get
> >
> >>Pig/HCat integration working with Oozie?
> >
> >>
> >
> >>I'm running this on CDH5 and the output of "oozie admin -listsharelib"
> >
> >>includes Pig and Hcatalog. Within my Pig scripts I am referring to 
> >>HCat
> >
> >>tables by  name alone, i.e. no hcat:// URI. The  hive-site.xml that
> >
> >>works for Hive actions is available. I have other non-HCat workflows
> >
> >>running fine, including Pig and Hive actions.
> >
> >>
> >
> >>When I run Pig scripts that use HCatalog from the CLI I need specify
> >
> >>-useHcatalog and have HCAT.BIN defined; should I be passing values 
> >>for
> >
> >>these to the Pig script within <argument> elements in the action
> >
> >>definition? (I've tried both with and without).
> >
> >>
> >
> >>Anything else that is required for this to work? Or pointers to any
> >
> >>documentation with examples/specs for what's needed? I found 
> >>different
> >
> >>parts of the picture spread around but no definitive spec or full
> >
> >>examples.
> >
> >>
> >
> >>I cut my script down to the following; note the commented out second
> >
> >>statement, we don't even get as far as trying to read the (existing 
> >>and
> >
> >>containing data) table:
> >
> >>
> >
> >>REGISTER
> >
> >>/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/pig/piggybank.j
> >>ar
> >
> >>
> >
> >>mydata = LOAD 'testtable_hcat' USING
> >
> >>org.apache.hcatalog.pig.HCatLoader();
> >
> >>-- store mydata into '/tmp/zz.out' using PigStorage();
> >
> >>
> >
> >>I commented out the store because the only error I get includes the
> >
> >>seeming code JA018 and it was suggested on the Google this may be
> >
> >>permission related. Anything I need consider here? The cluster isn't
> >
> >>using any external security provider and only basic authentication:
> >
> >>
> >
> >>job_1402172905909_0054 FAILED/KILLEDJA018
> >
> >>
> >
> >>Here's the workflow.xml:
> >
> >><workflow-app xmlns="uri:oozie:workflow:0.4" name="shell-wf">
> >
> >>    <start to="pig-node"/>
> >
> >>    <action name="pig-node">
> >
> >><pig>
> >
> >>            <job-tracker>${jobTracker}</job-tracker>
> >
> >>            <name-node>${nameNode}</name-node>
> >
> >>            <job-xml>${workflowRoot}/hive-site.xml</job-xml>
> >
> >>            <configuration>
> >
> >>                <property>
> >
> >>                    <name>mapred.job.queue.name</name>
> >
> >>                    <value>${queueName}</value>
> >
> >>                </property>
> >
> >>            </configuration>
> >
> >>            <script>${workflowRoot}/pig/simple.pig</script>
> >
> >>        </pig>
> >
> >>        <ok to="end"/>
> >
> >>        <error to="fail"/>
> >
> >></action>
> >
> >>
> >
> >>    <kill name="fail">
> >
> >>        <message>Pig action failed, error
> >
> >>message[${wf:errorMessage(wf:lastErrorNode())}]</message>
> >
> >>    </kill>
> >
> >>    <end name="end"/>
> >
> >></workflow-app>
> >
> >>
> >
> >>Thanks
> >
> >>Garry
> >
> >>
> >
> >>
> >
> >
> >
> >
> >
> >-----
> >
> >No virus found in this message.
> >
> >Checked by AVG - www.avg.com<http://www.avg.com>
> >
> >Version: 2014.0.4570 / Virus Database: 3955/7637 - Release Date:
> >06/07/14
> >
> >
> >
> >-----
> >
> >No virus found in this message.
> >
> >Checked by AVG - www.avg.com<http://www.avg.com>
> >
> >Version: 2014.0.4570 / Virus Database: 3955/7637 - Release Date:
> >06/07/14
>
>

Re: Using HCat within a Pig action

Posted by Robert Kanter <rk...@cloudera.com>.
I was able to run the HCatalog example, which uses HCatLoader in Pig, with
CDH 5.something.
I did have to modify the example to include the "hcatalog" sharelib by
adding this to the pig action's <configuration> section:
<property>
     <name>oozie.action.sharelib.for.pig</name>
     <value>pig,hcatalog</value>
</property>
Did you try adding this?

Can you look at the output from the launcher job's map task (the MR job
that succeeded)?  It will likely have an error message from Pig in it that
should give us an idea of what's wrong.


On Mon, Jun 16, 2014 at 5:33 PM, Garry Turkington <
g.turkington@improvedigital.com> wrote:

> Hi Mona,
>
> Thanks for the input. Some more info below.
>
> If I look at my workflow in the Oozie web ui it still looks a bit odd.
> Looking at the task for the Pig node I see
>
> Status: ERROR
> Error code: JA018
> Error message: Main class [org.apache.oozie.action.hadoop.PigMain], exit
> code [2]
> External id: <job id>
> External Status: FAILED/KILLED
> Clicking on the child job link gives "n/a" as the child job.
> The <job id> listed above shows as successful in the MapReduce logs
> though.  Am I driving the interface wrong? :)
>
> I looked at that example -- thanks for the pointer -- and on first cut it
> seems to be showing the same behaviour. I simplified it a bit to remove the
> coordinator job and just run the script directly --  modifying
> job.properties to have all the right variables and the same result as above.
>
> Re hive-site.xml I tried setting permission explicitly but no affect. Plus
> as mentioned I can get a Hive action to work so in other contexts the file
> appears to be read correctly.
>
> I did also notice the readme for the example mentions HCatalog 0.4.1; that
> is a quite old version. It's obviously hard to compare directly since
> HCatalog versions are now aligned with Hive but 0.5 came out in 2013 and
> the version I'm running with is 0.12. This is using CDH5. Has HCatalog
> support been seen to work since the 0.4.1 release?
>
> Any other thoughts?
>
> Thanks
> Garry
>
>
> -----Original Message-----
> From: Mona Chitnis [mailto:chitnis@yahoo-inc.com.INVALID]
> Sent: 16 June 2014 20:05
> To: user@oozie.apache.org
> Subject: Re: Using HCat within a Pig action
>
> Hello Garry,
>
> Oozie first launches a ³launcher² job which then launches the actual
> Pig/Hive job. The ŒSUCCEEDED¹ job that you see on Yarn is this launcher
> job, which succeeded in its job of launching the actual pig job, but actual
> pig job failed. Oozie console will show child jobs under tab ŒChild Jobs¹
> pointing to the child jobs that the launcher spawned, so you can go there
> directly.
>
> What is your error exactly? If it says Œhive-site.xml Permission Denied¹,
> I¹ve encountered that error before and for some reason it was because the
> hive-site.xml unix permissions were changed from 600 (rw-r--r--). This is
> the hive-site.xml that you are including in your oozie workflow application
> directory. I don¹t know the root cause of it yet but if you keep the
> permissions as 600 it should work.
>
> If it¹s not the above error, could you elaborate on it, and copy the stack
> trace here?
>
> Hive updated versions have indeed changed the package hierarchy for
> HCatLoader and HCatStorer so the changes you made here are correct.
>
> You will find an example (examples/apps/hcatalog) in the examples folder
> in Oozie and includes a README. I have created JIRA
> (https://issues.apache.org/jira/browse/OOZIE-1881) to append the above
> info as an FAQ to the HCatalog integration doc as well as a walkthrough of
> this Pig+HCat example.
>
> Regards,
> ‹
> Mona
>
> On 6/15/14, 2:47 PM, "Garry Turkington" <g....@improvedigital.com>
> wrote:
>
> >Hi,
> >
> >
> >
> >Digging into this some more I have a little more information about the
> >problem.
> >
> >
> >
> >The simple Pig script is as below, i.e.:
> >
> >
> >
> >REGISTER
> >/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/pig/piggybank.jar
> >
> >
> >
> >mydata = LOAD 'testtable_hcat' USING
> >org.apache.hive.hcatalog.pig.HCatLoader();
> >
> >This fails with the following message in the Oozie CLI and UI:
> >
> >0000003-140615021945919-oozie-oozi-W@pig-node
> >    ERROR     job_1402823993415_0004 FAILED/KILLEDJA018
> >
> >
> >
> >Though that particular MR job id is marked as successful in the MR and
> >YARN logs. Which is I think why it's proving difficult to find any more
> >logging.
> >
> >
> >
> >What does work:
> >
> >* Hive actions within Oozie
> >
> >* Other Pig actions (that don't use HCatalog) within Oozie
> >
> >* This Pig script run from the CLI as either the submitting or yarn
> >user
> >
> >
> >
> >I did change 2 things; the package name for the HCATLoader as the
> >org.apache.hcatalog.* is now deprecated in favour of
> >org.apache.hive.hcatalog.* and the /user/yarn directory was not present.
> >But neither made an impact.
> >
> >
> >
> >I think the JA018 -- referred to as being due to the output dir already
> >existing  in oozie-defaults.xml is actually referring to something else.
> >Possibly a missing library.
> >
> >
> >
> >To run the script from the command line I add the -useHCatalog argument
> >to Pig which explicitly adds jars to the classpath. Though many of
> >these would be for the hcat binary etc which I'm not using. The
> >HCatalog adaptor for Pig though does appear to be in the Oozie sharelib:
> >
> >
> >
> >[cloudera@localhost ~]$ oozie admin -shareliblist hcatalog | grep -i
> >pig
> >
> >
> >hdfs://localhost.localdomain:8020/user/oozie/share/lib/lib_201404041128
> >20/ hcatalog/hive-hcatalog-pig-adapter-0.12.0-cdh5.0.0.jar
> >
> >
> >
> >Any insight in any of the above from anyone? The fact I can't find any
> >examples of this Oozie/Pig/Hcat combo working isn't filling me with
> >confidence.
> >
> >
> >
> >One thing that would help -- if Pig is dropping an error log file is
> >there any way of capturing that/making it available? I tried doing the
> >equivalent of a "pig -l > <destination>" in the workflow.xml but that
> >didn't seem to work either.
> >
> >
> >
> >Or any thoughts on when things would be failing in such a way that the
> >MapReduce job is logged as successful but Oozie sees the action as
> >failed/killed?
> >
> >
> >
> >Any pointers well received,
> >
> >Garry
> >
> >
> >
> >
> >
> >-----Original Message-----
> >From: Garry Turkington [mailto:g.turkington@improvedigital.com]
> >Sent: 11 June 2014 00:11
> >To: user@oozie.apache.org
> >Subject: RE: Using HCat within a Pig action
> >
> >
> >
> >Mona,
> >
> >
> >
> >Thanks for the response.
> >
> >
> >
> >That doesn't quite look like my problem though; my Hive Oozie actions
> >are working fine. As are my Pig Oozie actions, but things start
> >breaking when trying to use HCat from within the Pig action.
> >
> >
> >
> >Are there any additional arguments required -- or configuration options
> >-- for  a Pig job using HCat? Or any working  examples anywhere?
> >
> >
> >
> >Thanks
> >
> >Garry
> >
> >
> >
> >-----Original Message-----
> >
> >From: Mona Chitnis [mailto:chitnis@yahoo-inc.com.INVALID]
> >
> >Sent: 09 June 2014 19:20
> >
> >To: user@oozie.apache.org<ma...@oozie.apache.org>
> >
> >Subject: Re: Using HCat within a Pig action
> >
> >
> >
> >Looks like some discussion on this problem already
> >https://groups.google.com/a/cloudera.org/forum/#!topic/hue-user/m8NnJvz
> >xGA
> >Q
> >
> >
> >
> >On 6/9/14, 6:00 AM, "Garry Turkington"
> ><g.turkington@improvedigital.com<mailto:g.turkington@improvedigital.com
> >>>
> >
> >wrote:
> >
> >
> >
> >>Hi,
> >
> >>
> >
> >>I've got some Pig scripts that access data via HCat. They run fine on
> >
> >>the command line but if I try to get some executed as part of an Oozie
> >
> >>action it is failing. Unfortunately with very little detailed error
> >>messages.
> >
> >>
> >
> >>So before I go into the specifics can I clarify what is needed to get
> >
> >>Pig/HCat integration working with Oozie?
> >
> >>
> >
> >>I'm running this on CDH5 and the output of "oozie admin -listsharelib"
> >
> >>includes Pig and Hcatalog. Within my Pig scripts I am referring to
> >>HCat
> >
> >>tables by  name alone, i.e. no hcat:// URI. The  hive-site.xml that
> >
> >>works for Hive actions is available. I have other non-HCat workflows
> >
> >>running fine, including Pig and Hive actions.
> >
> >>
> >
> >>When I run Pig scripts that use HCatalog from the CLI I need specify
> >
> >>-useHcatalog and have HCAT.BIN defined; should I be passing values for
> >
> >>these to the Pig script within <argument> elements in the action
> >
> >>definition? (I've tried both with and without).
> >
> >>
> >
> >>Anything else that is required for this to work? Or pointers to any
> >
> >>documentation with examples/specs for what's needed? I found different
> >
> >>parts of the picture spread around but no definitive spec or full
> >
> >>examples.
> >
> >>
> >
> >>I cut my script down to the following; note the commented out second
> >
> >>statement, we don't even get as far as trying to read the (existing
> >>and
> >
> >>containing data) table:
> >
> >>
> >
> >>REGISTER
> >
> >>/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/pig/piggybank.jar
> >
> >>
> >
> >>mydata = LOAD 'testtable_hcat' USING
> >
> >>org.apache.hcatalog.pig.HCatLoader();
> >
> >>-- store mydata into '/tmp/zz.out' using PigStorage();
> >
> >>
> >
> >>I commented out the store because the only error I get includes the
> >
> >>seeming code JA018 and it was suggested on the Google this may be
> >
> >>permission related. Anything I need consider here? The cluster isn't
> >
> >>using any external security provider and only basic authentication:
> >
> >>
> >
> >>job_1402172905909_0054 FAILED/KILLEDJA018
> >
> >>
> >
> >>Here's the workflow.xml:
> >
> >><workflow-app xmlns="uri:oozie:workflow:0.4" name="shell-wf">
> >
> >>    <start to="pig-node"/>
> >
> >>    <action name="pig-node">
> >
> >><pig>
> >
> >>            <job-tracker>${jobTracker}</job-tracker>
> >
> >>            <name-node>${nameNode}</name-node>
> >
> >>            <job-xml>${workflowRoot}/hive-site.xml</job-xml>
> >
> >>            <configuration>
> >
> >>                <property>
> >
> >>                    <name>mapred.job.queue.name</name>
> >
> >>                    <value>${queueName}</value>
> >
> >>                </property>
> >
> >>            </configuration>
> >
> >>            <script>${workflowRoot}/pig/simple.pig</script>
> >
> >>        </pig>
> >
> >>        <ok to="end"/>
> >
> >>        <error to="fail"/>
> >
> >></action>
> >
> >>
> >
> >>    <kill name="fail">
> >
> >>        <message>Pig action failed, error
> >
> >>message[${wf:errorMessage(wf:lastErrorNode())}]</message>
> >
> >>    </kill>
> >
> >>    <end name="end"/>
> >
> >></workflow-app>
> >
> >>
> >
> >>Thanks
> >
> >>Garry
> >
> >>
> >
> >>
> >
> >
> >
> >
> >
> >-----
> >
> >No virus found in this message.
> >
> >Checked by AVG - www.avg.com<http://www.avg.com>
> >
> >Version: 2014.0.4570 / Virus Database: 3955/7637 - Release Date:
> >06/07/14
> >
> >
> >
> >-----
> >
> >No virus found in this message.
> >
> >Checked by AVG - www.avg.com<http://www.avg.com>
> >
> >Version: 2014.0.4570 / Virus Database: 3955/7637 - Release Date:
> >06/07/14
>
>

RE: Using HCat within a Pig action

Posted by Garry Turkington <g....@improvedigital.com>.
Hi Mona,

Thanks for the input. Some more info below.

If I look at my workflow in the Oozie web ui it still looks a bit odd. Looking at the task for the Pig node I see

Status: ERROR
Error code: JA018
Error message: Main class [org.apache.oozie.action.hadoop.PigMain], exit code [2]
External id: <job id>
External Status: FAILED/KILLED
Clicking on the child job link gives "n/a" as the child job.
The <job id> listed above shows as successful in the MapReduce logs though.  Am I driving the interface wrong? :)

I looked at that example -- thanks for the pointer -- and on first cut it seems to be showing the same behaviour. I simplified it a bit to remove the coordinator job and just run the script directly --  modifying job.properties to have all the right variables and the same result as above.

Re hive-site.xml I tried setting permission explicitly but no affect. Plus as mentioned I can get a Hive action to work so in other contexts the file appears to be read correctly.

I did also notice the readme for the example mentions HCatalog 0.4.1; that is a quite old version. It's obviously hard to compare directly since HCatalog versions are now aligned with Hive but 0.5 came out in 2013 and the version I'm running with is 0.12. This is using CDH5. Has HCatalog support been seen to work since the 0.4.1 release?

Any other thoughts?

Thanks
Garry


-----Original Message-----
From: Mona Chitnis [mailto:chitnis@yahoo-inc.com.INVALID] 
Sent: 16 June 2014 20:05
To: user@oozie.apache.org
Subject: Re: Using HCat within a Pig action

Hello Garry,

Oozie first launches a ³launcher² job which then launches the actual Pig/Hive job. The ŒSUCCEEDED¹ job that you see on Yarn is this launcher job, which succeeded in its job of launching the actual pig job, but actual pig job failed. Oozie console will show child jobs under tab ŒChild Jobs¹ pointing to the child jobs that the launcher spawned, so you can go there directly.

What is your error exactly? If it says Œhive-site.xml Permission Denied¹, I¹ve encountered that error before and for some reason it was because the hive-site.xml unix permissions were changed from 600 (rw-r--r--). This is the hive-site.xml that you are including in your oozie workflow application directory. I don¹t know the root cause of it yet but if you keep the permissions as 600 it should work.

If it¹s not the above error, could you elaborate on it, and copy the stack trace here?

Hive updated versions have indeed changed the package hierarchy for HCatLoader and HCatStorer so the changes you made here are correct.

You will find an example (examples/apps/hcatalog) in the examples folder in Oozie and includes a README. I have created JIRA
(https://issues.apache.org/jira/browse/OOZIE-1881) to append the above info as an FAQ to the HCatalog integration doc as well as a walkthrough of this Pig+HCat example.

Regards,
‹
Mona

On 6/15/14, 2:47 PM, "Garry Turkington" <g....@improvedigital.com>
wrote:

>Hi,
>
>
>
>Digging into this some more I have a little more information about the 
>problem.
>
>
>
>The simple Pig script is as below, i.e.:
>
>
>
>REGISTER
>/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/pig/piggybank.jar
>
>
>
>mydata = LOAD 'testtable_hcat' USING
>org.apache.hive.hcatalog.pig.HCatLoader();
>
>This fails with the following message in the Oozie CLI and UI:
>
>0000003-140615021945919-oozie-oozi-W@pig-node
>    ERROR     job_1402823993415_0004 FAILED/KILLEDJA018
>
>
>
>Though that particular MR job id is marked as successful in the MR and 
>YARN logs. Which is I think why it's proving difficult to find any more 
>logging.
>
>
>
>What does work:
>
>* Hive actions within Oozie
>
>* Other Pig actions (that don't use HCatalog) within Oozie
>
>* This Pig script run from the CLI as either the submitting or yarn 
>user
>
>
>
>I did change 2 things; the package name for the HCATLoader as the
>org.apache.hcatalog.* is now deprecated in favour of
>org.apache.hive.hcatalog.* and the /user/yarn directory was not present.
>But neither made an impact.
>
>
>
>I think the JA018 -- referred to as being due to the output dir already 
>existing  in oozie-defaults.xml is actually referring to something else.
>Possibly a missing library.
>
>
>
>To run the script from the command line I add the -useHCatalog argument 
>to Pig which explicitly adds jars to the classpath. Though many of 
>these would be for the hcat binary etc which I'm not using. The 
>HCatalog adaptor for Pig though does appear to be in the Oozie sharelib:
>
>
>
>[cloudera@localhost ~]$ oozie admin -shareliblist hcatalog | grep -i 
>pig
>
>        
>hdfs://localhost.localdomain:8020/user/oozie/share/lib/lib_201404041128
>20/ hcatalog/hive-hcatalog-pig-adapter-0.12.0-cdh5.0.0.jar
>
>
>
>Any insight in any of the above from anyone? The fact I can't find any 
>examples of this Oozie/Pig/Hcat combo working isn't filling me with 
>confidence.
>
>
>
>One thing that would help -- if Pig is dropping an error log file is 
>there any way of capturing that/making it available? I tried doing the 
>equivalent of a "pig -l > <destination>" in the workflow.xml but that 
>didn't seem to work either.
>
>
>
>Or any thoughts on when things would be failing in such a way that the 
>MapReduce job is logged as successful but Oozie sees the action as 
>failed/killed?
>
>
>
>Any pointers well received,
>
>Garry
>
>
>
>
>
>-----Original Message-----
>From: Garry Turkington [mailto:g.turkington@improvedigital.com]
>Sent: 11 June 2014 00:11
>To: user@oozie.apache.org
>Subject: RE: Using HCat within a Pig action
>
>
>
>Mona,
>
>
>
>Thanks for the response.
>
>
>
>That doesn't quite look like my problem though; my Hive Oozie actions 
>are working fine. As are my Pig Oozie actions, but things start 
>breaking when trying to use HCat from within the Pig action.
>
>
>
>Are there any additional arguments required -- or configuration options
>-- for  a Pig job using HCat? Or any working  examples anywhere?
>
>
>
>Thanks
>
>Garry
>
>
>
>-----Original Message-----
>
>From: Mona Chitnis [mailto:chitnis@yahoo-inc.com.INVALID]
>
>Sent: 09 June 2014 19:20
>
>To: user@oozie.apache.org<ma...@oozie.apache.org>
>
>Subject: Re: Using HCat within a Pig action
>
>
>
>Looks like some discussion on this problem already 
>https://groups.google.com/a/cloudera.org/forum/#!topic/hue-user/m8NnJvz
>xGA
>Q
>
>
>
>On 6/9/14, 6:00 AM, "Garry Turkington"
><g.turkington@improvedigital.com<mailto:g.turkington@improvedigital.com
>>>
>
>wrote:
>
>
>
>>Hi,
>
>>
>
>>I've got some Pig scripts that access data via HCat. They run fine on
>
>>the command line but if I try to get some executed as part of an Oozie
>
>>action it is failing. Unfortunately with very little detailed error 
>>messages.
>
>>
>
>>So before I go into the specifics can I clarify what is needed to get
>
>>Pig/HCat integration working with Oozie?
>
>>
>
>>I'm running this on CDH5 and the output of "oozie admin -listsharelib"
>
>>includes Pig and Hcatalog. Within my Pig scripts I am referring to 
>>HCat
>
>>tables by  name alone, i.e. no hcat:// URI. The  hive-site.xml that
>
>>works for Hive actions is available. I have other non-HCat workflows
>
>>running fine, including Pig and Hive actions.
>
>>
>
>>When I run Pig scripts that use HCatalog from the CLI I need specify
>
>>-useHcatalog and have HCAT.BIN defined; should I be passing values for
>
>>these to the Pig script within <argument> elements in the action
>
>>definition? (I've tried both with and without).
>
>>
>
>>Anything else that is required for this to work? Or pointers to any
>
>>documentation with examples/specs for what's needed? I found different
>
>>parts of the picture spread around but no definitive spec or full
>
>>examples.
>
>>
>
>>I cut my script down to the following; note the commented out second
>
>>statement, we don't even get as far as trying to read the (existing 
>>and
>
>>containing data) table:
>
>>
>
>>REGISTER
>
>>/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/pig/piggybank.jar
>
>>
>
>>mydata = LOAD 'testtable_hcat' USING
>
>>org.apache.hcatalog.pig.HCatLoader();
>
>>-- store mydata into '/tmp/zz.out' using PigStorage();
>
>>
>
>>I commented out the store because the only error I get includes the
>
>>seeming code JA018 and it was suggested on the Google this may be
>
>>permission related. Anything I need consider here? The cluster isn't
>
>>using any external security provider and only basic authentication:
>
>>
>
>>job_1402172905909_0054 FAILED/KILLEDJA018
>
>>
>
>>Here's the workflow.xml:
>
>><workflow-app xmlns="uri:oozie:workflow:0.4" name="shell-wf">
>
>>    <start to="pig-node"/>
>
>>    <action name="pig-node">
>
>><pig>
>
>>            <job-tracker>${jobTracker}</job-tracker>
>
>>            <name-node>${nameNode}</name-node>
>
>>            <job-xml>${workflowRoot}/hive-site.xml</job-xml>
>
>>            <configuration>
>
>>                <property>
>
>>                    <name>mapred.job.queue.name</name>
>
>>                    <value>${queueName}</value>
>
>>                </property>
>
>>            </configuration>
>
>>            <script>${workflowRoot}/pig/simple.pig</script>
>
>>        </pig>
>
>>        <ok to="end"/>
>
>>        <error to="fail"/>
>
>></action>
>
>>
>
>>    <kill name="fail">
>
>>        <message>Pig action failed, error
>
>>message[${wf:errorMessage(wf:lastErrorNode())}]</message>
>
>>    </kill>
>
>>    <end name="end"/>
>
>></workflow-app>
>
>>
>
>>Thanks
>
>>Garry
>
>>
>
>>
>
>
>
>
>
>-----
>
>No virus found in this message.
>
>Checked by AVG - www.avg.com<http://www.avg.com>
>
>Version: 2014.0.4570 / Virus Database: 3955/7637 - Release Date: 
>06/07/14
>
>
>
>-----
>
>No virus found in this message.
>
>Checked by AVG - www.avg.com<http://www.avg.com>
>
>Version: 2014.0.4570 / Virus Database: 3955/7637 - Release Date: 
>06/07/14


Re: Using HCat within a Pig action

Posted by Mona Chitnis <ch...@yahoo-inc.com.INVALID>.
Hello Garry,

Oozie first launches a ³launcher² job which then launches the actual
Pig/Hive job. The ŒSUCCEEDED¹ job that you see on Yarn is this launcher
job, which succeeded in its job of launching the actual pig job, but
actual pig job failed. Oozie console will show child jobs under tab ŒChild
Jobs¹ pointing to the child jobs that the launcher spawned, so you can go
there directly.

What is your error exactly? If it says Œhive-site.xml Permission Denied¹,
I¹ve encountered that error before and for some reason it was because the
hive-site.xml unix permissions were changed from 600 (rw-r--r--). This is
the hive-site.xml that you are including in your oozie workflow
application directory. I don¹t know the root cause of it yet but if you
keep the permissions as 600 it should work.

If it¹s not the above error, could you elaborate on it, and copy the stack
trace here?

Hive updated versions have indeed changed the package hierarchy for
HCatLoader and HCatStorer so the changes you made here are correct.

You will find an example (examples/apps/hcatalog) in the examples folder
in Oozie and includes a README. I have created JIRA
(https://issues.apache.org/jira/browse/OOZIE-1881) to append the above
info as an FAQ to the HCatalog integration doc as well as a walkthrough of
this Pig+HCat example.

Regards,
‹
Mona

On 6/15/14, 2:47 PM, "Garry Turkington" <g....@improvedigital.com>
wrote:

>Hi,
>
>
>
>Digging into this some more I have a little more information about the
>problem.
>
>
>
>The simple Pig script is as below, i.e.:
>
>
>
>REGISTER 
>/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/pig/piggybank.jar
>
>
>
>mydata = LOAD 'testtable_hcat' USING
>org.apache.hive.hcatalog.pig.HCatLoader();
>
>This fails with the following message in the Oozie CLI and UI:
>
>0000003-140615021945919-oozie-oozi-W@pig-node
>    ERROR     job_1402823993415_0004 FAILED/KILLEDJA018
>
>
>
>Though that particular MR job id is marked as successful in the MR and
>YARN logs. Which is I think why it's proving difficult to find any more
>logging.
>
>
>
>What does work:
>
>* Hive actions within Oozie
>
>* Other Pig actions (that don't use HCatalog) within Oozie
>
>* This Pig script run from the CLI as either the submitting or yarn user
>
>
>
>I did change 2 things; the package name for the HCATLoader as the
>org.apache.hcatalog.* is now deprecated in favour of
>org.apache.hive.hcatalog.* and the /user/yarn directory was not present.
>But neither made an impact.
>
>
>
>I think the JA018 -- referred to as being due to the output dir already
>existing  in oozie-defaults.xml is actually referring to something else.
>Possibly a missing library.
>
>
>
>To run the script from the command line I add the -useHCatalog argument
>to Pig which explicitly adds jars to the classpath. Though many of these
>would be for the hcat binary etc which I'm not using. The HCatalog
>adaptor for Pig though does appear to be in the Oozie sharelib:
>
>
>
>[cloudera@localhost ~]$ oozie admin -shareliblist hcatalog | grep -i pig
>
>        
>hdfs://localhost.localdomain:8020/user/oozie/share/lib/lib_20140404112820/
>hcatalog/hive-hcatalog-pig-adapter-0.12.0-cdh5.0.0.jar
>
>
>
>Any insight in any of the above from anyone? The fact I can't find any
>examples of this Oozie/Pig/Hcat combo working isn't filling me with
>confidence.
>
>
>
>One thing that would help -- if Pig is dropping an error log file is
>there any way of capturing that/making it available? I tried doing the
>equivalent of a "pig -l > <destination>" in the workflow.xml but that
>didn't seem to work either.
>
>
>
>Or any thoughts on when things would be failing in such a way that the
>MapReduce job is logged as successful but Oozie sees the action as
>failed/killed?
>
>
>
>Any pointers well received,
>
>Garry
>
>
>
>
>
>-----Original Message-----
>From: Garry Turkington [mailto:g.turkington@improvedigital.com]
>Sent: 11 June 2014 00:11
>To: user@oozie.apache.org
>Subject: RE: Using HCat within a Pig action
>
>
>
>Mona,
>
>
>
>Thanks for the response.
>
>
>
>That doesn't quite look like my problem though; my Hive Oozie actions are
>working fine. As are my Pig Oozie actions, but things start breaking when
>trying to use HCat from within the Pig action.
>
>
>
>Are there any additional arguments required -- or configuration options
>-- for  a Pig job using HCat? Or any working  examples anywhere?
>
>
>
>Thanks
>
>Garry
>
>
>
>-----Original Message-----
>
>From: Mona Chitnis [mailto:chitnis@yahoo-inc.com.INVALID]
>
>Sent: 09 June 2014 19:20
>
>To: user@oozie.apache.org<ma...@oozie.apache.org>
>
>Subject: Re: Using HCat within a Pig action
>
>
>
>Looks like some discussion on this problem already
>https://groups.google.com/a/cloudera.org/forum/#!topic/hue-user/m8NnJvzxGA
>Q
>
>
>
>On 6/9/14, 6:00 AM, "Garry Turkington"
><g....@improvedigital.com>>
>
>wrote:
>
>
>
>>Hi,
>
>>
>
>>I've got some Pig scripts that access data via HCat. They run fine on
>
>>the command line but if I try to get some executed as part of an Oozie
>
>>action it is failing. Unfortunately with very little detailed error
>>messages.
>
>>
>
>>So before I go into the specifics can I clarify what is needed to get
>
>>Pig/HCat integration working with Oozie?
>
>>
>
>>I'm running this on CDH5 and the output of "oozie admin -listsharelib"
>
>>includes Pig and Hcatalog. Within my Pig scripts I am referring to HCat
>
>>tables by  name alone, i.e. no hcat:// URI. The  hive-site.xml that
>
>>works for Hive actions is available. I have other non-HCat workflows
>
>>running fine, including Pig and Hive actions.
>
>>
>
>>When I run Pig scripts that use HCatalog from the CLI I need specify
>
>>-useHcatalog and have HCAT.BIN defined; should I be passing values for
>
>>these to the Pig script within <argument> elements in the action
>
>>definition? (I've tried both with and without).
>
>>
>
>>Anything else that is required for this to work? Or pointers to any
>
>>documentation with examples/specs for what's needed? I found different
>
>>parts of the picture spread around but no definitive spec or full
>
>>examples.
>
>>
>
>>I cut my script down to the following; note the commented out second
>
>>statement, we don't even get as far as trying to read the (existing and
>
>>containing data) table:
>
>>
>
>>REGISTER
>
>>/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/pig/piggybank.jar
>
>>
>
>>mydata = LOAD 'testtable_hcat' USING
>
>>org.apache.hcatalog.pig.HCatLoader();
>
>>-- store mydata into '/tmp/zz.out' using PigStorage();
>
>>
>
>>I commented out the store because the only error I get includes the
>
>>seeming code JA018 and it was suggested on the Google this may be
>
>>permission related. Anything I need consider here? The cluster isn't
>
>>using any external security provider and only basic authentication:
>
>>
>
>>job_1402172905909_0054 FAILED/KILLEDJA018
>
>>
>
>>Here's the workflow.xml:
>
>><workflow-app xmlns="uri:oozie:workflow:0.4" name="shell-wf">
>
>>    <start to="pig-node"/>
>
>>    <action name="pig-node">
>
>><pig>
>
>>            <job-tracker>${jobTracker}</job-tracker>
>
>>            <name-node>${nameNode}</name-node>
>
>>            <job-xml>${workflowRoot}/hive-site.xml</job-xml>
>
>>            <configuration>
>
>>                <property>
>
>>                    <name>mapred.job.queue.name</name>
>
>>                    <value>${queueName}</value>
>
>>                </property>
>
>>            </configuration>
>
>>            <script>${workflowRoot}/pig/simple.pig</script>
>
>>        </pig>
>
>>        <ok to="end"/>
>
>>        <error to="fail"/>
>
>></action>
>
>>
>
>>    <kill name="fail">
>
>>        <message>Pig action failed, error
>
>>message[${wf:errorMessage(wf:lastErrorNode())}]</message>
>
>>    </kill>
>
>>    <end name="end"/>
>
>></workflow-app>
>
>>
>
>>Thanks
>
>>Garry
>
>>
>
>>
>
>
>
>
>
>-----
>
>No virus found in this message.
>
>Checked by AVG - www.avg.com<http://www.avg.com>
>
>Version: 2014.0.4570 / Virus Database: 3955/7637 - Release Date: 06/07/14
>
>
>
>-----
>
>No virus found in this message.
>
>Checked by AVG - www.avg.com<http://www.avg.com>
>
>Version: 2014.0.4570 / Virus Database: 3955/7637 - Release Date: 06/07/14


RE: Using HCat within a Pig action

Posted by Garry Turkington <g....@improvedigital.com>.
Hi,



Digging into this some more I have a little more information about the problem.



The simple Pig script is as below, i.e.:



REGISTER /opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/pig/piggybank.jar



mydata = LOAD 'testtable_hcat' USING org.apache.hive.hcatalog.pig.HCatLoader();

This fails with the following message in the Oozie CLI and UI:

0000003-140615021945919-oozie-oozi-W@pig-node                                 ERROR     job_1402823993415_0004 FAILED/KILLEDJA018



Though that particular MR job id is marked as successful in the MR and YARN logs. Which is I think why it's proving difficult to find any more logging.



What does work:

* Hive actions within Oozie

* Other Pig actions (that don't use HCatalog) within Oozie

* This Pig script run from the CLI as either the submitting or yarn user



I did change 2 things; the package name for the HCATLoader as the org.apache.hcatalog.* is now deprecated in favour of org.apache.hive.hcatalog.* and the /user/yarn directory was not present. But neither made an impact.



I think the JA018 -- referred to as being due to the output dir already existing  in oozie-defaults.xml is actually referring to something else. Possibly a missing library.



To run the script from the command line I add the -useHCatalog argument to Pig which explicitly adds jars to the classpath. Though many of these would be for the hcat binary etc which I'm not using. The HCatalog adaptor for Pig though does appear to be in the Oozie sharelib:



[cloudera@localhost ~]$ oozie admin -shareliblist hcatalog | grep -i pig

        hdfs://localhost.localdomain:8020/user/oozie/share/lib/lib_20140404112820/hcatalog/hive-hcatalog-pig-adapter-0.12.0-cdh5.0.0.jar



Any insight in any of the above from anyone? The fact I can't find any examples of this Oozie/Pig/Hcat combo working isn't filling me with confidence.



One thing that would help -- if Pig is dropping an error log file is there any way of capturing that/making it available? I tried doing the equivalent of a "pig -l > <destination>" in the workflow.xml but that didn't seem to work either.



Or any thoughts on when things would be failing in such a way that the MapReduce job is logged as successful but Oozie sees the action as failed/killed?



Any pointers well received,

Garry





-----Original Message-----
From: Garry Turkington [mailto:g.turkington@improvedigital.com]
Sent: 11 June 2014 00:11
To: user@oozie.apache.org
Subject: RE: Using HCat within a Pig action



Mona,



Thanks for the response.



That doesn't quite look like my problem though; my Hive Oozie actions are working fine. As are my Pig Oozie actions, but things start breaking when trying to use HCat from within the Pig action.



Are there any additional arguments required -- or configuration options -- for  a Pig job using HCat? Or any working  examples anywhere?



Thanks

Garry



-----Original Message-----

From: Mona Chitnis [mailto:chitnis@yahoo-inc.com.INVALID]

Sent: 09 June 2014 19:20

To: user@oozie.apache.org<ma...@oozie.apache.org>

Subject: Re: Using HCat within a Pig action



Looks like some discussion on this problem already https://groups.google.com/a/cloudera.org/forum/#!topic/hue-user/m8NnJvzxGAQ



On 6/9/14, 6:00 AM, "Garry Turkington" <g....@improvedigital.com>>

wrote:



>Hi,

>

>I've got some Pig scripts that access data via HCat. They run fine on

>the command line but if I try to get some executed as part of an Oozie

>action it is failing. Unfortunately with very little detailed error messages.

>

>So before I go into the specifics can I clarify what is needed to get

>Pig/HCat integration working with Oozie?

>

>I'm running this on CDH5 and the output of "oozie admin -listsharelib"

>includes Pig and Hcatalog. Within my Pig scripts I am referring to HCat

>tables by  name alone, i.e. no hcat:// URI. The  hive-site.xml that

>works for Hive actions is available. I have other non-HCat workflows

>running fine, including Pig and Hive actions.

>

>When I run Pig scripts that use HCatalog from the CLI I need specify

>-useHcatalog and have HCAT.BIN defined; should I be passing values for

>these to the Pig script within <argument> elements in the action

>definition? (I've tried both with and without).

>

>Anything else that is required for this to work? Or pointers to any

>documentation with examples/specs for what's needed? I found different

>parts of the picture spread around but no definitive spec or full

>examples.

>

>I cut my script down to the following; note the commented out second

>statement, we don't even get as far as trying to read the (existing and

>containing data) table:

>

>REGISTER

>/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/pig/piggybank.jar

>

>mydata = LOAD 'testtable_hcat' USING

>org.apache.hcatalog.pig.HCatLoader();

>-- store mydata into '/tmp/zz.out' using PigStorage();

>

>I commented out the store because the only error I get includes the

>seeming code JA018 and it was suggested on the Google this may be

>permission related. Anything I need consider here? The cluster isn't

>using any external security provider and only basic authentication:

>

>job_1402172905909_0054 FAILED/KILLEDJA018

>

>Here's the workflow.xml:

><workflow-app xmlns="uri:oozie:workflow:0.4" name="shell-wf">

>    <start to="pig-node"/>

>    <action name="pig-node">

><pig>

>            <job-tracker>${jobTracker}</job-tracker>

>            <name-node>${nameNode}</name-node>

>            <job-xml>${workflowRoot}/hive-site.xml</job-xml>

>            <configuration>

>                <property>

>                    <name>mapred.job.queue.name</name>

>                    <value>${queueName}</value>

>                </property>

>            </configuration>

>            <script>${workflowRoot}/pig/simple.pig</script>

>        </pig>

>        <ok to="end"/>

>        <error to="fail"/>

></action>

>

>    <kill name="fail">

>        <message>Pig action failed, error

>message[${wf:errorMessage(wf:lastErrorNode())}]</message>

>    </kill>

>    <end name="end"/>

></workflow-app>

>

>Thanks

>Garry

>

>





-----

No virus found in this message.

Checked by AVG - www.avg.com<http://www.avg.com>

Version: 2014.0.4570 / Virus Database: 3955/7637 - Release Date: 06/07/14



-----

No virus found in this message.

Checked by AVG - www.avg.com<http://www.avg.com>

Version: 2014.0.4570 / Virus Database: 3955/7637 - Release Date: 06/07/14

RE: Using HCat within a Pig action

Posted by Garry Turkington <g....@improvedigital.com>.
Mona,

Thanks for the response.

That doesn't quite look like my problem though; my Hive Oozie actions are working fine. As are my Pig Oozie actions, but things start breaking when trying to use HCat from within the Pig action.

Are there any additional arguments required -- or configuration options -- for  a Pig job using HCat? Or any working  examples anywhere?

Thanks
Garry

-----Original Message-----
From: Mona Chitnis [mailto:chitnis@yahoo-inc.com.INVALID] 
Sent: 09 June 2014 19:20
To: user@oozie.apache.org
Subject: Re: Using HCat within a Pig action

Looks like some discussion on this problem already https://groups.google.com/a/cloudera.org/forum/#!topic/hue-user/m8NnJvzxGAQ

On 6/9/14, 6:00 AM, "Garry Turkington" <g....@improvedigital.com>
wrote:

>Hi,
>
>I've got some Pig scripts that access data via HCat. They run fine on 
>the command line but if I try to get some executed as part of an Oozie 
>action it is failing. Unfortunately with very little detailed error messages.
>
>So before I go into the specifics can I clarify what is needed to get 
>Pig/HCat integration working with Oozie?
>
>I'm running this on CDH5 and the output of "oozie admin -listsharelib"
>includes Pig and Hcatalog. Within my Pig scripts I am referring to HCat 
>tables by  name alone, i.e. no hcat:// URI. The  hive-site.xml that 
>works for Hive actions is available. I have other non-HCat workflows  
>running fine, including Pig and Hive actions.
>
>When I run Pig scripts that use HCatalog from the CLI I need specify 
>-useHcatalog and have HCAT.BIN defined; should I be passing values for 
>these to the Pig script within <argument> elements in the action 
>definition? (I've tried both with and without).
>
>Anything else that is required for this to work? Or pointers to any 
>documentation with examples/specs for what's needed? I found different 
>parts of the picture spread around but no definitive spec or full 
>examples.
>
>I cut my script down to the following; note the commented out second 
>statement, we don't even get as far as trying to read the (existing and 
>containing data) table:
>
>REGISTER
>/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/pig/piggybank.jar
>
>mydata = LOAD 'testtable_hcat' USING 
>org.apache.hcatalog.pig.HCatLoader();
>-- store mydata into '/tmp/zz.out' using PigStorage();
>
>I commented out the store because the only error I get includes the 
>seeming code JA018 and it was suggested on the Google this may be 
>permission related. Anything I need consider here? The cluster isn't 
>using any external security provider and only basic authentication:
>
>job_1402172905909_0054 FAILED/KILLEDJA018
>
>Here's the workflow.xml:
><workflow-app xmlns="uri:oozie:workflow:0.4" name="shell-wf">
>    <start to="pig-node"/>
>    <action name="pig-node">
><pig>
>            <job-tracker>${jobTracker}</job-tracker>
>            <name-node>${nameNode}</name-node>
>            <job-xml>${workflowRoot}/hive-site.xml</job-xml>
>            <configuration>
>                <property>
>                    <name>mapred.job.queue.name</name>
>                    <value>${queueName}</value>
>                </property>
>            </configuration>
>            <script>${workflowRoot}/pig/simple.pig</script>
>        </pig>
>        <ok to="end"/>
>        <error to="fail"/>
></action>
>
>    <kill name="fail">
>        <message>Pig action failed, error 
>message[${wf:errorMessage(wf:lastErrorNode())}]</message>
>    </kill>
>    <end name="end"/>
></workflow-app>
>
>Thanks
>Garry
>
>


-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 2014.0.4570 / Virus Database: 3955/7637 - Release Date: 06/07/14

Re: Using HCat within a Pig action

Posted by Mona Chitnis <ch...@yahoo-inc.com.INVALID>.
Looks like some discussion on this problem already
https://groups.google.com/a/cloudera.org/forum/#!topic/hue-user/m8NnJvzxGAQ

On 6/9/14, 6:00 AM, "Garry Turkington" <g....@improvedigital.com>
wrote:

>Hi,
>
>I've got some Pig scripts that access data via HCat. They run fine on the
>command line but if I try to get some executed as part of an Oozie action
>it is failing. Unfortunately with very little detailed error messages.
>
>So before I go into the specifics can I clarify what is needed to get
>Pig/HCat integration working with Oozie?
>
>I'm running this on CDH5 and the output of "oozie admin -listsharelib"
>includes Pig and Hcatalog. Within my Pig scripts I am referring to HCat
>tables by  name alone, i.e. no hcat:// URI. The  hive-site.xml that works
>for Hive actions is available. I have other non-HCat workflows  running
>fine, including Pig and Hive actions.
>
>When I run Pig scripts that use HCatalog from the CLI I need specify
>-useHcatalog and have HCAT.BIN defined; should I be passing values for
>these to the Pig script within <argument> elements in the action
>definition? (I've tried both with and without).
>
>Anything else that is required for this to work? Or pointers to any
>documentation with examples/specs for what's needed? I found different
>parts of the picture spread around but no definitive spec or full
>examples.
>
>I cut my script down to the following; note the commented out second
>statement, we don't even get as far as trying to read the (existing and
>containing data) table:
>
>REGISTER  
>/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/pig/piggybank.jar
>
>mydata = LOAD 'testtable_hcat' USING org.apache.hcatalog.pig.HCatLoader();
>-- store mydata into '/tmp/zz.out' using PigStorage();
>
>I commented out the store because the only error I get includes the
>seeming code JA018 and it was suggested on the Google this may be
>permission related. Anything I need consider here? The cluster isn't
>using any external security provider and only basic authentication:
>
>job_1402172905909_0054 FAILED/KILLEDJA018
>
>Here's the workflow.xml:
><workflow-app xmlns="uri:oozie:workflow:0.4" name="shell-wf">
>    <start to="pig-node"/>
>    <action name="pig-node">
><pig>
>            <job-tracker>${jobTracker}</job-tracker>
>            <name-node>${nameNode}</name-node>
>            <job-xml>${workflowRoot}/hive-site.xml</job-xml>
>            <configuration>
>                <property>
>                    <name>mapred.job.queue.name</name>
>                    <value>${queueName}</value>
>                </property>
>            </configuration>
>            <script>${workflowRoot}/pig/simple.pig</script>
>        </pig>
>        <ok to="end"/>
>        <error to="fail"/>
></action>
>
>    <kill name="fail">
>        <message>Pig action failed, error
>message[${wf:errorMessage(wf:lastErrorNode())}]</message>
>    </kill>
>    <end name="end"/>
></workflow-app>
>
>Thanks
>Garry
>
>